Geoplot is a very easy-to-use python library that lets us create maps with just one method calls generally. It let us create different kinds of maps like choropleth maps, scatter maps, bubble maps, cartogram, KDE on maps, connection/Sankey maps, etc. Geoplot is built on the top of cartopy and geopandas to make working with maps easier. We have already covered on tutorial on geoplot where we have explained how we can create choropleth maps using geoplot. Please feel free to check it if you are interested in choropleth maps.
As a part of this tutorial, we'll cover scatter and bubble maps creation using geoplot. We'll be trying to explain the API with simple and easy-to-use examples.
We'll start by importing the necessary libraries. We have also imported geopandas as we'll be using geospatial datasets available from it by merging them with other datasets.
import geoplot
print("Geoplot Version : {}".format(geoplot.__version__))
import matplotlib.pyplot as plt
import geopandas as gpd
import pandas as pd
print("Geopandas Version : {}".format(gpd.__version__))
print("Available Datasets : {}".format(gpd.datasets.available))
Below we have loaded the world geometries dataset available from geopandas. It has a column named geometry which holds instances of polygons or multi-polygons representing individual countries of the world.
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
#world = gpd.read_file(geoplot.datasets.get_path("world"))
world.head()
Below we have loaded another dataset that has location information for Starbucks stores worldwide. The dataset can be downloaded from the below link.
The dataset has geospatial information for each store in the form of longitude and latitude. Geoplot requires that we represent geospatial information as shapely objects and through geometry column in the dataset. To satisfy this requirement, we have created Point shapely objects by looping through the longitudes and latitudes of each store. We have then stored Point objects in geometry column of the dataset. Each Point object represents one location on the map.
Please make a NOTE that we have also converted normal pandas dataframe to GeoDataFrame.
from shapely.geometry import Point
starbucks = pd.read_csv("datasets/starbucks_store_locations.csv")
points = []
for lon, lat in zip(starbucks["Longitude"], starbucks["Latitude"]):
points.append(Point(lon, lat))
starbucks["geometry"] = points
starbucks = gpd.GeoDataFrame(starbucks)
starbucks.head()
Our first scatter map represents Starbucks store locations on the world map. We have first created a world map using polyplot() method and then plotted points on it showing locations using pointplot() method of geoplot. The definition of pointplot() is below for reference.
We have first created a world map using polyplot() method and we have stored Axes object returned by it in a variable. We have then plotted points on a world map using pointplot() method using the Starbucks dataset we had created earlier. We have given Axes returned by polyplot() method to pointplot() so that it creates points on the same map. We have given parameters like color, edgecolor and alpha which will be passed to Point object.
world_map_axes = geoplot.polyplot(world, facecolor="lightgrey", alpha=0.5, figsize=(15, 10));
geoplot.pointplot(starbucks,
ax=world_map_axes,
color="tomato",
edgecolor="black",
alpha=0.5
);
In this section, we have created another scatter map that shows the locations of Starbucks stores on the US map. We have first loaded the geo json dataset which has polygons for each state of the US creating a whole US map. We have loaded the dataset using geopandas. The dataset can be downloaded from the below link.
us_states = gpd.read_file("datasets/us-states.json")
us_states.head()
Below we have first drawn the US map using polyplot() method and stored Axes reference returned by it in a variable. We have then plotted points on a map using pointplot() method. We have filtered our Starbucks GeoDataFrame to keep only entries of US and give it to pointplot(). We have given Axes reference returned by polyplot() as well to a method.
us_map_axes = geoplot.polyplot(us_states, facecolor="dodgerblue", alpha=0.1, figsize=(15, 15));
geoplot.pointplot(starbucks[starbucks["Country"] == "US"],
ax=us_map_axes,
color="dodgerblue",
edgecolor="black",
alpha=0.1
);
In this section, we have plotted a bubble map showing the US state’s population. In order to plot bubbles on the US map for each state, we need the center of each state where the bubble will be plotted showing the population of that state. The size of the bubble will be based on the population of that state.
We have created a simple method named calculate_center() which takes as input GeoDataFrame, calculates the center of each polygon/multi-polygon geometry object, and returns them. We have added a center for each state in our dataframe using this method.
def calculate_center(df):
"""
Calculate the centre of a geometry
This method first converts to a planar crs, gets the centroid
then converts back to the original crs. This gives a more
accurate
"""
original_crs = df.crs
planar_crs = 'EPSG:3857'
return df['geometry'].to_crs(planar_crs).centroid.to_crs(original_crs)
us_states["center"] = calculate_center(us_states)
us_states["Longitude"] = [val.x for val in us_states.center]
us_states["Latitude"] = [val.y for val in us_states.center]
us_states.head()
Below we have loaded the dataset which has information about the population of each US state. The dataset can be downloaded from the below link.
us_pop = pd.read_csv("datasets/State Populations.csv")
us_pop.head()
Now we have merged the US state’s geo dataset with the US state’s population dataset. We have removed geometry column which had polygons/multi-polygons as geometries. We have also renamed center column as new geometry column. We'll be using this modified dataframe to create bubbles on the US map.
us_states_pop = us_states.merge(us_pop, left_on="name", right_on="State").drop(columns=["geometry"]).rename(columns={"center": "geometry"})
us_states_pop.head()
Below we have first created a US map using polyplot() method giving it original US states GeoDataFrame which we had loaded earlier. We have stored Axes reference returned by it in a variable.
We have then created bubbles on the map using the merged geo dataframe which we created in the previous cell. We have instructed pointplot() method to use column 2018 Population for hue and scale both. The values of this column will decide the size of points on the map. We have given Axes reference generated by polyplot() to it so that bubbles are drawn on the same map. We have instructed the method to use the value of scale parameter to decide legend values.
We have asked the method to use Quantiles scheme for clustering column data into 5 clusters. We can notice that from the legend that it has 5 entries. The limit parameter is set to (5,50) informing to create bubbles in this size range.
us_map_axes = geoplot.polyplot(us_states, facecolor="white", figsize=(15, 10));
geoplot.pointplot(us_states_pop,
hue="2018 Population",
scale="2018 Population",
scheme="Quantiles", ## mapclassify.Quantiles(us_states_pop["2018 Population"], k=5) will give same results.
ax=us_map_axes,
edgecolor="black",
legend=True,
legend_var="scale",
legend_kwargs={"loc":"best",
"fontsize": "large",
"title":"Population",
"title_fontsize":"large"},
limits=(5, 50),
);
plt.title("US State's 2018 Population", fontdict={"fontsize": 15}, pad=15);
Our second bubble map shows the GDP of countries on the world map. We have the first loaded dataset which has information about happiness scores and few other important parameters for each country of the world. The dataset can be downloaded from the below link.
We have then merged the world geo dataset with the happiness dataset to create a merged geo dataset. We have then found out centers for each country using calculate_center() method. We have then removed original geometry columns which have polygons/multi-polygons. The center column has been renamed as a new geometry column.
happiness = pd.read_csv("datasets/world_happiness_2019.csv")
world_happiness = world.merge(happiness, left_on="name", right_on="Country or region")
world_happiness["center"] = calculate_center(world_happiness)
world_happiness = world_happiness.drop(columns=["geometry"]).rename(columns={"center": "geometry"})
world_happiness.head()
Below we have first created a world map using polyplot() method and world geo dataset. We have Axes reference object in a variable.
We have then created bubbles showing the GDP of each country on a world map using pointplot() method. We have given our merged world geo dataset from the previous cell. We have asked to use gdp_md_est column as column for hue and scale. The legend is decided based on the value of scale parameter.
world_map_axes = geoplot.polyplot(world, facecolor="white", figsize=(15, 10));
geoplot.pointplot(world_happiness,
hue="gdp_md_est",
scale="gdp_md_est",
cmap="RdBu",
ax=world_map_axes,
edgecolor="black",
legend=True,
legend_var="scale",
legend_kwargs={"loc":"best",
"fontsize": "large",
"title":"GDP",
"title_fontsize":"large"},
limits=(5, 50),
);
plt.title("GDP of Countries", fontdict={"fontsize": 15}, pad=15);
In this section, we'll be creating a bubble map showing the count of Starbucks stores for each US state.
We have first created an intermediate dataframe where we have filtered entries of only US Starbucks. We have then created a new dataframe that has a count of stores per US state by using the grouping functionality of pandas.
We have then merged this intermediate dataset with the geo dataset (the US states population bubble map section) which has a center of each US state. The final geo dataframe has a center for each state and Starbucks store count for each as well. We'll be using this dataset for creating our bubble map.
starbucks_us = starbucks[starbucks["Country"] == "US"]
starbucks_us_state_cnt = starbucks_us.groupby("State/Province").count()[["Store Number"]].reset_index().rename(columns={"Store Number":"Store Count"})
starbucks_us_state_cnt.head()
starbucks_us_statewise = us_states_pop.merge(starbucks_us_state_cnt, left_on="id", right_on="State/Province")
starbucks_us_statewise.head()
Below we have first created a US map using polyplot() method. We have stored Axes reference returned by a method in a variable which we'll pass on to pointplot() method.
We have then added bubbles on US map using pointplot() method. We have asked it to use Store Count column for hue and scale parameters. We have added a title to the chart as well.
us_map_axes = geoplot.polyplot(us_states, facecolor="white", figsize=(15, 15));
geoplot.pointplot(starbucks_us_statewise,
hue="Store Count",
scale="Store Count",
cmap="BrBG",
ax=us_map_axes,
edgecolor="black",
legend=True,
legend_var="scale",
limits=(5, 50),
);
plt.title("Starbucks US Stores Count Statewise", fontdict={"fontsize": 15}, pad=15);
This ends our small tutorial explaining how we can use geoplot to create scatter and bubble maps in Python. Please feel free to let us know your views in the comments section. Please feel free to go through reference section tutorials if you want to learn about other libraries in python which can be used to plot maps.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to