Updated On : Oct-08,2021 Time Investment : ~30 mins

Geoplot - Scatter & Bubble Maps [Python]¶

Geoplot is a very easy-to-use python library that lets us create maps with just one method calls generally. It let us create different kinds of maps like choropleth maps, scatter maps, bubble maps, cartogram, KDE on maps, connection/Sankey maps, etc. Geoplot is built on the top of cartopy and geopandas to make working with maps easier. We have already covered on tutorial on geoplot where we have explained how we can create choropleth maps using geoplot. Please feel free to check it if you are interested in choropleth maps.

Geoplot - Choropleth Maps

As a part of this tutorial, we'll cover scatter and bubble maps creation using geoplot. We'll be trying to explain the API with simple and easy-to-use examples.

We'll start by importing the necessary libraries. We have also imported geopandas as we'll be using geospatial datasets available from it by merging them with other datasets.

import geoplot

print("Geoplot Version : {}".format(geoplot.__version__))

import matplotlib.pyplot as plt

Geoplot Version : 0.4.4

import geopandas as gpd

import pandas as pd

print("Geopandas Version : {}".format(gpd.__version__))

print("Available Datasets : {}".format(gpd.datasets.available))

Geopandas Version : 0.9.0
Available Datasets : ['naturalearth_cities', 'naturalearth_lowres', 'nybb']

Below we have loaded the world geometries dataset available from geopandas. It has a column named geometry which holds instances of polygons or multi-polygons representing individual countries of the world.

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
#world = gpd.read_file(geoplot.datasets.get_path("world"))

world.head()

	pop_est	continent	name	iso_a3	gdp_md_est	geometry
0	920938	Oceania	Fiji	FJI	8374.0	MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1	53950935	Africa	Tanzania	TZA	150600.0	POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2	603253	Africa	W. Sahara	ESH	906.5	POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3	35623680	North America	Canada	CAN	1674000.0	MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4	326625791	North America	United States of America	USA	18560000.0	MULTIPOLYGON (((-122.84000 49.00000, -120.0000...

Below we have loaded another dataset that has location information for Starbucks stores worldwide. The dataset can be downloaded from the below link.

Starbucks Store Locations

The dataset has geospatial information for each store in the form of longitude and latitude. Geoplot requires that we represent geospatial information as shapely objects and through geometry column in the dataset. To satisfy this requirement, we have created Point shapely objects by looping through the longitudes and latitudes of each store. We have then stored Point objects in geometry column of the dataset. Each Point object represents one location on the map.

Please make a NOTE that we have also converted normal pandas dataframe to GeoDataFrame.

from shapely.geometry import Point

starbucks = pd.read_csv("datasets/starbucks_store_locations.csv")

points = []
for lon, lat in zip(starbucks["Longitude"], starbucks["Latitude"]):
    points.append(Point(lon, lat))

starbucks["geometry"] = points

starbucks = gpd.GeoDataFrame(starbucks)

starbucks.head()

	Brand	Store Number	Store Name	Ownership Type	Street Address	City	State/Province	Country	Postcode	Phone Number	Timezone	Longitude	Latitude	geometry
0	Starbucks	47370-257954	Meritxell, 96	Licensed	Av. Meritxell, 96	Andorra la Vella	7	AD	AD500	376818720	GMT+1:00 Europe/Andorra	1.53	42.51	POINT (1.53000 42.51000)
1	Starbucks	22331-212325	Ajman Drive Thru	Licensed	1 Street 69, Al Jarf	Ajman	AJ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	55.47	25.42	POINT (55.47000 25.42000)
2	Starbucks	47089-256771	Dana Mall	Licensed	Sheikh Khalifa Bin Zayed St.	Ajman	AJ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	55.47	25.39	POINT (55.47000 25.39000)
3	Starbucks	22126-218024	Twofour 54	Licensed	Al Salam Street	Abu Dhabi	AZ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	54.38	24.48	POINT (54.38000 24.48000)
4	Starbucks	17127-178586	Al Ain Tower	Licensed	Khaldiya Area, Abu Dhabi Island	Abu Dhabi	AZ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	54.54	24.51	POINT (54.54000 24.51000)

Starbucks Store Locations Worldwide Scatter Map¶

Our first scatter map represents Starbucks store locations on the world map. We have first created a world map using polyplot() method and then plotted points on it showing locations using pointplot() method of geoplot. The definition of pointplot() is below for reference.

pointplot(df,projection=None,hue=None,cmap=None,norm=None,scheme=None,scale=None,limits=(1,5),legend=False,legend_var=None,legend_values=None,legend_labels=None,legend_kwargs=None,figsize=(8,6), extent=None,ax=None, **kwargs) - This method takes as input GeoDataFrame which has geometry column holding locations on map. It plots point at those locations on map.
- The projection parameter takes as input instance of any projection available from geoplot.crs module. It can be used to change the projection of the map.
- The hue parameter takes as input string column name from GeoDataFrame or geo series, iterable specifying values to be used to color points of the map. We can give a column name from GeoDataFrame which has continuous data or categorical data. It'll create a color bar for a continuous column and a categorical legend for the categorical column.
- The cmap parameter accepts matplotlib colormap name.
- The scheme parameter accepts instances of mapclassify specifying scheme to group geo data. We can also give the string to this parameter specifying the name of any class from mapclassify. It'll create an instance of that class with default parameters. Below are some of the classification schemes for choropleth maps.
  - 'BoxPlot', 'EqualInterval', 'FisherJenks', 'FisherJenksSampled', 'HeadTailBreaks', 'JenksCaspall', 'JenksCaspallForced', 'JenksCaspallSampled', 'MaxP', 'MaximumBreaks', 'NaturalBreaks', 'Quantiles', 'Percentiles', 'StdMean'
- The scale parameter accepts string column name from data frame or iterable which will be used to decide the size of points on a map.
- The limits parameter accepts tuple of (min, max) specifying minimum and maximum sizes of bubbles on the map. The value of scale will be mapped to bubble size in the range specified by this parameter.
- The legend parameter accepts boolean value specifying whether to include a legend or not.
- The legend_var accepts one of the below strings as input specifying which variable to use for legends.
  - 'hue' - It creates legend based on hue parameter value.
  - 'scale' - It creates legend based on scale parameter value.
- The legend_kwargs parameter accepts dictionary specifying options to modify the legend of the map.
- The legend_labels accepts list of string values for categorical legends. We can group map values using scheme parameter into groups and using this parameter we can give labels for each group that will be included in the map.
- The legend_values parameter accepts a list of values to be used for categorical legends.
- The extent parameter takes as input tuple of 4 values (min_longitude, min_latitude, max_longitude, max_latitude) specifying bounding box of map.
- The figsize parameter accepts tuple specifying figure size.
- The ax parameter accepts matplotlix Axes object.
- All other extra parameters provided will be passed to point objects which are present in geometry column. This can be fill color of points, edge color, edge width, etc.

We have first created a world map using polyplot() method and we have stored Axes object returned by it in a variable. We have then plotted points on a world map using pointplot() method using the Starbucks dataset we had created earlier. We have given Axes returned by polyplot() method to pointplot() so that it creates points on the same map. We have given parameters like color, edgecolor and alpha which will be passed to Point object.

world_map_axes = geoplot.polyplot(world, facecolor="lightgrey", alpha=0.5, figsize=(15, 10));

geoplot.pointplot(starbucks,
                  ax=world_map_axes,
                  color="tomato",
                  edgecolor="black",
                  alpha=0.5
                 );

Starbucks Store Locations Across US Scatter Map¶

In this section, we have created another scatter map that shows the locations of Starbucks stores on the US map. We have first loaded the geo json dataset which has polygons for each state of the US creating a whole US map. We have loaded the dataset using geopandas. The dataset can be downloaded from the below link.

US States Geo JSON

us_states = gpd.read_file("datasets/us-states.json")

us_states.head()

	id	name	geometry
0	AL	Alabama	POLYGON ((-87.35930 35.00118, -85.60667 34.984...
1	AK	Alaska	MULTIPOLYGON (((-131.60202 55.11798, -131.5691...
2	AZ	Arizona	POLYGON ((-109.04250 37.00026, -109.04798 31.3...
3	AR	Arkansas	POLYGON ((-94.47384 36.50186, -90.15254 36.496...
4	CA	California	POLYGON ((-123.23326 42.00619, -122.37885 42.0...

Below we have first drawn the US map using polyplot() method and stored Axes reference returned by it in a variable. We have then plotted points on a map using pointplot() method. We have filtered our Starbucks GeoDataFrame to keep only entries of US and give it to pointplot(). We have given Axes reference returned by polyplot() as well to a method.

us_map_axes = geoplot.polyplot(us_states, facecolor="dodgerblue", alpha=0.1, figsize=(15, 15));

geoplot.pointplot(starbucks[starbucks["Country"] == "US"],
                  ax=us_map_axes,
                  color="dodgerblue",
                  edgecolor="black",
                  alpha=0.1
                 );

US States Population Bubble Map¶

In this section, we have plotted a bubble map showing the US state’s population. In order to plot bubbles on the US map for each state, we need the center of each state where the bubble will be plotted showing the population of that state. The size of the bubble will be based on the population of that state.

We have created a simple method named calculate_center() which takes as input GeoDataFrame, calculates the center of each polygon/multi-polygon geometry object, and returns them. We have added a center for each state in our dataframe using this method.

def calculate_center(df):
    """
    Calculate the centre of a geometry

    This method first converts to a planar crs, gets the centroid
    then converts back to the original crs. This gives a more
    accurate
    """
    original_crs = df.crs
    planar_crs = 'EPSG:3857'
    return df['geometry'].to_crs(planar_crs).centroid.to_crs(original_crs)

us_states["center"] = calculate_center(us_states)
us_states["Longitude"] = [val.x for val in us_states.center]
us_states["Latitude"] = [val.y for val in us_states.center]

us_states.head()

	id	name	geometry	center	Longitude	Latitude
0	AL	Alabama	POLYGON ((-87.35930 35.00118, -85.60667 34.984...	POINT (-86.82705 32.81439)	-86.827048	32.814386
1	AK	Alaska	MULTIPOLYGON (((-131.60202 55.11798, -131.5691...	POINT (-152.52500 65.00297)	-152.525004	65.002968
2	AZ	Arizona	POLYGON ((-109.04250 37.00026, -109.04798 31.3...	POINT (-111.66516 34.33632)	-111.665157	34.336315
3	AR	Arkansas	POLYGON ((-94.47384 36.50186, -90.15254 36.496...	POINT (-92.43914 34.91573)	-92.439137	34.915733
4	CA	California	POLYGON ((-123.23326 42.00619, -122.37885 42.0...	POINT (-119.68388 37.38770)	-119.683878	37.387697

Below we have loaded the dataset which has information about the population of each US state. The dataset can be downloaded from the below link.

US States Population 2018

us_pop = pd.read_csv("datasets/State Populations.csv")

us_pop.head()

	State	2018 Population
0	California	39776830
1	Texas	28704330
2	Florida	21312211
3	New York	19862512
4	Pennsylvania	12823989

Now we have merged the US state’s geo dataset with the US state’s population dataset. We have removed geometry column which had polygons/multi-polygons as geometries. We have also renamed center column as new geometry column. We'll be using this modified dataframe to create bubbles on the US map.

us_states_pop = us_states.merge(us_pop, left_on="name", right_on="State").drop(columns=["geometry"]).rename(columns={"center": "geometry"})

us_states_pop.head()

	id	name	geometry	Longitude	Latitude	State	2018 Population
0	AL	Alabama	POINT (-86.82705 32.81439)	-86.827048	32.814386	Alabama	4888949
1	AK	Alaska	POINT (-152.52500 65.00297)	-152.525004	65.002968	Alaska	738068
2	AZ	Arizona	POINT (-111.66516 34.33632)	-111.665157	34.336315	Arizona	7123898
3	AR	Arkansas	POINT (-92.43914 34.91573)	-92.439137	34.915733	Arkansas	3020327
4	CA	California	POINT (-119.68388 37.38770)	-119.683878	37.387697	California	39776830

Below we have first created a US map using polyplot() method giving it original US states GeoDataFrame which we had loaded earlier. We have stored Axes reference returned by it in a variable.

We have then created bubbles on the map using the merged geo dataframe which we created in the previous cell. We have instructed pointplot() method to use column 2018 Population for hue and scale both. The values of this column will decide the size of points on the map. We have given Axes reference generated by polyplot() to it so that bubbles are drawn on the same map. We have instructed the method to use the value of scale parameter to decide legend values.

We have asked the method to use Quantiles scheme for clustering column data into 5 clusters. We can notice that from the legend that it has 5 entries. The limit parameter is set to (5,50) informing to create bubbles in this size range.

us_map_axes = geoplot.polyplot(us_states, facecolor="white", figsize=(15, 10));

geoplot.pointplot(us_states_pop,
                  hue="2018 Population",
                  scale="2018 Population",
                  scheme="Quantiles", ## mapclassify.Quantiles(us_states_pop["2018 Population"], k=5) will give same results.
                  ax=us_map_axes,
                  edgecolor="black",
                  legend=True,
                  legend_var="scale",
                  legend_kwargs={"loc":"best",
                                  "fontsize": "large",
                                  "title":"Population",
                                  "title_fontsize":"large"},
                  limits=(5, 50),

                 );

plt.title("US State's 2018 Population", fontdict={"fontsize": 15}, pad=15);

GDP of Countries Bubble Map¶

Our second bubble map shows the GDP of countries on the world map. We have the first loaded dataset which has information about happiness scores and few other important parameters for each country of the world. The dataset can be downloaded from the below link.

World Happiness Dataset

We have then merged the world geo dataset with the happiness dataset to create a merged geo dataset. We have then found out centers for each country using calculate_center() method. We have then removed original geometry columns which have polygons/multi-polygons. The center column has been renamed as a new geometry column.

happiness = pd.read_csv("datasets/world_happiness_2019.csv")

world_happiness = world.merge(happiness, left_on="name", right_on="Country or region")

world_happiness["center"] = calculate_center(world_happiness)

world_happiness = world_happiness.drop(columns=["geometry"]).rename(columns={"center": "geometry"})

world_happiness.head()

	pop_est	continent	name	iso_a3	gdp_md_est	Overall rank	Country or region	Score	GDP per capita	Social support	Healthy life expectancy	Freedom to make life choices	Generosity	Perceptions of corruption	geometry
0	53950935	Africa	Tanzania	TZA	150600.0	153	Tanzania	3.231	0.476	0.885	0.499	0.417	0.276	0.147	POINT (34.75848 -6.27836)
1	35623680	North America	Canada	CAN	1674000.0	9	Canada	7.278	1.365	1.505	1.039	0.584	0.285	0.308	POINT (-96.99822 67.99064)
2	326625791	North America	United States of America	USA	18560000.0	19	United States of America	6.892	1.433	1.457	0.874	0.454	0.280	0.128	POINT (-119.45018 51.26001)
3	18556698	Asia	Kazakhstan	KAZ	460700.0	60	Kazakhstan	5.809	1.173	1.508	0.729	0.410	0.146	0.096	POINT (67.31752 48.46973)
4	29748859	Asia	Uzbekistan	UZB	202300.0	41	Uzbekistan	6.174	0.745	1.529	0.756	0.631	0.322	0.240	POINT (63.12119 41.82781)

Below we have first created a world map using polyplot() method and world geo dataset. We have Axes reference object in a variable.

We have then created bubbles showing the GDP of each country on a world map using pointplot() method. We have given our merged world geo dataset from the previous cell. We have asked to use gdp_md_est column as column for hue and scale. The legend is decided based on the value of scale parameter.

world_map_axes = geoplot.polyplot(world, facecolor="white", figsize=(15, 10));

geoplot.pointplot(world_happiness,
                  hue="gdp_md_est",
                  scale="gdp_md_est",
                  cmap="RdBu",
                  ax=world_map_axes,
                  edgecolor="black",
                  legend=True,
                  legend_var="scale",
                  legend_kwargs={"loc":"best",
                                  "fontsize": "large",
                                  "title":"GDP",
                                  "title_fontsize":"large"},
                  limits=(5, 50),

                 );

plt.title("GDP of Countries", fontdict={"fontsize": 15}, pad=15);

Starbucks US Store per State Bubble Map¶

In this section, we'll be creating a bubble map showing the count of Starbucks stores for each US state.

We have first created an intermediate dataframe where we have filtered entries of only US Starbucks. We have then created a new dataframe that has a count of stores per US state by using the grouping functionality of pandas.

We have then merged this intermediate dataset with the geo dataset (the US states population bubble map section) which has a center of each US state. The final geo dataframe has a center for each state and Starbucks store count for each as well. We'll be using this dataset for creating our bubble map.

starbucks_us = starbucks[starbucks["Country"] == "US"]

starbucks_us_state_cnt = starbucks_us.groupby("State/Province").count()[["Store Number"]].reset_index().rename(columns={"Store Number":"Store Count"})

starbucks_us_state_cnt.head()

	State/Province	Store Count
0	AK	49
1	AL	85
2	AR	55
3	AZ	488
4	CA	2821

starbucks_us_statewise = us_states_pop.merge(starbucks_us_state_cnt, left_on="id", right_on="State/Province")

starbucks_us_statewise.head()

	id	name	geometry	Longitude	Latitude	State	2018 Population	State/Province	Store Count
0	AL	Alabama	POINT (-86.82705 32.81439)	-86.827048	32.814386	Alabama	4888949	AL	85
1	AK	Alaska	POINT (-152.52500 65.00297)	-152.525004	65.002968	Alaska	738068	AK	49
2	AZ	Arizona	POINT (-111.66516 34.33632)	-111.665157	34.336315	Arizona	7123898	AZ	488
3	AR	Arkansas	POINT (-92.43914 34.91573)	-92.439137	34.915733	Arkansas	3020327	AR	55
4	CA	California	POINT (-119.68388 37.38770)	-119.683878	37.387697	California	39776830	CA	2821

Below we have first created a US map using polyplot() method. We have stored Axes reference returned by a method in a variable which we'll pass on to pointplot() method.

We have then added bubbles on US map using pointplot() method. We have asked it to use Store Count column for hue and scale parameters. We have added a title to the chart as well.

us_map_axes = geoplot.polyplot(us_states, facecolor="white", figsize=(15, 15));

geoplot.pointplot(starbucks_us_statewise,
                  hue="Store Count",
                  scale="Store Count",
                  cmap="BrBG",
                  ax=us_map_axes,
                  edgecolor="black",
                  legend=True,
                  legend_var="scale",
                  limits=(5, 50),

                 );

plt.title("Starbucks US Stores Count Statewise", fontdict={"fontsize": 15}, pad=15);

This ends our small tutorial explaining how we can use geoplot to create scatter and bubble maps in Python. Please feel free to let us know your views in the comments section. Please feel free to go through reference section tutorials if you want to learn about other libraries in python which can be used to plot maps.

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

geoplot, scatter-maps, bubble-maps

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)

Geoplot - Scatter & Bubble Maps [Python]¶

Starbucks Store Locations Worldwide Scatter Map¶

Starbucks Store Locations Across US Scatter Map¶

US States Population Bubble Map¶

GDP of Countries Bubble Map¶

Starbucks US Store per State Bubble Map¶

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

Want to Share Your Views? Have Any Suggestions?

Sunny Solanki

Subscribe to Our YouTube Channel

Tutorial Categories

Newsletter Subscription