Choropleth Maps & Scatter Maps using cufflinks [Python]

Updated On : Sep-29,2020 Time Investment : ~20 mins

Choropleth Maps & Scatter Maps using cufflinks¶

The cufflinks library provides a wrapper around pandas so that we can create an interactive plotly chart directly from it by calling iplot() or figure() method on the dataframe. The iplot() API is almost the same as that of plot() API which generates charts based on matplotlib. We have already covered a tutorial explaining how to generate various charts using cufflinks in a separate tutorial. We recommend that you go through that tutorial if you do not have a background on cufflinks.

cufflinks - How to create plotly charts from pandas dataframe with one line of code?

As a part of this tutorial, we'll be using the same API to generate scatter maps and choropleth maps. We'll be generating scatter and choropleth maps using one line of code from the pandas dataframe.

We'll start by loading the necessary libraries.

import pandas as pd
import numpy as np

import cufflinks as cf

print("List of Cufflinks Themes : ", cf.getThemes())

cf.set_config_file(theme='ggplot',sharing='public',offline=True)

List of Cufflinks Themes :  ['ggplot', 'pearl', 'solar', 'space', 'white', 'polar', 'henanigans']

Load Datasets¶

We'll be using below mentioned 2 datasets for plotting various maps. Both datasets are easily available from kaggle. We suggest that you download both datasets to follow along with the tutorial.

World Happiness Report Dataset - It has information about attributes like happiness score, GDP per capita, social support, healthy life expectancy, generosity, corruption, and freedom to make life choices for each country of the world.
Starbucks Store Locations Dataset - It has information about Starbucks store locations worldwide. It has information about each store's name, address, city, state, country, latitude, and longitude.

We have loaded both datasets as pandas dataframe.

starbucks_stores = pd.read_csv("datasets/starbucks_store_locations.csv")

starbucks_stores.head()

	Brand	Store Number	Store Name	Ownership Type	Street Address	City	State/Province	Country	Postcode	Phone Number	Timezone	Longitude	Latitude
0	Starbucks	47370-257954	Meritxell, 96	Licensed	Av. Meritxell, 96	Andorra la Vella	7	AD	AD500	376818720	GMT+1:00 Europe/Andorra	1.53	42.51
1	Starbucks	22331-212325	Ajman Drive Thru	Licensed	1 Street 69, Al Jarf	Ajman	AJ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	55.47	25.42
2	Starbucks	47089-256771	Dana Mall	Licensed	Sheikh Khalifa Bin Zayed St.	Ajman	AJ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	55.47	25.39
3	Starbucks	22126-218024	Twofour 54	Licensed	Al Salam Street	Abu Dhabi	AZ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	54.38	24.48
4	Starbucks	17127-178586	Al Ain Tower	Licensed	Khaldiya Area, Abu Dhabi Island	Abu Dhabi	AZ	AE	NaN	NaN	GMT+04:00 Asia/Dubai	54.54	24.51

world_happiness = pd.read_csv("datasets/world_happiness_2019.csv")
world_happiness.head()

	Overall rank	Country or region	Score	GDP per capita	Social support	Healthy life expectancy	Freedom to make life choices	Generosity	Perceptions of corruption
0	1	Finland	7.769	1.340	1.587	0.986	0.596	0.153	0.393
1	2	Denmark	7.600	1.383	1.573	0.996	0.592	0.252	0.410
2	3	Norway	7.554	1.488	1.582	1.028	0.603	0.271	0.341
3	4	Iceland	7.494	1.380	1.624	1.026	0.591	0.354	0.118
4	5	Netherlands	7.488	1.396	1.522	0.999	0.557	0.322	0.298

Scatter Maps¶

We can plot a scatter map from the pandas dataframe by calling the figure() method on it and passing the kind parameter value as scattergeo. We also need to pass latitude and longitude column names to lat and lon parameters of the figure() method. We have also passed the Store Name column to the text parameter so that when a mouse hovers over any point in the chart, the name of that store will be displayed in a tooltip.

Below we have plotted a scatter chart of Starbucks store locations worldwide. We can clearly see a high amount of store concentration in the US, Europe, and China.

starbucks_stores.figure(kind="scattergeo",
                        size=0.05,
                        margin=(0,0,0,0),
                        colors=["tomato"],
                        lat="Latitude", lon="Longitude", text="Store Name")

Below we have created another scatter chart exactly the same way as the previous step. We have plotted scatter chart for stores only located in the US. We have added one more parameter which is projection. We need to override the default projection which plots points on the world map to the USA map. We have set albers usa as a projection in order to highlight only the US map.

We can see a high concentration of Starbucks stores in the east and west coast of the US.

us_stores = starbucks_stores[starbucks_stores.Country=="US"]

us_stores.figure(kind="scattergeo",
                        size=0.05,
                        margin=(0,0,0,0),
                        colors="tomato",
                        projection={"type":"albers usa"},
                        lat="Latitude", lon="Longitude", text="Store Name")

Choropleth Maps¶

The second chart type that we'll introduce is choropleth maps. We'll be using the world happiness dataframe for plotting happiness score, population, and GDP per capita as choropleth maps. The choropleth maps in plotly require country or state names as ISO codes instead of the full name. Our original happiness dataset has a full country name instead of ISO codes for the country. We'll hence use geopandas dataframe to get ISO codes for the country from country name.

We have below loaded geopandas library and data frame which has information about each country of the world as well as their ISO codes.

import geopandas as gpd

gpd.datasets.available

['naturalearth_cities', 'naturalearth_lowres', 'nybb']

world_geo_df = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
world_geo_df.head()

	pop_est	continent	name	iso_a3	gdp_md_est	geometry
0	920938	Oceania	Fiji	FJI	8374.0	MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1	53950935	Africa	Tanzania	TZA	150600.0	POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2	603253	Africa	W. Sahara	ESH	906.5	POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3	35623680	North America	Canada	CAN	1674000.0	MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4	326625791	North America	United States of America	USA	18560000.0	MULTIPOLYGON (((-122.84000 49.00000, -120.0000...

We are merging the geopandas dataframe with the world happiness dataframe so that the final dataframe will have ISO codes for each country present in it.

world_geo_df = world_geo_df.merge(world_happiness, how="left", left_on="name", right_on="Country or region")
world_geo_df.head()

	pop_est	continent	name	iso_a3	gdp_md_est	geometry	Overall rank	Country or region	Score	GDP per capita	Social support	Healthy life expectancy	Freedom to make life choices	Generosity	Perceptions of corruption
0	920938	Oceania	Fiji	FJI	8374.0	MULTIPOLYGON (((180.00000 -16.06713, 180.00000...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	53950935	Africa	Tanzania	TZA	150600.0	POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...	153.0	Tanzania	3.231	0.476	0.885	0.499	0.417	0.276	0.147
2	603253	Africa	W. Sahara	ESH	906.5	POLYGON ((-8.66559 27.65643, -8.66512 27.58948...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	35623680	North America	Canada	CAN	1674000.0	MULTIPOLYGON (((-122.84000 49.00000, -122.9742...	9.0	Canada	7.278	1.365	1.505	1.039	0.584	0.285	0.308
4	326625791	North America	United States of America	USA	18560000.0	MULTIPOLYGON (((-122.84000 49.00000, -120.0000...	19.0	United States of America	6.892	1.433	1.457	0.874	0.454	0.280	0.128

We can easily create a choropleth map from the world dataframe by calling iplot() method on it and passing the kind parameter as choropleth. Apart from chart kind, we also need to pass two other important parameters which are locations and z. The locations parameter will be used to map ISO codes in the choropleth map and the z parameter will be used to map the value for that code. We have used the iso_a3 column as a locations column because it has ISO codes for each country and pop_est as z parameter as it has population data for each country.

We have first created a choropleth map of the world population. We have used Reds as the color palette of the map. We can see from chart high concentration of the population in China and India.

world_geo_df.iplot(kind="choropleth",
                   locations="iso_a3", z="pop_est",
                   colorscale="Reds",
                   margin=(0,0,0,0), title="World Population Choropleth Map")

Below we have created another choropleth map which is created with exactly the same code as the previous chart with only difference in the column used for the z parameter and color palette. We have plotted a choropleth map showing happiness for each country of the world.

world_geo_df.iplot(kind="choropleth",
                   locations="iso_a3", z="Score",
                   colorscale="PiYG",
                   margin=(0,0,0,0), title="World Happiness Choropleth Map")

The third choropleth map that we have created is the same way as the previous two choropleth maps. We have created a choropleth map of GDP per capita for each country of the world.

world_geo_df.iplot(kind="choropleth",
                   locations="iso_a3", z="GDP per capita",
                   colorscale="RdBu",
                   margin=(0,0,0,0), title="World GDP Per Capita Choropleth Map")

The fourth choropleth map that we'll be creating will show the distribution of Starbucks stores per each US state. We have hence created a new dataframe below which has information about the count of Starbucks store per each state of the US.

us_stores = starbucks_stores[starbucks_stores.Country == "US"]
us_stores = us_stores.groupby(by=['State/Province']).count()[["Store Name"]].rename(columns={"Store Name":"Count"}).reset_index()
us_stores.head()

	State/Province	Count
0	AK	49
1	AL	85
2	AR	55
3	AZ	488
4	CA	2821

We can easily create a choropleth map from the us_stores dataframe by calling iplot() method on it. We have used State/Province column as locations column and Count column as z column. We also have introduced two more parameters (locationmode and projection) which are needed in the case of the USA. These two parameters will help us show only the US map. If we don't provide these parameters then it'll show the whole world map which is not needed as we only need to see the US map.

us_stores.iplot(kind="choropleth",
                       locations="State/Province", z="Count",
                       colorscale="YlOrRd",
                       margin=(0,0,0,0), locationmode="USA-states",
                       projection={"type":"albers usa"},
                       title="Starbucks Stores Count Per US State", )

This ends our small tutorial explaining how to create scatter maps and choropleth maps from cufflinks using one line of code. Please feel free to let us know your views in the comments section.

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

cufflinks, maps, plotly, pandas

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)