Updated On : Apr-04,2020 Time Investment : ~20 mins

Treemap in Python (plotly)¶

Treemap are quite common ways to see the distribution of quantities with hierarchical structures. We have many situations where data is hierarchical like population by country/continent, market capitalization by sector/company, area distribution by country/continent, etc. We need an efficient way to represent this data so that we can analyze distribution down to a few layers of hierarchy. We'll be explaining how to draw treemaps using few examples below using plotly as our primary library.

import pandas as pd
import numpy as np

import plotly.express as px

Loading Datasets¶

We'll start the loading list of below datasets which will be used for plotting treemap.

district_wise_population = pd.read_csv("datasets/indian-census-data-with-geospatial-indexing/district wise population and centroids.csv")
district_wise_population["Country"] = "India"
district_wise_population.head()

	State	District	Latitude	Longitude	Population in 2001	Population in 2011	Country
0	Andhra Pradesh	Anantapur	14.312066	77.460158	3640478	4081148	India
1	Andhra Pradesh	Chittoor	13.331093	78.927639	3745875	4174064	India
2	Andhra Pradesh	East Godavari	16.782718	82.243207	4901420	5154296	India
3	Andhra Pradesh	Guntur	15.884926	80.586576	4465144	4887813	India
4	Andhra Pradesh	Krishna	16.143873	81.148051	4187841	4517398	India

world_data = pd.read_csv("datasets/countries of the world.csv")
world_data.head()

	Country	Region	Population	Area (sq. mi.)	Pop. Density (per sq. mi.)	Coastline (coast/area ratio)	Net migration	Infant mortality (per 1000 births)	GDP ($ per capita)	Literacy (%)	Phones (per 1000)	Arable (%)	Crops (%)	Other (%)	Climate	Birthrate	Deathrate	Agriculture	Industry	Service
0	Afghanistan	ASIA (EX. NEAR EAST)	31056997	647500	48,0	0,00	23,06	163,07	700.0	36,0	3,2	12,13	0,22	87,65	1	46,6	20,34	0,38	0,24	0,38
1	Albania	EASTERN EUROPE	3581655	28748	124,6	1,26	-4,93	21,52	4500.0	86,5	71,2	21,09	4,42	74,49	3	15,11	5,22	0,232	0,188	0,579
2	Algeria	NORTHERN AFRICA	32930091	2381740	13,8	0,04	-0,39	31	6000.0	70,0	78,1	3,22	0,25	96,53	1	17,14	4,61	0,101	0,6	0,298
3	American Samoa	OCEANIA	57794	199	290,4	58,29	-20,71	9,27	8000.0	97,0	259,5	10	15	75	2	22,46	3,27	NaN	NaN	NaN
4	Andorra	WESTERN EUROPE	71201	468	152,1	0,00	6,6	4,05	19000.0	100,0	497,2	2,22	0	97,78	3	8,71	6,25	NaN	NaN	NaN

starbucks_stores = pd.read_csv("datasets/starbucks_store_locations.csv")
starbucks_stores = starbucks_stores.groupby(["Country","State/Province","City"]).count()[["Store Number"]].rename(columns={"Store Number":"Count"})
starbucks_stores = starbucks_stores.reset_index()

starbucks_stores.head()

	Country	State/Province	City	Count
0	AD	7	Andorra la Vella	1
1	AE	AJ	Ajman	2
2	AE	AZ	Abu Dhabi	40
3	AE	AZ	Al Ain	8
4	AE	DU	Abu Dhabi	3

Indian State/District Population Distribution 2001 Treemap¶

Our first treemap consists of the population distribution of India per state per district for the year 2001. We have used 3 layers of hierarchical data here ['Country', 'State', 'District']. We need to pass categorical columns to path attribute whereas numerical column to values attribute to get a distribution of values by path hierarchy.

fig = px.treemap(district_wise_population,
                 path=['Country', 'State', 'District'],
                 values='Population in 2001')

fig.update_layout(title="Indian State/District Population Distribution 2001",
                  width=1000, height=700,)

fig.show()

Indian State/District Population Distribution 2011 Treemap¶

Our second treemap consists of the population distribution of India per state per district for the year 2011.

fig = px.treemap(district_wise_population,
                 path=['Country', 'State', 'District'],
                 values='Population in 2011',
                 color="District",
                 width=1000, height=700,
                 title="Indian State/District Population Distribution 2011",
                 )

fig.show()

World Population Distribution Treemap¶

Our third treemap consists of population distribution per country per continent/region.

fig = px.treemap(world_data,
                 path=['Region', 'Country'],
                 values='Population',
                 color='Country',
                 hover_data=['Area (sq. mi.)','Pop. Density (per sq. mi.)'],
                 width=1000, height=700,
                 title="World Population Distribution",)


fig.show()

Starbucks Store Counts Per City, State, Country Treemap¶

Our fourth treemap consists of Starbucks store counts per city per state per country for the whole world.We have color-encoded it by country.

fig = px.treemap(starbucks_stores,
                 path=["Country","State/Province","City"],
                 values='Count',
                 color='Country',
                 width=1000, height=700,
                 title="Starbucks Store Counts Per City, State, Country",)

fig.show()

World Area Distribution Color-encoded by GDP Treemap¶

Our fifth treemap consists of Area distribution per region per country for the whole world. We have also color-encoded it by GDP for each country so that we can see how area and GDP are related.

We can notice from the below graph that countries like Russia, China, Canada, the US, Brazil, Australia have high areas but GDP per capita is high for Canada, US, Australia.

fig = px.treemap(world_data,
                 path=['Region', 'Country'],
                 values='Area (sq. mi.)',
                 color='GDP ($ per capita)',
                 color_continuous_scale='RdYlGn',
                  )

fig.update_layout(title="World Area Distribution Color-encoded by GDP",
                  width=1000, height=600,)

fig.show()

World Population Distribution Color-encoded by GDP Treemap¶

Our sixth treemap consists of Population distribution per region per country for the whole world. We have also color-encoded it by GDP for each country so that we can see how Population and GDP are related.

We can notice from below graph that countries like China, India, US, Brazil, Pakistan, Indonesia has high area but GDP per capita is high for the US, Australia, and most Europe countries.

fig = px.treemap(world_data,
                 path=['Region', 'Country'],
                 values='Population',
                 color='GDP ($ per capita)',
                 color_continuous_scale='RdBu',
                 width=1000, height=600,
                 title="World Population Distribution Color-encoded by GDP",)

fig.show()

NOTE

Please make a note that it's advisable to use Treemap for representing data till 3 hierarchical layers only. If used beyond 3 layers then it'll become difficult to interpret for viewer.

This ends our small tutorial on generating Treemap in python using plotly. Please feel free to let us know your views in the comment section.

References¶

https://plotly.com/python/treemaps/

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

treemap, plotly

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)

Treemap in Python (plotly)¶

Loading Datasets¶

Indian State/District Population Distribution 2001 Treemap¶

Indian State/District Population Distribution 2011 Treemap¶

World Population Distribution Treemap¶

Starbucks Store Counts Per City, State, Country Treemap¶

World Area Distribution Color-encoded by GDP Treemap¶

World Population Distribution Color-encoded by GDP Treemap¶

NOTE

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

Want to Share Your Views? Have Any Suggestions?

Sunny Solanki

Subscribe to Our YouTube Channel

Tutorial Categories

Newsletter Subscription