Share @ LinkedIn Facebook  sunburst-chart, plotly
How to Create Sunburst Chart/Diagram in Python [Plotly]?

How to Create Sunburst Chart/Diagram in Python [Plotly]?

The sunburst diagram can be used to visualize the distribution of hierarchical variables of data. It represents distribution with a list of rings around the center circle. The central circle represents the total quantity of a particular attribute and then each ring around it represents distribution at that level to a relationship with parent ring which is inside of it. The common example to explain the usage of sunburst chart would population distribution of world where the central circle represents total world distribution, ring around it represents distribution per continent, ring around it represents distribution per country of each continent and ring around it can further used to for distribution per state of each country.

The sunburst chart is very similar to treemap charts with the only difference that data is laid out radially. If you are interested in learning treemap plotting using python then feel free to go through our tutorial on treemap which explains various ways to draw treemap in python.

We'll start by importing necessary libraries.

In [1]:
import pandas as pd
import numpy as np

pd.set_option("max_columns", 30)

import plotly.express as px
import plotly.graph_objects as go

We'll also be using 3 datasets available from kaggle to include further data for analysis and plotting.

  • world happiness report dataset - Dataset also holds information about GDP per capita, social support, life expectancy, generosity, and corruption.
  • starbucks store location data - It has information about each Starbucks store locations as well as their address, city, country, phone number, latitude and longitude data.
  • Indian census data - It has information aboutthe population of Indian districts in 2001 and 2011.

We suggest that you download allthe datasets to follow along with us through the tutorial.

In [2]:
starbucks_locations = pd.read_csv("datasets/starbucks_store_locations.csv")
starbucks_locations.head()
Out[2]:
Brand Store Number Store Name Ownership Type Street Address City State/Province Country Postcode Phone Number Timezone Longitude Latitude
0 Starbucks 47370-257954 Meritxell, 96 Licensed Av. Meritxell, 96 Andorra la Vella 7 AD AD500 376818720 GMT+1:00 Europe/Andorra 1.53 42.51
1 Starbucks 22331-212325 Ajman Drive Thru Licensed 1 Street 69, Al Jarf Ajman AJ AE NaN NaN GMT+04:00 Asia/Dubai 55.47 25.42
2 Starbucks 47089-256771 Dana Mall Licensed Sheikh Khalifa Bin Zayed St. Ajman AJ AE NaN NaN GMT+04:00 Asia/Dubai 55.47 25.39
3 Starbucks 22126-218024 Twofour 54 Licensed Al Salam Street Abu Dhabi AZ AE NaN NaN GMT+04:00 Asia/Dubai 54.38 24.48
4 Starbucks 17127-178586 Al Ain Tower Licensed Khaldiya Area, Abu Dhabi Island Abu Dhabi AZ AE NaN NaN GMT+04:00 Asia/Dubai 54.54 24.51
In [3]:
world_countries_data = pd.read_csv("datasets/countries of the world.csv")
world_countries_data["World"] = "World"
world_countries_data.head()
Out[3]:
Country Region Population Area (sq. mi.) Pop. Density (per sq. mi.) Coastline (coast/area ratio) Net migration Infant mortality (per 1000 births) GDP ($ per capita) Literacy (%) Phones (per 1000) Arable (%) Crops (%) Other (%) Climate Birthrate Deathrate Agriculture Industry Service World
0 Afghanistan ASIA (EX. NEAR EAST) 31056997 647500 48,0 0,00 23,06 163,07 700.0 36,0 3,2 12,13 0,22 87,65 1 46,6 20,34 0,38 0,24 0,38 World
1 Albania EASTERN EUROPE 3581655 28748 124,6 1,26 -4,93 21,52 4500.0 86,5 71,2 21,09 4,42 74,49 3 15,11 5,22 0,232 0,188 0,579 World
2 Algeria NORTHERN AFRICA 32930091 2381740 13,8 0,04 -0,39 31 6000.0 70,0 78,1 3,22 0,25 96,53 1 17,14 4,61 0,101 0,6 0,298 World
3 American Samoa OCEANIA 57794 199 290,4 58,29 -20,71 9,27 8000.0 97,0 259,5 10 15 75 2 22,46 3,27 NaN NaN NaN World
4 Andorra WESTERN EUROPE 71201 468 152,1 0,00 6,6 4,05 19000.0 100,0 497,2 2,22 0 97,78 3 8,71 6,25 NaN NaN NaN World
In [4]:
indian_district_population = pd.read_csv("datasets/indian-census-data-with-geospatial-indexing/district wise population for year 2001 and 2011.csv")
indian_district_population["Country"] = "India"
indian_district_population.head()
Out[4]:
State District Population in 2001 Population in 2011 Country
0 Andaman & Nicobar Islands Nicobar 42068 36842 India
1 Andaman & Nicobar Islands North & Middle Andaman 105613 105597 India
2 Andaman & Nicobar Islands South Andaman 208471 238142 India
3 Andhra Pradesh Anantapur 3640478 4081148 India
4 Andhra Pradesh Chittoor 3745875 4174064 India

There are two ways to generate a sunburst chart using plotly. It provides two APIs for generating sunburst charts.

  • plotly.express - It provides method named sunburst() to create sunburst charts.
  • plotly.graph_objects - It provides method named Sunburst() to create charts.

We'll be explaining both ways one by one below.

Plotly Express

The plotly has a module named express which provides easy to use method named sunburst() which can be used to create sunburst charts. It accepts dataframe containing data, columns to use for hierarchical, and column to use for actual values of the distribution. We can provide a list of columns with hierarchical relations as list to the pathattribute of the method. The values to use to decide sizes of distribution circles can be provided as a column name to thevaluesattribute. We can also providetitle,width, and height attribute of the figure. The sunburst() method returns figure object which can be used to show a chart by calling show() method on it.

Starbucks Store Count Distribution World Wide Sunburst Chart

We'll need to prepare the dataset first in order to show Starbucks store counts distribution per city, and country worldwide. We'll be grouping the original Starbucks dataset according to Country, and City. Then we'll call count() on it which will count entry for each possible combination of Country and City. We also have introduceda new column named World which has all valuesthe same containing string World. We have created this column to createa circle inthe center to seethe total worldwide count.

In [5]:
starbucks_dist = starbucks_locations.groupby(by=["Country", "State/Province", "City"]).count()[["Store Number"]].rename(columns={"Store Number":"Count"})
starbucks_dist["World"] = "World"
starbucks_dist = starbucks_dist.reset_index()
starbucks_dist.head()
Out[5]:
Country State/Province City Count World
0 AD 7 Andorra la Vella 1 World
1 AE AJ Ajman 2 World
2 AE AZ Abu Dhabi 40 World
3 AE AZ Al Ain 8 World
4 AE DU Abu Dhabi 3 World
In [ ]:
fig = px.sunburst(starbucks_dist,
                  path=["World", "Country", "State/Province", "City"],
                  values='Count',
                  title="Starbucks Store Count Distribution World Wide [Country, State, City]",
                  width=750, height=750)
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

Indian Districts Population Distribution Per State [2011] Sunburst Chart

Below we are creating a sunburst chart depicting population distribution per district per the of India in 2011. We have passed the path parameter list of columns necessary to createa hierarchy. We have covered this in our tutorial on treemap as well.

In [ ]:
fig = px.sunburst(indian_district_population,
                  path=["Country", "State", "District",],
                  values='Population in 2011',
                  width=750, height=750,
                  title="Indian District Population Per State",
                  )
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Population Distribution Per Country Per Region Sunburst Chart

Below we have created a sunburst chart showing population count per country per region of the world. We have provided necessary columns having a hierarchical relationship to the path parameter of the method.

In [ ]:
fig = px.sunburst(world_countries_data,
                  path=["World", "Region", "Country"],
                  values='Population',
                  width=750, height=750,
                  title="World Population Per Country Per Region",
                  )
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Area Distribution Per Country Per Region Sunburst Chart

Below the sunburst chart explains area distribution per country per region worldwide.

In [ ]:
fig = px.sunburst(world_countries_data,
                  path=["World", "Region", "Country"],
                  values='Area (sq. mi.)',
                  width=750, height=750,
                  title="World Area Per Country Per Region",
                  )
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Population Distribution Per Country Per Region Color-Encoded By GDP Sunburst Chart

Below we have again plotted sunburst chart explaining population distribution per country per region but we have also color encoded each distribution according to GDP of that country/region. We can compare the population and GDP of the country based on this sunburst chart. We can notice that countries like India and China have less GDP even though having more population whereas countries like the US, Japan, Germany, UK, France, Australia, Hong Kong have less population but more GDP.

In [ ]:
fig = px.sunburst(world_countries_data,
                  path=["World", "Region", "Country"],
                  values='Population',
                  width=750, height=750,
                  color_continuous_scale="BrBG",
                  color='GDP ($ per capita)',
                  title="World Population Per Country Per Region Color-Encoded By GDP"
                  )
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Population Per Country Per Region Color-Encoded By Area Sunburst Chart

Below we have again plotted population distribution per country per region of the world but this time we have color encoded data to the area of countries and region. This helps us compare the relationship between population and area. We can notice that countries like India are more but has less area compared to countries like Russia, the United States, Brazil which has visibly more area with less population.

In [ ]:
fig = px.sunburst(world_countries_data,
                  path=["World", "Region", "Country"],
                  values='Population',
                  width=750, height=750,
                  color_continuous_scale="RdYlGn",
                  color='Area (sq. mi.)',
                  title="World Population Per Country Per Region Color-Encoded By Area"
                  )
fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

Plotly Graph Objects

The second way of creating a sunburst chart using plotly is using the Sunburst() method of the graph_objects module. We need to provide it a list of all possible combination of parent and child combination and their values in order to create a chart using this method.

World Population Per Country Per Region Sunburst Chart

In order to create a sunburst chart using graph_objects.Sunburst() method, we have done little preprocessing with data. The Sunburst() method expects that we provided all possible parent-child relationship labels and their values to it. We have region-country relation labels and values ready in the dataset but for getting world-region relationship labels and values we have grouped dataframe according to the region in order to get region-wise population counts. We have then combined labels in order to generate all possible parent-child relationship labels as well as values.

In [ ]:
region_wise_pop = world_countries_data.groupby(by="Region").sum()[["Population"]].reset_index()

parents = [""] + ["World"] *region_wise_pop.shape[0] + world_countries_data["Region"].values.tolist()
labels = ["World"] + region_wise_pop["Region"].values.tolist() + world_countries_data["Country"].values.tolist()
values  = [world_countries_data["Population"].sum()] + region_wise_pop["Population"].values.tolist() + world_countries_data["Population"].values.tolist()

fig =go.Figure(go.Sunburst(
    parents=parents,
    labels= labels,
    values= values,
))

fig.update_layout(title="World Population Per Country Per Region",
                  width=700, height=700)

fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Population Per Country Per Region Sunburst Chart

Below we have again created a sunburst chart of population distribution but this time it looks completely like the plotly.express module. We have set the branchvalues parameter to string value total which fills the whole circle. By default, the Sunburst() method does not create full circle sunburst charts.

In [ ]:
region_wise_pop = world_countries_data.groupby(by="Region").sum()[["Population"]].reset_index()

parents = [""] + ["World"] *region_wise_pop.shape[0] + world_countries_data["Region"].values.tolist()
labels = ["World"] + region_wise_pop["Region"].values.tolist() + world_countries_data["Country"].values.tolist()
values  = [world_countries_data["Population"].sum()] + region_wise_pop["Population"].values.tolist() + world_countries_data["Population"].values.tolist()

fig =go.Figure(go.Sunburst(
    parents=parents,
    labels= labels,
    values= values,
    branchvalues="total",
))

fig.update_layout(title="World Population Per Country Per Region",
                  width=700, height=700)

fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

World Population and Area Distribution Per Country Per Region Sunburst Chart Subplots

Below we have combined two sunburst charts into a single figure. One sunburst chart is about world population distribution per country per region and another is about area distribution per country per region. We can combine many related sunburst charts this way to show possible relationships. Please go through code to understand little preprocessing in order to create charts.

In [ ]:
fig = go.Figure()

parents = [""] + ["World"] *region_wise_pop.shape[0] + world_countries_data["Region"].values.tolist()
labels = ["World"] + region_wise_pop["Region"].values.tolist() + world_countries_data["Country"].values.tolist()
values  = [world_countries_data["Population"].sum()] + region_wise_pop["Population"].values.tolist() + world_countries_data["Population"].values.tolist()

fig.add_trace(go.Sunburst(
    parents=parents,
    labels= labels,
    values= values,
    domain=dict(column=0),
    name="Population Distribution"
))


region_wise_area = world_countries_data.groupby(by="Region").sum()[["Area (sq. mi.)"]].reset_index()

parents = [""] + ["World"] *region_wise_area.shape[0] + world_countries_data["Region"].values.tolist()
labels = ["World"] + region_wise_area["Region"].values.tolist() + world_countries_data["Country"].values.tolist()
values  = [world_countries_data["Area (sq. mi.)"].sum()] + region_wise_area["Area (sq. mi.)"].values.tolist() + world_countries_data["Area (sq. mi.)"].values.tolist()


fig.add_trace(go.Sunburst(
    parents=parents,
    labels= labels,
    values= values,
    domain=dict(column=1)
))


fig.update_layout(
    grid= dict(columns=2, rows=1),
    margin = dict(t=0, l=0, r=0, b=0),
    width=900, height=700
)

fig.show()

How to Create Sunburst Chart/Diagram in Python [Plotly]?

This ends our small tutorial explaining how to plot a sunburst chart in python using plotly. Please feel free to let us know your views in the comments section.

References



Sunny Solanki  Sunny Solanki