Share @ LinkedIn Facebook  bqplot, interactive-plots
bqplot - Interactive Plotting in Python Jupyter Notebook

Interactive Plotting in Python Jupyter Notebook using bqplot

Table of Contents

Introduction

bqplot is an interactive data visualization library developed by Bloomberg developers. It's totally based on d3.js (data visualization javascript library) and ipywidgets (python jupyter notebook widgets library). The main aim of bqplot is to bring in benefits ofd3.js functionality to python along with utilizing widgets facility of ipywidgets by keeping all plot components as widgets to infuse flexibility. The library is developed with keeping interactive widgets in mind which allows us to change widgets value to reflect changes in the plot. All of the individual components of the graph in bqplot are interactive widgets based on ipywidgets. This gives a lot of flexibility with regard to creating interactive visualization as well as easy integration with other notebook widgets.

bqplot provides 2 kinds of APIs for creating plots:

  • Matplotlib pyplot like API: It provides the same set of functions as that of available in matplotlib.pyplot module. We can easily create graphs by calling methods like scatter(), bar(), pie(), heatmap(), etc.

  • bqplot internal object model API: It provides API which lets us create an object for each individual graph components like figure, axis, scales, etc. Each of these objects behaves as a widget and can be linked to other widgets. We need to then combine all of this to create a plot. This API gives more flexibility.

We'll be covering bqplot's matplotlib like pyplot API in this tutorial. We'll also give various examples explaining about individual components of graph and modification of them to create aesthetically pleasing graphs. We'll be using various datasets to explain various chart types available with bqplot.

If you are interested in learning about plotting with internal object model API then please feel free to visit our tutorial on it:

We'll start by importing necessary libraries.

In [1]:
import bqplot
from bqplot import pyplot as plt

import pandas as pd
import numpy as np

import sklearn

import warnings
warnings.filterwarnings("ignore")

Load Datasets

We'll be loading all datasets from the beginning and will be keeping them as pandas dataframe to make plotting easy.

Wine Dataset

The first dataset that we'll be loading is wine dataset available with scikit-learn. It has information about wine ingredients and their presence in three different wine categories.

In [2]:
from sklearn.datasets import load_wine
wine  = load_wine()

print("Dataset Features : ", wine.feature_names)
print("Dataset Size : ", wine.data.shape)

wine_df = pd.DataFrame(data=wine.data, columns=wine.feature_names)
wine_df["Category"] = wine.target

wine_df.head()
Dataset Features :  ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']
Dataset Size :  (178, 13)
Out[2]:
alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue od280/od315_of_diluted_wines proline Category
0 14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0 0
1 13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0 0
2 13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0 0
3 14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0 0
4 13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0 0

Apple OHLC Data from Yahoo Finance

Another dataset that we'll be using for our explanation purpose is APPLE OHLC data downloaded from yahoo finance as CSV. We'll be loading it as a pandas dataframe.

In [3]:
apple_df = pd.read_csv("datasets/AAPL.csv", index_col=0, parse_dates=True)
apple_df.head()
Out[3]:
Open High Low Close Adj Close Volume
Date
2019-04-05 196.449997 197.100006 195.929993 197.000000 194.454758 18526600
2019-04-08 196.419998 200.229996 196.339996 200.100006 197.514709 25881700
2019-04-09 200.320007 202.850006 199.229996 199.500000 196.922470 35768200
2019-04-10 198.679993 200.740005 198.179993 200.619995 198.027985 21695300
2019-04-11 200.850006 201.000000 198.440002 198.949997 196.379578 20900800

World Happiness Dataset

The third dataset that we'll be using for an explanation of map charts is world happiness dataset available on kaggle. It has information about attributes like happiness score, perception of corruption, healthy life expectancy, social support by govt., freedom to make life choices, generosity and GDP per capita for various countries of the earth. We'll be loading it as a pandas dataframe.

In [4]:
happiness_df = pd.read_csv("datasets/world_happiness_2019.csv")
happiness_df.head()
Out[4]:
Overall rank Country or region Score GDP per capita Social support Healthy life expectancy Freedom to make life choices Generosity Perceptions of corruption
0 1 Finland 7.769 1.340 1.587 0.986 0.596 0.153 0.393
1 2 Denmark 7.600 1.383 1.573 0.996 0.592 0.252 0.410
2 3 Norway 7.554 1.488 1.582 1.028 0.603 0.271 0.341
3 4 Iceland 7.494 1.380 1.624 1.026 0.591 0.354 0.118
4 5 Netherlands 7.488 1.396 1.522 0.999 0.557 0.322 0.298

We suggest that you download all datasets beforehand and keep it in the same directory as a jupyter notebook to follow along with a tutorial. We'll now start by plotting various plots to explain the usage of bqplot's pyplot API.

1. Scatter Plot

The first plot type that we'll introduce is a scatter plot. We'll plot the alcohol vs malic acid relationship using a scatter plot.

In [ ]:
fig = plt.figure(title="Alcohol vs Malic Acid Relation")

scat = plt.scatter(x=wine_df["alcohol"], y=wine_df["malic_acid"])

plt.xlabel("Alcohol")
plt.ylabel("Malic Acid")

plt.show()

Bqplot Simple Scatter Chart

Below we are trying to modify scatter plot by passing arguments related to color, edge color, edge width, marker size, market type, opacity, etc. Below we have explained another way of setting axis attributes by passing them as a dictionary to the axes_options parameter. We need to use stroke and stroke_width parameters to modify the line property of markers. We have used square markers for this scatter plot and 2 different colors to color individual markers.

In [ ]:
fig = plt.figure(title="Alcohol vs Malic Acid Relation", )

options = {'x':{'label':"Alcohol"}, 'y':{'label':'Malic Acid'}}

scat = plt.scatter(wine_df["alcohol"], wine_df["malic_acid"],
                   colors=["lime", "tomato"],
                   axes_options = options,
                   stroke="black", stroke_width=2.0,
                   default_size=150,
                   default_opacities=[0.7],
                   marker="square",
                   )

plt.show()

Bqplot Simple Scatter Chart

We can even access the layout object from the figure object and then modify plot width and height by setting their values as pixels. We are also setting the x-axis label, y-axis label and x-axis limit to further enhance the graph. We are also color-encoding points according to the wine category. We also have changed the color bar location through the axes_options parameter. We are color-encoding points of scatter plot by using different wine categories.

In [ ]:
fig = plt.figure(title="Alcohol vs Malic Acid Relation")

fig.layout.height = "500px"
fig.layout.width = "600px"

options = {'color': dict(label='Category', orientation='vertical', side='right')}

scat = plt.scatter(x = wine_df["alcohol"], y = wine_df["malic_acid"],
                   color=wine_df["Category"],
                   axes_options = options,
                   stroke="black", stroke_width=2.0,
                   default_size=200,
                   default_opacities=[0.9],
                   marker="circle",
                   )

plt.xlabel("Alcohol")
plt.ylabel("Malic Acid")

plt.xlim(10.7, 15.3)

plt.show()

Bqplot Simple Scatter Chart

Below we are introducing tooltip which will highlight Wine Category, Alcohol and Malic Acid values for that point when the mouse hovers over it. We need to pass graph attributes that will be used to generate tooltip contents. We are using the contents of the x-axis, y-axis and color (wine category) for displaying on the tooltip.

In [ ]:
from bqplot import Tooltip
scat.tooltip = Tooltip(fields=["color", 'x', 'y'], labels=["Wine Category", "Alcohol", "Malic Acid"])

We can also enable movement of a point on the graph by setting enable_move attribute to True.

In [ ]:
scat.enable_move = True

2. Bar Chart

The second type of chart we'll introduce is a bar chart and it's a variety like a side by side as well as stacked bar charts.

2.1 Simple Bar Chart

Below We are plotting our first bar chart depicting the average magnesium per wine category. We have first grouped entries of wine dataframe to group entries according to wine categories and then have taken average to collect dataframe with average values of all columns per wine category. We'll be further using these average values per wine category dataframe in the future with other charts as well.

In [ ]:
fig = plt.figure(title="Average Magnesium Per Wine Category")

fig.layout.height = "400px"
fig.layout.width = "600px"

avg_wine_df = wine_df.groupby(by="Category").mean()

bar_chart  = plt.bar(x = avg_wine_df.index, y= avg_wine_df["magnesium"])

bar_chart.colors = ["tomato"]

bar_chart.tooltip = Tooltip(fields=["x", "y"], labels=["Wine Category", "Avg Magnesium"])

plt.xlabel("Wine Category")
plt.ylabel("Average Magnesium")


plt.show()

Bqplot Simple Bar Chart

2.2 Side by Side Bar Chart

The below example demonstrates how to generate side by side bar chart. We are generating average ash and average flavonoids per wine category as a bar chart.

In [ ]:
fig = plt.figure(title="Average Magnesium Per Wine Category",
                 fig_margin={'top':50, 'bottom':20, 'left':150, 'right':150},
                 legend_location="top-left")

avg_wine_df = wine_df.groupby(by="Category").mean()

bar_chart  = plt.bar(x = avg_wine_df.index, y= [avg_wine_df["ash"], avg_wine_df["flavanoids"]],
                     labels = ["Ash", "Flavanoids"],
                     display_legend=True)

bar_chart.type = "grouped"

bar_chart.colors = ["tomato", "lime"]

bar_chart.tooltip = Tooltip(fields=["x", "y"], labels=["Wine Category", "Avg Ash/Flavanoids"])

plt.xlabel("Wine Category")
plt.ylabel("Average Magnesium")


plt.show()

Bqplot Side by Side Bar Chart

2.3 Stacked bar Chart

Below we are explaining a stacked bar chart example. We are plotting average ash and flavonoids per wine category stacked over one another as a bar chart.

In [ ]:
fig = plt.figure(title="Average Magnesium Per Wine Category",
                fig_margin={'top':50, 'bottom':20, 'left':150, 'right':150},)

avg_wine_df = wine_df.groupby(by="Category").mean()

bar_chart  = plt.bar(x = avg_wine_df.index, y= [avg_wine_df["ash"], avg_wine_df["flavanoids"]],
                     labels=["Ash", "Flavanoids"],
                     display_legend=True)

bar_chart.type = "stacked"

bar_chart.colors = bqplot.CATEGORY10

bar_chart.tooltip = Tooltip(fields=["x", "y"], labels=["Wine Category", "Avg Ash/Flavanoids"])

plt.xlabel("Wine Category")
plt.ylabel("Average Magnesium")


plt.show()

Bqplot Stacked Bar Chart

3. Line Chart

The third chart type that we would like to introduce is the famous line chart. We'll be plotting simple line chart as well as chart with more than one line per chart.

3.1 Apple Stock Close Price Line Chart

Below we are plotting apple stock close price for the whole period from May-2019 till Apr - 2020. We'll be using plot() method by passing it date-range and closing prices to generate a line chart.

In [ ]:
fig = plt.figure(title="Apple Stock Close Price")

line_chart = plt.plot(x=apple_df.index, y=apple_df.Close)

plt.xlabel("Date")
plt.ylabel("Close Price")

plt.show()

Bqplot Single Line Chart

3.2 Apple Stock Open, High, Low and Close Prices Line Charts

Below we are generating another line chart where we are plotting open, high, low and close prices of apple for a period of May-2019 till Apr-2020. We have combined all line charts in a single figure and also displaying legends to differentiate each line from another using different colors.

In [ ]:
fig = plt.figure(title="Apple Stock Close Price", legend_location="top-left")

line_chart = plt.plot(x=apple_df.index, y=[apple_df.Open, apple_df.High, apple_df.Low, apple_df.Close],
                     labels=["Open","High", "Low", "Close"],
                     display_legend=True)

plt.xlabel("Date")
plt.ylabel("Close Price")

line_chart.tooltip = Tooltip(fields=["x", "y"], labels=["Date", "OHLC Price"])

plt.show()

Bqplot Multi Line Chart

4. Histogram

The fourth chart type that we'll be introducing is histograms. The histograms are quite commonly used to see a distribution of values of a particular column of data. Below we are plotting alcohol distribution with 20 bins per histogram.

In [ ]:
fig = plt.figure(title="Alcohol Distribution")

fig.layout.width = "600px"
fig.layout.height = "500px"

histogram = plt.hist(sample = wine_df["alcohol"], bins=20)

histogram.colors = ["orangered"]
histogram.stroke="blue"
histogram.stroke_width = 2.0

plt.grids(value="none")
plt.xlim(10.5,15.5)

plt.show()

Bqplot Stacked Histogram Chart

5. Pie Chart

The fifth chart type will be a pie chart. The pie charts are commonly used to see a distribution of each value in categorical variables. We'll be checking the distribution of wine categories. We'll also modify various styling attributes of the pie chart.

In [ ]:
from collections import Counter

wine_cat = Counter(wine_df["Category"])

fig = plt.figure(title="Wine Category Distribution", animation_duration=1000)

pie = plt.pie(sizes = list(wine_cat.values()),
              labels =["Category %d"%val for val in list(wine_cat.keys())],
              display_values = True,
              values_format=".0f",
              display_labels='outside')

pie.stroke="black"
pie.colors = ["tomato","lawngreen", "skyblue"]
pie.opacities = [0.7,0.8,0.9]

pie.radius = 150
pie.inner_radius = 60

pie.label_color = 'orangered'
pie.font_size = '20px'
pie.font_weight = 'bold'

plt.show()

Bqplot Pie Chart

6. BoxPlot

Our sixth chart type is box plots. The box plots are commonly used to check the concentration of the majority of values of a particular quantity. We'll be plotting a box plot for various columns of wine data.

In [ ]:
fig = plt.figure(title="Box Plots")

mini_df = wine_df[["alcohol","malic_acid","ash","total_phenols", "flavanoids", "nonflavanoid_phenols", "proanthocyanins", "color_intensity", "hue"]]

boxes = plt.boxplot(x=range(mini_df.shape[1]), y=mini_df.values.T)

boxes.box_fill_color = 'lawngreen'
boxes.opacity = 0.6
boxes.box_width = 50

plt.grids(value="none")

plt.show()

Bqplot Box Plot

7. HeatMap

The seventh chart type that we'll be introducing is a heatmap. We are using the heatmap below to depict the correlation between various columns of wine data.

In [ ]:
fig  = plt.figure(title="Correlation Heatmap",padding_y=0)

fig.layout.width = "700px"
fig.layout.height = "700px"

axes_options = {'color': {'orientation': "vertical","side":"right"}}

plt.heatmap(color=wine_df.corr().values, axes_options=axes_options)

plt.show()

Bqplot Heatmap

8. Candlestick Chart

The candlestick charts are very common in the finance industry and our eight chart type that we would like to introduce. It's used to represent a change in the value of the stock for a particular day over a period of time.

8.1 Apple Candlestick Chart with Candle Marker [January - 2020]

We are plotting a candlestick chart for apple stock for January-2020. We need an open, high, low and close price of the stock to generate candlestick charts.

In [ ]:
fig = plt.figure(title="Apple CandleStick Chart")

fig.layout.width="800px"

apple_df_jan_2020 = apple_df["2020-1"]

ohlc = plt.ohlc(x=apple_df_jan_2020.index, y=apple_df_jan_2020[["Open","High","Low","Close"]],
                marker="candle", stroke="blue")

ohlc.colors=["lime", "tomato"]
plt.xlabel("Date")

plt.show()

Bqplot Candlestick Chart Candle Marker

8.2 Apple OHLC Chart [January - 2020]

Below we have introduced another variation of candlestick chart which only displays lines instead of a bar for each change in stock value.

In [ ]:
fig = plt.figure(title="Apple CandleStick Chart")

fig.layout.width="800px"

apple_df_jan_2020 = apple_df["2020-1"]

ohlc = plt.ohlc(x=apple_df_jan_2020.index, y=apple_df_jan_2020[["Open","High","Low","Close"]],
                marker="bar", stroke="blue")

ohlc.colors=["lime", "tomato"]
plt.xlabel("Date")

plt.show()

Bqplot Candlestick Chart Bar Marker

9. Choropleth Maps

Our ninth and last chart type that we'll like to introduce is choropleth maps. bqplot provides a way to create interactive choropleth maps as well. We'll be utilizing the world happiness dataset that we had loaded earlier for plotting various choropleth maps.

We first need to create simple mapping method which takes as input map data and then maps each id of the country to particular value like happiness score, life expectancy, corruption of that country. bqplot has a method geo() which is used to generate choropleth mapping needs a mapping from country id to its value to generate choropleth maps as its color parameter.

We'll follow below-mentioned steps to generate choropleth maps with bqplot:

  • We'll first generate a world map graph using bqplot's geo(). It'll initialize the graph with data about each country in the world.
  • We'll then retrieve map data from bqplot map object and pass it to below function which will generate a mapping from country id to the column.
  • We'll then set this mapping from country id to value (like happiness score, life expectancy, corruption perception, etc.) as the color attribute of the map object. We also have set the default color value of grey when we don't find the mapping.
In [ ]:
def map_data_to_color_mapping(map_data, column="Score"):
    """
    Function to Map Country ID to Column Value from Happiness DataFrame
    """
    name_to_id_mapping = []
    for entry in map_data:
        if entry["properties"]["name"] == "Russian Federation":
            name_to_id_mapping.append(("Russia", entry["id"]))
        else:
            name_to_id_mapping.append((entry["properties"]["name"], entry["id"]))

    name_to_id_mapping = dict(name_to_id_mapping)

    color = []
    for name, idx in name_to_id_mapping.items():
        score = happiness_df[happiness_df["Country or region"].str.contains(name)]["Score"].values
        if len(score) > 0:
            color.append((idx,score[0]))
    return dict(color)

9.1 World Happiness Choropleth Map

Below we are generating a happiness choropleth map which depicts the choropleth of happiness score for each country of the world.

In [ ]:
fig = plt.figure(title='World Happiness Report')

plt.scales(scales={'color': bqplot.ColorScale(scheme='Blues')})

choropleth_map = plt.geo(map_data='WorldMap',
                     colors={'default_color': 'Grey'})

map_data = choropleth_map.map_data["objects"]["subunits"]["geometries"]

choropleth_map.color = map_data_to_color_mapping(map_data)

choropleth_map.tooltip = Tooltip(fields=["color"], labels=["Happiness Score"])

fig

Bqplot Happiness Choropleth Map

9.2 World Healthy Life Expectancy Choropleth Map

Below we are generating a Healthy life expectancy choropleth map which depicts choropleth of Healthy life expectancy for each country of the world.

In [ ]:
fig = plt.figure(title='World Healthy life expectancy Report')

plt.scales(scales={'color': bqplot.ColorScale(scheme='RdYlBu')})

choropleth_map = plt.geo(map_data='WorldMap',
                     colors={'default_color': 'white'})

map_data = choropleth_map.map_data["objects"]["subunits"]["geometries"]

choropleth_map.color = map_data_to_color_mapping(map_data, "Healthy life expectancy")

choropleth_map.tooltip = Tooltip(fields=["color"], labels=["Healthy life expectancy"])

fig

Bqplot Healthy Life Expectancy Choropleth Map

9.3 World Perceptions of Corruption Choropleth Map

Below we are generating Perceptions of corruption choropleth map which depicts choropleth of Perceptions of corruption for each country of the world.

In [ ]:
fig = plt.figure(title='World Perceptions of corruption Report')

plt.scales(scales={'color': bqplot.ColorScale(scheme='BrBG')})

choropleth_map = plt.geo(map_data='WorldMap',
                     colors={'default_color': 'white'})

map_data = choropleth_map.map_data["objects"]["subunits"]["geometries"]

choropleth_map.color = map_data_to_color_mapping(map_data, "Perceptions of corruption")

choropleth_map.tooltip = Tooltip(fields=["color"], labels=["Perceptions of corruption"])

fig


Sunny Solanki  Sunny Solanki