Share @ LinkedIn Facebook  hvplot, pandas
hvplot - How to Convert Static Pandas Python Plot (Matplotlib) to Interactive?

How to Convert Static Pandas Plot (Matplotlib) to Interactive?

Table of Contents

Introduction

Pandas is a famous python library which provides easy to use interface to maintain tabular data with its efficient data structure dataframe. Pandas is quite common nowadays and the majority of developer working with tabular data uses it for some purpose. Pandas also provides plotting functionality but all of the plots are static plots. Pandas use matplotlib for plotting which is a famous python library for plotting static graphs. The developer who has experience in plotting with pandas know about it's plotting functionality well.

But what if you want your plots to be interactive?

What if you want to use the same interface of pandas for plotting which you are used to but want interactive graphs instead of static?

What if you are very well aware of pandas and in the future want to explore interactive plotting with it?

If the answer to any of the questions is yes then this tutorial is for you. We have designed this tutorial on how to continue to use the existing pandas interface for plotting interactive graphs. We'll introduce a library called hvplot which provides a wrapper around pandas so that it can make use of an interactive plotting library called holoviews for plotting. We'll be explaining a few examples on how to use hvplot to generate interactive graphs. We'll be using a scikit-learn wine dataset which is a classification task dataset.

Loading Wine Dataset

We'll start loading the wine dataset available in scikit-learn. It has 13 features and a target variable with 3 different classes of wine. We'll keep the total dataset into pandas dataframe so that it becomes easily available for plotting and manipulation.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import load_wine

wine = load_wine()

print("Feature Names : ", wine.feature_names)
print("\nTarget Names : ", wine.target_names)

wine_df = pd.DataFrame(wine.data, columns = wine.feature_names)
wine_df["Target"] = wine.target
wine_df["Target"] = ["Class_1" if typ==0 else "Class_2" if typ==1 else "Class_3"  for typ in wine_df["Target"]]

print("\nDataset Size : ", wine_df.shape)

wine_df.head()
Feature Names :  ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']

Target Names :  ['class_0' 'class_1' 'class_2']

Dataset Size :  (178, 14)
Out[1]:
alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue od280/od315_of_diluted_wines proline Target
0 14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0 Class_1
1 13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0 Class_1
2 13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0 Class_1
3 14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0 Class_1
4 13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0 Class_1

1. Plotting With Pandas

We'll first plot static graphs to show current pandas API of plotting which makes use of matplotlib for plotting. We'll then convert graphs to interactive using hvplot.

In [ ]:
with plt.style.context(("seaborn", "ggplot")):
    wine_df.plot(
                 x="alcohol",
                 y="malic_acid",
                 kind="scatter",
                 s=100, alpha=0.7,
                 title="Alcohol vs Malic Acid Scatter Chart")

Convert Static Pandas Plot (Matplotlib) to Interactive

We can see that the above-generated graph is static and does not let us interact with individual points to look at values. We'll now try to convert this graph to an interactive graph and introduce a few other interactive graphs as well for the exploration of hvplot API.

2. Converting Static Plots to Interactive using Hvplot

It's quite simple to convert static pandas plots to interactive. We just need to import pandas module of hvplot which will provide a wrapper around the existing pandas module and expose hvplot API which we'll be exploring further for plotting purpose.

In [ ]:
import hvplot.pandas

After importing pandas module of hvplot, we need to call hvplot() method on pandas dataframe to generate interactive graphs. Please make a note that we have called the method hvplot() with almost the same parameter values as above one.

In [ ]:
scat1 = wine_df.hvplot(
                       x="alcohol",
                       y="malic_acid",
                       kind="scatter",
                       size=70,
                       alpha=0.7,
                       title="Alcohol vs Malic Acid Scatter Chart")
scat1

Convert Static Pandas Plot (Matplotlib) to Interactive

We'll now explore furthermore graphs for explanation purposes. We have grouped wine dataframe by wine class and then taken mean for each column per wine category. We'll use this averaged dataset for plotting few graphs.

3. Bar Chart

To create a bar chart, we have first taken an average of all columns of wine dataframe by grouping it according to the wine categories column. We'll then plot bar chart by calling hvplot() passing it columns ["malic_acid", "ash", "total_phenols"] to compare quantities. We also pass kind=bar to create bar charts.

In [ ]:
average_wine_df = wine_df.groupby(by="Target").mean()

bar1 = average_wine_df.hvplot(
                        y=["malic_acid", "ash", "total_phenols"],
                        kind="bar",
                        height=400, width=800,
                        ylim=(0, 4),
                        ylabel="Average Ash, Malic Acid & Total Phenols",
                        title="Average Ash, Malic Acid & Total Phenols per Wine Class", )

bar1

Convert Static Pandas Plot (Matplotlib) to Interactive

3.1 Stacked Bar Chart

We can easily convert side by side bar chart to a stacked bar chart to see a distribution of ["malic_acid", "ash", "total_phenols"] in all wine categories. We just need to pass parameter stack=True to convert bar chart to stacked bar chart.

In [ ]:
bar2 = average_wine_df.hvplot(
                            y=["malic_acid", "ash", "total_phenols"],
                            kind="bar",
                            height=400, width=600,
                            bar_width=0.6,
                            ylim=(0, 8),
                            stacked=True,
                            ylabel="Average Ash, Malic Acid & Total Phenols",
                            title="Average Ash, Malic Acid & Total Phenols per Wine Class")

bar2

Convert Static Pandas Plot (Matplotlib) to Interactive

We can also call methods on hvplot like we can call on the plot method of pandas. We have explained further through the below example.

3.2 Horizontal Bar Chart

We can call methods on the hvplot module of pandas dataframe as well. We'll now call barh() method to create bar chart of ["color_intensity", "hue"] by wine categories.

In [ ]:
bar3 = average_wine_df.hvplot.barh(
                                y=["color_intensity", "hue"],
                                height=400, width=600,
                                ylim=(0.0, 8.0),
                                ylabel="Average Ash, Malic Acid & Total Phenols",
                                color=["limegreen","tomato"],
                                title="Average Ash, Malic Acid & Total Phenols per Wine Class")
bar3

Convert Static Pandas Plot (Matplotlib) to Interactive

4. Scatter Plot

We can easily create a scatter plot by calling the scatter() method on the hvplot module by passing x and y values. We even color-encoded points according to wine categories by passing by=Target.

In [ ]:
scat = wine_df.hvplot.scatter(
               x="proanthocyanins",
               y="total_phenols",
               by="Target",
               width=700, height=400,
               size=70, alpha=0.6,
               xlim=(0.0, 4.0),
               ylim=(0.0, 4.0),
               xlabel="Proanthocyanins",
               ylabel="Total Phenols",
               title="proanthocyanins vs total phenols color-encoded by wine class")

scat

Convert Static Pandas Plot (Matplotlib) to Interactive

5. Histogram

We can create histogram by calling hist() method on hvplot by passing malic_acid to plot histogram of malic_acid. We can change bins by passing a number of bins to bins parameter.

In [ ]:
wine_df.hvplot.hist(
                    y=["malic_acid"],
                    width=500, height=400,
                    ylim=(0,50),
                    bins=20,
                    alpha=0.9,
                    color="orangered",
                    ylabel="Frequency",
                    title="Malic Acid Distribution")

Convert Static Pandas Plot (Matplotlib) to Interactive

5.1 Overlapped Histograms

We can create overlapped histograms as well to see the distribution of variables by various categories. We are creating overlapped histograms for proline variable per wine category.

In [ ]:
wine_df.hvplot.hist(
                    y="proline",
                    by="Target", ## Grouping by Wine Class Type.
                    width=600, height=400,
                    ylim=(0,16),
                    alpha=0.7,
                    bins=20,
                    ylabel="Frequency",
                    title="Malic Acid Distribution")

Convert Static Pandas Plot (Matplotlib) to Interactive

6. KDE Graph

We can also create Kernel Density Estimation graphs by passing the column for which estimation is needed.

In [ ]:
wine_df.hvplot.kde(
                    y=["malic_acid"],
                    width=500, height=400,
                    alpha=0.9,
                    color="orangered",
                    ylabel="Frequency",
                    title="Malic Acid Distribution")

Convert Static Pandas Plot (Matplotlib) to Interactive

6.1 Overlapped KDEs

We can create overlapped KDEs the same way we created overlapped histograms.

In [ ]:
wine_df.hvplot.kde(
                    y="proline",
                    by="Target", ## Grouping by Wine Class Type.
                    width=600, height=400,
                    alpha=0.7,
                    ylabel="Frequency",
                    title="Malic Acid Distribution")

Convert Static Pandas Plot (Matplotlib) to Interactive

7. Box Plot

We can create a box plot by passing a list of columns for which box plots are needed as a list. We are creating box plot for ["malic_acid", "ash","flavanoids", "total_phenols"].

In [ ]:
wine_df.hvplot.box(
                    y=["malic_acid", "ash","flavanoids", "total_phenols"],
                    ylim=(0,6),
                    height=400,
                    box_width=0.6,
                    color="crimson",
                    title="Malic Acid & Ash")

Convert Static Pandas Plot (Matplotlib) to Interactive

We can also create a box plot for one variable according to different categories. We are creating below proline box plot for its distribution per wine category.

In [ ]:
wine_df.hvplot.box(
                    y="proline",
                    by="Target",
                    ylim=(0,1750),
                    height=400,
                    color="forestgreen",
                    title="Proline distribution per wine category")

Convert Static Pandas Plot (Matplotlib) to Interactive

8. Line Chart

We can easily create bar charts by calling hvplot() method x and y values as it'll create a line chart by default. The line chart is ideal for time-series plots where we use datetime as x axis and other quantities as y axis. We can pass more than one column as a list to y parameter so that it creates a line chart for all of them. we are creating line chart for ["malic_acid", "ash", "total_phenols"].

In [ ]:
wine_df.hvplot(
                y = ["malic_acid", "ash", "total_phenols"],
                width=700, height=400,
                title="Line Chart of All samples of Malic Acid, Ash & Total Phenols")

Convert Static Pandas Plot (Matplotlib) to Interactive

9. Violin Chart

We can easily create a violin chart by calling violin() method passing it column for which violin is needed. We can either pass more than one column to y parameter to create a violin chart for all columns or we can pass a single column to y parameter and categorical column to by parameter to generate the violin chart for that column according to a categorical variable.

In [ ]:
wine_df.hvplot.violin(
                        y="proanthocyanins",
                        by="Target",
                        width=700, height=400,
                        ylim=(0.0,4.0),
                        title="Violin Chart of proanthocyanins per Wine Category")

Convert Static Pandas Plot (Matplotlib) to Interactive

10. Area Chart

We can easily create area chart by passing a column name to area() method.

In [ ]:
wine_df.hvplot.area(y="ash", alpha=0.4)

Convert Static Pandas Plot (Matplotlib) to Interactive

10.1 Overlapped Area Chart

We can pass more than one column to area() method to create an overlapped area chart. We can either keep them stacked over one another or prevent them from stacking over one another using stacked parameter.

In [ ]:
wine_df.hvplot.area(y=["ash", "malic_acid", "total_phenols"], stacked=False, alpha=0.4)

Convert Static Pandas Plot (Matplotlib) to Interactive

11. Table

We can even plot dataframe data as tables by calling table() method with column names.

In [ ]:
table = average_wine_df.hvplot.table(
                             columns=["alcohol","malic_acid", "ash"],
                             width=600, height=120,)

table

Convert Static Pandas Plot (Matplotlib) to Interactive

Hvplot also lets us merge more than related graphs to create a figure of related graphs and it can also let us overlay one graph over another. It supports 2 kinds of operations for this.

  • + - It'll merge graphs by putting each one next to another.
  • * - It'll let us overlay one graph over another to create combine single graph.

We'll start with + operation to merge 3 bar charts. We'll create a bar chart of malic_acid, alcohol and magnesium grouped by wine category. We'll then merge all of them using + operation.

In [ ]:
avg_malic_acid_per_class = wine_df.groupby(by="Target").mean()[["malic_acid"]]
avg_alcohol_per_class = wine_df.groupby(by="Target").mean()[["alcohol"]]
avg_magnesium_per_class = wine_df.groupby(by="Target").mean()[["magnesium"]]

bar1 = avg_malic_acid_per_class.hvplot.bar(
                                        ylim=(0.0, 4.0),
                                        color="tomato",
                                        width=300,height=300,
                                        title="Average Malic Acid per Wine Class")

bar2 = avg_alcohol_per_class.hvplot.bar(
                                        ylim=(0.0, 14.0),
                                        width=300,height=300,
                                        title="Average Alcohol per Wine Class")

bar3 = avg_magnesium_per_class.hvplot.bar(
                                        ylim=(0.0, 120.0),
                                        color="lawngreen",
                                        width=300,height=300,
                                        title="Average Magnesium per Wine Class")

bar1 + bar2 + bar3

Convert Static Pandas Plot (Matplotlib) to Interactive

We can overlay one graph over others using * operation. We'll create 3 scatter plots. We'll create 3 scatter plots of color_intensity versus hue for each wine category and then we'll merge these 3 scatter plots using * operation to create a single scatter plot.

In [ ]:
scat1 = wine_df[wine_df["Target"] == "Class_1"].hvplot.scatter(x="color_intensity", y="hue",
                                                               width=600, height=400,
                                                               xlim=(0,14),ylim=(0,2.0),
                                                               size=50, alpha=0.6,
                                                               title="Color Intensity vs Hue color-encoded by Wine Class")

scat2 = wine_df[wine_df["Target"] == "Class_2"].hvplot.scatter(x="color_intensity", y="hue", size=70, alpha=0.6)

scat3 = wine_df[wine_df["Target"] == "Class_3"].hvplot.scatter(x="color_intensity", y="hue", size=90, alpha=0.6)

scat1 * scat2 * scat3

Convert Static Pandas Plot (Matplotlib) to Interactive

Below we have created another example of * operation by merging area and step graphs of ash attribute.

In [ ]:
wine_df.hvplot.area(y="ash", alpha=0.4) * wine_df.hvplot.step(y="ash", color="red")

Convert Static Pandas Plot (Matplotlib) to Interactive

When we merge graphs using + operations, it creates a Layout object consisting of graphs. We can call cols() method on it to reorganize graphs in a different order than putting next to each other. Below we are merging 4 bar plots and rather than putting all 4 next to each other, we are creating 2 columns so that each has 2 plots.

In [ ]:
avg_color_intens_per_class = wine_df.groupby(by="Target").mean()[["color_intensity"]]

bar4 = avg_color_intens_per_class.hvplot.bar(
                                        ylim=(0.0, 8.0),
                                        color="yellow",
                                        alpha=0.8,
                                        width=300,height=300,
                                        title="Average Color Intens per Wine Class")


(bar1 + bar2 + bar3 + bar4).cols(2)

Convert Static Pandas Plot (Matplotlib) to Interactive

Below we have given another example of + operation where we are merging bar chart and table.

In [ ]:
bar_1 = average_wine_df.hvplot(y=["malic_acid", "ash", "total_phenols"],
                        kind="bar",
                        height=400, width=750,
                        ylim=(0, 4),
                        ylabel="Average Ash, Malic Acid & Total Phenols",
                        title="Average Ash, Malic Acid & Total Phenols per Wine Class", )

table = average_wine_df.hvplot.table(
                             columns=["malic_acid", "ash", "total_phenols"],
                             width=750, height=120,)

(bar_1 + table).cols(1)

Convert Static Pandas Plot (Matplotlib) to Interactive

This concludes our tutorial on converting static pandas plots to interactive plots. Please feel free to ask us questions in the comments section if you have any doubt. We tried to cover as much material as possible. To summarize, hvplot module which we used to create interactive graphs makes use of holoviews library for plotting which is based on bokeh.

References


Sunny Solanki  Sunny Solanki