Updated On : Mar-29,2023 Time Investment : ~15 mins

# Hexbin Charts using Matplotlib | Hexgonal Binning | Python¶

Data visualization is an essential tool for gaining insights into complex data sets. One of the popular visualization techniques is the hexbin chart, which provides an effective way to visualize large amounts of data by grouping and aggregating data points into hexagonal bins. Hexbin charts offer an alternative to traditional scatter plots, particularly useful when dealing with large datasets with overlapping data points.

#### What Can You Learn From This Article?¶

In this tutorial, we will explore how to create hexbin charts using Matplotlib, a popular data visualization library in Python. The tutorial covers in detail how to use hexbin() method of matplotlib to create hexbin charts. It also covers various parameters of the method in detail with examples. By the end of this article, readers will have a comprehensive understanding of how to use hexbin charts to visualize their data using Matplotlib.

### Video Tutorial¶

Please feel free to check below video tutorial if feel comfortable learning through videos.

First, we have imported matplotlib and printed the version that we have used in our tutorial.

```import matplotlib

print("Matplotlib Version : {}".format(matplotlib.__version__))
```
```Matplotlib Version : 3.5.3
```

## Load Wine Dataset¶

In this section, we have loaded the dataset that we'll use to create hexbin charts in our tutorial.

The code loads the wine dataset using the load_wine() method from the datasets module, which returns a Bunch object that contains the wine dataset's data and metadata. After loading the dataset, the code converts it into a pandas DataFrame, and adds a target variable column to the DataFrame.

The wine dataset has information about various ingredients used in creation of 3 different types of wine.

```from sklearn import datasets
import pandas as pd

wine_df = pd.DataFrame(data=wine.data, columns=wine.feature_names)
wine_df["WineType"] = wine.target

```
alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue od280/od315_of_diluted_wines proline WineType
0 14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0 0
1 13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0 0
2 13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0 0
3 14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0 0
4 13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0 0

## 1. Initial Hexbin Chart¶

In this section, we have created our first hexbin chart exploring relationship between the values of ingredients "alcohol" and "malic acid" of our wine dataset.

The code starts by importing pyplot API of matplotlib.

Then, The code creates a new figure object with the specified width and height, 10 and 8 inches respectively. It adds a subplot to the figure.

The next line creates a hexbin plot of the two variables, 'alcohol' and 'malic_acid', from the wine dataset named wine_df using hexbin() method of pyplot API. The gridsize parameter specifies the number of hexagons in the x and y directions, and the cmap parameter specifies the color map to use. Then, the code adds colorbar to the chart. The colorbar shows the number of samples matching a particular combination of ingredients.

The next line of code hides the spines (lines around the chart) of the subplot on all sides - bottom, top, left, and right.

Then, we have added x-axis label, y-axis label, and chart title.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"],
gridsize=(15,10), cmap="magma"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

## 2. Modify Line Properties¶

In this example, we have explained how to modify various line properties of hexagons in the hexbin chart.

The majority of the code is same as our previous example with the addition of a few new parameters in a call to hexbin(). The linewidth parameter lets us specify line width, edgecolor parameter lets us specify line color, line style parameter lets us specify line style (dashed, dotted, etc), and alpha parameter lets us specify the opacity of hexagons.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"],
linewidth=1.5, edgecolor="white", linestyle="dotted", alpha=0.8,
gridsize=(15,10), cmap="magma"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

## 3. Include Range Of Values in Colorbar¶

In this section, we have explored how to include a range of values in the hexbin chart.

Our previous hexbin charts included all values but the code below includes hexagons where there is a minimum of 1 example present in data for the combination of two ingredients (alcohol & malic acid). We can set the minimum count using mincnt parameter of hexbin() method.

The resulting hexbin chart has omitted hexagons with 0 counts.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"],
linewidth=1.5, edgecolor="white", mincnt=1,
gridsize=(15,10), cmap="magma"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

Below, we have created another example explaining how to specify a range of values. This time, we have used vmin and vmax parameters to specify a range.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"],
linewidth=1.5, edgecolor="white",
vmin=1, vmax=5,
gridsize=(15,10), cmap="magma"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

## 4. Distribution of Third Variable¶

In this section, we have explained how to use a hexbin chart to understand the distribution of values of third variable for a combination of the main two values.

Below, we have created a hexbin chart that shows the relationship between values of alcohol and malic acid but the hexagon color is based on the values of column WineType. The third column is specified using C parameter. The values of column WineType are averaged by default.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"], C=wine_df["WineType"],
linewidth=1.5, edgecolor="black",
gridsize=(15,10), cmap="RdYlGn"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

As we said in the previous example by default average is taken for values provided through C parameter, but what if you want to perform some other function on the values like minimum, maximum, and standard deviation. We can do that using reduce_C_function parameter. We need to provide a function to it that we want to apply to list of values. The function will be applied to all WineType values that match the particular combination of alcohol and malic acid values.

```import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure(figsize=(10,8))

plt.hexbin(x=wine_df["alcohol"], y=wine_df["malic_acid"], C=wine_df["WineType"],
linewidth=1.5, edgecolor="black", reduce_C_function=np.max,
gridsize=(15,10), cmap="RdYlGn"
);

plt.colorbar();

ax.spines[["bottom", "top", "left", "right"]].set_visible(False);

plt.xlabel("Alcohol", fontsize=16, fontweight="bold")
plt.ylabel("Malic Acid", fontsize=16, fontweight="bold")
plt.title("Alcohol vs Malic Acid Hexbin Chart", loc="left", pad=10, fontsize=25, fontweight="bold");
```

## Summary¶

• plt.hexbin() - Hexbin is a 2D histogram plot, in which the bins are hexagons and the color represents the number of data points within each bin.

Sunny Solanki

## Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

## Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

## Want to Share Your Views? Have Any Suggestions?

If you want to

• provide some suggestions on topic
• share your views
• include some details in tutorial
• suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.