Updated On : Jan-09,2023 Time Investment : ~20 mins

# Waterfall Chart using Matplotlib | Python¶

Waterfall charts are commonly used to understand how an initial value (for example net revenue) is impacted by a series of positive or negative values. It helps understand the cumulative effect.

The most common use cases of waterfall charts are financial analysis and quantitative analysis (inventory analysis and performance analysis).

As a part of this tutorial, we have explained how to create waterfall charts using Python library "matplotlib". Tutorial covers a guide to creating simple waterfall chart and then improving its looks & feel as well. Apart from normal waterfall charts with vertical bars, tutorial also covers how to create waterfall charts with horizontal bars. We have created waterfall charts using "pyplot" API of matplotlib.

If you are new to matplotlib and want to learn about it from basic then please feel free to check our detailed tutorial on it.

### Video Tutorial¶

If you are comfortable learning through videos then please feel free to check our video tutorial on waterfall charts.

Below, we have imported matplotlib and printed the version that we have used in our tutorial.

```import matplotlib

print("Matplotlib Version : {}".format(matplotlib.__version__))
```
```Matplotlib Version : 3.5.3
```

## Chart 1¶

In this section, we have explained our first waterfall chart.

Below, we have created a sample dataset that we'll use for our purpose. The dataset has 3 types of entries:

• Positive values (earnings/revenues)
• 0s - Location where we want cumulative of all previous values
• Negative values (Cost / Purchases)
```import pandas as pd

labels = ["Sales", "Consulting", "Net Revenue", "Purchases", "Other Expenses", "Profit"]
values = [60000, 80000, 0, -40000, -30000, 0]

df = pd.DataFrame({"Labels": labels, "Vals": values})

df
```
Labels Vals
0 Sales 60000
1 Consulting 80000
2 Net Revenue 0
3 Purchases -40000
4 Other Expenses -30000
5 Profit 0

Below, we have calculated cumulative values and then modified cumulative where original values are negative. These modified cumulative values will be used in future for other logic and text annotation purposes.

Apart from this, we have also added a color column to dataframe specifying colors of bars of waterfall chart. The green color is used for positive values, red for negative values, and dodgerblue when we want cumulative of all previous values.

```df["Cumulative"] = df["Vals"].cumsum()
df["Cumulative"] = [cum-val if val<0 else cum for cum, val in df[["Cumulative", "Vals"]].values]

## Bar Colors
df["Color"] = ["green" if val>0 else "red" if val<0 else "dodgerblue" for val in df["Vals"]]

df
```
Labels Vals Cumulative Color
0 Sales 60000 60000 green
1 Consulting 80000 140000 green
2 Net Revenue 0 140000 dodgerblue
3 Purchases -40000 140000 red
4 Other Expenses -30000 100000 red
5 Profit 0 70000 dodgerblue

Below, we have included logic that calculated bottom and height values for bars of our waterfall charts. The logic takes into consideration current value and previous values to calculate these values. The bottom values are values of Y-axis from where bar will start and height is height of bar from bottom location.

Once, we have calculated bottom and height values of bars, we can easily create a waterfall chart.

```bottom = [0,]
height = [values[0],]

for i, val in enumerate(values[1:], start=1):
if val==0: ## Current Value equal to 0
bottom.append(0)
height.append(df["Cumulative"][i])
elif val > 0: ## Current Value greater than 0
if values[i-1] >=0:
bottom.append(df["Cumulative"][i-1])
else:
bottom.append(bottom[i-1])
height.append(val)
elif val < 0: ## Current Value less than 0
if values[i-1] >=0:
bottom.append(df["Cumulative"][i-1]+val)
else:
bottom.append(bottom[i-1]+val)
height.append(-val)

df["Bottom"] = bottom
df["Height"] = height

df
```
Labels Vals Cumulative Color Bottom Height
0 Sales 60000 60000 green 0 60000
1 Consulting 80000 140000 green 60000 80000
2 Net Revenue 0 140000 dodgerblue 0 140000
3 Purchases -40000 140000 red 100000 40000
4 Other Expenses -30000 100000 red 70000 30000
5 Profit 0 70000 dodgerblue 0 70000

Below, we have created our first waterfall chart using various columns of our modified dataset.

First of all, we have created a figure object.

Then, we have plotted bars of waterfall chart using height, bottom, and color columns of our dataframe.

Then, we have modified X and Y axes' tick labels.

After that, we have added annotation on top of bar.

At last, we have added X axis label, Y-axis label, and title of chart.

If you want to add line connecting bars which is sometimes used by some examples then you can uncomment code call to plt.step() method.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12,7))

plt.bar(x=df.index, height=df["Height"], bottom=df["Bottom"], color=df["Color"]);
#plt.step(df.index, df["Cumulative"], where="mid", color="black");

plt.xticks(df.index, df["Labels"], fontdict=dict(fontsize=14));
plt.yticks(range(0, 160001, 20000), ["{:,} \$".format(val) for val in range(0, 160001, 20000)],
fontdict=dict(fontsize=14)
);

for idx in range(len(df)):
plt.text(x=df.index[idx], y=df["Cumulative"][idx],
s="{:,} \$".format(df["Vals"][idx] if df["Vals"][idx]!=0 else df["Cumulative"][idx]),
ha="center", va="bottom", fontdict=dict(fontsize=16)
);

plt.xlabel("Earnings/Purchases", fontdict=dict(fontsize=16, fontweight="bold"))
plt.ylabel("Cost (\$)", fontdict=dict(fontsize=16, fontweight="bold"))
plt.title("WaterFall Chart", loc="left", pad=10, fontdict=dict(fontsize=20, fontweight="bold"));
```

### Add Theme to Improve Look of Chart¶

Our previous chart has a default matplotlib theme which is not that attractive. Matplotlib let us add theme of famous libraries and blogs to our charts with just one line of code.

Below, we have used "fivethirtyeight" theme which is based on fivethirtyeight blog website.

Now, we can see that our charts look has improved. It modifies things like chart background, label fonts, and so on. It even added grid to chart.

If you want to learn more about how to improve look and feel of your matplotlib charts by adding theme to them then please feel free to check below video tutorial. We have covered topic in detail over there.

```plt.style.use("fivethirtyeight");
```
```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12,7))

plt.bar(x=df.index, height=df["Height"], bottom=df["Bottom"], color=df["Color"]);
#plt.step(df.index, df["Cumulative"], where="mid", color="black");

plt.xticks(df.index, df["Labels"], fontdict=dict(fontsize=14));
plt.yticks(range(0, 160001, 20000), ["{:,} \$".format(val) for val in range(0, 160001, 20000)],
fontdict=dict(fontsize=14)
);

for idx in range(len(df)):
plt.text(x=df.index[idx], y=df["Cumulative"][idx],
s="{:,} \$".format(df["Vals"][idx] if df["Vals"][idx]!=0 else df["Cumulative"][idx]),
ha="center", va="bottom", fontdict=dict(fontsize=16)
);

plt.xlabel("Earnings/Purchases", fontdict=dict(fontsize=16, fontweight="bold"))
plt.ylabel("Cost (\$)", fontdict=dict(fontsize=16, fontweight="bold"))
plt.title("WaterFall Chart", loc="left", pad=10, fontdict=dict(fontsize=20, fontweight="bold"));
```

## Chart 2¶

Our previous example of a waterfall chart had all green and red bars together. They were not interleaving. There can be situations where we have red and green bars interleaving.

Hence, we have created one more example to check our logic for calculating height and bottom values.

Below, we have created a new dataset and calculated cumulative values in it. We have also added color column for color of bars.

```import pandas as pd

labels = ["Q1", "Q2", "Q3", "Q4", "Total", "Q1", "Q2", "Q3", "Q4", "Total"]
values = [60000, 80000, -40000, 30000, 0, -30000, 80000, -40000, 30000, 0]

df = pd.DataFrame({"Labels": labels, "Vals": values})

df["Cumulative"] = df["Vals"].cumsum()
df["Cumulative"] =[cum-val if val<0 else cum for cum, val in df[["Cumulative", "Vals"]].values]

df["Color"] = ["green" if val>0 else "red" if val<0 else "dodgerblue" for val in df["Vals"]]

df
```
Labels Vals Cumulative Color
0 Q1 60000 60000 green
1 Q2 80000 140000 green
2 Q3 -40000 140000 red
3 Q4 30000 130000 green
4 Total 0 130000 dodgerblue
5 Q1 -30000 130000 red
6 Q2 80000 180000 green
7 Q3 -40000 180000 red
8 Q4 30000 170000 green
9 Total 0 170000 dodgerblue

Below, we have included exactly same logic that we used for our first chart to calculate bottom and height values.

```bottom = [0,]
height = [values[0],]

for i, val in enumerate(values[1:], start=1):
if val==0:
bottom.append(0)
height.append(df["Cumulative"][i])
elif val > 0:
if values[i-1] >=0:
bottom.append(df["Cumulative"][i-1])
else:
bottom.append(bottom[i-1])
height.append(val)
elif val < 0:
if values[i-1] >=0:
bottom.append(df["Cumulative"][i-1]+val)
else:
bottom.append(bottom[i-1]+val)
height.append(-val)

df["Bottom"] = bottom
df["Height"] = height

df
```
Labels Vals Cumulative Color Bottom Height
0 Q1 60000 60000 green 0 60000
1 Q2 80000 140000 green 60000 80000
2 Q3 -40000 140000 red 100000 40000
3 Q4 30000 130000 green 100000 30000
4 Total 0 130000 dodgerblue 0 130000
5 Q1 -30000 130000 red 100000 30000
6 Q2 80000 180000 green 100000 80000
7 Q3 -40000 180000 red 140000 40000
8 Q4 30000 170000 green 140000 30000
9 Total 0 170000 dodgerblue 0 170000

The code below creates our second waterfall chart which uses the new dataset. We can see that our logic to calculate bottom and height seems to be working fine. The code is exactly same as our previous chart.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(15,8))

plt.bar(x=df.index, height=df["Height"], bottom=df["Bottom"], color=df["Color"]);
#plt.step(df.index, df["Cumulative"], where="mid", color="black");

plt.xticks(df.index, df["Labels"], fontdict=dict(fontsize=14));
plt.yticks(range(0, 220001, 20000), ["{:,} \$".format(val) for val in range(0, 220001, 20000)],
fontdict=dict(fontsize=14)
);

for idx in range(len(df)):
plt.text(x=df.index[idx], y=df["Cumulative"][idx],
s="{:,} \$".format(df["Vals"][idx] if df["Vals"][idx]!=0 else df["Cumulative"][idx]),
ha="center", va="bottom", fontdict=dict(fontsize=16)
);

plt.xlabel("Earnings/Purchases", fontdict=dict(fontsize=16, fontweight="bold"))
plt.ylabel("Cost (\$)", fontdict=dict(fontsize=16, fontweight="bold"))
plt.title("WaterFall Chart", loc="left", pad=10, fontdict=dict(fontsize=20, fontweight="bold"));
```

## Chart 3¶

In this section, we have created a variety of waterfall chart where all charts are laid out horizontally. We have done it by using barh() method instead of bar() method.

The code is almost same as our previous example with minor changes in method name. We have also reversed labels and ticks of axes.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12,10))

plt.barh(y=df.index, width=df["Height"], left=df["Bottom"], color=df["Color"]);
#plt.step(df.index, df["Cumulative"], where="mid", color="black");

plt.yticks(df.index, df["Labels"], fontdict=dict(fontsize=14));
plt.xticks(range(0, 220001, 20000), ["{:,} \$".format(val) for val in range(0, 220001, 20000)],
fontdict=dict(fontsize=14)
);

for idx in range(len(df)):
plt.text(y=df.index[idx], x=df["Cumulative"][idx],
s="{:,} \$".format(df["Vals"][idx] if df["Vals"][idx]!=0 else df["Cumulative"][idx]),
ha="right", va="center", fontdict=dict(fontsize=16)
);

plt.ylabel("Earnings/Purchases", fontdict=dict(fontsize=16, fontweight="bold"))
plt.xlabel("Cost (\$)", fontdict=dict(fontsize=16, fontweight="bold"))
plt.title("WaterFall Chart", loc="left", pad=10, fontdict=dict(fontsize=20, fontweight="bold"));
```

## Chart 4¶

Below, we have created our fourth and last waterfall chart. Our waterfall chart in this example also has horizontal bars like our previous example.

But there is one change.

In our previous examples, bars started from bottom whereas, in this example, it starts from top. We have done it by reversing list of values.

```import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12,10))

plt.barh(y=df.index, width=df["Height"][::-1], left=df["Bottom"][::-1], color=df["Color"][::-1]);
#plt.step(df.index, df["Cumulative"], where="mid", color="black");

plt.yticks(df.index, df["Labels"][::-1], fontdict=dict(fontsize=14));
plt.xticks(range(0, 220001, 20000), ["{:,} \$".format(val) for val in range(0, 220001, 20000)],
fontdict=dict(fontsize=14)
);

for idx in range(len(df)):
plt.text(y=df.index[idx], x=df["Cumulative"].values[::-1][idx],
s="{:,} \$".format(df["Vals"].values[::-1][idx] if df["Vals"].values[::-1][idx]!=0 else df["Cumulative"].values[::-1][idx]),
ha="right", va="center", fontdict=dict(fontsize=16)
);

plt.ylabel("Earnings/Purchases", fontdict=dict(fontsize=16, fontweight="bold"))
plt.xlabel("Cost (\$)", fontdict=dict(fontsize=16, fontweight="bold"))
plt.title("WaterFall Chart", loc="left", pad=10, fontdict=dict(fontsize=20, fontweight="bold"));
```

This ends our small tutorial explaining how to create a waterfall chart using matplotlib, a famous python data visualization library.

Sunny Solanki

## Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

## Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

## Want to Share Your Views? Have Any Suggestions?

If you want to

• provide some suggestions on topic