**Numba** is a very commonly used library nowadays to speed up computations in Python code. It let us speed up our code by just decorating them with one of the decorators it provides and then all the speed up will be handled by it without the developers’ need to worry. The majority of the time numba decorated functions work quite faster compared to normal python functions. Numba is designed to speed up even numpy code as well.

Though **Numba** can speed up numpy code, it does not speed up code involving pandas which is the most commonly used data manipulation library designed on the top of numpy. We have already created a tutorial where we introduced **@jit** decorator of numba and had discussed that numba can not speed up code involving pandas operations on pandas dataframe.

If you want to check our tutorial on **numba @jit** decorator then please feel free to check it from the below link.

We have created this tutorial to guide developers on how we can use numba to speed up code involving pandas dataframe. As a part of this tutorial, we'll explain with examples various ways to speed up your pandas’ operations. There are basically two ways to speed up pandas operations which we have listed below.

**Using 'numba' Engine Available for Selected Pandas Methods**- There are selected methods (**rolling(), groupby(), etc**) in pandas that works on a list of values at a time. These methods let us provide with an argument named**engine**which if set to**'numba'**will speed up operations using**Numba**behind the scene.**Create Custom Numba Functions to Work with Pandas DataFrame**- We can jit-decorate functions for working with pandas dataframe. We need to design jit-decorated functions in a way that works on numpy arrays or Python lists using loops to speed up the process. We need to retrieve numpy arrays from our pandas dataframes and need to give them as input to our jit-decorated functions for speed up.
- We can also create custom
**numba**functions to replace commonly used operations (like mean(), std(), etc) in pandas. We can use various decorators available from**numba**to speed up the code of these custom functions. These approaches will show performance improvements when data is large (Generally >**1M**entries).

Below we have highlighted important sections of our tutorial to give an overview of the material covered in this tutorial.

- Using 'numba' Engine Available for Selected Pandas Methods
- Example 1: Trying Various Engines with Pandas Series
- Example 2: Trying Various Engines with Numpy Arrays
- Example 3: Giving Arguments for Numba Engine
- Example 4: Trying Custom Functions

- Create Custom Numba Functions to Work with Pandas DataFrame
- Example 1: Decorate Functions with Simply @jit Decorator
- Example 2: Strict nopython Mode (@jit(nopython=True) | @njit)
- Example 3: Provide DataType for Further Speed Up
- Example 4: Introduce Python Loops
- Example 5: Try to Replace Existing Pandas DataFrame Functions with Custom Jit-Decorated Functions
- Example 6: Try to Vectorize Functions using @vectorize Decorator for Further Speed Up

We'll now explain two different ways of speeding up pandas code explained above with simple examples. We have imported the necessary libraries to start with below.

In [1]:

```
import pandas as pd
print("Pandas Version : {}".format(pd.__version__))
```

In [2]:

```
import numpy as np
```

Below we have created a dataframe with random data that we'll be using in our examples. The dataframe has 5 columns with random floats and one column has categorical values.

In [3]:

```
np.random.seed(123)
data = np.random.rand(int(1e5),5)
df = pd.DataFrame(data=data, columns=list("ABCDE"))
df["Type"] = np.random.choice(["Class1","Class2","Class3","Class4","Class5"], size=(len(df)))
df
```

Out[3]:

In this section, we'll explain pandas dataframe methods which let us use **numba** for some operations. Pandas generally let us use **numba** with methods that work on a bunch of values of data like **groupby()**, **rolling()**, etc. This methods groups entries of main dataframes and then applies various aggregate functions on these grouped entries. We can inform them to use **numba** for performing aggregate operations on grouped entries by setting **engine** argument value to **'numba'**.

Below we have first created a rolling dataframe with a window size of 1000. We can call various aggregate functions on this rolled dataframe to find our rolling statistics. The majority of functions which we can call on rolled dataframe accepts **engine** argument which can be set to **'numba'**.

In the next cell below, we have grouped entries of the dataframe based on **Type** column of data. There are two methods (**transform()** and **agg()/aggregate()**) which works on grouped dataframes that accepts **engine** argument.

We'll be using these rolled and grouped dataframes in our examples.

In [6]:

```
rolling_df = df.rolling(1000)
rolling_df
```

Out[6]:

In [7]:

```
grouped_by_types = df.groupby("Type")
grouped_by_types
```

Out[7]:

In our first example, we are simply calling **mean()** function on rolled dataframe to calculate the rolling average on the dataframe. We have called **mean()** function with various arguments. We have called it without argument, with **engine** set to **'cython'** and with **engine** set to **'numba'**.

The **cython** is a different implementation of python which is faster compared to normal python implementation.

When we provide **engine='numba'**, the function will use numba to speed up operations behind the scene. It's not guaranteed that using **engine='numba'** will always improve the performance. We need to test it first to check.

We are using the jupyter notebook magic command **%time** to measure the time taken by a particular statement. We'll be using it in all our examples to measure the time taken by various statements. If you are interested in learning about various magic commands available with the jupyter notebook then please feel free to check our tutorial on the same which covers the majority of magic commands.

In [11]:

```
%time out = rolling_df.mean()
%time out = rolling_df.mean(engine='cython')
%time out = rolling_df.mean(engine='numba')
%time out = rolling_df.mean(engine='numba')
```

In this section, we have again called **mean()** function on our rolling dataframe just like our previous example but there is one difference. We have provided **raw=True** to a method that will give a function that calculates average numpy arrays instead of pandas series. If we don't provide **raw=True**, it'll give values of columns as pandas series. We set **raw=True** because **numba** functions do well with function which operates on numpy arrays.

In [7]:

```
%time out = rolling_df.mean(raw=True)
%time out = rolling_df.mean(engine='cython', raw=True)
%time out = rolling_df.mean(engine='numba', raw=True)
%time out = rolling_df.mean(engine='numba', raw=True)
```

In the below cell, we have called **std()** function on our rolled dataframe to calculate rolling standard deviation. We have measured the time taken by each call. We can notice that numba seems to be doing better this time but not that much noticeable difference.

In [9]:

```
%time out = rolling_df.std(raw=True)
%time out = rolling_df.std(engine='cython', raw=True)
%time out = rolling_df.std(engine='numba', raw=True)
%time out = rolling_df.std(engine='numba', raw=True)
```

The methods which accept **engine='numba'** argument also let us specify various arguments that we generally provide to **numba @jit** decorator. The common arguments of **numba @jit** decorators are **nopython, nogil, cache, and parallel**.

In the below cell, we have tried to calculate standard deviation on our rolled dataframe by providing different arguments for **numba** engine. We can notice that the **numba** engine seems to be doing better this time compared to the normal call.

If you want to know in detail about these **numba @jit** decorator arguments then please feel free to check our tutorial on it which covers all arguments in detail with examples.

In [8]:

```
%time out = rolling_df.std(raw=True)
%time out = rolling_df.std(engine='cython', raw=True)
%time out = rolling_df.std(engine='numba', nopython=True, raw=True)
%time out = rolling_df.std(engine='numba', nopython=True, cache=True, raw=True)
%time out = rolling_df.std(engine='numba', nopython=True, cache=True, parallel=True, raw=True)
```

We can also provide custom user-defined functions to perform the different aggregate function which is not available through pandas.

Below we have created a simple custom function that takes as input arrays, squares its values, and then takes the mean of squared values. We'll be giving this function as an aggregate function on rolled dataframe.

In [20]:

```
def custom_mean(x):
return (x * x).mean()
```

In the below cell, we have called **apply()** function on our rolled dataframe asking it to execute the custom mean function we designed in the previous cell. Like our previous examples, we have tried function without any backend engine, **cython** engine, and **numba** engine.

We can notice from the results that numba is doing a little better job compared to other backend engines.

In [25]:

```
%time out = rolling_df.apply(custom_mean, raw=True)
%time out = rolling_df.apply(custom_mean, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
```

In the below cell, we have created a custom standard deviation function that takes the square of the input array and then calculates the standard deviation of squared values.

We have then tried this function on our rolled dataframe using **apply()** function. We have also recorded the time taken by each call for comparison purposes.

In [26]:

```
def custom_std(x):
return (x * x).std()
```

In [27]:

```
%time out = rolling_df.apply(custom_std, raw=True)
%time out = rolling_df.apply(custom_std, engine='cython', raw=True)
%time out = rolling_df.apply(custom_std, engine='numba', raw=True)
%time out = rolling_df.apply(custom_std, engine='numba', raw=True)
```

In the below cell, we have created a function that takes as input index and an array of values, it then calculates the mean of it.

We'll be using this function on our grouped dataframe to calculate the mean of grouped entries. We'll be comparing the time taken by different engines as usual.

Please make a **NOTE** that currently only **transform()**, **agg()** and **aggregate()** functions support **engine** argument which can be set to **'numba'**. The **agg()** and **aggregate()** methods perform same function.

In [13]:

```
from numba import jit
def func(values, index):
return values.mean()
%time out = grouped_by_types.agg('mean')
%time out = grouped_by_types.agg('mean', engine='cython')
%time out = grouped_by_types.agg(func, engine='numba')
%time out = grouped_by_types.agg(func, engine='numba')
```

In this section, we'll be creating a **@jit** decorated function to work on our pandas dataframe. We'll compare the performance of these **@jit** decorated functions with other non-decorated functions. We'll also try to create functions to replace aggregate functions which are already provided by the pandas dataframe. Apart from **@jit**, we'll also try to use **@vectorize** decorator to speed up operations.

As we had said earlier, we'll be retrieving numpy arrays from our pandas dataframe before giving them to **numba** functions because **numba** works well with numpy arrays and python loops.

Please make a **NOTE** that the difference in the performance of **numba** functions might not be visible with small arrays. It becomes visible as array size increases. The **numba** functions also get compiled the first time we run it hence can take more time when we execute it the first time but all subsequent executions are quite faster as it uses compiled version from memory.

Below we have again created rolled dataframe and grouped dataframe like our previous section. We'll be trying various **numba** functions on them this time.

In [6]:

```
rolling_df = df.rolling(1000)
```

In [7]:

```
grouped_by_types = df.groupby("Type")
```

As a part of our first example, we have created two functions that perform the same operation on the input array but one of them is decorated with **numba @jit** decorator. The functions take as input arrays, squares their values, and then calculate the mean of squared values.

Then in the next cell, we have tried these functions on our rolled dataframe using **apply()** function. We have called **apply()** more than once with different backend engines (None, **cython** and **numba**) like our previous examples. We have also recorded the time taken by various executions.

We can notice from the results that **@jit** decorated function takes less time compared to the normal non-decorated function.

In [16]:

```
from numba import jit, njit, vectorize, float64
def custom_mean(x):
return (x * x).mean()
@jit(cache=True)
def custom_mean_jitted(x):
return (x * x).mean()
```

In [17]:

```
%time out = rolling_df.apply(custom_mean, raw=True)
%time out = rolling_df.apply(custom_mean_jitted, raw=True)
%time out = rolling_df.apply(custom_mean, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean_jitted, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
%time out = rolling_df.apply(custom_mean_jitted, engine='numba', raw=True)
```

Our code for this example is almost exactly the same as our code from the previous example with one minor change. We are using **@njit** decorator instead of **@jit** decorator this time. The **@njit** decorator compiles the code of function in pure **nopython** mode of **numba** which is generally faster. We can force **nopython** mode when using **@jit** decorator as well by providing **nopython=True** argument to it.

If you want to know about **numba nopython** mode then please feel free to check our tutorial that covers it.

We have then executed these functions on our rolled dataframe using **apply()** function with different backends for comparison purposes.

In [18]:

```
from numba import jit, njit
def custom_mean(x):
return (x * x).mean()
@njit(cache=True)
def custom_mean_jittted(x):
return (x * x).mean()
```

In [19]:

```
%time out = rolling_df.apply(custom_mean, raw=True)
%time out = rolling_df.apply(custom_mean_jittted, raw=True)
%time out = rolling_df.apply(custom_mean, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean_jittted, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
%time out = rolling_df.apply(custom_mean_jittted, engine='numba', raw=True)
```

We can further speed up our **@jit** decorated functions by providing input and output data types. The **numba** will create compiled version based on those datatypes which can improve performance. Below we have provided **float64** as the input and output data type of our function.

We have then called these functions on our rolled dataframe using **apply()** method with different backend engines for comparing performance.

In [20]:

```
from numba import jit, njit, float64
def custom_mean(x):
return (x * x).mean()
@jit(float64(float64[:]), nopython=True, cache=True)
def custom_mean_jitted(x):
return (x * x).mean()
```

In [21]:

```
%time out = rolling_df.apply(custom_mean, raw=True)
%time out = rolling_df.apply(custom_mean_jitted, raw=True)
%time out = rolling_df.apply(custom_mean, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean_jitted, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
%time out = rolling_df.apply(custom_mean_jitted, engine='numba', raw=True)
```

As we know that **numba** works really well with python loops, we can also modify our function to work with python loops. In this example, we have modified our **@jit** decorated function to calculate the mean of squared values in the loop.

We have then executed these functions on our rolled dataframe with different backend engines to compare performance. We can notice that it seems to be doing a little better compared to our previous examples.

In [22]:

```
from numba import jit, njit, vectorize, float64
def custom_mean(x):
return (x * x).mean()
@jit(float64(float64[:]), nopython=True, cache=True)
def custom_mean_loops_jitted(x):
out = 0.0
for i in x:
out += (i*i)
return out / len(x)
```

In [23]:

```
%time out = rolling_df.apply(custom_mean, raw=True)
%time out = rolling_df.apply(custom_mean_loops_jitted, raw=True)
%time out = rolling_df.apply(custom_mean, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean_loops_jitted, engine='cython', raw=True)
%time out = rolling_df.apply(custom_mean, engine='numba', raw=True)
%time out = rolling_df.apply(custom_mean_loops_jitted, engine='numba', raw=True)
```

In this example, we'll create a custom **@jit** decorated function to replace an existing **mean()** function available from the pandas dataframe.

Below we have first calculated the mean of 5 columns of the dataframe using the in-built **mean()** function and recorded the time taken for the operation.

In [27]:

```
%time out = df[list("ABCDE")].mean()
```

In the below cell, we have designed a function that takes as input numpy array and calculates the mean of it. We have **@jit** decorated the function and also specified input/output data types. We have also provided **nopython=True** argument to run **numba** in strict **nopython** mode.

In [28]:

```
from numba import jit, njit, vectorize, float64, float32
@jit([float32(float32[:]), float64(float64[:])], nopython=True, cache=True)
def custom_mean(x):
return x.mean()
```

In the below cell, we have looped through column names of the pandas’ data frame and calculated the mean of them using our custom mean function. We have also recorded the time taken to calculate the mean of all columns. We can notice that it takes a little less time compared to pandas’ in-built function. We think that this difference will increase with an increase in the size of the array and the number of columns.

Please make a **NOTE** that difference in performance will be more visible as array size increases and goes beyond **1M** values.

In [29]:

```
%%time
avg_cols = {}
for col in list("ABCDE"):
avg_cols[col] = custom_mean(df[col].values)
```

In this section, we'll explain another example where we'll use **@vectorize** decorator to replace the existing function of pandas.

Below we have taken a column of our pandas dataframe, squared its values, and then added scaler value 2 to it. We have performed this operation by providing a simple function to **apply()** method. We have recorded the time taken by this operation.

In the next cell below, we have calculated the square of the pandas’ column by using a simple multiplication operation. We have recorded the time taken to perform an operation in this way as well.

In [30]:

```
%time out = df.A.apply(lambda x : x**2 + 2)
```

In [17]:

```
%time out = (df.A.values * df.A.values) + 2
```

In the below cell, we have created a simple function that takes as input a single value, squares it, and adds scalar value 2 to it. We have then vectorized this function using **numba @vectorize** function. We'll be using this function to perform the same operation we performed using pandas in-built method earlier in previous cells.

If you want to know about how **numba @vectorize** decorator works then please feel free to check our tutorial on it from the below link.

In [18]:

```
from numba import vectorize, float32, float64
@vectorize([float32(float32), float64(float64)])
def square(x):
return x**2 + 2
```

In the below cell, we have called our vectorized function 3 times on values of the column of our dataframe. We have recorded the time taken by this function all 3 times. We can notice that our vectorized function takes quite less time compared to pandas’ in-built functionalities.

Please make a **NOTE** that difference in performance will be more visible as array size increases and goes beyond **1M** values.

In [19]:

```
%time out = square(df["A"].values)
%time out = square(df["A"].values)
%time out = square(df["A"].values)
```

This ends our small tutorial explaining how we can create code using **numba** when working with pandas dataframe to speed up code involving dataframes. Please feel free to let us know your views in the comments section.

**Thank You** for visiting our website. If you like our work, please support us so that we can keep on creating new tutorials/blogs on interesting topics (like AI, ML, Data Science, Python, Digital Marketing, SEO, etc.) that can help people learn new things faster. You can support us by clicking on the **Coffee** button at the bottom right corner. We would appreciate even if you can give a thumbs-up to our article in the comments section below.

If you want to

- provide some suggestions on topic
- share your views
- include some details in tutorial
- suggest some new topics on which we should create tutorials/blogs

Sunny Solanki

Numba @stencil Decorator: Guide to Improve Performance of Code involving Stencil Kernels

Numba @guvectorize Decorator: Generalized Universal Functions

Simple Guide to Understand Pandas Multi-Level / Hierarchical Index

xarray (Dataset) : Multi-Dimensional Labelled Arrays