Share @ LinkedIn Facebook  profiling, line_profiler
line_profiler: Line by Line Profiling of Python Code

line_profiler: Line by Line Profiling of Python Code

The profiling is a process where we analyze time taken by various parts of our code. It can help us better understand the time/space complexity of our code. Python has many profilers like cProfile, profile, line_profiler, memory_profiler, etc which can help us know time/memory taken by our python code. We have already covered tutorials on cProfile and profile which helps us understand how much time is taken by the various functions of our code.

The cProfile and profile provides profiling results on a function basis but don't give us information on line by line basis of function. The results generated by these libraries have time taken by function calls but no information about time taken by individual lines of each function. Python has a library called line_profiler which can help us better understand the time taken by various lines of our code. We'll be covering how to use line_profiler to profile Python code as a part of this tutorial.

Example 1 : kernprof : Command Line/Shell

Our first example of using line_profiler explains how to use it from a command prompt/shell. We have designed a simple python script that has one main_func() function which calls 3 other functions. All three other functions generate 100000 random numbers between 1-100 and then take an average of it. We have manually inserted time wait in all functions so that each takes different time to complete even though all perform the same functionality. We need to decorate each function which we want to profile in script with @profile decorator.

random_number_average.py

import time
import random

@profile
def very_slow_random_generator():
    time.sleep(5)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

@profile
def slow_random_generator():
    time.sleep(2)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

@profile
def fast_random_generator():
    time.sleep(1)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

@profile
def main_func():
    result = fast_random_generator()
    print(result)

    result = slow_random_generator()
    print(result)

    result = very_slow_random_generator()
    print(result)

main_func()

We can profile the script from the command line/shell using the below command. The kernprof command will generate a file named script_name.lprof after it has completed profiling. The profiling results are stored in the .lprof file generated in the same directory.

  • kernprof -l random_number_average.py

We can then call the below line to see the output of profiling by executing the below line.

  • python -m line_profiler random_number_average.py.lprof

Output

Timer unit: 1e-06 s

Total time: 5.35466 s
File: addition.py
Function: very_slow_random_generator at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     4                                           @profile
     5                                           def very_slow_random_generator():
     6         1    5005108.0 5005108.0     93.5      time.sleep(5)
     7         1     349107.0 349107.0      6.5      arr = [random.randint(1,100) for i in range(100000)]
     8         1        449.0    449.0      0.0      return sum(arr) / len(arr)

Total time: 2.34696 s
File: addition.py
Function: slow_random_generator at line 10

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    10                                           @profile
    11                                           def slow_random_generator():
    12         1    2002354.0 2002354.0     85.3      time.sleep(2)
    13         1     344160.0 344160.0     14.7      arr = [random.randint(1,100) for i in range(100000)]
    14         1        444.0    444.0      0.0      return sum(arr) / len(arr)

Total time: 1.34553 s
File: addition.py
Function: fast_random_generator at line 16

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    16                                           @profile
    17                                           def fast_random_generator():
    18         1    1001299.0 1001299.0     74.4      time.sleep(1)
    19         1     343781.0 343781.0     25.5      arr = [random.randint(1,100) for i in range(100000)]
    20         1        447.0    447.0      0.0      return sum(arr) / len(arr)

Total time: 9.04754 s
File: addition.py
Function: main_func at line 22

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    22                                           @profile
    23                                           def main_func():
    24         1    1345633.0 1345633.0     14.9      result = fast_random_generator()
    25         1         30.0     30.0      0.0      print(result)
    26
    27         1    2347071.0 2347071.0     25.9      result = slow_random_generator()
    28         1         20.0     20.0      0.0      print(result)
    29
    30         1    5354771.0 5354771.0     59.2      result = very_slow_random_generator()
    31         1         20.0     20.0      0.0      print(result)

The result generated by line_profiler has one table per each function. The table has the same rows as per the number of lines of code into that function.

  • Hits: The first column represents how many times that line was hit inside that function. In our current example hits has a value of 1 but it can be more than one in case of recurrence.
  • Time: The second column represents the time taken by that line in total for all hits.
  • Per Hit: The third column represents time taken per each call of that line.
  • % Time: The fourth column represents % of time taken by that line of total function time.
  • Line Contents: The fifth column represents a line of function.

If we want to generate output to the shell/command line after the script is profiled using kernprof than we can call it using -v option.

  • kernprof -l -v random_number_average.py

Example 2: Profiling From Code Only

As a part of our second example of using line_profiler, we'll explain how we can profile python code by adding few lines of extra code to our script rather than passing it to kernprof from the command line/shell.

Below we have re-written code from our previous example again.

In [31]:
import time
import random

def very_slow_random_generator():
    time.sleep(5)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def slow_random_generator():
    time.sleep(2)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def fast_random_generator():
    time.sleep(1)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def main_func():
    result = fast_random_generator()
    print(result)

    result = slow_random_generator()
    print(result)

    result = very_slow_random_generator()
    print(result)

We then need to create an object of LineProfiler class first. We then need to create a wrapper around main_func() by calling the LineProfiler instance passing it main_func. We can then execute that line profiler wrapper which will execute main_func().

In [33]:
from line_profiler import LineProfiler

lprofiler = LineProfiler()

lp_wrapper = lprofiler(main_func)

lp_wrapper()
50.48633
50.37566
50.55234

Once the line profiler wrapper (main_func) completes, we can call the print_stats() method on the line profiler instance and it'll print the profiling result.

We can see from the below results printed that it only profiled code of main_func. We'll need to make a modification to code a bit if we want to profile other functions as well.

In [34]:
lprofiler.print_stats()
Timer unit: 1e-06 s

Total time: 9.032 s
File: <ipython-input-31-c9c4218f4095>
Function: main_func at line 19

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    19                                           def main_func():
    20         1    1338047.0 1338047.0     14.8      result = fast_random_generator()
    21         1         76.0     76.0      0.0      print(result)
    22
    23         1    2339920.0 2339920.0     25.9      result = slow_random_generator()
    24         1        309.0    309.0      0.0      print(result)
    25
    26         1    5353573.0 5353573.0     59.3      result = very_slow_random_generator()
    27         1         79.0     79.0      0.0      print(result)

Below we are creating the LineProfiler instance again. We are then calling add_function() on it three times passing it the name of three random average generator functions so that it profiles code of them as well. We are then wrapping main_func by calling the line profiler instance. We can then call the line profiler wrapper which will execute main_func() but this time it'll profile code for other three random average generator functions as well along with main_func.

In [36]:
lprofiler = LineProfiler()

lprofiler.add_function(fast_random_generator)
lprofiler.add_function(slow_random_generator)
lprofiler.add_function(very_slow_random_generator)

lp_wrapper = lprofiler(main_func)

lp_wrapper()
50.63049
50.4163
50.39218
In [37]:
lprofiler.print_stats()
Timer unit: 1e-06 s

Total time: 5.34017 s
File: <ipython-input-31-c9c4218f4095>
Function: very_slow_random_generator at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     4                                           def very_slow_random_generator():
     5         1    5005293.0 5005293.0     93.7      time.sleep(5)
     6         1     334440.0 334440.0      6.3      arr = [random.randint(1,100) for i in range(100000)]
     7         1        434.0    434.0      0.0      return sum(arr) / len(arr)

Total time: 2.35658 s
File: <ipython-input-31-c9c4218f4095>
Function: slow_random_generator at line 9

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     9                                           def slow_random_generator():
    10         1    2002075.0 2002075.0     85.0      time.sleep(2)
    11         1     354025.0 354025.0     15.0      arr = [random.randint(1,100) for i in range(100000)]
    12         1        478.0    478.0      0.0      return sum(arr) / len(arr)

Total time: 1.37797 s
File: <ipython-input-31-c9c4218f4095>
Function: fast_random_generator at line 14

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    14                                           def fast_random_generator():
    15         1    1001241.0 1001241.0     72.7      time.sleep(1)
    16         1     376286.0 376286.0     27.3      arr = [random.randint(1,100) for i in range(100000)]
    17         1        446.0    446.0      0.0      return sum(arr) / len(arr)

Total time: 9.07578 s
File: <ipython-input-31-c9c4218f4095>
Function: main_func at line 19

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    19                                           def main_func():
    20         1    1378068.0 1378068.0     15.2      result = fast_random_generator()
    21         1        342.0    342.0      0.0      print(result)
    22
    23         1    2356694.0 2356694.0     26.0      result = slow_random_generator()
    24         1         92.0     92.0      0.0      print(result)
    25
    26         1    5340256.0 5340256.0     58.8      result = very_slow_random_generator()
    27         1        329.0    329.0      0.0      print(result)

Example 3: Jupyter Notebook

Our third example explains how to use line_profiler into a jupyter notebook. We first need to load line_profiler as an external extension into the jupyter notebook using the %load_ext magic command. This will make the %lprun line magic command available to us which we can use to profile Python code.

In [2]:
%load_ext line_profiler

Below we have created a simple function that generated a numpy array of a random number between 1-100 of size 1000x1000. We are then taking an average of an array and returning average. At the start of the function, we are pausing script for 5 seconds as well.

In [1]:
import time
import numpy as np
from concurrent.futures import ProcessPoolExecutor

def very_slow_random_generator():
    time.sleep(5)
    arr1 = np.random.randint(1,100, size=(1000,1000))
    avg = arr1.mean()
    return avg

We can then call the very_slow_random_generator() function by first registering it with the %lprun line magic command and then calling it so that code inside it will be profiled line by line.

In [3]:
%lprun -f very_slow_random_generator very_slow_random_generator()
Out[3]:
49.998236
Timer unit: 1e-06 s

Total time: 5.01748 s
File: <ipython-input-13-8e4f20603e75>
Function: very_slow_random_generator at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                           def very_slow_random_generator():
     6         1    5005205.0 5005205.0     99.8      time.sleep(5)
     7         1      10931.0  10931.0      0.2      arr1 = np.random.randint(1,100, size=(1000,1000))
     8         1       1344.0   1344.0      0.0      avg = arr1.mean()
     9         1          1.0      1.0      0.0      return avg

Below we have re-written code from the previous example. We'll try to profile all functions in the below code by using the %lprun line magic command.

In [38]:
import time
import random

def very_slow_random_generator():
    time.sleep(5)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def slow_random_generator():
    time.sleep(2)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def fast_random_generator():
    time.sleep(1)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def main_func():
    result = fast_random_generator()
    print(result)

    result = slow_random_generator()
    print(result)

    result = very_slow_random_generator()
    print(result)

Below we have explained how we can profile all functions using the %lprun line magic command. We first need to register all functions with %lprun with -f option and then calling main_func(). This way it'll profile code of all functions.

In [40]:
%lprun -f main_func -f fast_random_generator -f slow_random_generator -f very_slow_random_generator main_func()
50.5898
50.31312
50.35173
Timer unit: 1e-06 s

Total time: 5.37966 s
File: <ipython-input-38-c9c4218f4095>
Function: very_slow_random_generator at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     4                                           def very_slow_random_generator():
     5         1    5005309.0 5005309.0     93.0      time.sleep(5)
     6         1     373891.0 373891.0      7.0      arr = [random.randint(1,100) for i in range(100000)]
     7         1        457.0    457.0      0.0      return sum(arr) / len(arr)

Total time: 2.33906 s
File: <ipython-input-38-c9c4218f4095>
Function: slow_random_generator at line 9

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     9                                           def slow_random_generator():
    10         1    2002214.0 2002214.0     85.6      time.sleep(2)
    11         1     336388.0 336388.0     14.4      arr = [random.randint(1,100) for i in range(100000)]
    12         1        454.0    454.0      0.0      return sum(arr) / len(arr)

Total time: 1.35545 s
File: <ipython-input-38-c9c4218f4095>
Function: fast_random_generator at line 14

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    14                                           def fast_random_generator():
    15         1    1001058.0 1001058.0     73.9      time.sleep(1)
    16         1     353907.0 353907.0     26.1      arr = [random.randint(1,100) for i in range(100000)]
    17         1        482.0    482.0      0.0      return sum(arr) / len(arr)

Total time: 9.07528 s
File: <ipython-input-38-c9c4218f4095>
Function: main_func at line 19

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    19                                           def main_func():
    20         1    1355544.0 1355544.0     14.9      result = fast_random_generator()
    21         1        118.0    118.0      0.0      print(result)
    22
    23         1    2339153.0 2339153.0     25.8      result = slow_random_generator()
    24         1        351.0    351.0      0.0      print(result)
    25
    26         1    5379754.0 5379754.0     59.3      result = very_slow_random_generator()
    27         1        362.0    362.0      0.0      print(result)

This ends our small tutorial explaining how to use line_profiler to profile Python code. Please feel free to let us know your views in the comments section

References



Sunny Solanki  Sunny Solanki