Share @ LinkedIn Facebook  profiling, cProfile, profile
How to Profile Python Code using cProfile & profile?

How to Profile Python Code using cProfile & profile

The profiling of code is a process where you find out which particular part of the code is taking how much time. This can be useful when optimizing code which is taking more time than expected. The profiling can be of time or even it can be of memory as well sometimes. There are profilers which let us know memory usage at each statement of code. We'll be concentrating on profilers for measuring the time of various parts of code as a part of this tutorial. The python has two profilers for this purpose.

  • cProfile - Its an extension of C hence adds reasonably low overhead to actual code when performing profiling.
  • profile - It’s totally written in python hence adds a reasonable amount of overhead to actual code when performing profiling.

Both of the above-mentioned profilers provide deterministic profiling of python code. The deterministic profiling monitors each statement, function call, exceptions, etc. The other profiling technique is called statistical profiling technique which randomly samples instruction being executed and records time to determine where more time is being spent. As statistical profiling does not monitor each statement, function calls, etc and randomly samples which part to monitor, it adds less overhead of profiling a task.

As a part of this tutorial, we'll be explaining how to use cProfile and profile modules to profile Python code. We'll also cover the pstats module which can be used to format the output of profilers. We'll be explaining the usage of both libraries with few examples. If you are interested in learning about other profilers available in python then please feel free to go through the references section to check out tutorials on other profilers.

1. cProfile

In [1]:
import cProfile

Example 1 : run() function

The cProfile has a function named run() which accepts any python statement. It'll pass that python statement to exec() function to execute and generate profiling results as well.

Below we are simply running a loop 10000 time and generating a random numbers between 1-10 each time. We keep all random numbers in a list and then sum them up. We have passed this statement to the run() method of cProfile to see profiling results of where time is getting spent.

In [2]:
import random

cProfile.run("sum([random.randint(1,10) for i in range(10000)])")
         55958 function calls in 0.014 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    0.014    0.014 <string>:1(<listcomp>)
        1    0.000    0.000    0.014    0.014 <string>:1(<module>)
    10000    0.004    0.000    0.010    0.000 random.py:174(randrange)
    10000    0.002    0.000    0.013    0.000 random.py:218(randint)
    10000    0.004    0.000    0.006    0.000 random.py:224(_randbelow)
        1    0.000    0.000    0.014    0.014 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.sum}
    10000    0.000    0.000    0.000    0.000 {method 'bit_length' of 'int' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    15953    0.001    0.000    0.001    0.000 {method 'getrandbits' of '_random.Random' objects}


We can understand from the above output that a total of 55957 function calls was made. The Ordered by: standard name indicates that the table displayed is sorted by string name printed at last.

The meaning of each column in the table is explained below:

  • ncalls: It represents a number of times that function was called. It can have two numbers separated by a slash in case of recursive calls. The first number, in that case, represents total calls including recursive calls and the second number represents primitive calls excluding recursive calls.
  • tottime: It represents total time spent in that function excluding time spent in sub-functions of that function.
  • percall : It represents tottime divided by ncalls.
  • cumtime : It represents total time spent in that function including time-spent in sub-functions of that function.
  • percall : It represents cumtime divided by ncalls.
  • filename:lineno(function) : It represents filename, line number in file and function name.

The run() function also provides another parameter named filename to which we can provide filename where the output of profiling will be saved. We'll load this file later using the pstats module.

In [3]:
cProfile.run("sum([random.randint(1,10) for i in range(10000)])", filename="out.prof")

Example 2 : Profile() Constructor

The run() method explained in the previous example has the limitation that we need to pass a single statement to it which needs to be profiled. If we want more control over which part of the code should be profiled then Profile() constructor is a better option. It has a function named enable() and disable() which lets us enable and disable profiler so that only a particular part of the code can be profiled. This can help us avoid unnecessarily profile whole code when we need to profile only certain parts of code.

Below we have created two function slow_add() and fast_add() which takes two numbers and returns their sums. The slow_add() function has a sleep time of half-second so that it takes more time to perform sum. We have then created the main_func() function where we are looping through numbers from 0-9 and calling slow_add() if the number is even else calling fast_add() passing the same number two times as parameters. We are adding the output of function calls to the arr list and returning it as well. We have declared our profiler outside of main_func() and we are enabling it before the execution of the loop and disabling after the execution of the loop. This will prevent the profiling of other parts of the function.

In [4]:
import time

def slow_add(a, b):
    time.sleep(0.5)
    return a+b

def fast_add(a, b):
    return a+b

prof = cProfile.Profile()

def main_func():
    arr = []
    prof.enable()
    for i in range(10):

        if i%2==0:
            arr.append(slow_add(i,i))
        else:
            arr.append(fast_add(i,i))
    prof.disable()
    return arr

main_func()
Out[4]:
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Once profiling is completed, we can call the print_stats() function to print profiling statistics.

In [5]:
prof.print_stats()
         26 function calls in 2.503 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    0.000    0.000    2.503    0.501 <ipython-input-4-2cfef41b561a>:3(slow_add)
        5    0.000    0.000    0.000    0.000 <ipython-input-4-2cfef41b561a>:7(fast_add)
        5    2.503    0.501    2.503    0.501 {built-in method time.sleep}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


The Profiler object also has a method named dump_stats() which lets us dump profiling results to a file.

In [6]:
prof.dump_stats("main_func.prof")

From python 3.8, we can even use the Profile() object as a context manager. Below we have recreated code of main_func() but this time we have declared Profile() as context manager using with statement. We have then returned the profile object as well which can be used to print the results of profiling.

def main_func():
    arr = []
    with cProfile.Profile() as prof:
        for i in range(10):
            if i%2==0:
                arr.append(slow_add(i,i))
            else:
                arr.append(fast_add(i,i))
    return arr, prof

out, prof = main_func()

print(out)

prof.print_stats()

Output:

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

26 function calls in 2.504 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    0.000    0.000    2.504    0.501 <ipython-input-178-2cfef41b561a>:3(slow_add)
        5    0.000    0.000    0.000    0.000 <ipython-input-178-2cfef41b561a>:7(fast_add)
        5    2.503    0.501    2.503    0.501 {built-in method time.sleep}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Example 3 : Command Line / Shell

The python also lets us execute profiler from the command line directly on the python file. We have put the code of this example into a file named addition.py as shown below.

addition.py

import time

def slow_add(a, b):
    time.sleep(0.5)
    return a+b

def fast_add(a, b):
    return a+b

def main_func():
    arr = []
    for i in range(10):
        if i%2==0:
            arr.append(slow_add(i,i))
        else:
            arr.append(fast_add(i,i))
    return arr

if __name__ == '__main__':
    main_func()

Once we have put code to a python file, we can use the below command in the command prompt or Unix shell to profile that python file.

Command to Profile File:

  • python -m cProfile addition.py

Output

29 function calls in 2.504 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

We can even save the output of profiling to file when performing profiling from the command line as explained below.

  • python -m cProfile -o prof_out.prof addition.py

We can also sort the output of profiling based on a particular column of the table by giving that column name with -s parameter.

Example 4 : pstats

The python also provides the pstats module which lets us modify/filter results of profiling. It lets us load the profiler results file using it's Stats() constructor. We can pass the Stats() constructor a single file or the list of files containing profiling results.

Below we have loaded the prof_out.prof file created during the previous example. We can print profiling results from the Stats object by calling the print_stats() method on it. The print_stats() method accepts a list of arguments that can be used to filter results. It has been explained below with simple examples.

In [66]:
import pstats
In [10]:
p = pstats.Stats("prof_out.prof")

p.print_stats();
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


The pstats module also provides us with various methods that let us modify the output of profiling. One important method that it provides it sort_stats() which accepts column name from profiling results and then sorts data according to the column name. The list of column names is stored in pstats.SortKey.

Below we have sorted the results of profiling based on cumulative time and then have taken the first 3 results from it when printing.

In [11]:
p.sort_stats(pstats.SortKey.CUMULATIVE).print_stats(3);
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Ordered by: cumulative time
   List reduced from 8 to 3 due to restriction <3>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)


Below we have called sort_stats() again but this time we are sorting profiling results based on total time and cumulative time.

In [12]:
p.sort_stats(pstats.SortKey.TIME, pstats.SortKey.CUMULATIVE).print_stats();
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Ordered by: internal time, cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


The below example demonstrates that we can pass a string to print_stats() and it'll filter profiling results which have that string present in it.

In [13]:
p = pstats.Stats("prof_out.prof")

p.print_stats("sleep");
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Random listing order was used
   List reduced from 8 to 1 due to restriction <'sleep'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}


The below example demonstrates another usage of print_stats() where we have passed float number to filter a total number of results to half. It'll multiply a total number of rows by that float number and print only that many rows.

In [14]:
p = pstats.Stats("prof_out.prof")

p.print_stats(0.5);
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Random listing order was used
   List reduced from 8 to 4 due to restriction <0.5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)


Another important method provided by Stats object is strip_dirs(). It'll remove all leading pathname information from filename in the last column of profiling results. This can help us if the result has long path information present which is truncating the output to the next line exceeding monitor limit.

In [15]:
p = pstats.Stats("prof_out.prof")

p.strip_dirs().print_stats();
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


We can even reverse the order of profiling results by calling the reverse_order() method on Stats object.

In [16]:
p = pstats.Stats("prof_out.prof")

p.sort_stats(pstats.SortKey.CUMULATIVE).reverse_order().print_stats();
Fri Oct  2 17:31:21 2020    prof_out.prof

         29 function calls in 2.504 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        5    0.000    0.000    2.504    0.501 addition.py:3(slow_add)
        1    0.000    0.000    2.504    2.504 addition.py:10(main_func)
        1    0.000    0.000    2.504    2.504 addition.py:1(<module>)
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}


The Stats object has two other methods which can be very useful to understand which function is calling which other function. The print_callers() provides us results that specify which function was called by which other function. The print_callees() function results which informs us about a function called which other functions.

In [17]:
p = pstats.Stats("prof_out.prof")

p.print_callers();
   Random listing order was used

Function                                          was called by...
                                                      ncalls  tottime  cumtime
{built-in method time.sleep}                      <-       5    2.504    2.504  addition.py:3(slow_add)
{built-in method builtins.exec}                   <-
{method 'append' of 'list' objects}               <-      10    0.000    0.000  addition.py:10(main_func)
addition.py:3(slow_add)                           <-       5    0.000    2.504  addition.py:10(main_func)
addition.py:7(fast_add)                           <-       5    0.000    0.000  addition.py:10(main_func)
addition.py:10(main_func)                         <-       1    0.000    2.504  addition.py:1(<module>)
addition.py:1(<module>)                           <-       1    0.000    2.504  {built-in method builtins.exec}
{method 'disable' of '_lsprof.Profiler' objects}  <-


In [18]:
p.print_callees();
   Random listing order was used

Function                                          called...
                                                      ncalls  tottime  cumtime
{built-in method time.sleep}                      ->
{built-in method builtins.exec}                   ->       1    0.000    2.504  addition.py:1(<module>)
{method 'append' of 'list' objects}               ->
addition.py:3(slow_add)                           ->       5    2.504    2.504  {built-in method time.sleep}
addition.py:7(fast_add)                           ->
addition.py:10(main_func)                         ->       5    0.000    2.504  addition.py:3(slow_add)
                                                           5    0.000    0.000  addition.py:7(fast_add)
                                                          10    0.000    0.000  {method 'append' of 'list' objects}
addition.py:1(<module>)                           ->       1    0.000    2.504  addition.py:10(main_func)
{method 'disable' of '_lsprof.Profiler' objects}  ->


Example 5 : Recursive Function

Our fifth example explains the profiling of recursive function. We have written a recursive function below which returns nth Fibonacci number. We have profiled this function by calling it with argument 25. We can see from the profiled output that there were 150052 total function calls and 4 primitive calls. We can even notice from ncalls of the first line that Fibonacci function has 150049 total calls and 1 primitive call.

In [19]:
def Fibonacci(n):
    if n<=0:
        print("Incorrect input")
    elif n==1:
        return 0
    elif n==2:
        return 1
    else:
        return Fibonacci(n-1)+Fibonacci(n-2)
In [20]:
cProfile.run("Fibonacci(25)")
         150052 function calls (4 primitive calls) in 0.029 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 150049/1    0.029    0.000    0.029    0.029 <ipython-input-19-b552bd3bf56e>:1(Fibonacci)
        1    0.000    0.000    0.029    0.029 <string>:1(<module>)
        1    0.000    0.000    0.029    0.029 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


2. profile

The second profiler that we'll explain as a part of this tutorial is profile. We'll import it to get started.

In [21]:
import profile

Example 1 : run()

The profile also has the same methods as that of cProfile. Our first example of profile is exactly the same as that of cProfile. We have used the run() method of profile to profile python statement which sums up 10000 random integers between 0-9.

In [22]:
import random

profile.run("sum([random.randint(1,10) for i in range(10000)])")
         56063 function calls in 0.149 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.014    0.000    0.014    0.000 :0(bit_length)
        1    0.000    0.000    0.149    0.149 :0(exec)
    16057    0.021    0.000    0.021    0.000 :0(getrandbits)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
        1    0.000    0.000    0.000    0.000 :0(sum)
        1    0.011    0.011    0.149    0.149 <string>:1(<listcomp>)
        1    0.000    0.000    0.149    0.149 <string>:1(<module>)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.149    0.149 profile:0(sum([random.randint(1,10) for i in range(10000)]))
    10000    0.031    0.000    0.116    0.000 random.py:174(randrange)
    10000    0.022    0.000    0.139    0.000 random.py:218(randint)
    10000    0.051    0.000    0.085    0.000 random.py:224(_randbelow)


The output of profile is formatted exactly the same way as that of the output of cProfile.

Example 2 : Profile() Constructor

Our second example explains how to use the Profile() constructor to profile code. We can call its print_stats() method once profiling is complete to see results. We can even dump the output of profiling to a file using the dump_stats() method.

In [41]:
import time

def slow_add(a, b):
    time.sleep(0.5)
    return a+b

def fast_add(a, b):
    return a+b

def main_func():
    arr = []
    for i in range(10):
        if i%2==0:
            arr.append(slow_add(i,i))
        else:
            arr.append(fast_add(i,i))
    return arr
In [49]:
prof = profile.Profile()

prof.run("main_func()")
Out[49]:
<profile.Profile at 0x7f29802b2ba8>
In [50]:
prof.print_stats()
         30 function calls in 0.015 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.002    0.002 :0(exec)
        1    0.012    0.012    0.012    0.012 :0(setprofile)
        5    0.002    0.000    0.002    0.000 :0(sleep)
        1    0.000    0.000    0.002    0.002 <ipython-input-41-ef931e1ad756>:10(main_func)
        5    0.000    0.000    0.002    0.000 <ipython-input-41-ef931e1ad756>:3(slow_add)
        5    0.000    0.000    0.000    0.000 <ipython-input-41-ef931e1ad756>:7(fast_add)
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
        1    0.000    0.000    0.015    0.015 profile:0(main_func())
        0    0.000             0.000          profile:0(profiler)


In [44]:
prof.dump_stats("prof_out2.prof")

Example 3 : Calibrated Profiling

The profiler available through the profile module subtracts a small constant from each event during profiling hence removing extra overhead added to the profiling process by the profiler. The default value of this constant is 0. The process of removing this overhead of profiling is commonly referred to as calibration.

We can find out a better constant for a particular system by executing the below code. Below code executes python statements 10000 statements with a profiler and without a profiler. It then finds our overhead as a float. It performs this process five times and adds overhead found each time to a python list. We then take an average to find out final float which can be used as a constant that will be used by the profiler to remove extra overhead added by the profiler.

In [45]:
pr = profile.Profile()

cal = []
for i in range(5):
    cal.append(pr.calibrate(10000, verbose=1))
elapsed time without profiling = 0.013557597999999782
elapsed time with profiling = 0.06708630000000015
'CPU seconds' profiler reported = 0.034153742000013754
total # calls = 10001.0
mean stopwatch overhead per profile event = 1.029704229577741e-06
elapsed time without profiling = 0.010843444000000257
elapsed time with profiling = 0.05433191400000004
'CPU seconds' profiler reported = 0.028127470000001598
total # calls = 10001.0
mean stopwatch overhead per profile event = 8.641148885112159e-07
elapsed time without profiling = 0.010875469999999776
elapsed time with profiling = 0.054664718999999806
'CPU seconds' profiler reported = 0.02814929999997018
total # calls = 10001.0
mean stopwatch overhead per profile event = 8.636051394845717e-07
elapsed time without profiling = 0.011092839999999882
elapsed time with profiling = 0.07285476399999968
'CPU seconds' profiler reported = 0.03783517099996603
total # calls = 10001.0
mean stopwatch overhead per profile event = 1.336982851713136e-06
elapsed time without profiling = 0.010917874999999633
elapsed time with profiling = 0.054198018999999764
'CPU seconds' profiler reported = 0.028088194000002176
total # calls = 10001.0
mean stopwatch overhead per profile event = 8.584301069894282e-07
In [46]:
bias = sum(cal)/5
bias
Out[46]:
9.905674432552185e-07

Once we have found out bias constant that we would like to remove for each event when using profiling, we can set it in profiler by setting the bias parameter when creating the Profile object as explained below.

We can then again run code with this calibrated profiler to see how it affects the measurements.

In [63]:
prof = profile.Profile(bias=bias)

prof.run("main_func()")
Out[63]:
<profile.Profile at 0x7f29802dcc88>
In [64]:
prof.print_stats()
         30 function calls in 0.014 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.002    0.002 :0(exec)
        1    0.012    0.012    0.012    0.012 :0(setprofile)
        5    0.002    0.000    0.002    0.000 :0(sleep)
        1    0.000    0.000    0.002    0.002 <ipython-input-41-ef931e1ad756>:10(main_func)
        5    0.000    0.000    0.002    0.000 <ipython-input-41-ef931e1ad756>:3(slow_add)
        5    0.000    0.000    0.000    0.000 <ipython-input-41-ef931e1ad756>:7(fast_add)
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
        1    0.000    0.000    0.014    0.014 profile:0(main_func())
        0    0.000             0.000          profile:0(profiler)


We can see from the above results that time taken when using a calibrated profiler is a little less than when used without calibrating.

Example 4: Jupyter Notebook

The jupyter notebook provides us with line and cell magic commands which can be used for profiling in notebooks. We can use %prun to perform profiling of one line code and %%prun to perform profiling of whole cell code.

Below we are first using %prun to profile function call Fibonacci(25).

In [39]:
%prun Fibonacci(25)
 
150052 function calls (4 primitive calls) in 0.035 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 150049/1    0.035    0.000    0.035    0.035 <ipython-input-114-b552bd3bf56e>:1(Fibonacci)
        1    0.000    0.000    0.035    0.035 {built-in method builtins.exec}
        1    0.000    0.000    0.035    0.035 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Below we are profiling code from our previous main_func() function using %%prun cell magic command.

In [40]:
%%prun

arr = []
for i in range(10):
    if i%2==0:
        arr.append(slow_add(i,i))
    else:
        arr.append(fast_add(i,i))

print(arr)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
 
63 function calls in 2.504 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    2.504    0.501    2.504    0.501 {built-in method time.sleep}
        3    0.001    0.000    0.001    0.000 socket.py:337(send)
        1    0.000    0.000    2.504    2.504 <string>:1(<module>)
        1    0.000    0.000    2.504    2.504 {built-in method builtins.exec}
        5    0.000    0.000    2.504    0.501 <ipython-input-28-f6d5027c3479>:3(slow_add)
        3    0.000    0.000    0.001    0.000 iostream.py:197(schedule)
        2    0.000    0.000    0.001    0.000 iostream.py:386(write)
        1    0.000    0.000    0.001    0.001 {built-in method builtins.print}
        3    0.000    0.000    0.000    0.000 threading.py:1080(is_alive)
        3    0.000    0.000    0.000    0.000 threading.py:1038(_wait_for_tstate_lock)
       10    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        3    0.000    0.000    0.000    0.000 {method 'acquire' of '_thread.lock' objects}
        2    0.000    0.000    0.000    0.000 {built-in method posix.getpid}
        2    0.000    0.000    0.000    0.000 iostream.py:310(_is_master_process)
        5    0.000    0.000    0.000    0.000 <ipython-input-28-f6d5027c3479>:7(fast_add)
        2    0.000    0.000    0.000    0.000 iostream.py:323(_schedule_flush)
        3    0.000    0.000    0.000    0.000 iostream.py:93(_event_pipe)
        3    0.000    0.000    0.000    0.000 {method 'append' of 'collections.deque' objects}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        3    0.000    0.000    0.000    0.000 threading.py:507(is_set)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Example 5 : Command Line / Shell

The profile profiler can be used from the command line/shell exactly the same way as that of cProfile. Below we are first profiling the addition.py file created in earlier examples.

Command to Profile File:

  • python -m profile addition.py

Output

30 function calls in 0.003 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.001    0.001 :0(exec)
        1    0.003    0.003    0.003    0.003 :0(setprofile)
        5    0.000    0.000    0.000    0.000 :0(sleep)
        1    0.000    0.000    0.000    0.000 addition.py:1(<module>)
        1    0.000    0.000    0.000    0.000 addition.py:10(main_func)
        5    0.000    0.000    0.000    0.000 addition.py:3(slow_add)
        5    0.000    0.000    0.000    0.000 addition.py:7(fast_add)
        1    0.000    0.000    0.003    0.003 profile:0(<code object <module> at 0x7f17e69a2270, file "addition.py", line 1>)
        0    0.000             0.000          profile:0(profiler)

We can even save the output of profiling to file when performing profiling from the command line as explained below.

  • python -m profile -o prof_out3.prof addition.py

This ends our small tutorial explaining how to profile Python code using cProfile and profile profilers. Please feel free to let us know your views in the comments section.

References



Sunny Solanki  Sunny Solanki