The yappi stands for yet another python profiler. The yappi is a Python profiler that is designed in C and supports the profiling of multi-threaded, asyncio, and gevent code. As it’s designed in C, it runs quite fast compared to other profilers which are designed in pure python. The yappi provides a flexible API which lets us profile particular parts of code by starting and stopping profiling. The yappi returns profiling results as function-based and not line by line based like line_profiler
and few other profilers which are line by line based. The yappi also lets us format profiling output in different formats like callgrind and pstat. It even lets us decide whether we want to profile wall time (total time for which code ran from the first statement to the last statement) or CPU time (time for which code actually used CPU) taken by code. The yappi stores profiling output as a special stat object which lets us filter profiling results as well as sort results. As a part of this tutorial, we'll be explaining with simple examples, how we can profile our python code using yappi. We'll also try to explain the majority of yappi API in the process.
We'll be explaining the usage of yappi in two ways:
As a part of this section, we'll explain how we can profile script from the command line using yappi. We'll also explain various options available when profiling code from the command line.
As a part of our first example, we'll explain how we can profile a simple script using yappi. Below we have designed a simple python script which loops 10000 times, calls the add function passing it index twice, adding the result of the add function to a list, and returning the list. We have then profiled this script using yappi.
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10000):
out = add(i, i)
final_list.append(out)
return final_list
if __name__ == "__main__":
l = get_sum_of_list()
#print(l)
Please make a note that we have called yappi from the jupyter notebook cell and the profiling script is kept in the profiling_examples folder.
We can see from the below output that yappi has reported profiling for each function calls. By default, yappi measures CPU time taken by functions. The table represented is sorted based on ttot
time and in descending order.
The function call table has a list of columns mentioned below:
The thread call table has a list of columns mentioned below:
threading.Thread
.!yappi profiling_examples/example_1.py
The yappi lets us profile results in different formats using -f
or --output-format=output_format
option. We need to provide format after -f
options which can be one of the pstat
, callgrind
, and ystat
. Below we have called our same profiling example and formatted results as pstat
.
!yappi -f pstat profiling_examples/example_1.py
The yappi also lets us save the results of profiling to an output file. It provides option named -o
or --output-file=output_file
for this purpose. We have below stored the result of profiling our example script to an output file named yappi_ex1.out
as callgrind
format. We have then displayed file contents as well.
!yappi -f callgrind -o yappi_ex1.out profiling_examples/example_1.py
!cat yappi_ex1.out
The yappi provides us with the option of whether we want to see the time taken by inbuilt python functions or not. We can instruct yappi to include time taken by inbuilt python functions by using the -b
or --builtins
option. Below we have instructed our yappi to include built-in function times when profiling our script. We can see that it has included the time of functions like append()
list function and few other built-in function calls which were not present in previous profiling results.
!yappi -b profiling_examples/example_1.py
The yappi lets us choose whether we want it to report wall time or CPU time for the profiling results. The wall time is the total time taken by code which includes waiting for CPU, waiting for inputs, waiting for results from networks, etc. The wall time is generally total time including all waits and actual CPU usage. The CPU time is the time taken by code when it actually used CPU excluding the time spent waiting for CPU. The yappi lets us specify whether we want to profile wall time or CPU time using -c
or --clock-type=clock_type
options. Below we have profiled our script using both ways.
!yappi -c wall profiling_examples/example_1.py
!yappi -c cpu profiling_examples/example_1.py
start()
¶The yappi lets us profile results for multi-threaded code as well. It lets us decide whether we want it to report the time spent on each thread or the main thread which started other thread. It lets us do this using -s
or --single-thread
options.
To explain this example, below we have created a simple script that uses multi-threading. We have created a function that generates a random number between 1-100 and has introduced artificial wait time in the function so that it takes 2 seconds for it. We have then created the ThreadPoolExecutor
of 3 threads and called the random number generator function three times in different threads. We have also printed the output of the calls.
If you are interested in learning about the concurrent.futures
python module then please feel free to check our tutorial on the same.
import time
import random
from concurrent.futures import ThreadPoolExecutor
def slow_random_gen():
time.sleep(2)
val = random.randint(1,100)
return val
def main_func():
ppe = ThreadPoolExecutor(max_workers=3)
futures = []
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
print([future.result() for future in futures])
if __name__ == '__main__':
main_func()
Below we have profiled our script written above in a simple way without using any option. We can notice that by default, it has accounted for all threads (main thread and 3 threads created by thread pool executor).
In the next cell, we have profiled script using -s
option which only reports the main thread which had created other threads, and all other threads created by thread pool executor is not reported.
Please make a note that in the below example, we have a shortened the output of function calls as it was quite long due to the usage of different libraries. Please don't be surprised if you see long function calls output when you profile this code using yappi.
!yappi profiling_examples/multi_threaded_example.py
[15, 69, 69]
Clock type: CPU
Ordered by: totaltime, desc
name ncall tsub ttot tavg
..lti_threaded_example.py:1 <module> 1 0.000034 0.014990 0.014990
..lib._bootstrap>:978 _find_and_load 20/3 0.000162 0.013865 0.000693
..strap>:948 _find_and_load_unlocked 20/3 0.000089 0.013721 0.000686
..lib._bootstrap>:663 _load_unlocked 20/4 0.000149 0.012675 0.000634
..>:722 SourceFileLoader.exec_module 11/4 0.000044 0.012373 0.001125
..rap>:211 _call_with_frames_removed 30/4 0.000052 0.012340 0.000411
...
name id tid ttot scnt
_MainThread 0 139947519743808 0.016440 6
Thread 2 139947415697152 0.000374 2
Thread 1 139947491821312 0.000356 2
Thread 3 139947407304448 0.000246 2
!yappi -s profiling_examples/multi_threaded_example.py
[84, 59, 18]
Clock type: CPU
Ordered by: totaltime, desc
name ncall tsub ttot tavg
..lti_threaded_example.py:1 <module> 1 0.000041 0.014064 0.014064
..lib._bootstrap>:978 _find_and_load 20/3 0.000148 0.012756 0.000638
..strap>:948 _find_and_load_unlocked 20/3 0.000080 0.012646 0.000632
..lib._bootstrap>:663 _load_unlocked 20/4 0.000142 0.011733 0.000587
..>:722 SourceFileLoader.exec_module 11/4 0.000040 0.011459 0.001042
..rap>:211 _call_with_frames_removed 30/4 0.000047 0.011400 0.000380
..ent/futures/__init__.py:4 <module> 1 0.000006 0.005445 0.005445
..urrent/futures/_base.py:4 <module> 1 0.000027 0.005024 0.005024
..ib/python3.7/random.py:38 <module> 1 0.000123 0.004458 0.004458
...7/logging/__init__.py:24 <module> 1 0.000098 0.004047 0.004047
...
name id tid ttot scnt
_MainThread 0 139785483880256 0.015532 1
As apart of this section, we'll explain how we can use yappi as a python library and profile a particular part of the python code. We'll explain various API functions of yappi as a part of this section.
run()
Function¶The simplest way to profile code using yappi is using the run()
function as a context manager for code that we want to profile.
YFuncStats
object which has information about each function calls and its profiling data.YThreadStats
object which has profiling information about threads.out
which is set to sys.stdout
hence it prints output to standard output. We can output profiling results to a file as well.Below we have explained the usage of all the above-mentioned method with a simple example. We are simply looping 10000 times generating two random numbers each time, adding both, and then adding results of that addition to a python list. We are returning that python list at the end. We have called our main function into yappi.run()
context manager so that it gets profiled along with all functions that it calls. We have also printed a few constants available with yappi which stores few default options of the library.
CODE
import random
import yappi
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10000):
rand1 = random.randint(1,100)
rand2 = random.randint(1,100)
out = add(rand1, rand2)
final_list.append(out)
return final_list
with yappi.run(builtins=True):
l = get_sum_of_list()
# print(l)
print("================ Func Stats ===================")
yappi.get_func_stats().print_all()
print("\n================ Thread Stats ===================")
yappi.get_thread_stats().print_all()
print("\nYappi Backend Types : ",yappi.BACKEND_TYPES)
print("Yappi Clock Types : ", yappi.CLOCK_TYPES)
print("Yappi Stats Columns : ", yappi.COLUMNS_FUNCSTATS)
print("Yappi Line Sep : ", yappi.LINESEP)
OUTPUT
================ Func Stats ===================
Clock type: CPU
Ordered by: totaltime, desc
name ncall tsub ttot tavg
yappi_ex1.py:8 get_sum_of_list 1 0.029480 0.167297 0.167297
..on3.7/random.py:218 Random.randint 20000 0.025818 0.125842 0.000006
..3.7/random.py:174 Random.randrange 20000 0.029624 0.100024 0.000005
...7/random.py:224 Random._randbelow 20000 0.043398 0.070399 0.000004
..bits' of '_random.Random' objects> 25450 0.015492 0.015492 0.000001
..hod 'bit_length' of 'int' objects> 20000 0.011510 0.011510 0.000001
yappi_ex1.py:4 add 10000 0.006068 0.006068 0.000001
..method 'append' of 'list' objects> 10000 0.005907 0.005907 0.000001
.. _GeneratorContextManager.__exit__ 1 0.000000 0.000000 0.000000
builtins.next 1 0.000000 0.000000 0.000000
================ Thread Stats ===================
name id tid ttot scnt
_MainThread 0 139678150301504 0.167317 1
Yappi Backend Types : {'NATIVE_THREAD': 0, 'GREENLET': 1}
Yappi Clock Types : {'WALL': 0, 'CPU': 1}
Yappi Stats Columns : ['name', 'ncall', 'ttot', 'tsub', 'tavg']
Yappi Line Sep :
As a part of this example, we'll explain how we can change between wall and CPU time profiling as well as how we can sort the results of profiling.
Below we have explained the usage of the above-mentioned methods where we sort the output based on a number of calls in ascending order.
CODE
import random
import yappi
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10000):
rand1 = random.randint(1,100)
rand2 = random.randint(1,100)
out = add(rand1, rand2)
final_list.append(out)
return final_list
yappi.set_clock_type("WALL")
with yappi.run(builtins=True):
l = get_sum_of_list()
# print(l)
print("Default Sort Type : ", yappi.DEFAULT_SORT_TYPE)
print("Default Sort Order : ", yappi.DEFAULT_SORT_ORDER)
print("Default Clock Type : ", yappi.get_clock_type())
print("Yappi Memoery Usage : ", yappi.get_mem_usage())
print("\n================ Func Stats ===================")
stats = yappi.get_func_stats()
sorted_stats = stats.sort("ncall", sort_order="asc")
sorted_stats.print_all()
OUTPUT
Default Sort Type : totaltime
Default Sort Order : desc
Default Clock Type : wall
Yappi Memoery Usage : 134808
================ Func Stats ===================
Clock type: WALL
Ordered by: ncall, asc
name ncall tsub ttot tavg
yappi_ex2.py:8 get_sum_of_list 1 0.009664 0.050853 0.050853
builtins.next 1 0.000000 0.000000 0.000000
.. _GeneratorContextManager.__exit__ 1 0.000000 0.000000 0.000000
yappi_ex2.py:4 add 10000 0.001418 0.001418 0.000000
..method 'append' of 'list' objects> 10000 0.001230 0.001230 0.000000
..on3.7/random.py:218 Random.randint 20000 0.006829 0.038541 0.000002
..3.7/random.py:174 Random.randrange 20000 0.011719 0.031712 0.000002
...7/random.py:224 Random._randbelow 20000 0.013457 0.019993 0.000001
..hod 'bit_length' of 'int' objects> 20000 0.002394 0.002394 0.000000
..bits' of '_random.Random' objects> 25697 0.004142 0.004142 0.000000
As a part of this example, we'll explain how we can store profiling results to an output file and then reload it again.
Below we have explained the usage of the above-mentioned methods with a simple example.
CODE
import random
import yappi
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10000):
rand1 = random.randint(1,100)
rand2 = random.randint(1,100)
out = add(rand1, rand2)
final_list.append(out)
return final_list
yappi.set_clock_type("WALL")
with yappi.run(builtins=True):
l = get_sum_of_list()
# print(l)
print("Default Clock Type : ", yappi.get_clock_type())
print("Yappi Memory Usage : ", yappi.get_mem_usage())
stats = yappi.get_func_stats()
print("Saving Stats to output File")
stats.save("yappi_ex3.out")
print("Clearing existing Stats ")
stats.clear()
print("Checking Current Stats After Clearing Stats")
stats.print_all()
print("Loading Stats From File")
stats.add("yappi_ex3.out")
print("Loaded Stats")
stats.print_all()
OUTPUT
Default Clock Type : wall
Yappi Memory Usage : 134808
Saving Stats to output File
Clearing existing Stats
Checking Current Stats After Clearing Stats
Loading Stats From File
Loaded Stats
Clock type: WALL
Ordered by: totaltime, desc
name ncall tsub ttot tavg
yappi_ex3.py:8 get_sum_of_list 1 0.009884 0.051883 0.051883
..on3.7/random.py:218 Random.randint 20000 0.007478 0.039296 0.000002
..3.7/random.py:174 Random.randrange 20000 0.011625 0.031818 0.000002
...7/random.py:224 Random._randbelow 20000 0.013642 0.020193 0.000001
..bits' of '_random.Random' objects> 25673 0.004147 0.004147 0.000000
..hod 'bit_length' of 'int' objects> 20000 0.002404 0.002404 0.000000
yappi_ex3.py:4 add 10000 0.001439 0.001439 0.000000
..method 'append' of 'list' objects> 10000 0.001264 0.001264 0.000000
.. _GeneratorContextManager.__exit__ 1 0.000000 0.000000 0.000000
builtins.next 1 0.000000 0.000000 0.000000
start()
and stop()
Functions¶As a part of this function, we'll explain how we can profile a particular part of the code by using the start()
and stop()
yappi function.
CODE
import random
import yappi
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10000):
rand1 = random.randint(1,100)
rand2 = random.randint(1,100)
out = add(rand1, rand2)
final_list.append(out)
return final_list
yappi.start()
l = get_sum_of_list()
print("Is Profiling Running : ", yappi.is_running())
yappi.stop()
print("Is Profiling Running : ", yappi.is_running())
print("========== Func Stats ==============")
yappi.get_func_stats().print_all()
print("\n========== Thread Stats ==============")
yappi.get_thread_stats().print_all()
yappi.clear_stats()
yappi.start(builtins=True, profile_threads=False)
l = get_sum_of_list()
yappi.stop()
print("========== Func Stats =================")
yappi.get_func_stats().print_all()
OUTPUT
Is Profiling Running : True
Is Profiling Running : False
========== Func Stats ==============
Clock type: CPU
Ordered by: totaltime, desc
name ncall tsub ttot tavg
yappi_ex4.py:8 get_sum_of_list 1 0.029489 0.168133 0.168133
..on3.7/random.py:218 Random.randint 20000 0.025782 0.126814 0.000006
..3.7/random.py:174 Random.randrange 20000 0.030228 0.101032 0.000005
...7/random.py:224 Random._randbelow 20000 0.043764 0.070804 0.000004
yappi_ex4.py:4 add 10000 0.005995 0.005995 0.000001
========== Thread Stats ==============
name id tid ttot scnt
_MainThread 0 140700123936576 0.168184 1
========== Func Stats =================
Clock type: CPU
Ordered by: totaltime, desc
name ncall tsub ttot tavg
yappi_ex4.py:8 get_sum_of_list 1 0.025540 0.147629 0.147629
..on3.7/random.py:218 Random.randint 20000 0.022682 0.111618 0.000006
..3.7/random.py:174 Random.randrange 20000 0.026678 0.088935 0.000004
...7/random.py:224 Random._randbelow 20000 0.038397 0.062257 0.000003
..bits' of '_random.Random' objects> 25748 0.013910 0.013910 0.000001
..hod 'bit_length' of 'int' objects> 20000 0.009951 0.009951 0.000000
yappi_ex4.py:4 add 10000 0.005356 0.005356 0.000001
..method 'append' of 'list' objects> 10000 0.005115 0.005115 0.000001
As a part of this example, we have explained how we can profile multi-threaded code using yappi. We have used the same code that we had used when explaining the profiling of multi-threaded code using yappi from command prompt/shell.
CODE
import time
import random
from concurrent.futures import ThreadPoolExecutor
import yappi
def slow_random_gen():
time.sleep(2)
val = random.randint(1,100)
return val
def main_func():
ppe = ThreadPoolExecutor(max_workers=3)
futures = []
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
print([future.result() for future in futures])
yappi.start()
main_func()
yappi.stop()
#print("========== Func Stats ==============")
#yappi.get_func_stats().print_all()
print("\n========== Thread Stats ==============")
yappi.get_thread_stats().print_all()
yappi.clear_stats()
yappi.start(profile_threads=False)
main_func()
yappi.stop()
print("\n========== Thread Stats ==============")
yappi.get_thread_stats().print_all()
OUTPUT
[55, 77, 43]
========== Thread Stats ==============
name id tid ttot scnt
_MainThread 0 139640863278912 0.001367 6
Thread 1 139640835811072 0.000469 2
Thread 3 139640819025664 0.000438 2
Thread 2 139640827418368 0.000436 2
[47, 76, 59]
========== Thread Stats ==============
name id tid ttot scnt
_MainThread 0 139640863278912 0.002273 1
@profile
Decorator¶Our sixth and last example explains how we can use the @profile
decorator to inform yappi to profile only that function from the script.
clock_type
which can be set to CPU or wall
and profile_builtins
which can be set to True or False informing yappi whether to profile built-in functions or not.Below we have explained with a simple example, how we can use this decorator.
CODE
import yappi
def add(a, b):
return a+b
@yappi.profile(clock_type="wall", profile_builtins=False)
def get_sum_of_list():
final_list = []
for i in range(10000):
out = add(i, i)
final_list.append(out)
return final_list
l = get_sum_of_list()
OUTPUT
Executed in 0.037366 WALL clock seconds
Clock type: WALL
Ordered by: totaltime, desc
name ncall tsub ttot tavg
yappi_ex6.py:6 get_sum_of_list 1 0.025634 0.037346 0.037346
yappi_ex6.py:3 add 100.. 0.006410 0.006410 0.000000
This ends our small tutorial explaining how to use yappi
from command prompt/shell and in code to profile Python code. Please feel free to let us know your views in the comments section.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to