pprofile
- Deterministic & Statistical Profiler For Python Code¶Profiling is an important aspect of checking the performance of a running code. It gives us insight into the time taken by various parts of our code which can help us make a decision about optimizing code to make it run faster for critical applications. The python has plenty of profilers that provides profiling functionality like cProfile
, profile
, line_profiler
, memory_profiler
, guppy/heapy
etc. The profilers are generally of two types (Deterministic Profilers & Statistical profilers). The deterministic profilers take more time compared to statistical profilers because they measure each part of the code whereas statistical profilers randomly sample part of code for profiling hence bit faster but not accurate like deterministic profilers. All the profilers that we mentioned earlier are deterministic profilers. If you are interested in learning about them then please feel free to refer to the references section which has a link to all of them.
As a part of this tutorial, we'll be primarily concentrating on pprofile
profiler which provides both deterministic and statistical profiling functionality. We'll explain through various examples how to use pprofile
to profile Python code. It’s a pure python profiler inspired by Robert Kern's line_profiler
and provides profiling results for each line of code. It supports profiling in multi-threaded code as well. We can easily install pprofile
by using pip
.
We'll be explaining the usage of pprofile
with few examples in two ways:
First, we'll explain through various examples how to use pprofile
from the command line to profile scripts. We'll explain various options available to perform different types of profiling as well as different features of pprofile
.
Our first example is the profiling of a simple python script written below. The script simple loops from 0-99 and add numbers to itself by calling the add()
function. It maintains all numbers in a list and prints it at last.
CODE example_1.py
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(100):
out = add(i, i)
final_list.append(out)
return final_list
if __name__ == "__main__":
l = get_sum_of_list()
print(l)
PROFILE SCRIPT
We can simply profile any python script by calling the below command from the shell/command prompt once we have installed pprofile
. This will profile all lines of code as well as all lines of imports used. This will return the result as a table shown in the output section.
pprofile example_1.py
OUTPUT
Below we have printed output generated from profiling. The table has information about the number of times a particular line is executed, total time taken to execute that line, time is taken to execute that line each individual time, % of the time from total taken by that line from the total time, and source code of that line. We can analyze this output to check which part of our code is taking the most time and take decisions accordingly to improve it.
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198]
Command line: example_1.py
Total duration: 0.00135207s
File: example_1.py
File duration: 0.00130534s (96.54%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 102| 0.000148535| 1.45622e-06| 10.99%|def add(a, b):
2| 100| 0.000195265| 1.95265e-06| 14.44%| return a+b
3| 0| 0| 0| 0.00%|
4| 0| 0| 0| 0.00%|
5| 2| 5.96046e-06| 2.98023e-06| 0.44%|def get_sum_of_list():
6| 1| 2.38419e-06| 2.38419e-06| 0.18%| final_list = []
7| 101| 0.000189304| 1.8743e-06| 14.00%| for i in range(100):
8| 100| 0.000530481| 5.30481e-06| 39.23%| out = add(i, i)
(call)| 100| 0.000323296| 3.23296e-06| 23.91%|# example_1.py:1 add
9| 100| 0.000193834| 1.93834e-06| 14.34%| final_list.append(out)
10| 1| 1.66893e-06| 1.66893e-06| 0.12%| return final_list
11| 0| 0| 0| 0.00%|
12| 0| 0| 0| 0.00%|
13| 1| 3.09944e-06| 3.09944e-06| 0.23%|if __name__ == "__main__":
14| 1| 9.05991e-06| 9.05991e-06| 0.67%| l = get_sum_of_list()
(call)| 1| 0.00124359| 0.00124359| 91.98%|# example_1.py:5 get_sum_of_list
15| 1| 2.57492e-05| 2.57492e-05| 1.90%| print(l)
16| 0| 0| 0| 0.00%|
If our python script is taking parameters as input then we can pass those parameters one by one after the script name as explained below.
pprofile script arg1 arg2
If we want to store the output of profiling to some file rather than flooding standard output then we can use -o
or --out
option followed by output filename to which save profiling output as explained below.
pprofile -o example_1.out example_1.py
pprofile --out example_1.out example_1.py
We can even profile the python module using the pprofile
option -m
followed by the module name. It'll profile all files in that module.
pprofile -m module_name
Our second example demonstrating the usage of profiling is a python script exactly like the previous example with the only twist that now we are generating two random integers in range 1-100
and adding them.
CODE example_2.py
import random
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(10):
rand1 = random.randint(1,100)
rand2 = random.randint(1,100)
out = add(rand1, rand2)
final_list.append(out)
return final_list
if __name__ == "__main__":
l = get_sum_of_list()
print(l)
PROFILE SCRIPT
pprofile example_2.py
OUTPUT
Below we have printed output generated from profiling above script which makes use of random module of python. Please make a note that this time it profiled the random.py
file which has code for function randint
that we have used in our script. We have not included all lines of random.py
into the output here as it was flooding output making page size too long. If you run the above code on your computer then you'll be able to see the whole file. We have only included the part which is actually getting executed.
[36, 164, 85, 112, 89, 161, 7, 43, 83, 135]
Command line: example_2.py
Total duration: 0.00148344s
File: /home/sunny/anaconda3/lib/python3.7/random.py
File duration: 0.00109649s (73.92%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
72| 0| 0| 0| 0.00%|class Random(_random.Random):
174| 20| 4.22001e-05| 2.11e-06| 2.84%| def randrange(self, start, stop=None, step=1, _int=int):
184| 20| 5.29289e-05| 2.64645e-06| 3.57%| istart = _int(start)
185| 20| 4.79221e-05| 2.39611e-06| 3.23%| if istart != start:
186| 0| 0| 0| 0.00%| raise ValueError("non-integer arg 1 for randrange()")
187| 20| 4.76837e-05| 2.38419e-06| 3.21%| if stop is None:
188| 0| 0| 0| 0.00%| if istart > 0:
189| 0| 0| 0| 0.00%| return self._randbelow(istart)
190| 0| 0| 0| 0.00%| raise ValueError("empty range for randrange()")
191| 0| 0| 0| 0.00%|
192| 0| 0| 0| 0.00%| # stop argument supplied.
193| 20| 4.74453e-05| 2.37226e-06| 3.20%| istop = _int(stop)
194| 20| 4.60148e-05| 2.30074e-06| 3.10%| if istop != stop:
195| 0| 0| 0| 0.00%| raise ValueError("non-integer stop for randrange()")
196| 20| 4.57764e-05| 2.28882e-06| 3.09%| width = istop - istart
197| 20| 4.50611e-05| 2.25306e-06| 3.04%| if step == 1 and width > 0:
198| 20| 0.000138998| 6.9499e-06| 9.37%| return istart + self._randbelow(width)
(call)| 20| 0.000417709| 2.08855e-05| 28.16%|# /home/sunny/anaconda3/lib/python3.7/random.py:224 _randbelow
199| 0| 0| 0| 0.00%| if step == 1:
200| 0| 0| 0| 0.00%| raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
201| 0| 0| 0| 0.00%|
202| 0| 0| 0| 0.00%| # Non-unit step argument supplied.
203| 0| 0| 0| 0.00%| istep = _int(step)
204| 0| 0| 0| 0.00%| if istep != step:
205| 0| 0| 0| 0.00%| raise ValueError("non-integer step for randrange()")
206| 0| 0| 0| 0.00%| if istep > 0:
207| 0| 0| 0| 0.00%| n = (width + istep - 1) // istep
208| 0| 0| 0| 0.00%| elif istep < 0:
209| 0| 0| 0| 0.00%| n = (width + istep + 1) // istep
210| 0| 0| 0| 0.00%| else:
211| 0| 0| 0| 0.00%| raise ValueError("zero step for randrange()")
212| 0| 0| 0| 0.00%|
213| 0| 0| 0| 0.00%| if n <= 0:
214| 0| 0| 0| 0.00%| raise ValueError("empty range for randrange()")
215| 0| 0| 0| 0.00%|
216| 0| 0| 0| 0.00%| return istart + istep*self._randbelow(n)
217| 0| 0| 0| 0.00%|
218| 20| 3.0756e-05| 1.5378e-06| 2.07%| def randint(self, a, b):
219| 0| 0| 0| 0.00%| """Return random integer in range [a, b], including both end points.
220| 0| 0| 0| 0.00%| """
221| 0| 0| 0| 0.00%|
222| 20| 0.000133991| 6.69956e-06| 9.03%| return self.randrange(a, b+1)
(call)| 20| 0.00093174| 4.6587e-05| 62.81%|# /home/sunny/anaconda3/lib/python3.7/random.py:174 randrange
223| 0| 0| 0| 0.00%|
224| 20| 4.62532e-05| 2.31266e-06| 3.12%| def _randbelow(self, n, int=int, maxsize=1<<BPF, type=type,
225| 0| 0| 0| 0.00%| Method=_MethodType, BuiltinMethod=_BuiltinMethodType):
226| 0| 0| 0| 0.00%| "Return a random int in the range [0,n). Raises ValueError if n==0."
227| 0| 0| 0| 0.00%|
228| 20| 5.19753e-05| 2.59876e-06| 3.50%| random = self.random
229| 20| 5.10216e-05| 2.55108e-06| 3.44%| getrandbits = self.getrandbits
230| 0| 0| 0| 0.00%| # Only call self.getrandbits if the original random() builtin method
231| 0| 0| 0| 0.00%| # has not been overridden or if a new getrandbits() was supplied.
232| 20| 4.98295e-05| 2.49147e-06| 3.36%| if type(random) is BuiltinMethod or type(getrandbits) is Method:
233| 20| 4.95911e-05| 2.47955e-06| 3.34%| k = n.bit_length() # don't use (n-1) here because n can be 1
234| 20| 5.22137e-05| 2.61068e-06| 3.52%| r = getrandbits(k) # 0 <= r < 2**k
235| 25| 5.91278e-05| 2.36511e-06| 3.99%| while r >= n:
236| 5| 1.21593e-05| 2.43187e-06| 0.82%| r = getrandbits(k)
237| 20| 4.55379e-05| 2.2769e-06| 3.07%| return r
238| 0| 0| 0| 0.00%| # There's an overridden random() method but no new getrandbits() method,
239| 0| 0| 0| 0.00%| # so we can only use random() from here.
240| 0| 0| 0| 0.00%| if n >= maxsize:
241| 0| 0| 0| 0.00%| _warn("Underlying random() generator does not supply \n"
242| 0| 0| 0| 0.00%| "enough bits to choose from a population range this large.\n"
243| 0| 0| 0| 0.00%| "To remove the range limitation, add a getrandbits() method.")
244| 0| 0| 0| 0.00%| return int(random() * n)
245| 0| 0| 0| 0.00%| if n == 0:
246| 0| 0| 0| 0.00%| raise ValueError("Boundary cannot be zero")
247| 0| 0| 0| 0.00%| rem = maxsize % n
248| 0| 0| 0| 0.00%| limit = (maxsize - rem) / maxsize # int(limit * maxsize) % n == 0
249| 0| 0| 0| 0.00%| r = random()
250| 0| 0| 0| 0.00%| while r >= limit:
251| 0| 0| 0| 0.00%| r = random()
252| 0| 0| 0| 0.00%| return int(r*maxsize) % n
253| 0| 0| 0| 0.00%|
254| 0| 0| 0| 0.00%|## -------------------- sequence methods -------------------
File: example_2.py
File duration: 0.000351667s (23.71%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 2| 1.88351e-05| 9.41753e-06| 1.27%|import random
2| 0| 0| 0| 0.00%|
3| 11| 1.74046e-05| 1.58223e-06| 1.17%|def add(a, b):
4| 10| 2.07424e-05| 2.07424e-06| 1.40%| return a+b
5| 0| 0| 0| 0.00%|
6| 0| 0| 0| 0.00%|
7| 2| 5.72205e-06| 2.86102e-06| 0.39%|def get_sum_of_list():
8| 1| 2.86102e-06| 2.86102e-06| 0.19%| final_list = []
9| 11| 2.40803e-05| 2.18912e-06| 1.62%| for i in range(10):
10| 10| 6.27041e-05| 6.27041e-06| 4.23%| rand1 = random.randint(1,100)
(call)| 10| 0.000565767| 5.65767e-05| 38.14%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
11| 10| 5.93662e-05| 5.93662e-06| 4.00%| rand2 = random.randint(1,100)
(call)| 10| 0.00053072| 5.3072e-05| 35.78%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
12| 10| 5.79357e-05| 5.79357e-06| 3.91%| out = add(rand1, rand2)
(call)| 10| 3.48091e-05| 3.48091e-06| 2.35%|# example_2.py:3 add
13| 10| 2.19345e-05| 2.19345e-06| 1.48%| final_list.append(out)
14| 1| 2.14577e-06| 2.14577e-06| 0.14%| return final_list
15| 0| 0| 0| 0.00%|
16| 0| 0| 0| 0.00%|
17| 1| 2.86102e-06| 2.86102e-06| 0.19%|if __name__ == "__main__":
18| 1| 9.53674e-06| 9.53674e-06| 0.64%| l = get_sum_of_list()
(call)| 1| 0.00136471| 0.00136471| 92.00%|# example_2.py:7 get_sum_of_list
19| 1| 4.55379e-05| 4.55379e-05| 3.07%| print(l)
20| 0| 0| 0| 0.00%|
As a part of the third example, we'll explain how we can eliminate files/modules for which we don't want profiling out. The pprofile
provides an option named exclude-syspath
which will remove reference to all modules that come with python like random, datetime, sys, etc. It also provides another option named --exclude
following which we can give string and it'll eliminate all files who starts with that string from profiling output. You can also mention more than one string/file names after --exclude
but you'll be able to specify the only one after one time calling that option. If you want to exclude say for example 2 modules named module1 and module2 then you can do it with the below command.
pprofile --exclude module1 --exclude module2 example_1.py
PROFILE SCRIPT
Below we have shown two commands which we executed on our script from the second example to remove the random module from profiling output. Both the command has the same impact in this case.
pprofile --exclude-syspath example_1.py
pprofile --exclude random example_1.py
OUTPUT
[121, 99, 103, 159, 60, 148, 94, 91, 118, 120]
Command line: example_2.py
Total duration: 0.0015018s
File: example_2.py
File duration: 0.000342607s (22.81%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 2| 1.93119e-05| 9.65595e-06| 1.29%|import random
2| 0| 0| 0| 0.00%|
3| 11| 1.88351e-05| 1.71228e-06| 1.25%|def add(a, b):
4| 10| 2.02656e-05| 2.02656e-06| 1.35%| return a+b
5| 0| 0| 0| 0.00%|
6| 0| 0| 0| 0.00%|
7| 2| 6.4373e-06| 3.21865e-06| 0.43%|def get_sum_of_list():
8| 1| 3.09944e-06| 3.09944e-06| 0.21%| final_list = []
9| 11| 2.40803e-05| 2.18912e-06| 1.60%| for i in range(10):
10| 10| 7.03335e-05| 7.03335e-06| 4.68%| rand1 = random.randint(1,100)
(call)| 10| 0.000574112| 5.74112e-05| 38.23%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
11| 10| 6.07967e-05| 6.07967e-06| 4.05%| rand2 = random.randint(1,100)
(call)| 10| 0.000538826| 5.38826e-05| 35.88%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
12| 10| 5.96046e-05| 5.96046e-06| 3.97%| out = add(rand1, rand2)
(call)| 10| 3.55244e-05| 3.55244e-06| 2.37%|# example_2.py:3 add
13| 10| 2.26498e-05| 2.26498e-06| 1.51%| final_list.append(out)
14| 1| 2.14577e-06| 2.14577e-06| 0.14%| return final_list
15| 0| 0| 0| 0.00%|
16| 0| 0| 0| 0.00%|
17| 1| 3.09944e-06| 3.09944e-06| 0.21%|if __name__ == "__main__":
18| 1| 9.29832e-06| 9.29832e-06| 0.62%| l = get_sum_of_list()
(call)| 1| 0.00139403| 0.00139403| 92.82%|# example_2.py:7 get_sum_of_list
19| 1| 2.26498e-05| 2.26498e-05| 1.51%| print(l)
20| 0| 0| 0| 0.00%|
--include
Option¶It might happen sometime that string passed in --exclude
excludes all the files matching it and you want to include some of the files then you can give that file names after the --include
option. If there is more than one file then you'll need to include the --include
option more than once like --exclude
.
pprofile --include random example_1.py
Callgrind
¶The callgrind is a format that counts function calls and CPU instructions executed within each call. The callgrind format has information about each function called in each file. If you are interested in learning about callgrind format in-depth than we suggest that you go through this tutorial.
The pprofile
also lets us generate output in this format by using the -f
or --format
option followed by the string callgrind
.
PROFILE SCRIPT
pprofile -f callgrind example_2.py
pprofile --format callgrind example_2.py
OUTPUT
Below is the output generated with callgrind format. There are generally function names followed by four integers in each row. Each row is referred to as the cost line.
[29, 90, 87, 100, 96, 139, 109, 138, 63, 39]
# callgrind format
version: 1
creator: pprofile
event: usphit :microseconds/hit
events: hits microseconds usphit
cmd: example_2.py
fl=example_2.py
fn=<module>:1
1 2 19 9
3 1 4 4
7 1 3 3
17 1 3 3
18 1 9 9
cfl=example_2.py
cfn=get_sum_of_list:7
calls=1 7
18 1 1384 1384
19 1 23 23
fn=add:3
3 10 14 1
4 10 20 2
fn=get_sum_of_list:7
7 1 2 2
8 1 3 3
9 11 24 2
10 10 70 7
cfl=/home/sunny/anaconda3/lib/python3.7/random.py
cfn=randint:218
calls=10 218
10 10 571 57
11 10 60 6
cfl=/home/sunny/anaconda3/lib/python3.7/random.py
cfn=randint:218
calls=10 218
11 10 533 53
12 10 58 5
cfl=example_2.py
cfn=add:3
calls=10 3
12 10 35 3
13 10 22 2
14 1 2 2
fl=/home/sunny/anaconda3/lib/python3.7/random.py
fn=randint:218
218 20 30 1
222 20 133 6
cfl=/home/sunny/anaconda3/lib/python3.7/random.py
cfn=randrange:174
calls=20 174
222 20 940 47
fn=randrange:174
174 20 40 2
184 20 53 2
185 20 48 2
187 20 46 2
193 20 46 2
194 20 47 2
196 20 45 2
197 20 46 2
198 20 138 6
cfl=/home/sunny/anaconda3/lib/python3.7/random.py
cfn=_randbelow:224
calls=20 224
198 20 425 21
fn=_randbelow:224
224 20 45 2
228 20 52 2
229 20 50 2
232 20 50 2
233 20 50 2
234 20 55 2
235 25 61 2
236 5 12 2
237 20 46 2
The pprofile
also lets us see profiling tracing out if we want to analyze how profiler is performing and whether it’s profiling all files as needed. We can do that by using the -v
option.
PROFILE SCRIPT
pprofile -v example_1.py
OUTPUT
0.000006 call example_1.py:1 <module>+0
0.000054 line example_1.py:1 <module>+0
0.000073 line example_1.py:5 <module>+4
0.000082 line example_1.py:13 <module>+12
0.000091 line example_1.py:14 <module>+13
0.000099 call example_1.py:5 get_sum_of_list+0
0.000106 line example_1.py:6 get_sum_of_list+1
0.000114 line example_1.py:7 get_sum_of_list+2
0.000124 line example_1.py:8 get_sum_of_list+3
0.000132 call example_1.py:1 add+0
0.000139 line example_1.py:2 add+1
0.000147 return example_1.py:2 add+1
0.000182 line example_1.py:9 get_sum_of_list+4
0.000204 line example_1.py:7 get_sum_of_list+2
0.000211 line example_1.py:8 get_sum_of_list+3
0.000217 call example_1.py:1 add+0
0.000223 line example_1.py:2 add+1
0.000230 return example_1.py:2 add+1
0.000238 line example_1.py:9 get_sum_of_list+4
0.000246 line example_1.py:7 get_sum_of_list+2
0.000253 line example_1.py:8 get_sum_of_list+3
0.000260 call example_1.py:1 add+0
0.000267 line example_1.py:2 add+1
0.000274 return example_1.py:2 add+1
0.000283 line example_1.py:9 get_sum_of_list+4
0.000290 line example_1.py:7 get_sum_of_list+2
0.000297 line example_1.py:8 get_sum_of_list+3
0.000304 call example_1.py:1 add+0
0.000310 line example_1.py:2 add+1
0.000318 return example_1.py:2 add+1
0.000326 line example_1.py:9 get_sum_of_list+4
0.000334 line example_1.py:7 get_sum_of_list+2
0.000340 line example_1.py:8 get_sum_of_list+3
0.000348 call example_1.py:1 add+0
0.000354 line example_1.py:2 add+1
0.000361 return example_1.py:2 add+1
0.000370 line example_1.py:9 get_sum_of_list+4
0.000377 line example_1.py:7 get_sum_of_list+2
0.000385 line example_1.py:10 get_sum_of_list+5
0.000392 return example_1.py:10 get_sum_of_list+5
0.000417 line example_1.py:15 <module>+14
[0, 2, 4, 6, 8]
0.000431 return example_1.py:15 <module>+14
0.000473 call /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:886 __exit__+0
0.000482 line /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:890 __exit__+4
0.000492 call /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:869 disable+0
0.000500 line /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:873 disable+4
0.000508 line /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:874 disable+5
Command line: example_1.py
Total duration: 0.000518084s
File: example_1.py
File duration: 0.000416517s (80.40%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 7| 7.93934e-05| 1.13419e-05| 15.32%|def add(a, b):
2| 5| 5.84126e-05| 1.16825e-05| 11.27%| return a+b
3| 0| 0| 0| 0.00%|
4| 0| 0| 0| 0.00%|
5| 2| 1.54972e-05| 7.7486e-06| 2.99%|def get_sum_of_list():
6| 1| 8.34465e-06| 8.34465e-06| 1.61%| final_list = []
7| 6| 4.41074e-05| 7.35124e-06| 8.51%| for i in range(5):
8| 5| 8.60691e-05| 1.72138e-05| 16.61%| out = add(i, i)
(call)| 5| 9.01222e-05| 1.80244e-05| 17.40%|# example_1.py:1 add
9| 5| 4.98295e-05| 9.9659e-06| 9.62%| final_list.append(out)
10| 1| 2.14577e-05| 2.14577e-05| 4.14%| return final_list
11| 0| 0| 0| 0.00%|
12| 0| 0| 0| 0.00%|
13| 1| 8.10623e-06| 8.10623e-06| 1.56%|if __name__ == "__main__":
14| 1| 2.02656e-05| 2.02656e-05| 3.91%| l = get_sum_of_list()
(call)| 1| 0.000307322| 0.000307322| 59.32%|# example_1.py:5 get_sum_of_list
15| 1| 2.5034e-05| 2.5034e-05| 4.83%| print(l)
16| 0| 0| 0| 0.00%|
The pprofile
also lets us perform statistical profiling if deterministic profiling is taking a lot of time and we just want to oversee which parts of code are taking a lot of time and not in-depth profiling results. We can perform statistical profiling using the -s
option followed by time specifying how often the profiler should record time. Below we are performing statistical profiling every 0.00001
seconds for a script for our previous example. Though the results generated are mostly specifying 0% because the time taken by the script is quite less. We have also excluded the thread
module from the output because statistical profiling seems to make use of it and it gets included in the output.
The statistical profiling will work well when particular functions in your code are taking a reasonable amount of time so that the statistical profiler will capture only that function and won't waste much time in profiling all other parts of code that might be running reasonably fast.
PROFILE SCRIPT
pprofile -s 0.00001 --exclude thread --exclude pprofile example_1.py
OUTPUT
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198]
Command line: example_1.py
Total duration: 0.000355482s
File: example_1.py
File duration: 0s (0.00%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 0| 0| 0| 0.00%|def add(a, b):
2| 0| 0| 0| 0.00%| return a+b
3| 0| 0| 0| 0.00%|
4| 0| 0| 0| 0.00%|
5| 0| 0| 0| 0.00%|def get_sum_of_list():
6| 0| 0| 0| 0.00%| final_list = []
7| 0| 0| 0| 0.00%| for i in range(100):
8| 0| 0| 0| 0.00%| out = add(i, i)
9| 0| 0| 0| 0.00%| final_list.append(out)
10| 0| 0| 0| 0.00%| return final_list
11| 0| 0| 0| 0.00%|
12| 0| 0| 0| 0.00%|
13| 0| 0| 0| 0.00%|if __name__ == "__main__":
14| 0| 0| 0| 0.00%| l = get_sum_of_list()
15| 1| 0| 0| 0.00%| print(l)
16| 0| 0| 0| 0.00%|
The pprofile
module also provides us with functionality to monitor profiling results in a multi-threading environment. We'll need to use the -t
option followed by a maximum number of threads that we expect to monitor. Below we have created a simple script for explaining multi-threading profiling. We have created a function named slow_random_gen()
which generates a random number between 1-100 and takes 2 seconds for it. We are creating 3 different threads and calling these functions in all threads to get a random number. Please make a note that to understand the below code it’s required to have knowledge of concurrent.futures
modules of python. If you want to learn about it then please feel free to check our tutorial on the same.
CODE multi_threaded_example.py
import time
import random
from concurrent.futures import ThreadPoolExecutor
def slow_random_gen():
time.sleep(2)
val = random.randint(1,100)
return val
def main_func():
ppe = ThreadPoolExecutor(max_workers=3)
futures = []
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
print([future.result() for future in futures])
if __name__ == '__main__':
main_func()
PROFILE SCRIPT Below we have executed a script with profiling for 4 threads.
pprofile -t 4 --exclude-syspath --exclude frozen multi_threaded_example.py
OUTPUT
[34, 56, 13]
Command line: multi_threaded_example.py
Total duration: 2.01113s
File: multi_threaded_example.py
File duration: 6.00999s (298.84%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 2| 1.90735e-05| 9.53674e-06| 0.00%|import time
2| 1| 3.8147e-06| 3.8147e-06| 0.00%|import random
3| 1| 1.4782e-05| 1.4782e-05| 0.00%|from concurrent.futures import ThreadPoolExecutor
(call)| 1| 0.00248432| 0.00248432| 0.12%|# <frozen importlib._bootstrap>:1009 _handle_fromlist
4| 0| 0| 0| 0.00%|
5| 0| 0| 0| 0.00%|
6| 4| 1.12057e-05| 2.80142e-06| 0.00%|def slow_random_gen():
7| 3| 6.00906| 2.00302|298.79%| time.sleep(2)
8| 3| 0.000433922| 0.000144641| 0.02%| val = random.randint(1,100)
(call)| 3| 0.00138569| 0.000461896| 0.07%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
9| 3| 4.79221e-05| 1.5974e-05| 0.00%| return val
10| 0| 0| 0| 0.00%|
11| 2| 5.48363e-06| 2.74181e-06| 0.00%|def main_func():
12| 1| 1.12057e-05| 1.12057e-05| 0.00%| ppe = ThreadPoolExecutor(max_workers=3)
(call)| 1| 4.57764e-05| 4.57764e-05| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:113 __init__
13| 1| 3.8147e-06| 3.8147e-06| 0.00%| futures = []
14| 0| 0| 0| 0.00%|
15| 1| 1.04904e-05| 1.04904e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.00103927| 0.00103927| 0.05%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
16| 1| 1.04904e-05| 1.04904e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.000769615| 0.000769615| 0.04%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
17| 1| 1.07288e-05| 1.07288e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.000736713| 0.000736713| 0.04%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
18| 0| 0| 0| 0.00%|
19| 6| 0.000220537| 3.67562e-05| 0.01%| print([future.result() for future in futures])
(call)| 3| 2.00542| 0.668475| 99.72%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/_base.py:405 result
(call)| 1| 2.00553| 2.00553| 99.72%|# multi_threaded_example.py:19 <listcomp>
20| 0| 0| 0| 0.00%|
21| 1| 3.09944e-06| 3.09944e-06| 0.00%|if __name__ == '__main__':
22| 1| 0.000114918| 0.000114918| 0.01%| main_func()
(call)| 1| 2.00829| 2.00829| 99.86%|# multi_threaded_example.py:11 main_func
(call)| 3| 7.43866e-05| 2.47955e-05| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:167 weakref_cb
The pprofile
also provides a programming interface that lets us create profiling objects and profile only a particular part of code rather than profiling total script. We'll be explaining through various examples how we can perform profiling of python code this way as well.
The first and simple way of doing profiling in python code by using the pprofile
module is to create an object of class Profile
from it. Then we can use this object as a context manager for code for which we want to perform profiling. We can then call the print_stats()
method and it'll print profiling output.
Below we have written the same script as our earlier example. We have put the part where we are calling the main function into profiling instance context manager so that only that function will be profiled.
import pprofile
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(100):
out = add(i, i)
final_list.append(out)
return final_list
profiler = pprofile.Profile()
with profiler():
l = get_sum_of_list()
profiler.print_stats()
The profiler object has an attribute named total_time
which will return the time taken for profiling.
profiler.total_time
callgrind()
¶We can also call the callgrind()
method on profiler object passing it sys.stdout
to output callgrind format output to standard output as explained below.
import sys
profiler.callgrind(sys.stdout)
dump_stats()
¶We can also dump output to some file by calling the dump_stats()
method passing the file name to it.
profiler.dump_stats("prof_results.out")
!cat prof_results.out
run()
¶The pprofile
also have a method named run which will let us run a single line of python code and profile it. We need to provide code as a string.
pprofile.run("[i*i for i in range(100)]")
runpath()
¶The runpath()
function lets us profile python script.
profiler = pprofile.Profile()
p = profiler.runpath("/home/sunny/profiling_examples/example_1.py", [])
p.print_stats()
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198]
Total duration: 0.000794649s
File: /home/sunny/profiling_examples/example_1.py
File duration: 0.000756741s (95.23%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 102| 9.53674e-05| 9.34975e-07| 12.00%|def add(a, b):
2| 100| 0.000100851| 1.00851e-06| 12.69%| return a+b
3| 0| 0| 0| 0.00%|
4| 0| 0| 0| 0.00%|
5| 2| 4.05312e-06| 2.02656e-06| 0.51%|def get_sum_of_list():
6| 1| 2.14577e-06| 2.14577e-06| 0.27%| final_list = []
7| 101| 0.000101566| 1.00561e-06| 12.78%| for i in range(100):
8| 100| 0.000309706| 3.09706e-06| 38.97%| out = add(i, i)
(call)| 100| 0.000180483| 1.80483e-06| 22.71%|# /home/sunny/profiling_examples/example_1.py:1 add
9| 100| 0.000106335| 1.06335e-06| 13.38%| final_list.append(out)
10| 1| 9.53674e-07| 9.53674e-07| 0.12%| return final_list
11| 0| 0| 0| 0.00%|
12| 0| 0| 0| 0.00%|
13| 1| 2.14577e-06| 2.14577e-06| 0.27%|if __name__ == "__main__":
14| 1| 6.91414e-06| 6.91414e-06| 0.87%| l = get_sum_of_list()
(call)| 1| 0.000703096| 0.000703096| 88.48%|# /home/sunny/profiling_examples/example_1.py:5 get_sum_of_list
15| 1| 2.67029e-05| 2.67029e-05| 3.36%| print(l)
16| 0| 0| 0| 0.00%|
The profiler object of pprofile
also has two interesting methods named enable()
and disable()
. This method lets us enable and disable profiler whenever needed. This way we can disable profiler when we don't want to profiler particular part of code which we are sure is running fast and optimized. We can enable profiling only during parts where we feel there might be bottlenecks.
import pprofile
profiler = pprofile.Profile()
def add(a, b):
return a+b
def get_sum_of_list():
profiler.enable()
final_list = []
for i in range(100):
out = add(i, i)
final_list.append(out)
profiler.disable()
return final_list
l = get_sum_of_list()
profiler.print_stats()
We can also perform statistical profiling in python code by creating an object of class StatisticProfile
. We can then use this instance as a context manager to perform statistical profiling of code. We can call this object passing its period specifying how often sampling recording time should happen.
Below we have performed statistical profiling of the same code as previous examples. Please make a note that below you see 0% because the script is taking quite less time.
import pprofile
stats_profiler = pprofile.StatisticProfile()
def add(a, b):
return a+b
def get_sum_of_list():
final_list = []
for i in range(1000000):
out = add(i, i)
final_list.append(out)
return final_list
with stats_profiler(period=0.00001, single=False):
l = get_sum_of_list()
stats_profiler.print_stats()
Total duration: 0.118409s
File: testing_pprofile.py
File duration: 0s (0.00%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 0| 0| 0| 0.00%|import pprofile
2| 0| 0| 0| 0.00%|
3| 0| 0| 0| 0.00%|stats_profiler = pprofile.StatisticProfile()
4| 0| 0| 0| 0.00%|
5| 2| 0| 0| 0.00%|def add(a, b):
6| 1| 0| 0| 0.00%| return a+b
7| 0| 0| 0| 0.00%|
8| 0| 0| 0| 0.00%|
9| 0| 0| 0| 0.00%|def get_sum_of_list():
10| 0| 0| 0| 0.00%| final_list = []
11| 2| 0| 0| 0.00%| for i in range(1000000):
12| 9| 0| 0| 0.00%| out = add(i, i)
(call)| 3| 0| 0| 0.00%|# testing_pprofile.py:5 add
13| 8| 0| 0| 0.00%| final_list.append(out)
14| 0| 0| 0| 0.00%| return final_list
15| 0| 0| 0| 0.00%|
16| 0| 0| 0| 0.00%|
17| 0| 0| 0| 0.00%|with stats_profiler(period=0.00001, single=False):
(call)| 1| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:1119 __enter__
18| 0| 0| 0| 0.00%| l = get_sum_of_list()
(call)| 22| 0| 0| 0.00%|# testing_pprofile.py:9 get_sum_of_list
19| 0| 0| 0| 0.00%|
20| 0| 0| 0| 0.00%|stats_profiler.print_stats()
The pprofile
also lets us perform profiling for multi-threading code. It provides us with the ThreadProfile
class which lets us perform deterministic profiling in a multi-threaded environment.
Below we have performed deterministic profiling of multi-threaded code which we had designed in our previous example on multi-threading.
import pprofile
import time
import random
from concurrent.futures import ThreadPoolExecutor
def slow_random_gen():
time.sleep(5)
val = random.randint(1,100)
return val
def main_func():
ppe = ThreadPoolExecutor(max_workers=3)
futures = []
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
print([future.result() for future in futures])
threaded_profiler = pprofile.ThreadProfile()
with threaded_profiler():
main_func()
threaded_profiler.print_stats()
Total duration: 5.01263s
File: <ipython-input-143-7e1c86652dbd>
File duration: 15.0137s (299.52%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 0| 0| 0| 0.00%|import time
2| 0| 0| 0| 0.00%|import random
3| 0| 0| 0| 0.00%|from concurrent.futures import ThreadPoolExecutor
4| 0| 0| 0| 0.00%|
5| 0| 0| 0| 0.00%|
6| 3| 9.53674e-06| 3.17891e-06| 0.00%|def slow_random_gen():
7| 3| 15.013| 5.00435|299.50%| time.sleep(5)
8| 3| 0.000275612| 9.18706e-05| 0.01%| val = random.randint(1,100)
(call)| 3| 0.00121427| 0.000404755| 0.02%|# /home/sunny/anaconda3/lib/python3.7/random.py:218 randint
9| 3| 3.8147e-05| 1.27157e-05| 0.00%| return val
10| 0| 0| 0| 0.00%|
11| 1| 9.29832e-06| 9.29832e-06| 0.00%|def main_func():
12| 1| 3.0756e-05| 3.0756e-05| 0.00%| ppe = ThreadPoolExecutor(max_workers=3)
(call)| 1| 7.15256e-05| 7.15256e-05| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:113 __init__
13| 1| 3.09944e-06| 3.09944e-06| 0.00%| futures = []
14| 0| 0| 0| 0.00%|
15| 1| 1.23978e-05| 1.23978e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.00184584| 0.00184584| 0.04%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
16| 1| 1.07288e-05| 1.07288e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.000720739| 0.000720739| 0.01%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
17| 1| 1.3113e-05| 1.3113e-05| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0.00124812| 0.00124812| 0.02%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
18| 0| 0| 0| 0.00%|
19| 6| 0.000213623| 3.56038e-05| 0.00%| print([future.result() for future in futures])
(call)| 3| 5.00662| 1.66887| 99.88%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/_base.py:405 result
(call)| 1| 5.00673| 5.00673| 99.88%|# <ipython-input-143-7e1c86652dbd>:19 <listcomp>
(call)| 2| 0.00160623| 0.000803113| 0.03%|# /home/sunny/anaconda3/lib/python3.7/site-packages/ipykernel/iostream.py:386 write
20| 0| 0| 0| 0.00%|
21| 0| 0| 0| 0.00%|threaded_profiler = pprofile.ThreadProfile()
22| 0| 0| 0| 0.00%|
23| 0| 0| 0| 0.00%|with threaded_profiler():
24| 0| 0| 0| 0.00%| main_func()
(call)| 1| 5.0124| 5.0124|100.00%|# <ipython-input-143-7e1c86652dbd>:11 main_func
(call)| 3| 6.19888e-05| 2.06629e-05| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:167 weakref_cb
The pprofile
also provides us with a class named StatisticalThread
to perform statistical profiling in multi-threaded code. Below we have performed statistical profiling of multi-threaded code.
import pprofile
import time
import random
from concurrent.futures import ThreadPoolExecutor
def slow_random_gen():
time.sleep(5)
val = random.randint(1,100)
return val
def main_func():
ppe = ThreadPoolExecutor(max_workers=3)
futures = []
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
futures.append(ppe.submit(slow_random_gen))
print([future.result() for future in futures])
threaded_profiler = pprofile.StatisticalThread(period=0.00001, single=False)
with threaded_profiler():
main_func()
threaded_profiler.print_stats()
Total duration: 5.00778s
File: <ipython-input-149-73facb723294>
File duration: 0s (0.00%)
Line #| Hits| Time| Time per hit| %|Source code
------+----------+-------------+-------------+-------+-----------
1| 0| 0| 0| 0.00%|import time
2| 0| 0| 0| 0.00%|import random
3| 0| 0| 0| 0.00%|from concurrent.futures import ThreadPoolExecutor
4| 0| 0| 0| 0.00%|
5| 0| 0| 0| 0.00%|
6| 0| 0| 0| 0.00%|def slow_random_gen():
7| 95490| 0| 0| 0.00%| time.sleep(5)
8| 0| 0| 0| 0.00%| val = random.randint(1,100)
9| 0| 0| 0| 0.00%| return val
10| 0| 0| 0| 0.00%|
11| 0| 0| 0| 0.00%|def main_func():
12| 0| 0| 0| 0.00%| ppe = ThreadPoolExecutor(max_workers=3)
13| 0| 0| 0| 0.00%| futures = []
14| 0| 0| 0| 0.00%|
15| 0| 0| 0| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
16| 0| 0| 0| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 2| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
17| 0| 0| 0| 0.00%| futures.append(ppe.submit(slow_random_gen))
(call)| 1| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/thread.py:145 submit
18| 0| 0| 0| 0.00%|
19| 0| 0| 0| 0.00%| print([future.result() for future in futures])
(call)| 31837| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/concurrent/futures/_base.py:405 result
(call)| 31837| 0| 0| 0.00%|# <ipython-input-149-73facb723294>:19 <listcomp>
(call)| 1| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/site-packages/ipykernel/iostream.py:386 write
20| 0| 0| 0| 0.00%|
21| 0| 0| 0| 0.00%|threaded_profiler = pprofile.StatisticalThread(period=0.00001, single=False)
22| 0| 0| 0| 0.00%|
23| 0| 0| 0| 0.00%|with threaded_profiler():
(call)| 1| 0| 0| 0.00%|# /home/sunny/anaconda3/lib/python3.7/site-packages/pprofile/__init__.py:1119 __enter__
24| 0| 0| 0| 0.00%| main_func()
(call)| 31842| 0| 0| 0.00%|# <ipython-input-149-73facb723294>:11 main_func
This ends our small tutorial explaining how to use pprofile
to profile Python code. Please feel free to let us know your views in the comments section.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to