Updated On : Nov-28,2019 Time Investment : ~15 mins


glob module helps in finding all paths which match particular patters in Unix shell.

It can handle *, ? and the characters expressed in [].

It can not handle tilde expansion (~ - user home directory) though.

It returns results in arbitrary order than proper sequence.

It supports both relative path matching and absolute path matching as well.

import glob
import sys

print('Creating directory structures and files for experimentation purpose.')
%mkdir folder_l1_1
%mkdir folder_l1_2
%mkdir folder_l1_1/folder_l2

!touch temp.txt temp.jpg temp.png a.png b.jpg c.txt d.txt t1.png t2.jpg .temp.png .temp.jpg
!touch folder_l1_1/t.txt folder_l1_1/t2.png folder_l1_1/a.mp4 folder_l1_1/b.mp4 folder_l1_1/c.mpeg folder_l1_1/t3.jpg
!touch folder_l1_2/t2.txt folder_l1_2/t1.png folder_l1_2/b.mp4 folder_l1_2/c.mp4 folder_l1_2/d.mpeg
!touch folder_l1_1/folder_l2/t.txt folder_l1_1/folder_l2/t2.png folder_l1_1/folder_l2/a.mp4

print('\nCurrent directory contents : ')
print('\nfolder_l1_1 directory contents : ')
%ls folder_l1_1
print('\nfolder_l1_2 directory contents :')
%ls folder_l1_2
print('\nfolder_l1_1/folder_l2 directory contents : ')
%ls folder_l1_1/folder_l2
Creating directory structures and files for experimentation purpose.

Current directory contents :
__notebook_source__.ipynb  c.txt         folder_l1_2/  temp.jpg
a.png                      d.txt         t1.png        temp.png
b.jpg                      folder_l1_1/  t2.jpg        temp.txt

folder_l1_1 directory contents :
a.mp4  b.mp4  c.mpeg  folder_l2/  t.txt  t2.png  t3.jpg

folder_l1_2 directory contents :
b.mp4  c.mp4  d.mpeg  t1.png  t2.txt

folder_l1_1/folder_l2 directory contents :
a.mp4  t.txt  t2.png
  • glob(pathname, recursive=False) - Returns all paths which matches pattern. If recursive is True with ** then it looks in a subdirectory as well.
print(glob.glob('[a-z]+.txt')) ## This does not work. Only *,? and [] works.
['d.txt', 'c.txt', 'temp.txt']
['temp.png', 't1.png', 'a.png']
['temp.jpg', 't2.jpg', 'b.jpg']
['temp.png', 't1.png', 'temp.jpg', 't2.jpg', 'a.png', 'b.jpg']
['.temp.jpg', '.temp.png', '.ipynb_checkpoints']
['.temp.jpg', '.temp.png']
['d.txt', 'c.txt', 'temp.txt']
['t1.png', 't2.jpg']
['t1.png', 't2.jpg']
['t1.png', 't2.jpg']
['t1.png', 't2.jpg']
['folder_l1_1/t.txt', 'folder_l1_2/t2.txt']
['folder_l1_1/t2.png', 'folder_l1_2/t1.png']
['folder_l1_1/t2.png', 'folder_l1_1/c.mpeg', 'folder_l1_1/t3.jpg', 'folder_l1_2/t1.png', 'folder_l1_2/d.mpeg']
['folder_l1_1/folder_l2/t2.png', 'folder_l1_1/folder_l2/a.mp4', 'folder_l1_1/folder_l2/t.txt']
['folder_l1_2/t1.png', 'folder_l1_2/b.mp4', 'folder_l1_2/d.mpeg', 'folder_l1_2/c.mp4', 'folder_l1_2/t2.txt']
['d.txt', 'c.txt', 'temp.txt', 'folder_l1_1/t.txt', 'folder_l1_1/folder_l2/t.txt', 'folder_l1_2/t2.txt']
['temp.png', 't1.png', 'a.png', 'folder_l1_1/t2.png', 'folder_l1_1/folder_l2/t2.png', 'folder_l1_2/t1.png']
['temp.png', 't1.png', 'temp.jpg', 't2.jpg', 'a.png', 'b.jpg', 'folder_l1_1/t2.png', 'folder_l1_1/c.mpeg', 'folder_l1_1/t3.jpg', 'folder_l1_1/folder_l2/t2.png', 'folder_l1_2/t1.png', 'folder_l1_2/d.mpeg']
  • iglob(pathname, recursive=False) - Returns iterator of all paths which matches pattern. If recursive is True with ** then it looks in a subdirectory as well. It's better to use iterator when lots of files can match pattern because it won't keep all paths matching in memory. We can avoid memory issues by not keeping a big list in memory.
%time normal_dir_list = glob.glob('**/*.*g',recursive=True) ## this one takes more time and memory because it keeps all matching paths in memory after generating.
%time iterator_dir_list = glob.iglob('**/*.*g',recursive=True) ## This takes quite less time as it just generates iterator but does create whole list in memory. Generates element base on call to retrieve element from iterator.
print('Size of list : %d bytes'%sys.getsizeof(normal_dir_list))
print('Size of iterator : %d bytes'%sys.getsizeof(iterator_dir_list))
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 2.06 ms
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 9.78 µs
Size of list : 160 bytes
Size of iterator : 88 bytes
  • escape(pathname) - Escapes all special characters[*,?, [] ] in pathname which can be useful if we have special characters are present in pathname.
!touch temp?tea.txt
Sunny Solanki  Sunny Solanki

YouTube Subscribe Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Need Help Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Share Views Want to Share Your Views? Have Any Suggestions?

If you want to

  • provide some suggestions on topic
  • share your views
  • include some details in tutorial
  • suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

Subscribe to Our YouTube Channel

YouTube SubScribe

Newsletter Subscription