Updated On : Mar-10,2021 Tags mimetypes
mimetypes - Guide to Determine MIME Type of File

mimetypes - Guide to Determine MIME Type of File

Multipurpose Internet Mail Extensions (MIME) type is a string that is used to identify a type of data that a particular file contains. It's generally used to represents the type of file on the Internet (usually in mails) so that software handling the data can understand how to handle it. The MIME type is a string separated by a slash where the first value is the main type and the second value is the subtype. The emails earlier used to contain only text but it has started supporting attachment with data types like audio, video, XML, pdf, etc. All these data types are stored in files of different formats. The MIME types provide the name which will be used to identify each file type.

Developers many times do not know the MIME type of the file and need it to be determined by itself. Python provides a module named mimetypes that provides a list of methods that has a mapping from file extensions to MIME type and vice-versa. As a part of this tutorial, we'll explain various methods of mimetypes module to find out MIME type based on file URL and vice-versa.

We'll start by importing mimetypes module.

In [1]:
import mimetypes

Determine MIME Type Based on URL/File Name


Important Methods of mimetypes Module

  • guess_type(url=None,strict=True) - This method accepts URL or filename as input and returns a tuple of two values where first value is MIME type and second value is file encoding. The encoding value is used in Content-Encoding http header.
    • If strict is True then it only considers IANA approved MIME types.

Below we have explained with few simple examples how we can use guess_type() method to determine the MIME type of the file/URL.

In [20]:
mime_type, encoding = mimetypes.guess_type("https://docs.python.org/3/library/platform.html")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("Deploying a Django Application to Google App Engine.pdf")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("docs.zip")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("brazil_flights_data.csv")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("dr_apj_kalam.jpeg")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("https://coderzcolumn.com/sitemap.xml")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))

mime_type, encoding = mimetypes.guess_type("Videos/snakeviz_1.mp4")
print("MIME Type : {:20s}, Encoding : {}".format(mime_type, encoding))
MIME Type : text/html           , Encoding : None
MIME Type : application/pdf     , Encoding : None
MIME Type : application/zip     , Encoding : None
MIME Type : text/csv            , Encoding : None
MIME Type : image/jpeg          , Encoding : None
MIME Type : application/xml     , Encoding : None
MIME Type : video/mp4           , Encoding : None

Determine File Extension from MIME Type


Important Methods of mimetypes Module

  • guess_extension(mime_type, strict=True) - This method accepts MIME type string as input and returns a file extension which is represented by it. The strict parameter works exactly like guess_type() method.

Below we have first retrieved the MIME type of a few files and URLs and then used it to determine file extensions to explain the usage of guess_extension() method.

In [21]:
mime_type, encoding = mimetypes.guess_type("https://docs.python.org/3/library/platform.html")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("Deploying a Django Application to Google App Engine.pdf")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("docs.zip")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("brazil_flights_data.csv")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("dr_apj_kalam.jpeg")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("https://coderzcolumn.com/sitemap.xml")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("Videos/snakeviz_1.mp4")
extension = mimetypes.guess_extension(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))
MIME Type : text/html           , Extension : .htm
MIME Type : application/pdf     , Extension : .pdf
MIME Type : application/zip     , Extension : .zip
MIME Type : text/csv            , Extension : .csv
MIME Type : image/jpeg          , Extension : .jpe
MIME Type : application/xml     , Extension : .rdf
MIME Type : video/mp4           , Extension : .mp4

Determine All File Extensions Represented by MIME Type


Important Methods of mimetypes Module

  • guess_all_extensions(mime_type, strict=True) - This method accepts MIME type string as input and returns a list of file extensions which is represented by it. The strict parameter works exactly like guess_type() method. The same MIME-type is generally used to represent files of more than one type in few situations.

Our code for this example is almost the same as our previous example with the only difference that we have used guess_all_extensions() method instead.

In [22]:
mime_type, encoding = mimetypes.guess_type("https://docs.python.org/3/library/platform.html")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("Deploying a Django Application to Google App Engine.pdf")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("docs.zip")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("brazil_flights_data.csv")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("dr_apj_kalam.jpeg")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("https://coderzcolumn.com/sitemap.xml")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))

mime_type, encoding = mimetypes.guess_type("Videos/snakeviz_1.mp4")
extension = mimetypes.guess_all_extensions(mime_type)
print("MIME Type : {:20s}, Extension : {}".format(mime_type, extension))
MIME Type : text/html           , Extension : ['.htm', '.html', '.shtml']
MIME Type : application/pdf     , Extension : ['.pdf']
MIME Type : application/zip     , Extension : ['.zip']
MIME Type : text/csv            , Extension : ['.csv']
MIME Type : image/jpeg          , Extension : ['.jpe', '.jpeg', '.jpg']
MIME Type : application/xml     , Extension : ['.rdf', '.wsdl', '.xpdl', '.xsl', '.xml', '.xsd']
MIME Type : video/mp4           , Extension : ['.mp4']

MIME Type Details Files


Important Attributes of mimetypes Module

  • knownfiles - This attribute returns a list of files (.types files) that have information about the mapping from MIME type and file extensions.
  • suffix_map - This attribute returns mapping from file suffixes to suffixes.
  • encodings_map - This attribute returns mapping from file extensions to encoding.
  • types_map - This attribute returns mapping from all file extensions to MIME Type.
  • common_types - This attribute returns mapping from file extensions to MIME Type for commonly found MIME types.

In [51]:
print("List of files having MIME mapping details : {}".format(mimetypes.knownfiles))

print("\nMIME Types Suffix Mapping : {}".format(mimetypes.suffix_map))

print("\nExtension to Encoding Mapping : {}".format(mimetypes.encodings_map))

types_mapping = mimetypes.types_map
print("\nMIME Types Mapping : {}".format(list(types_mapping.items())[:10]))
print("\nMIME Type of 3gp : {}".format(types_mapping['.3gp']))
print("MIME Type of 3gp : {}".format(types_mapping['.mp4']))
print("MIME Type of 3gp : {}".format(types_mapping['.pdf']))

print("\nCommon MIME Types Mapping : {}".format(mimetypes.common_types))
List of files having MIME mapping details : ['/etc/mime.types', '/etc/httpd/mime.types', '/etc/httpd/conf/mime.types', '/etc/apache/mime.types', '/etc/apache2/mime.types', '/usr/local/etc/httpd/conf/mime.types', '/usr/local/lib/netscape/mime.types', '/usr/local/etc/httpd/conf/mime.types', '/usr/local/etc/mime.types']

MIME Types Suffix Mapping : {'.svgz': '.svg.gz', '.tgz': '.tar.gz', '.taz': '.tar.gz', '.tz': '.tar.gz', '.tbz2': '.tar.bz2', '.txz': '.tar.xz'}

Extension to Encoding Mapping : {'.gz': 'gzip', '.Z': 'compress', '.bz2': 'bzip2', '.xz': 'xz'}

MIME Types Mapping : [('.a', 'application/octet-stream'), ('.ai', 'application/postscript'), ('.aif', 'audio/x-aiff'), ('.aifc', 'audio/x-aiff'), ('.aiff', 'audio/x-aiff'), ('.au', 'audio/basic'), ('.avi', 'video/x-msvideo'), ('.bat', 'application/x-msdos-program'), ('.bcpio', 'application/x-bcpio'), ('.bin', 'application/octet-stream')]

MIME Type of 3gp : video/3gpp
MIME Type of 3gp : video/mp4
MIME Type of 3gp : application/pdf

Common MIME Types Mapping : {'.jpg': 'image/jpg', '.mid': 'audio/midi', '.midi': 'audio/midi', '.pct': 'image/pict', '.pic': 'image/pict', '.pict': 'image/pict', '.rtf': 'application/rtf', '.xul': 'text/xul'}

Read MIME Types File


  • read_mime_types(filename) - This method accepts a file name and reads MIME type to file extension details from it. It returns a dictionary that has a mapping from file extension to MIME type read from the file.

In [56]:
mime_types = mimetypes.read_mime_types("/etc/mime.types")

list(mime_types.items())[:10]
Out[56]:
[('.a', 'application/octet-stream'),
 ('.ai', 'application/postscript'),
 ('.aif', 'audio/x-aiff'),
 ('.aifc', 'audio/x-aiff'),
 ('.aiff', 'audio/x-aiff'),
 ('.au', 'audio/basic'),
 ('.avi', 'video/x-msvideo'),
 ('.bat', 'application/x-msdos-program'),
 ('.bcpio', 'application/x-bcpio'),
 ('.bin', 'application/octet-stream')]

Add New MIME Type to File Extension Mapping


  • add_type(mime_type, file_extension, strict=True) - This method accepts MIME type and file extension as input and then add the mapping to list of all mappings available through types_map attribute.

In [62]:
mimetypes.add_type("text/json", ".ipynb", strict=True)

types_mapping = mimetypes.types_map

print(".ipynb present in mapping? : {}".format(".ipynb" in types_mapping))

print("\nLast few mappings : {}".format(list(types_mapping.items())[-5:]))
.ipynb present in mapping? : True

Last few mappings : [('.mkv', 'video/x-matroska'), ('.ice', 'x-conference/x-cooltalk'), ('.sisx', 'x-epoc/x-sisx-app'), ('.vrm', 'x-world/x-vrml'), ('.ipynb', 'text/json')]

This ends our small tutorial explaining how we can use mimetypes module to determine MIME types from file/URL and vice-versa. Please feel free to let us know your views in the comments section.

References



Sunny Solanki  Sunny Solanki