Updated On : Dec-30,2021 Tags beautifulsoup, modify-ht…

Beautifulsoup: Guide To Modify HTML

BeautifulSoup is the most preferred library by the majority of developers when they need to parse and retrieve information from HTML/XML documents. Its simple API helps developers complete tasks faster. Apart from parsing HTML/XML documents and searching for information, the API of BeautifulSoup also provides many other methods which can be used to modify the HTML document itself. We can modify the text of tags, add new tags, change existing tag names, add attributes to tags, remove tags, etc. These tasks will result in modification of contents and structure of HTML/XML document. We can easily handle these kinds of tasks using the simple API of BeautifulSoup which will handle any possible problem which can arise from modification. We just need to work with the API of BeautifulSoup to modify the document. We have already covered a tutorial on how to use BeautifulSoup to parse HTML documents where we have covered the majority of the API of it. Please feel free to check that tutorial as well.

As a part of this tutorial, we'll be primarily concentrating on how to use the API of BeautifulSoup to modify the parsed HTML document. Below we have listed important sections of the tutorial to give an overview of the material covered.

Important Sections Of Tutorial

  1. Create Soup Object to Easily Parse HTML of Web Page
  2. How to Change HTML Tag Name?
  3. How to Modify Text of HTML Tag?
    • Replace Existing Text using '.string' Attribute.
    • Add Text using append() Method
    • Add List of Strings using extend() Method
  4. How to Add New Attributes to HTML Tag?
  5. How to Modify existing Attribute's value for HTML Tag?
  6. How to Create New HTML Tag and Add it in HTML Document (soup)?
    • Add New Tag using 'append()' Method
    • Add New Tag using 'insert()' Method
    • Insert Tag using 'insert_before()' and 'insert_after()' Methods
  7. How to Clear Contents of a HTML Tag?
  8. How to Remove Tag from HTML Document (soup)?
  9. How to Replace a Tag with Another Tag in HTML Document (soup)?
  10. How to Wrap HTML Tag inside Another HTML Tag?
  11. How to Replace HTML Tag with it's Contents (Unwrap HTML Tag)?
In [173]:
import bs4

print("BeautifulSoup Version : {}".format(bs4.__version__))
BeautifulSoup Version : 4.10.0

1. Create Soup Object to Easily Parse HTML of Web Page

In this section, we have created a simple HTML document with a few tags. This will make things easier to understand when we add new tags, remove tags, modify attributes, etc.

We'll be creating a BeautifulSoup object by giving this HTML document as string to BeautifulSoup() constructor. The second argument to the constructor is a string specifying a backend that it'll use to parse the HTML document. This BeautifulSoup object has parsed HTML and various methods that we'll use to modify HTML documents.

If you want to know in detail about BeautifulSoup object then please feel free to check our below tutorial.

In [341]:
sample_html= '''<html>
<head>
    <title>CoderzColumn : Developed for Developers by Developers for the betterment of Development.</title>
    <script src="static/script1.js"></script>
    <script src="static/script2.js"></script>
    <link rel="stylesheet" href="static/stylesheet.css" type="text/css" />
</head>
<body>
    <p id='start'>Welcome to CoderzColumn</p>
    <p id='main_para'>We regularly publish tutorials on various topics 
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining 
    how to use various Python libraries.</p>
    <p id='sub_para'>Below are list of Important Sections of Our Website : </p>
        <ul>
            <li><a href='https://coderzcolumn.com/blogs'>Blogs</a></li>
            <li><a href='https://coderzcolumn.com/tutorials'>Tutorials</a></li>
            <li><a href='https://coderzcolumn.com/about'>About</a></li>
            <li><a href='https://coderzcolumn.com/contact-us'>Contact US</a></li>
        </ul>
    <p id='end'>Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any 
    information about any article or want us to publish article on particular topic.</p>
</body>
</html>'''
In [342]:
from bs4 import BeautifulSoup

soup = BeautifulSoup(sample_html, 'html.parser')

print(soup)
<html>
<head>
<title>CoderzColumn : Developed for Developers by Developers for the betterment of Development.</title>
<script src="static/script1.js"></script>
<script src="static/script2.js"></script>
<link href="static/stylesheet.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<p id="start">Welcome to CoderzColumn</p>
<p id="main_para">We regularly publish tutorials on various topics
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
    how to use various Python libraries.</p>
<p id="sub_para">Below are list of Important Sections of Our Website : </p>
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
<p id="end">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>
</body>
</html>

2. How to Change HTML Tag Name?

In this section, we'll explain how we can change the name of the HTML tag. All Tag object in beautifulsoup has a property named name which holds the name of the HTML tag. We can assign a new value to this name property and it'll change the HTML tag name.

Below we have first created a copy of our original BeautifulSoup object. We have then explained modification on this new object. We'll be following this for every section where we'll create a copy of the original BeautifulSoup object and explain modifications on the copied object.

We have modified the name of a few HTML tags in copied BeautifulSoup object.

Please make a NOTE that we'll be using various methods available through BeautifulSoup to find tags in it. These methods are explained in detail in our first tutorial on BeautifulSoup hence we have not included their description here.

In [176]:
import copy

soup_new = copy.deepcopy(soup)
In [177]:
main_para = soup_new.find(id="main_para")

print(type(main_para))

main_para
<class 'bs4.element.Tag'>
Out[177]:
<p id="main_para">We regularly publish tutorials on various topics
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
    how to use various Python libraries.</p>
In [178]:
main_para.name = "div"
In [179]:
main_para
Out[179]:
<div id="main_para">We regularly publish tutorials on various topics
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
    how to use various Python libraries.</div>
In [180]:
soup_new.find(id="main_para")
Out[180]:
<div id="main_para">We regularly publish tutorials on various topics
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
    how to use various Python libraries.</div>
In [181]:
soup_new.ul
Out[181]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [182]:
soup_new.a.name = "link"
In [183]:
soup_new.ul
Out[183]:
<ul>
<li><link href="https://coderzcolumn.com/blogs">Blogs</link></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>

3. How to Modify Text of HTML Tag?

In this section, we'll explain how we can modify the text store between the start and end of a particular HTML tag. The text within the tag is stored as NavigableString object. There are different ways to modify it.

  1. Using '.string' property of Tag object.
  2. Using append() method of Tag object.
  3. Using extend() method of Tag object.

We'll explain all three ways of modifying text below with simple examples.

Replace Existing Text using '.string' Attribute.

We can replace the existing text of HTML Tag by setting a new string value to '.string' property of Tag object. It'll replace any existing string with this new string value.

In [184]:
import copy

soup_new = copy.deepcopy(soup)
In [185]:
first_link = soup_new.a

first_link
Out[185]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [186]:
first_link.string = "Blogs (143)"

first_link.string
Out[186]:
'Blogs (143)'
In [187]:
soup_new.a.string
Out[187]:
'Blogs (143)'
In [188]:
p_start = soup_new.find(id="start")

p_start
Out[188]:
<p id="start">Welcome to CoderzColumn</p>
In [189]:
p_start.string = "Welcome to CoderzColumn, Have a Great Learning Experience."
In [190]:
p_start
Out[190]:
<p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>
In [191]:
soup_new.find(id="start")
Out[191]:
<p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>

Add Text using append() Method

The append() method is available through Tag object which accepts a string and appends that string to the existing string of HTML Tag. It works like append() method of python list.

In [192]:
import copy

soup_new = copy.deepcopy(soup)
In [193]:
p_start = soup_new.p

p_start
Out[193]:
<p id="start">Welcome to CoderzColumn</p>
In [194]:
p_start.append(", Have a Great Learning Experience.")
In [195]:
p_start, soup_new.p
Out[195]:
(<p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>,
 <p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>)

Add List of Strings using extend() Method

The extend() method accepts a list of strings and appends all strings to the end of an existing string of HTML tags. It works exactly like extend() method of the python list.

In [196]:
import copy

soup_new = copy.deepcopy(soup)
In [197]:
p_start = soup_new.p

p_start
Out[197]:
<p id="start">Welcome to CoderzColumn</p>
In [198]:
p_start.extend([", ", "Have a Great", " Learning Experience", "."])
In [199]:
p_start, soup_new.p
Out[199]:
(<p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>,
 <p id="start">Welcome to CoderzColumn, Have a Great Learning Experience.</p>)

4. How to Add New Attributes to HTML Tag?

We can retrieve the value of any attribute of an HTML tag by treating Tag object like a python dictionary. We can use the same approach to add a new attribute to the HTML tag as well.

In [200]:
import copy

soup_new = copy.deepcopy(soup)
In [201]:
p_start = soup_new.p

p_start
Out[201]:
<p id="start">Welcome to CoderzColumn</p>
In [202]:
p_start["name"] = "Welcome Paragraph"
In [203]:
p_start, soup_new.p
Out[203]:
(<p id="start" name="Welcome Paragraph">Welcome to CoderzColumn</p>,
 <p id="start" name="Welcome Paragraph">Welcome to CoderzColumn</p>)
In [204]:
link = soup_new.a

link
Out[204]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [205]:
link["target"] = "_blank"
In [206]:
link, soup_new.a
Out[206]:
(<a href="https://coderzcolumn.com/blogs" target="_blank">Blogs</a>,
 <a href="https://coderzcolumn.com/blogs" target="_blank">Blogs</a>)

5. How to Modify existing Attribute's value for HTML Tag?

We can easily modify the value of any existing attribute of an HTML tag by treating Tag object as a dictionary-like object. We can set a new value by assigning a new value to an attribute by giving the attribute name as the key to Tag object.

In [207]:
import copy

soup_new = copy.deepcopy(soup)
In [208]:
link = soup_new.a

link
Out[208]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [209]:
link["href"] = "https://coderzcolumn.com/blogs_latest"
In [210]:
link, soup_new.a
Out[210]:
(<a href="https://coderzcolumn.com/blogs_latest">Blogs</a>,
 <a href="https://coderzcolumn.com/blogs_latest">Blogs</a>)
In [211]:
p_end = soup_new.find(id="end")

p_end
Out[211]:
<p id="end">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>
In [212]:
p_end["name"] = "End Paragraph"
In [213]:
p_end
Out[213]:
<p id="end" name="End Paragraph">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>
In [214]:
soup_new.find(id="end")
Out[214]:
<p id="end" name="End Paragraph">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>

6. How to Create New HTML Tag and Add it in HTML Document (soup)?

In this section, we'll explain how we can create a new HTML tag and add it to BeautifulSoup object. The standard way to create a new tag is by using new_tag() method of BeautifulSoup object. We need to provide an HTML tag name as a string to this method in order to create a new tag. It'll create a new Tag object and return it. We can then use various methods of BeautifulSoup object to add this Tag object in the HTML document. We can also provide attributes of tag followed by tag name to new_tag() method.

Below are list of methods available from BeautifulSoup and Tag objects that let us add new Tag to HTML document.

  • append()
  • insert()
  • insert_before()
  • insert_after()

Add New Tag using 'append()' Method

In this section, we have explained how we can use append() method to add new Tag to HTML document. We have created a few 'li' HTML tags and added them to our existing unordered list tag.

In [215]:
import copy

soup_new = copy.deepcopy(soup)
In [216]:
unordered_list = soup_new.ul

unordered_list
Out[216]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [217]:
new_option = soup_new.new_tag("li")

new_option
Out[217]:
<li></li>
In [218]:
unordered_list.append(new_option)
In [219]:
unordered_list
Out[219]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
<li></li></ul>
In [220]:
soup_new.ul
Out[220]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
<li></li></ul>
In [221]:
new_option = soup_new.new_tag("li")

new_option
Out[221]:
<li></li>
In [222]:
#new_link = soup_new.new_tag("a", attrs={'href':"https://coderzcolumn.com/privacy_policy"})
new_link = soup_new.new_tag("a", href="https://coderzcolumn.com/privacy_policy")

new_link.string = "Privacy Policy"

new_link
Out[222]:
<a href="https://coderzcolumn.com/privacy_policy">Privacy Policy</a>
In [223]:
new_option.append(new_link)

new_option
Out[223]:
<li><a href="https://coderzcolumn.com/privacy_policy">Privacy Policy</a></li>
In [224]:
soup_new.ul.append("\n")
soup_new.ul.append(new_option)
soup_new.ul.append("\n")
In [225]:
soup_new.ul
Out[225]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
<li></li>
<li><a href="https://coderzcolumn.com/privacy_policy">Privacy Policy</a></li>
</ul>

Add New Tag using 'insert()' Method

In this section, we have explained how we can insert a new tag using insert() method. We need to provide an index of tag as the first argument to insert() method followed by Tag object to insert an object at a particular location in an HTML document.

In [250]:
import copy

soup_new = copy.deepcopy(soup)
In [251]:
link = soup_new.a

link
Out[251]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [252]:
link.insert(5, " (143)")
In [253]:
link
Out[253]:
<a href="https://coderzcolumn.com/blogs">Blogs (143)</a>
In [227]:
unordered_list = soup_new.ul

unordered_list
Out[227]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [228]:
new_option = soup_new.new_tag("li")

new_link = soup_new.new_tag("a", href="https://coderzcolumn.com/privacy_policy")

new_link.string = "Privacy Policy"

new_option.append(new_link)

new_option
Out[228]:
<li><a href="https://coderzcolumn.com/privacy_policy">Privacy Policy</a></li>
In [229]:
unordered_list.insert(0, "\n")
unordered_list.insert(0, new_option)
unordered_list.insert(0, "\n")
In [230]:
unordered_list
Out[230]:
<ul>
<li><a href="https://coderzcolumn.com/privacy_policy">Privacy Policy</a></li>

<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>

Insert Tag using 'insert_before()' and 'insert_after()' Methods

The insert_before() and insert_after() methods works like insert() method. They let us insert HTML tag before and after specified HTML tag. Below we have explained with simple examples how we can use them to add tags to HTML document.

In [275]:
import copy

soup_new = copy.deepcopy(soup)
In [276]:
p_inter1 = soup_new.new_tag("p", id="intermediate_para1")

p_inter1.string = "We have more than 250 Tutorials on Python."

p_inter1
Out[276]:
<p id="intermediate_para1">We have more than 250 Tutorials on Python.</p>
In [277]:
soup_new.ul.insert_before(p_inter1)
In [278]:
soup_new.find(id="intermediate_para1")
Out[278]:
<p id="intermediate_para1">We have more than 250 Tutorials on Python.</p>
In [279]:
p_inter2 = soup_new.new_tag("p", id="intermediate_para2")

p_inter2.string = "We have more than 50 Tutorials on Digital marketing."

p_inter2
Out[279]:
<p id="intermediate_para2">We have more than 50 Tutorials on Digital marketing.</p>
In [280]:
soup_new.ul.insert_after(p_inter2)
In [281]:
soup_new.find(id="intermediate_para2")
Out[281]:
<p id="intermediate_para2">We have more than 50 Tutorials on Digital marketing.</p>
In [282]:
soup_new.find_all("p")
Out[282]:
[<p id="start">Welcome to CoderzColumn</p>,
 <p id="main_para">We regularly publish tutorials on various topics
     (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
     how to use various Python libraries.</p>,
 <p id="sub_para">Below are list of Important Sections of Our Website : </p>,
 <p id="intermediate_para1">We have more than 250 Tutorials on Python.</p>,
 <p id="intermediate_para2">We have more than 50 Tutorials on Digital marketing.</p>,
 <p id="end">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
     information about any article or want us to publish article on particular topic.</p>]
In [283]:
bold = soup_new.new_tag("b")

bold.string = " (143)"

bold
Out[283]:
<b> (143)</b>
In [284]:
soup_new.a
Out[284]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [285]:
soup_new.a.string.insert_after(bold)
In [286]:
soup_new.a
Out[286]:
<a href="https://coderzcolumn.com/blogs">Blogs<b> (143)</b></a>
In [287]:
italic = soup_new.new_tag("i")

italic.string = "All "

italic
Out[287]:
<i>All </i>
In [297]:
soup_new.a.contents
Out[297]:
['Blogs', <b> (143)</b>]
In [298]:
soup_new.a.contents[0].insert_before(italic)
In [299]:
soup_new.a
Out[299]:
<a href="https://coderzcolumn.com/blogs"><i>All </i>Blogs<b> (143)</b></a>

7. How to Clear Contents of a HTML Tag?

The Tag and BeautifulSoup objects provide a method named clear() which can be used to create text content as well as all subtags of the given tag. The method will delete all sub-tags and text of the HTML tag on which it is called.

Below we have explained with a few simple examples what are the uses of clear() method.

In [300]:
import copy

soup_new = copy.deepcopy(soup)
In [302]:
soup_new.ul
Out[302]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [303]:
soup_new.ul.clear()
In [304]:
soup_new.ul
Out[304]:
<ul></ul>
In [309]:
p_end = soup_new.find(id="end")

p_end
Out[309]:
<p id="end">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>
In [310]:
p_end.clear()
In [311]:
p_end
Out[311]:
<p id="end"></p>
In [312]:
soup_new.find(id="end")
Out[312]:
<p id="end"></p>

8. How to Remove Tag from HTML Document (soup)?

In this section, we have explained how we can remove a particular HTML tag from BeautifulSoup object. The Tag object provides a method named extract() which when called returns that Tag object removing it from main BeautifulSoup object containing whole HTML document. We can call extract() method on any Tag object and it'll be removed from the soup object and returned.

Below we have explained a few examples demonstrating how extract() method works. We need to call extract() method on Tag object that we want to remove from the soup object.

In [343]:
import copy

soup_new = copy.deepcopy(soup)
In [344]:
soup_new.ul
Out[344]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [345]:
soup_new.ul.a.extract()
Out[345]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [346]:
soup_new.ul
Out[346]:
<ul>
<li></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [347]:
soup_new.p
Out[347]:
<p id="start">Welcome to CoderzColumn</p>
In [348]:
soup_new.p.string.extract()
Out[348]:
'Welcome to CoderzColumn'
In [349]:
soup_new.p
Out[349]:
<p id="start"></p>
In [351]:
soup_new.li.parent
Out[351]:
<ul>
<li></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [352]:
soup_new.li.parent.extract()
Out[352]:
<ul>
<li></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [354]:
soup_new.body
Out[354]:
<body>
<p id="start"></p>
<p id="main_para">We regularly publish tutorials on various topics
    (Python, Machine learning, Data Visualization, Digital Marketing, etc.) regularly explaining
    how to use various Python libraries.</p>
<p id="sub_para">Below are list of Important Sections of Our Website : </p>

<p id="end">Please feel free to send us mail @ coderzcolumn07@gmail.com if you need any
    information about any article or want us to publish article on particular topic.</p>
</body>

9. How to Replace a Tag with Another Tag in HTML Document (soup) ?

In this section, we have explained how we can replace one HTML tag with another in an HTML document. The Tag object has a method named replace_with() which can replace whatever is given to it with the Tag object in the main BeautifulSoup object. We can provide a string to replace_with() and it'll replace the original HTML tag with that string. We can provide another Tag object to replace_with() and it'll replace original HTML tag with this new tag represented through Tag object. We need to call replace_with() on Tag object that we want to replace in BeautifulSoup object.

Below we have explained with a few examples how we can use replace_with() to replace a particular HTML tag from a document.

In [355]:
import copy

soup_new = copy.deepcopy(soup)
In [357]:
soup_new.ul
Out[357]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [359]:
first_link = soup_new.find("a")

first_link
Out[359]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [361]:
first_link.replace_with("Blogs")
Out[361]:
<a href="https://coderzcolumn.com/blogs">Blogs</a>
In [363]:
soup_new.ul
Out[363]:
<ul>
<li>Blogs</li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [366]:
soup_new.ul.li.string, type(soup_new.ul.li.string)
Out[366]:
('Blogs', bs4.element.NavigableString)
In [367]:
new_link_tag = soup_new.new_tag("link", href="https://coderzcolumn.com/blogs")

new_link_tag
Out[367]:
<link href="https://coderzcolumn.com/blogs"/>
In [370]:
soup_new.ul.li.string.replace_with(new_link_tag)
Out[370]:
'Blogs'
In [371]:
soup_new.ul
Out[371]:
<ul>
<li><link href="https://coderzcolumn.com/blogs"/></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>

10. How to Wrap HTML Tag inside Another HTML Tag?

In this section, we have explained how we can wrap one HTML tag inside of another new HTML tag. The Tag object provides a method named wrap() which accepts another Tag object and wraps main Tag object inside of this provided Tag object. We can need to call wrap() method on Tag object which we want to wrap into another Tag object that we provided to wrap() method.

Below we have explained with examples how we can wrap one HTML tag inside of another using wrap() method.

In [372]:
import copy

soup_new = copy.deepcopy(soup)
In [374]:
soup_new.ul
Out[374]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [375]:
bold = soup_new.new_tag("b")

bold
Out[375]:
<b></b>
In [377]:
soup_new.ul.li.a.wrap(bold)
Out[377]:
<b><a href="https://coderzcolumn.com/blogs">Blogs</a></b>
In [378]:
soup_new.ul
Out[378]:
<ul>
<li><b><a href="https://coderzcolumn.com/blogs">Blogs</a></b></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [376]:
italic = soup_new.new_tag("i")

italic
Out[376]:
<i></i>
In [379]:
soup_new.ul.li.a.string.wrap(italic)
Out[379]:
<i>Blogs</i>
In [380]:
soup_new.ul
Out[380]:
<ul>
<li><b><a href="https://coderzcolumn.com/blogs"><i>Blogs</i></a></b></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>

11. How to Replace HTML Tag with it's Contents (Unwrap HTML Tag) ?

In this section, we have explained how we can replace the HTML tag with its content in an HTML document. The Tag object provides us with method named unwrap() that let us replace the Tag with it's contents inside of BeautifulSoup object. We can call unwrap() method on any Tag object and it'll replace that Tag object with it's content inside of BeautifulSoup object. This method is kind of the opposite of wrap() method we explained in the previous section.

Below we have explained with a few simple examples how we can use unwrap() method.

In [381]:
import copy

soup_new = copy.deepcopy(soup)
In [382]:
soup_new.ul
Out[382]:
<ul>
<li><a href="https://coderzcolumn.com/blogs">Blogs</a></li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [383]:
soup_new.ul.li.a.unwrap()
Out[383]:
<a href="https://coderzcolumn.com/blogs"></a>
In [384]:
soup_new.ul
Out[384]:
<ul>
<li>Blogs</li>
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>
In [385]:
soup_new.ul.li.unwrap()
Out[385]:
<li></li>
In [386]:
soup_new.ul
Out[386]:
<ul>
Blogs
<li><a href="https://coderzcolumn.com/tutorials">Tutorials</a></li>
<li><a href="https://coderzcolumn.com/about">About</a></li>
<li><a href="https://coderzcolumn.com/contact-us">Contact US</a></li>
</ul>

This ends our small tutorial explaining how we can modify the contents of an HTML document parsed as BeautifulSoup object. Please feel free to let us know your views in the comments section.

References

Sunny Solanki  Sunny Solanki

 Want to Share Your Views? Have Any Suggestions?

If you want to

  • provide some suggestions on topic
  • share your views
  • include some details in tutorial
  • suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking HERE.