如何使用 Python 中的 BeautifulSoup 刪除空標籤？

BeautifulSoup 是一個從 HTML 和 XML 檔案中提取資料的 Python 庫。使用 BeautifulSoup，我們還可以刪除 HTML 或 XML 文件中存在的空標籤，並進一步將給定資料轉換為人類可讀檔案。

首先，我們將使用命令在我們的本地環境中安裝 BeautifulSoup 庫：pip install beautifulsoup4

示例

#Import the BeautifulSoup library

from bs4 import BeautifulSoup

#Get the html document
html_object = """
<p>Python is an interpreted, high-level and general-purpose
programming language. Python's design
philosophy emphasizes code readability with its notable use of
significant indentation.</p>
"""

#Let us create the soup for the given html document
soup = BeautifulSoup(html_object, "lxml")

#Iterate over each line of the document and extract the data
for x in soup.find_all():
   if len(x.get_text(strip=True)) == 0:
      x.extract()

print(soup)

輸出

執行上述程式碼將生成輸出，並將給定的 HTML 文件轉換為人類可讀程式碼，方法是刪除其中的空標籤。

<html><body><p>Python is an interpreted, high−level and general−purpose programming
language. Python's design
philosophy emphasizes code readability with its notable use of significant indentation.</p>
</body></html>

Dev Prakash Sharma

更新於： 06-Mar-2021

771 次瀏覽

開啟你的職業生涯

完成課程以獲得認證

開始

如何使用 Python 中的 BeautifulSoup 刪除空標籤？

示例

輸出

開啟你的 職業生涯

開啟你的職業生涯