Beautiful Soup - decode() 方法

方法描述

Beautiful Soup 中的 decode() 方法返回解析樹的字串或 Unicode 表示形式，作為 HTML 或 XML 文件。該方法使用為編碼註冊的編解碼器解碼位元組。其功能與 encode() 方法相反。呼叫 encode() 獲取位元組串，呼叫 decode() 獲取 Unicode。讓我們透過一些例子來學習 decode() 方法。

語法

decode(pretty_print, encoding, formatter, errors)

引數

pretty_print − 如果為 True，將使用縮排使文件更易讀。
encoding − 最終文件的編碼。如果為 None，則文件將是 Unicode 字串。
formatter − Formatter 物件，或命名其中一個標準格式化程式的字串。
errors − 用於處理解碼錯誤的錯誤處理方案。值為 'strict'、'ignore' 和 'replace'。

返回值

decode() 方法返回一個 Unicode 字串。

示例

from bs4 import BeautifulSoup

soup = BeautifulSoup("Hello “World!”", 'html.parser')
enc = soup.encode('utf-8')
print (enc)
dec = enc.decode()
print (dec)

輸出

b'Hello \xe2\x80\x9cWorld!\xe2\x80\x9d'
Hello "World!"

列印頁面