如何透過正則表示式過濾 Pandas 中的行？

正則表示式（regex）是一個字元序列，用於定義搜尋模式。為了透過正則表示式過濾 Pandas 中的行，我們可以使用 str.match() 方法。

步驟

建立一個二維、可變大小、潛在非同質表格資料，df。
列印輸入 DataFrame，df。
初始化一個用於表示式的變數 regex。提供一個字串值作為正則表示式，例如，字串 'J.*' 將過濾以字母“J”開頭的所有條目。
使用 df.column_name.str.match(regex) 透過所提供的正則表示式過濾給定列名中的所有條目。

示例

import pandas as pd

df = pd.DataFrame(
   dict(
      name=['John', 'Jacob', 'Tom', 'Tim', 'Ally'],
      marks=[89, 23, 100, 56, 90],
      subjects=["Math", "Physics", "Chemistry", "Biology", "English"]
   )
)

print "Input DataFrame is:\n", df

regex = 'J.*'
print "After applying ", regex, " DataFrame is:\n", df[df.name.str.match(regex)]

regex = 'A.*'
print "After applying ", regex, " DataFrame is:\n", df[df.name.str.match(regex)]

輸出

Input DataFrame is:

     name    marks   subjects
0    John     89        Math
1   Jacob     23     Physics
2     Tom    100   Chemistry
3     Tim     56     Biology
4    Ally     90     English

After applying J.* DataFrame is:

    name   marks   subjects
0   John     89        Math
1  Jacob     23     Physics

After applying A.* DataFrame is:

    name   marks   subjects
4   Ally     90     English

Rishikesh Kumar Rishi

更新於： 2021 年 9 月 14 日

17K+ 次瀏覽

開啟你的職業之旅

完成課程獲得認證

開始

如何透過正則表示式過濾 Pandas 中的行？

步驟

示例

輸出

開啟你的 職業之旅

開啟你的職業之旅