Python - 從 Pandas DataFrame 中刪除重複值
若要從 Pandas DataFrame 中刪除重複值,請使用 drop_duplicates() 方法。首先,建立一個有 3 列的 DataFrame −
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'],'UnitsSold': [95, 70, 80, 95, 70, 90]})刪除重複值 −
dataFrame = dataFrame.drop_duplicates()
範例
以下是完整的程式碼 −
import pandas as pd
# Create DataFrame
dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'], 'UnitsSold': [95, 70, 80, 95, 70, 90]})
print"Dataframe...\n", dataFrame
# counting frequency of column Car
count = dataFrame['Car'].value_counts()
print"\nCount in column Car"
print(count)
# removing duplicates
dataFrame = dataFrame.drop_duplicates()
print"\nUpdated DataFrame after removing duplicates...\n",dataFrame
# counting frequency of column Car after removing duplicates
count = dataFrame['Car'].value_counts()
print"\nCount in column Car"
print(count)輸出
將產生以下輸出 −
Dataframe... Car Place UnitsSold 0 BMW Delhi 95 1 Mercedes Hyderabad 70 2 Lamborghini Chandigarh 80 3 BMW Delhi 95 4 Mercedes Hyderabad 70 5 Porsche Mumbai 90 Count in column Car BMW 2 Mercedes 2 Porsche 1 Lamborghini 1 Name: Car, dtype: int64 Updated DataFrame after removing duplicates... Car Place UnitsSold 0 BMW Delhi 95 1 Mercedes Hyderabad 70 2 Lamborghini Chandigarh 80 5 Porsche Mumbai 90 Count in column Car BMW 1 Porsche 1 Lamborghini 1 Mercedes 1 Name: Car, dtype: int64
廣告資訊
資料結構
網路
RDBMS
作業系統
Java
iOS
HTML
CSS
Android
Python
C 程式設計
C++
C#
MongoDB
MySQL
Javascript
PHP