如何使用 Python Scikit-learn 從資料集獲取類似字典的物件?
藉助 Scikit-learn Python 庫,我們可以獲取資料集的類似字典的物件。一些有趣的類似字典的物件屬性如下:
data - 它表示要學習的資料。
target - 它表示迴歸目標。
DESCR - 資料集的描述。
target_names - 它給出資料集的目標名稱。
feature_names - 它給出資料集的特徵名稱。
示例 1
在下面的示例中,我們使用加州住房資料集來獲取其類似字典的物件。
# Import necessary libraries import sklearn import pandas as pd from sklearn.datasets import fetch_california_housing # Loading the California housing dataset housing = fetch_california_housing() # Print dictionary-like objects print(housing.keys())
輸出
它將產生以下輸出:
dict_keys(['data', 'target', 'frame', 'target_names', 'feature_names', 'DESCR'])
示例 2
我們還可以獲取有關這些類似字典的物件的更多詳細資訊,如下所示:
# Import necessary libraries import sklearn import pandas as pd from sklearn.datasets import fetch_california_housing print(housing.data.shape) print('\n') print(housing.target.shape) print('\n') print(housing.feature_names) print('\n') print(housing.target_names) print('\n') print(housing.DESCR)
輸出
它將產生以下輸出:
(20640, 8)
(20640,)
['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', 'AveOccup', 'Latitude', 'Longitude']
['MedHouseVal']
.. _california_housing_dataset:
California Housing dataset
--------------------------
**Data Set Characteristics:**
:Number of Instances: 20640
:Number of Attributes: 8 numeric, predictive attributes and the target
:Attribute Information:
- MedInc median income in block group
- HouseAge median house age in block group
- AveRooms average number of rooms per household
- AveBedrms average number of bedrooms per household
- Population block group population
- AveOccup average number of household members
- Latitude block group latitude
- Longitude block group longitude
:Missing Attribute Values: None
Omitted due to length of the output…
示例 3
# Import necessary libraries import sklearn import pandas as pd from sklearn.datasets import fetch_california_housing # Loading the California housing dataset housing = fetch_california_housing(as_frame=True) print(housing.frame.info())
輸出
它將產生以下輸出:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 20640 entries, 0 to 20639 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 MedInc 20640 non-null float64 1 HouseAge 20640 non-null float64 2 AveRooms 20640 non-null float64 3 AveBedrms 20640 non-null float64 4 Population 20640 non-null float64 5 AveOccup 20640 non-null float64 6 Latitude 20640 non-null float64 7 Longitude 20640 non-null float64 8 MedHouseVal 20640 non-null float64 dtypes: float64(9) memory usage: 1.4 MB
廣告
資料結構
網路
關係資料庫管理系統
作業系統
Java
iOS
HTML
CSS
Android
Python
C 語言程式設計
C++
C#
MongoDB
MySQL
Javascript
PHP