Python程式計算標準差

Python 伺服器端程式設計程式設計

在本文中，我們將學習如何實現一個Python程式來計算資料集的標準差。

考慮一組在任何座標軸上繪製的值。這些值的標準差（稱為總體）被定義為它們之間觀察到的差異。如果標準差低，則值會緊密地繪製到平均值附近。但是，如果標準差高，則值會分散到遠離平均值的更遠處。

它表示為資料集方差的平方根。標準差有兩種型別：

總體標準差是根據總體的每個資料值計算的。因此，它是一個固定值。數學公式定義為：

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n}}}$$

其中，

X_m 是資料集的平均值。
X_i 是資料集的元素。
n 是資料集中元素的數量。

但是，樣本標準差是僅根據總體的某些資料值計算的統計量，因此該值取決於所選擇的樣本。數學公式定義為：

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n\:-\:1}}}$$

其中，

X_m 是資料集的平均值。
X_i 是資料集的元素。
n 是資料集中元素的數量。

輸入輸出場景

現在讓我們看看各種資料集的一些輸入輸出場景：

假設資料集僅包含正整數：

Input: [2, 3, 4, 1, 2, 5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

假設資料集僅包含負整數：

Input: [-2, -3, -4, -1, -2, -5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

假設資料集包含正整數和負整數：

Input: [-2, -3, -4, 1, 2, 5]
Result: Population Standard Deviation: 3.131382371342656
Sample Standard Deviation: 2.967415635794143

使用數學公式

我們在本文中已經看到了標準差的公式；現在讓我們看看用於在各種資料集上實現數學公式的Python程式。

示例

在下面的示例中，我們正在匯入math 庫，並透過在其方差上應用sqrt()內建方法來計算資料集的標準差。


import math

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#find the mean of dataset
sm=0
for i in range(len(dataset)):
   sm+=dataset[i]
   mean = sm/len(dataset)

#calculating population standard deviation of the dataset
deviation_sum = 0
for i in range(len(dataset)):
   deviation_sum+=(dataset[i]- mean)**2
   psd = math.sqrt((deviation_sum)/len(dataset))

#calculating sample standard deviation of the dataset
ssd = math.sqrt((deviation_sum)/len(dataset) - 1)

#display output
print("Population standard deviation of the dataset is", psd)
print("Sample standard deviation of the dataset is", ssd)

輸出

獲得的輸出標準差如下：

Population standard deviation of the dataset is 1.3437096247164249
Sample standard deviation of the dataset is 0.8975274678557505

在numpy模組中使用std()函式

在這種方法中，我們匯入numpy 模組，並且僅使用numpy.std()函式在numpy 陣列的元素上計算總體標準差。

示例

實現以下Python程式來計算numpy陣列元素的標準差：


import numpy as np

#declare the dataset list
dataset = np.array([2, 3, 4, 1, 2, 5])

#calculating standard deviation of the dataset
sd = np.std(dataset)

#display output
print("Population standard deviation of the dataset is", sd)

輸出

標準差顯示為以下輸出：

Population standard deviation of the dataset is 1.3437096247164249

在statistics模組中使用stdev()和pstdev()函式

Python中的statistics 模組提供了名為stdev()和pstdev()的函式來計算樣本資料集的標準差。Python中的 stdev()函式僅計算樣本標準差，而pstdev()函式計算總體標準差。

這兩個函式的引數和返回型別相同。

示例1：使用stdev()函式

用於演示stdev()函式的使用以查詢資料集的樣本標準差的Python程式如下：


import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.stdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

輸出

獲得的作為輸出的資料集的樣本標準差如下：

Standard Deviation of the dataset is 1.4719601443879744

示例2：使用pstdev()函式

用於演示pstdev()函式的使用以查詢資料集的總體標準差的Python程式如下：


import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.pstdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

輸出

獲得的作為輸出的資料集的樣本標準差如下：

Standard Deviation of the dataset is 1.3437096247164249

Alekhya Nagulavancha

更新於： 2022年10月26日

13K+ 瀏覽量

啟動您的職業生涯

透過完成課程獲得認證

開始學習