機器學習 - 隨機梯度下降

梯度下降是一種流行的最佳化演算法，用於最小化機器學習模型的成本函式。它的工作原理是迭代地調整模型引數，以最小化預測輸出與實際輸出之間的差異。該演算法透過計算成本函式相對於模型引數的梯度，然後沿梯度的反方向調整引數來工作。

隨機梯度下降是梯度下降的一種變體，它為每個訓練樣本更新引數，而不是在評估整個資料集後更新引數。這意味著 SGD 只使用單個訓練樣本計算成本函式的梯度，而不是使用整個資料集。這種方法允許演算法更快收斂，並且需要更少的記憶體來儲存資料。

隨機梯度下降演算法的工作原理

隨機梯度下降透過從資料集中隨機選擇單個訓練樣本並使用它來更新模型引數來工作。此過程會重複進行固定數量的輪次，或者直到模型收斂到成本函式的最小值。

以下是隨機梯度下降演算法的工作原理：

將模型引數初始化為隨機值。
對於每個輪次，隨機打亂訓練資料。
對於每個訓練樣本：
- 計算成本函式相對於模型引數的梯度。
- 沿梯度的反方向更新模型引數。
重複直到收斂

隨機梯度下降和普通梯度下降之間的主要區別在於梯度的計算方式和模型引數的更新方式。在隨機梯度下降中，使用單個訓練樣本計算梯度，而在梯度下降中，使用整個資料集計算梯度。

在 Python 中實現隨機梯度下降

讓我們來看一個如何在 Python 中實現隨機梯度下降的例子。我們將使用 scikit-learn 庫在 Iris 資料集上實現該演算法，這是一個用於分類任務的流行資料集。在這個例子中，我們將使用它的兩個特徵（萼片寬度和萼片長度）來預測鳶尾花的種類：

示例

# Import required libraries
import sklearn

import numpy as np
from sklearn import datasets
from sklearn.linear_model import SGDClassifier

# Loading Iris flower dataset
iris = datasets.load_iris()
X_data, y_data = iris.data, iris.target

# Dividing the dataset into training and testing dataset
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Getting the Iris dataset with only the first two attributes
X, y = X_data[:,:2], y_data

# Split the dataset into a training and a testing set(20 percent)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.20, random_state=1)

# Standarize the features
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

# create the linear model SGDclassifier
clfmodel_SGD = SGDClassifier(alpha=0.001, max_iter=200)

# Train the classifier using fit() function
clfmodel_SGD.fit(X_train, y_train)

# Evaluate the result
from sklearn import metrics
y_train_pred = clfmodel_SGD.predict(X_train)
print ("\nThe Accuracy of SGD classifier is:",
metrics.accuracy_score(y_train, y_train_pred)*100)

輸出

執行此程式碼時，將產生以下輸出：

The Accuracy of SGD classifier is: 77.5

列印頁面