使用PyTorch庫構建深度學習模型

PyTorch是一個廣泛使用的開源機器學習框架，由Facebook的AI研究團隊開發。它以其靈活性和速度以及輕鬆構建複雜模型的能力而聞名。PyTorch基於Torch庫（最初是用Lua開發的），並提供Python繫結。

PyTorch廣泛應用於學術界和工業界，用於各種機器學習任務，例如計算機視覺、自然語言處理和語音識別。在本教程中，我們將學習如何使用PyTorch庫構建深度學習模型。

入門

在深入使用Torch庫之前，我們需要首先使用pip安裝該庫。但是，由於它並非內建，因此我們必須首先安裝Torch庫。這可以使用pip包管理器完成。

要安裝Torch庫，請開啟您的終端並鍵入以下命令：

pip install torch

這將下載並安裝Torch庫及其依賴項。安裝完成後，我們就可以開始使用Torch並利用其模組了！

在本教程中，我們將使用PyTorch構建一個用於影像分類的卷積神經網路（CNN）。卷積神經網路（CNN）是一種深度學習模型，廣泛用於影像分類任務。在本教程中，我們將使用PyTorch構建一個CNN來對影像進行分類。

步驟1：匯入所需的庫

第一步是匯入所需的庫。我們將使用torch、torch.nn和torchvision庫。

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms

步驟2：載入和預處理資料集

我們將使用CIFAR-10資料集，這是一個廣泛用於影像分類任務的資料集。該資料集包含60,000張32x32彩色影像，分為10個類別，每個類別有6,000張影像。

transform = transforms.Compose(
   [transforms.ToTensor(),
   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
   download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
   shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
   download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
   shuffle=False, num_workers=2)

我們使用torchvision.transforms庫來預處理影像。我們首先將影像轉換為張量，然後對其進行歸一化。然後，我們載入資料集併為訓練集和測試集建立資料載入器。

步驟3：定義CNN模型

準備資料後，下一步是使用PyTorch定義CNN模型。在此步驟中，我們將定義CNN模型的結構。我們的模型將包含兩個卷積層，然後是兩個全連線層。

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
   def __init__(self):
      super(Net, self).__init__()
      # input image channel, 3 for RGB images
      # output channel, 6 for 6 filters
      # kernel size = 5
      self.conv1 = nn.Conv2d(3, 6, 5)
      # input channel, 6 from previous layer
      # output channel, 16 for 16 filters
      # kernel size = 5
      self.conv2 = nn.Conv2d(6, 16, 5)
      # an affine operation: y = Wx + b
      # 16 * 5 * 5 is the size of the image after convolutional layers
      # 120 is the output size of the first fully connected layer
      self.fc1 = nn.Linear(16 * 5 * 5, 120)
      # 84 is the output size of the second fully connected layer
      self.fc2 = nn.Linear(120, 84)
      # 10 is the output size of the last fully connected layer
      self.fc3 = nn.Linear(84, 10)

   def forward(self, x):
      # max pooling over a (2, 2) window
      x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
      # if the size is a square you can only specify a single number
      x = F.max_pool2d(F.relu(self.conv2(x)), 2)
      # flatten the input for fully connected layers
      x = x.view(-1, self.num_flat_features(x))
      x = F.relu(self.fc1(x))
      x = F.relu(self.fc2(x))
      x = self.fc3(x)
      return x

   def num_flat_features(self, x):
      size = x.size()[1:]  # all dimensions except the batch dimension
      num_features = 1
      for s in size:
         num_features *= s
      return num_features

net = Net()

步驟4：訓練模型

現在我們已經定義了CNN模型，是時候在我們自己的資料集上訓練它了。為此，我們將使用PyTorch DataLoader類按批次載入資料，並將其饋送到模型進行訓練。我們還將定義損失函式和最佳化器。

以下是訓練模型的程式碼：

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Train the model
num_epochs = 10
for epoch in range(num_epochs):
   running_loss = 0.0
   for i, data in enumerate(train_loader, 0):
      inputs, labels = data
      optimizer.zero_grad()

      # Forward pass
      outputs = model(inputs)
      loss = criterion(outputs, labels)

      # Backward and optimize
      loss.backward()
      optimizer.step()

      # Print statistics
      running_loss += loss.item()
      if i % 2000 == 1999:    # Print every 2000 mini-batches
         print('[%d, %5d] loss: %.3f' %
            (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0
print('Finished Training')

我們迴圈遍歷資料集10個epoch，並使用訓練資料訓練模型。在每個epoch中，我們將執行損失重置為0，並迴圈遍歷資料的批次。

對於每個批次，我們執行模型的前向傳播，計算損失，執行反向傳播，並使用最佳化器最佳化模型。最後，我們每2000個小批次列印一次訓練損失。

步驟5：評估模型

現在我們已經訓練了模型，是時候評估其在測試資料集上的效能了。我們將使用PyTorch DataLoader類按批次載入測試資料，並將其饋送到模型進行評估。

以下是評估模型的程式碼：

# Evaluate the model
correct = 0
total = 0
with torch.no_grad():
   for data in test_loader:
      images, labels = data
      outputs = model(images)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
   100 * correct / total))

在此程式碼中，我們首先將**correct**和**total**變數初始化為0。然後，我們使用PyTorch DataLoader類迴圈遍歷測試資料集，並將測試資料饋送到模型。我們使用**torch.max()**函式獲取最高輸出值的索引，該索引代表預測的類別。然後，我們將預測的類別與真實的類別進行比較，並相應地更新**correct**和**total**變數。

最後，我們列印模型在測試資料集上的準確率。

結論

總之，PyTorch是一個強大的深度學習包，它具有易於使用的介面，可用於建立和訓練神經網路。在本教程中，我們介紹了使用PyTorch構建用於影像分類的卷積神經網路的基礎知識。

PyTorch的靈活性和易用性使其成為對深度學習感興趣的研究人員和從業人員的絕佳選擇。該庫的動態計算圖和自動微分引擎使建立複雜模型並有效地對其進行最佳化變得簡單。此外，PyTorch擁有一個龐大而活躍的社群，這意味著有很多資源可供學習，並在需要時獲得幫助。

總的來說，對於任何對開始學習深度學習感興趣的人來說，無論是新手還是經驗豐富的從業者，PyTorch都是一個極好的選擇。PyTorch憑藉其簡單的API和強大的功能，可以幫助您快速構建和訓練各種應用的深度學習模型。

S Vijay Balaji

更新於：2023年8月31日

瀏覽量：117

啟動您的職業生涯

透過完成課程獲得認證

開始