PyTorch - 迴圈神經網路

迴圈神經網路是一種面向深度學習的演算法，它遵循順序方法。在神經網路中，我們總是假設每個輸入和輸出都獨立於所有其他層。這類神經網路被稱為迴圈神經網路，因為它們以順序方式執行數學計算，一個任務接一個任務地完成。

下圖說明了迴圈神經網路的完整方法和工作原理：

在上圖中，c1、c2、c3 和 x1 被視為輸入，其中包括一些隱藏的輸入值，例如 h1、h2 和 h3，它們分別輸出 o1。現在，我們將重點介紹如何使用 PyTorch 和迴圈神經網路來建立正弦波。

在訓練過程中，我們將採用一次一個資料點的訓練方法來訓練我們的模型。輸入序列 x 包含 20 個數據點，目標序列被認為與輸入序列相同。

步驟 1

匯入使用以下程式碼實現迴圈神經網路所需的包：

import torch
from torch.autograd import Variable
import numpy as np
import pylab as pl
import torch.nn.init as init

步驟 2

我們將設定模型超引數，輸入層的大小設定為 7。將有 6 個上下文神經元和 1 個輸入神經元來建立目標序列。

dtype = torch.FloatTensor
input_size, hidden_size, output_size = 7, 6, 1
epochs = 300
seq_length = 20
lr = 0.1
data_time_steps = np.linspace(2, 10, seq_length + 1)
data = np.sin(data_time_steps)
data.resize((seq_length + 1, 1))

x = Variable(torch.Tensor(data[:-1]).type(dtype), requires_grad=False)
y = Variable(torch.Tensor(data[1:]).type(dtype), requires_grad=False)

我們將生成訓練資料，其中 x 是輸入資料序列，y 是所需的目標序列。

步驟 3

在迴圈神經網路中，權重使用均值為零的正態分佈進行初始化。W1 將表示接受輸入變數，w2 將表示生成的輸出，如下所示：

w1 = torch.FloatTensor(input_size, 
hidden_size).type(dtype)
init.normal(w1, 0.0, 0.4)
w1 = Variable(w1, requires_grad = True)
w2 = torch.FloatTensor(hidden_size, output_size).type(dtype)
init.normal(w2, 0.0, 0.3)
w2 = Variable(w2, requires_grad = True)

步驟 4

現在，重要的是為前饋建立一個函式，該函式唯一地定義了神經網路。

def forward(input, context_state, w1, w2):
   xh = torch.cat((input, context_state), 1)
   context_state = torch.tanh(xh.mm(w1))
   out = context_state.mm(w2)
   return (out, context_state)

步驟 5

下一步是開始迴圈神經網路正弦波實現的訓練過程。外迴圈遍歷每個迴圈，內迴圈遍歷序列的元素。在這裡，我們還將計算均方誤差 (MSE)，這有助於預測連續變數。

for i in range(epochs):
   total_loss = 0
   context_state = Variable(torch.zeros((1, hidden_size)).type(dtype), requires_grad = True)
   for j in range(x.size(0)):
      input = x[j:(j+1)]
      target = y[j:(j+1)]
      (pred, context_state) = forward(input, context_state, w1, w2)
      loss = (pred - target).pow(2).sum()/2
      total_loss += loss
      loss.backward()
      w1.data -= lr * w1.grad.data
      w2.data -= lr * w2.grad.data
      w1.grad.data.zero_()
      w2.grad.data.zero_()
      context_state = Variable(context_state.data)
   if i % 10 == 0:
      print("Epoch: {} loss {}".format(i, total_loss.data[0]))

context_state = Variable(torch.zeros((1, hidden_size)).type(dtype), requires_grad = False)
predictions = []

for i in range(x.size(0)):
   input = x[i:i+1]
   (pred, context_state) = forward(input, context_state, w1, w2)
   context_state = context_state
   predictions.append(pred.data.numpy().ravel()[0])

步驟 6

現在，是時候繪製正弦波了，就像我們需要的那樣。

pl.scatter(data_time_steps[:-1], x.data.numpy(), s = 90, label = "Actual")
pl.scatter(data_time_steps[1:], predictions, label = "Predicted")
pl.legend()
pl.show()

輸出

上述過程的輸出如下：

列印頁面