如何在PyTorch中應用二維轉置卷積運算?
我們可以使用**torch.nn.ConvTranspose2d()**模組對包含多個輸入平面的輸入影像應用二維轉置卷積運算。此模組可以看作是**Conv2d**關於其輸入的梯度。
二維轉置卷積層的輸入大小必須為**[N,C,H,W]**,其中**N**是批大小,**C**是通道數,**H**和**W**分別是輸入影像的高度和寬度。
通常,二維轉置卷積運算應用於影像張量。對於RGB影像,通道數為3。轉置卷積運算的主要特徵是濾波器或核心大小和步幅。此模組支援**TensorFloat32**。
語法
torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
引數
**in_channels** – 輸入影像中的通道數。
**out_channels** – 轉置卷積運算產生的通道數。
**kernel_size** – 卷積核的大小。
除了以上三個引數外,還有一些可選引數,例如**stride、padding、dilation**等。我們將在下面的Python示例中詳細介紹這些引數。
步驟
您可以使用以下步驟應用二維轉置卷積運算:
匯入所需的庫。在以下所有示例中,所需的Python庫是**torch**。確保您已經安裝它。要在影像上應用二維轉置卷積運算,我們還需要**torchvision**和**Pillow**。
import torch import torchvision from PIL import Image
定義**輸入**張量或讀取輸入影像。如果輸入是影像,則我們首先將其轉換為torch張量。
定義**in_channels、out_channels、kernel_size**和其他引數。
接下來,透過將上述定義的引數傳遞給**torch.nn.ConvTranspose2d()**來定義轉置卷積運算convt。
convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
將轉置卷積運算convt應用於輸入張量或影像張量。
output = convt(input)
接下來列印轉置卷積運算後的張量。如果輸入是影像張量,則要視覺化影像,我們首先將轉置卷積運算後獲得的張量轉換為PIL影像,然後視覺化影像。
讓我們來看一些示例,以便更清楚地理解。
輸入影像
我們將在示例2中使用以下影像作為輸入檔案。

示例1
在下面的Python示例中,我們對輸入張量執行二維轉置卷積運算。我們應用**kernel_size、stride、padding**和**dilation**的不同組合。
# Python 3 program to perform 2D transpose convolution operation
import torch
import torch.nn as nn
'''torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0)
'''
in_channels = 2
out_channels = 3
kernel_size = 2
convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
# conv = nn.ConvTranspose2d(3, 6, 2)
'''input of size [N,C,H, W]
N==>batch size,
C==> number of channels,
H==> height of input planes in pixels,
W==> width in pixels.
'''
# define the input with below info
N=1
C=2
H=4
W=4
input = torch.empty(N,C,H,W).random_(256)
# input = torch.randn(2,3,32,64)
print("Input Tensor:
", input)
print("Input Size:",input.size())
# Perform transpose convolution operation
output = convt(input)
print("Output Tensor:
", output)
print("Output Size:",output.size())
# With square kernels (3,3) and equal stride
convt = nn.ConvTranspose2d(2, 3, 3, stride=2)
output = convt(input)
print("Output Size:",output.size())
# non-square kernels and unequal stride and with padding
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2))
output = convt(input)
print("Output Size:",output.size())
# non-square kernels and unequal stride and with padding and dilation
convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1),
padding=(4, 2), dilation=(3, 1))
output = convt(input)
print("Output Size:",output.size())輸出
Input Tensor: tensor([[[[115., 76., 102., 6.], [221., 173., 23., 205.], [123., 23., 112., 18.], [189., 178., 167., 143.]], [[239., 180., 226., 88.], [224., 30., 196., 224.], [ 57., 222., 47., 84.], [ 25., 255., 201., 114.]]]]) Input Size: torch.Size([1, 2, 4, 4]) Output Tensor: tensor([[[[ 48.1156, 64.6112, 64.9630, 47.2604, 3.9925], [74.9169, 80.7055, 138.8992, 82.8471, 54.3722], [20.0938, 49.5610, 30.2914, 93.3563, 3.1597], [-27.1410, 118.8138, 92.8670, 50.6170, 37.5564], [-27.7676, 6.5762, 33.6408, 6.7176, -8.8372]], [[ -18.2188, -56.5362, -49.8063, -43.3336, -16.8645], [ -23.4012, -6.1607, 40.5064, -17.4547, -25.1738], [ -5.7752, 53.6838, -27.9412, 36.7660, 44.0866], [ -23.5205, 1.1443, -29.0826, -34.7213, -4.1535], [ 5.6746, 38.4026, 72.8414, 59.2990, 34.9241]], [[ -35.0380, -31.4031, -38.0059, -19.3247, -5.6272], [-109.2401, -12.9763, -62.2776, -31.0825, 19.2766], [ -93.6596, -18.5403, -67.5457, -61.8533, 32.3005], [ -27.7020, -71.3938, -18.9532, -26.8304, 20.0184], [ -29.2334, -85.8179, -35.4292, -16.4065, 19.0788]]]], grad_fn=<SlowConvTranspose2DBackward>) Output Size: torch.Size([1, 3, 5, 5]) Output Size: torch.Size([1, 3, 9, 9]) Output Size: torch.Size([1, 3, 1, 4]) Output Size: torch.Size([1, 3, 5, 4])
示例2
在下面的Python示例中,我們對輸入影像執行二維轉置卷積運算。為了應用二維轉置卷積,我們首先將影像轉換為torch張量,並在轉置卷積之後,再次將其轉換為PIL影像以進行視覺化。
# Python program to perform 2D transpose convolution operation
# Import the required libraries
import torch
import torchvision
from PIL import Image
import torchvision.transforms as T
# Read input image
img = Image.open('car.jpg')
# convert the input image to torch tensor
img = T.ToTensor()(img)
print("Input image size:", img.size()) # size = [3, 464, 700]
# unsqueeze the image to make it 4D tensor
img = img.unsqueeze(0) # image size = [1, 3, 464, 700]
# define transpose convolution layer
# convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
convt = torch.nn.ConvTranspose2d(3, 3, 2)
# apply transpose convolution operation on image
img = convt(img)
# squeeze image to make it 3D
img = img.squeeze(0) # now image is again 3D
print("Output image size:",img.size())
# convert image to PIL image
img = T.ToPILImage()(img)
# display the image after convolution
img.show()
'''
Note: You may get different output image after the convolution operation
because the weights initialized may be different at different runs.
'''輸出
Input image size: torch.Size([3, 464, 700]) Output image size: torch.Size([3, 465, 701])

請注意,由於**權重**和**偏差**的初始化,您可能會在每次執行後看到獲得的影像的一些變化。
資料結構
網路
關係資料庫管理系統 (RDBMS)
作業系統
Java
iOS
HTML
CSS
Android
Python
C語言程式設計
C++
C#
MongoDB
MySQL
Javascript
PHP