如何將影像轉換為 PyTorch 張量?


PyTorch 張量是一個包含單個數據型別元素的 n 維陣列(矩陣)。張量類似於 NumPy 陣列。NumPy 陣列和 PyTorch 張量之間的區別在於,張量利用 GPU 加速數值計算。為了加速計算,影像被轉換為張量。

要將影像轉換為 PyTorch 張量,我們可以採取以下步驟:

步驟

  • 匯入所需的庫。所需的庫是 **torch、torchvision、Pillow**。

  • 讀取影像。影像必須是 PIL 影像或範圍在 [0, 255] 內的 **numpy.ndarray (HxWxC)**。這裡 **H、W** 和 **C** 分別是影像的高度、寬度和通道數。

  • 定義一個將影像轉換為張量的變換。我們使用 **transforms.ToTensor()** 來定義變換。

  • 使用上面定義的變換將影像轉換為張量。

輸入影像

示例 1

# Import the required libraries
import torch
from PIL import Image
import torchvision.transforms as transforms

# Read the image
image = Image.open('Penguins.jpg')

# Define a transform to convert the image to tensor
transform = transforms.ToTensor()

# Convert the image to PyTorch tensor
tensor = transform(image)

# print the converted image tensor
print(tensor)

輸出

tensor([[[0.4510, 0.4549, 0.4667, ..., 0.3333, 0.3333, 0.3333],
         [0.4549, 0.4510, 0.4627, ..., 0.3373, 0.3373, 0.3373],
         [0.4667, 0.4588, 0.4667, ..., 0.3451, 0.3451, 0.3412],
         ...,
         [0.6706, 0.5020, 0.5490, ..., 0.4627, 0.4275, 0.3333],
         [0.4196, 0.5922, 0.6784, ..., 0.4627, 0.4549, 0.3569],
         [0.3569, 0.3529, 0.4784, ..., 0.3922, 0.4314, 0.3490]],
         [[0.6824, 0.6863, 0.7020, ..., 0.6392, 0.6392, 0.6392],
         [0.6863, 0.6824, 0.6980, ..., 0.6314, 0.6314, 0.6314],
         [0.6980, 0.6902, 0.6980, ..., 0.6392, 0.6392, 0.6353],
         ...,
         [0.7255, 0.5412, 0.5765, ..., 0.5255, 0.5020, 0.4157],
         [0.4706, 0.6314, 0.7098, ..., 0.5255, 0.5294, 0.4392],
         [0.4196, 0.3961, 0.5020, ..., 0.4510, 0.5059, 0.4314]],
         [[0.8157, 0.8196, 0.8353, ..., 0.7922, 0.7922, 0.7922],
         [0.8196, 0.8157, 0.8314, ..., 0.7882, 0.7882, 0.7882],
         [0.8314, 0.8235, 0.8314, ..., 0.7961, 0.7961, 0.7922],
         ...,
         [0.6235, 0.5059, 0.6157, ..., 0.4863, 0.4941, 0.4196],
         [0.3922, 0.6000, 0.7176, ..., 0.4863, 0.5216, 0.4431],
         [0.3686, 0.3647, 0.4863, ..., 0.4235, 0.4980, 0.4353]]])

在上面的 Python 程式中,我們已將 PIL 影像轉換為張量。

示例 2

我們也可以使用 **OpenCV** 讀取影像。使用 OpenCV 讀取的影像是 **numpy.ndarray** 型別。我們可以使用 **transforms.ToTensor()** 將 **numpy.ndarray** 轉換為張量。請看下面的例子。

# Import the required libraries
import torch
import cv2
import torchvision.transforms as transforms

# Read the image
image = cv2.imread('Penguins.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Define a transform to convert the image to tensor
transform = transforms.ToTensor()

# Convert the image to PyTorch tensor
tensor = transform(image)

# Print the converted image tensor
print(tensor)

輸出

tensor([[[0.4510, 0.4549, 0.4667, ..., 0.3333, 0.3333, 0.3333],
         [0.4549, 0.4510, 0.4627, ..., 0.3373, 0.3373, 0.3373],
         [0.4667, 0.4588, 0.4667, ..., 0.3451, 0.3451, 0.3412],
         ...,
         [0.6706, 0.5020, 0.5490, ..., 0.4627, 0.4275, 0.3333],
         [0.4196, 0.5922, 0.6784, ..., 0.4627, 0.4549, 0.3569],
         [0.3569, 0.3529, 0.4784, ..., 0.3922, 0.4314, 0.3490]],
         [[0.6824, 0.6863, 0.7020, ..., 0.6392, 0.6392, 0.6392],
         [0.6863, 0.6824, 0.6980, ..., 0.6314, 0.6314, 0.6314],
         [0.6980, 0.6902, 0.6980, ..., 0.6392, 0.6392, 0.6353],
         ...,
         [0.7255, 0.5412, 0.5765, ..., 0.5255, 0.5020, 0.4157],
         [0.4706, 0.6314, 0.7098, ..., 0.5255, 0.5294, 0.4392],
         [0.4196, 0.3961, 0.5020, ..., 0.4510, 0.5059, 0.4314]],
         [[0.8157, 0.8196, 0.8353, ..., 0.7922, 0.7922, 0.7922],
         [0.8196, 0.8157, 0.8314, ..., 0.7882, 0.7882, 0.7882],
         [0.8314, 0.8235, 0.8314, ..., 0.7961, 0.7961, 0.7922],
         ...,
         [0.6235, 0.5059, 0.6157, ..., 0.4863, 0.4941, 0.4196],
         [0.3922, 0.6000, 0.7176, ..., 0.4863, 0.5216, 0.4431],
         [0.3686, 0.3647, 0.4863, ..., 0.4235, 0.4980, 0.4353]]])

更新於:2021年11月6日

15K+ 瀏覽量

啟動你的 職業生涯

透過完成課程獲得認證

開始
廣告