如何在PyTorch中讀取JPEG或PNG影像？

讀取影像在影像處理或計算機視覺相關任務中非常重要。**torchvision.io** 包提供了執行不同**IO**操作的函式。要讀取影像，**torchvision.io** 包提供了 **image_read()** 函式。此函式讀取**JPEG**和**PNG**影像。它返回一個**3D RGB**或**灰度**張量。

張量的三個維度對應於**[C,H,W]。C**是通道數，**W**和**H**分別是影像的寬度和高度。

對於**RGB**，通道數為3。因此，讀取影像的輸出是一個**[3,H,W]**的張量。輸出張量的值範圍為**[0,255]**。

語法

torchvision.io.read_image(path)

引數

**path** - 輸入JPEG或PNG影像路徑。

輸出

它返回一個大小為**[image_channels, image_height, image_width]**的torch張量。

步驟

您可以使用以下步驟在PyTorch中讀取和視覺化JPEG或PNG影像。

匯入所需的庫。在以下所有示例中，所需的Python庫是**torch**和**torchvision**。確保您已經安裝了它們。

import torch
import torchvision
from torchvision.io import read_image
import torchvision.transforms as T

使用**image_read()**函式讀取**JPEG**或**PNG**影像。使用影像型別（.jpg或.png）指定完整的影像路徑。此函式的輸出是一個大小為**[image_channels, image_height, image_width]**的torch張量。

img = read_image('butterfly.jpg')

可以選擇計算不同的影像屬性，即影像型別、影像大小等。
要顯示影像，我們首先將影像張量轉換為PIL影像，然後顯示影像。

img = T.ToPILImage()(img)
img.show()

輸入影像

我們將在以下示例中使用這些影像作為輸入檔案。

示例1

以下是使用PyTorch讀取JPEG影像的完整Python程式碼。

# Import the required libraries
import torch
import torchvision
from torchvision.io import read_image
import torchvision.transforms as T

# read a JPEG image
img = read_image('butterfly.jpg')

# display the image properties
print("Image data:
", img)

# check if input image is a PyTorch tensor
print("Is image a PyTorch Tensor:", torch.is_tensor(img))
print("Type of Image:", type(img))

# size of the image
print(img.size())

# convert the torch tensor to PIL image
img = T.ToPILImage()(img)

# display the image
img.show()

輸出

Image data:
   tensor([[[146, 169, 191, ..., 71, 61, 53],
      [140, 169, 192, ..., 75, 63, 53],
      [126, 161, 186, ..., 85, 68, 58],
      ...,
      [ 33, 31, 30, ..., 218, 221, 223],
      [ 30, 30, 31, ..., 216, 219, 224],
      [ 41, 45, 52, ..., 218, 219, 220]],

      [[130, 151, 170, ..., 47, 41, 35],
      [124, 151, 171, ..., 52, 42, 36],
      [110, 145, 168, ..., 61, 48, 39],
      ...,
      [ 29, 26, 25, ..., 197, 198, 200],
      [ 25, 25, 26, ..., 195, 198, 200],
      [ 20, 25, 33, ..., 200, 201, 202]],

      [[ 79, 101, 123, ..., 21, 17, 13],
      [ 73, 101, 126, ..., 21, 13, 10],
      [ 61, 96, 122, ..., 23, 11, 6],
      ...,
      [ 20, 20, 19, ..., 166, 167, 169],
      [ 19, 19, 20, ..., 164, 167, 172],
      [ 25, 27, 29, ..., 164, 165, 166]]],
dtype=torch.uint8)
Is image a PyTorch Tensor: True
Type of Image:
torch.Size([3, 465, 700])

請注意，**image_read()**的輸出是torch張量，值範圍為[0,255]，張量的型別為**torch.uint8**。

示例2

在此Python程式碼中，我們將看到如何使用PyTorch讀取**png**影像。

import torch
import torchvision

# read a png image
img = torchvision.io.read_image('elephant.png')

# display the properties of image
print("Image data:
", img)
print(img.size())
print(type(img))

# display the png image
# convert the image tensor to PIL image
img = torchvision.transforms.ToPILImage()(img)

# display the PIL image
img.show()

輸出

Image data:
   tensor([[[ 14, 13, 11, ..., 22, 21, 13],
      [ 13, 12, 9, ..., 24, 27, 21],
      [ 12, 10, 7, ..., 26, 33, 32],
      ...,
      [ 54, 15, 25, ..., 39, 76, 111],
      [ 79, 29, 32, ..., 38, 61, 84],
      [112, 60, 60, ..., 23, 47, 72]],

      [[ 14, 13, 11, ..., 11, 11, 5],
      [ 13, 12, 9, ..., 14, 17, 13],
      [ 12, 10, 7, ..., 15, 23, 23],
      ...,
      [ 38, 0, 9, ..., 25, 62, 97],
      [ 58, 8, 9, ..., 28, 50, 70],
      [ 91, 39, 37, ..., 13, 36, 58]],

      [[ 12, 11, 9, ..., 15, 12, 2],
      11, 10, 7, ..., 15, 16, 10],
      [ 10, 8, 5, ..., 13, 21, 18],
      ...,
      [ 38, 0, 9, ..., 24, 61, 96],
      [ 65, 15, 15, ..., 27, 48, 67],
      [ 98, 46, 43, ..., 12, 34, 55]]],
dtype=torch.uint8)
Is image a PyTorch Tensor: True
torch.Size([3, 466, 700])
<class 'torch.Tensor'>

請注意，**image_read()**的輸出是torch張量，值範圍為[0,255]，張量的型別為**torch.uint8**。

Shahid Akhtar Khan

更新於：2022年1月20日

7K+ 次瀏覽

啟動你的職業生涯

完成課程獲得認證

開始學習