使用MobileNet進行影像識別

介紹

影像識別是指識別影像中物體或特徵的過程。影像識別廣泛應用於醫療影像、汽車、安全以及缺陷檢測等眾多領域。

什麼是MobileNet？為什麼如此流行？

MobileNet是一種使用深度可分離卷積開發的深度學習CNN模型。與相同深度的其他模型相比，該模型大大減少了引數數量。該模型輕量級，經過最佳化可在移動和邊緣裝置上執行。目前已釋出三個版本的MobileNet，即MobileNet v1、v2和v3。MobileNet由谷歌開發。

讓我們簡單瞭解一下在ML領域存在已久的MobileNet V1和V2。

MobileNetV2比MobileNet V1提供了兩個重要的特性：

MobileNetV2在層之間具有線性瓶頸。它透過不允許非線性破壞過多資訊來保留資訊。
瓶頸之間的短連線。

Mobilenetv2架構

輸入	運算子	T	C	N	S
224² X 3	conv2d	—	32	1	2
112² X 32	瓶頸	1	16	1	1
112² X 16	瓶頸	6	24	2	2
56² X 24	瓶頸	6	32	3	2
28² X 32	瓶頸	6	64	4	2
14² X 64	瓶頸	6	96	3	1
14² X 96	瓶頸	6	160	3	2
7² X 160	瓶頸	6	320	1	1
7² X 320	conv2d 1x1 1	—	1280	1	1
7² X 1280	avgpool 7x7	—	—	1	—
1 X 1 X 1280	conv2d 1x1	—	k	—	—

MobileNet v1和Mobilenet V2的比較

大小	MOBILENETV1	MOBILENETV2	SHUFFLENET (2X,G=3)
112X112	64/1600	16/400	32/800
56x56	128/800	32/200	48/300
28x28	256/400	64/100	400/600K
14x14	512/200	160/62	800/310
7x7	1024/199	320/32	1600/156
1x1	1024/2	1280/2	1600/3
最大	1600K	400K	600K

影像識別的程式碼實現

示例

## MOBILENET

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import imagenet_utils
import matplotlib.pyplot as plt
from IPython.display import Image,display
%matplotlib inline

mobile = tf.keras.applications.mobilenet.MobileNet()

def format_image(file):
  image_path = '/content/images/'
  img = image.load_img(image_path + file, target_size=(224, 224))
  img_array = image.img_to_array(img)
  img_array_exp_dims = np.expand_dims(img_array, axis=0)
  return tf.keras.applications.mobilenet.preprocess_input(img_array_exp_dims)


display(Image(filename='/content/images/image.jpg', width=300,height=200))

preprocessed_img = format_image('image.jpg')
prediction_results = mobile.predict(preprocessed_img)

results = imagenet_utils.decode_predictions(prediction_results)
print(results)

輸出

[[('n02279972', 'monarch', 0.58884907), ('n02281406', 'sulphur_butterfly', 
0.18508224), ('n02277742', 'ringlet', 0.15471826), ('n02281787', 'lycaenid', 0.04744451), 
('n02276258', 'admiral', 0.01013135)]]

MobileNet相對於其他網路的優勢

MobileNet具有更高的分類精度和更少的引數。
MobileNet體積小，延遲低，功耗最佳化，非常適合移動和嵌入式裝置。
它們是用於分割和目標檢測的高效特徵提取器。

影像識別的益處

用於自動駕駛汽車和機器人檢測障礙物。
廣泛用於OCR技術，從影像中檢索資訊。
車道線檢測。
人臉檢測和考勤系統。
影像字幕和標籤，在社交媒體網站上有用。

結論

影像識別已成為每個目標檢測和影片相關任務的初步任務。由於已經存在大量的預訓練模型和架構，它在當前與視覺相關的AI領域變得非常重要。

Mithilesh Pradhan

更新於：2022年12月30日

475 次瀏覽

開啟您的職業生涯

透過完成課程獲得認證

開始學習