
- Microsoft Cognitive Toolkit (CNTK) 教程
- 首頁
- 介紹
- 入門
- CPU 和 GPU
- CNTK - 序列分類
- CNTK - 邏輯迴歸模型
- CNTK - 神經網路 (NN) 概念
- CNTK - 建立第一個神經網路
- CNTK - 訓練神經網路
- CNTK - 記憶體資料集和大型資料集
- CNTK - 效能測量
- 神經網路分類
- 神經網路二元分類
- CNTK - 神經網路迴歸
- CNTK - 分類模型
- CNTK - 迴歸模型
- CNTK - 記憶體不足的資料集
- CNTK - 監控模型
- CNTK - 卷積神經網路
- CNTK - 迴圈神經網路
- Microsoft Cognitive Toolkit 資源
- Microsoft Cognitive Toolkit - 快速指南
- Microsoft Cognitive Toolkit - 資源
- Microsoft Cognitive Toolkit - 討論
CNTK - 神經網路二元分類
本章將介紹如何使用 CNTK 進行神經網路二元分類。
使用神經網路進行二元分類類似於多類別分類,唯一的區別是隻有兩個輸出節點而不是三個或更多。在這裡,我們將使用兩種技術(單節點和雙節點技術)來使用神經網路執行二元分類。單節點技術比雙節點技術更常用。
載入資料集
為了使用神經網路實現這兩種技術,我們將使用鈔票資料集。該資料集可以從 UCI 機器學習知識庫下載,網址為 https://archive.ics.uci.edu/ml/datasets/banknote+authentication。
在我們的示例中,我們將使用 50 個具有 forgery = 0 類別的真實資料項和 50 個具有 forgery = 1 類別的假資料項。
準備訓練和測試檔案
完整的資料集包含 1372 個數據項。原始資料集如下所示:
3.6216, 8.6661, -2.8076, -0.44699, 0 4.5459, 8.1674, -2.4586, -1.4621, 0 … -1.3971, 3.3191, -1.3927, -1.9948, 1 0.39012, -0.14279, -0.031994, 0.35084, 1
現在,首先我們需要將此原始資料轉換為雙節點 CNTK 格式,如下所示:
|stats 3.62160000 8.66610000 -2.80730000 -0.44699000 |forgery 0 1 |# authentic |stats 4.54590000 8.16740000 -2.45860000 -1.46210000 |forgery 0 1 |# authentic . . . |stats -1.39710000 3.31910000 -1.39270000 -1.99480000 |forgery 1 0 |# fake |stats 0.39012000 -0.14279000 -0.03199400 0.35084000 |forgery 1 0 |# fake
您可以使用以下 Python 程式從原始資料建立 CNTK 格式的資料:
fin = open(".\\...", "r") #provide the location of saved dataset text file. for line in fin: line = line.strip() tokens = line.split(",") if tokens[4] == "0": print("|stats %12.8f %12.8f %12.8f %12.8f |forgery 0 1 |# authentic" % \ (float(tokens[0]), float(tokens[1]), float(tokens[2]), float(tokens[3])) ) else: print("|stats %12.8f %12.8f %12.8f %12.8f |forgery 1 0 |# fake" % \ (float(tokens[0]), float(tokens[1]), float(tokens[2]), float(tokens[3])) ) fin.close()
雙節點二元分類模型
雙節點分類和多類別分類之間幾乎沒有區別。在這裡,我們首先需要以 CNTK 格式處理資料檔案,為此我們將使用名為 create_reader 的輔助函式,如下所示:
def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field='stats', shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field='forgery', shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial = C.io.CTFDeserializer(path, streams) mb_src = C.io.MinibatchSource(deserial, randomize=rnd_order, max_sweeps=sweeps) return mb_src
現在,我們需要為我們的神經網路設定架構引數,並提供資料檔案的位置。這可以透過以下 Python 程式碼完成:
def main(): print("Using CNTK version = " + str(C.__version__) + "\n") input_dim = 4 hidden_dim = 10 output_dim = 2 train_file = ".\\...\\" #provide the name of the training file test_file = ".\\...\\" #provide the name of the test file
現在,藉助以下程式碼行,我們的程式將建立未經訓練的神經網路:
X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim, np.float32) with C.layers.default_options(init=C.initializer.uniform(scale=0.01, seed=1)): hLayer = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name='hidLayer')(X) oLayer = C.layers.Dense(output_dim, activation=None, name='outLayer')(hLayer) nnet = oLayer model = C.ops.softmax(nnet)
現在,一旦我們建立了未經訓練的雙節點模型,我們需要設定一個 Learner 演算法物件,然後使用它來建立一個 Trainer 訓練物件。我們將使用 SGD 學習器和 cross_entropy_with_softmax 損失函式:
tr_loss = C.cross_entropy_with_softmax(nnet, Y) tr_clas = C.classification_error(nnet, Y) max_iter = 500 batch_size = 10 learn_rate = 0.01 learner = C.sgd(nnet.parameters, learn_rate) trainer = C.Trainer(nnet, (tr_loss, tr_clas), [learner])
現在,一旦我們完成了 Trainer 物件,我們需要建立一個 reader 函式來讀取訓練資料:
rdr = create_reader(train_file, input_dim, output_dim, rnd_order=True, sweeps=C.io.INFINITELY_REPEAT) banknote_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src }
現在,是時候訓練我們的神經網路模型了:
for i in range(0, max_iter): curr_batch = rdr.next_minibatch(batch_size, input_map=iris_input_map) trainer.train_minibatch(curr_batch) if i % 500 == 0: mcee = trainer.previous_minibatch_loss_average macc = (1.0 - trainer.previous_minibatch_evaluation_average) * 100 print("batch %4d: mean loss = %0.4f, accuracy = %0.2f%% " \ % (i, mcee, macc))
訓練完成後,讓我們使用測試資料項評估模型:
print("\nEvaluating test data \n") rdr = create_reader(test_file, input_dim, output_dim, rnd_order=False, sweeps=1) banknote_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } num_test = 20 all_test = rdr.next_minibatch(num_test, input_map=iris_input_map) acc = (1.0 - trainer.test_minibatch(all_test)) * 100 print("Classification accuracy = %0.2f%%" % acc)
在評估我們訓練好的神經網路模型的準確性之後,我們將使用它來對未見過的 資料進行預測:
np.set_printoptions(precision = 1, suppress=True) unknown = np.array([[0.6, 1.9, -3.3, -0.3]], dtype=np.float32) print("\nPredicting Banknote authenticity for input features: ") print(unknown[0]) pred_prob = model.eval(unknown) np.set_printoptions(precision = 4, suppress=True) print("Prediction probabilities are: ") print(pred_prob[0]) if pred_prob[0,0] < pred_prob[0,1]: print(“Prediction: authentic”) else: print(“Prediction: fake”)
完整的雙節點分類模型
def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field='stats', shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field='forgery', shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial = C.io.CTFDeserializer(path, streams) mb_src = C.io.MinibatchSource(deserial, randomize=rnd_order, max_sweeps=sweeps) return mb_src def main(): print("Using CNTK version = " + str(C.__version__) + "\n") input_dim = 4 hidden_dim = 10 output_dim = 2 train_file = ".\\...\\" #provide the name of the training file test_file = ".\\...\\" #provide the name of the test file X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim, np.float32) withC.layers.default_options(init=C.initializer.uniform(scale=0.01, seed=1)): hLayer = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name='hidLayer')(X) oLayer = C.layers.Dense(output_dim, activation=None, name='outLayer')(hLayer) nnet = oLayer model = C.ops.softmax(nnet) tr_loss = C.cross_entropy_with_softmax(nnet, Y) tr_clas = C.classification_error(nnet, Y) max_iter = 500 batch_size = 10 learn_rate = 0.01 learner = C.sgd(nnet.parameters, learn_rate) trainer = C.Trainer(nnet, (tr_loss, tr_clas), [learner]) rdr = create_reader(train_file, input_dim, output_dim, rnd_order=True, sweeps=C.io.INFINITELY_REPEAT) banknote_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } for i in range(0, max_iter): curr_batch = rdr.next_minibatch(batch_size, input_map=iris_input_map) trainer.train_minibatch(curr_batch) if i % 500 == 0: mcee = trainer.previous_minibatch_loss_average macc = (1.0 - trainer.previous_minibatch_evaluation_average) * 100 print("batch %4d: mean loss = %0.4f, accuracy = %0.2f%% " \ % (i, mcee, macc)) print("\nEvaluating test data \n") rdr = create_reader(test_file, input_dim, output_dim, rnd_order=False, sweeps=1) banknote_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } num_test = 20 all_test = rdr.next_minibatch(num_test, input_map=iris_input_map) acc = (1.0 - trainer.test_minibatch(all_test)) * 100 print("Classification accuracy = %0.2f%%" % acc) np.set_printoptions(precision = 1, suppress=True) unknown = np.array([[0.6, 1.9, -3.3, -0.3]], dtype=np.float32) print("\nPredicting Banknote authenticity for input features: ") print(unknown[0]) pred_prob = model.eval(unknown) np.set_printoptions(precision = 4, suppress=True) print("Prediction probabilities are: ") print(pred_prob[0]) if pred_prob[0,0] < pred_prob[0,1]: print(“Prediction: authentic”) else: print(“Prediction: fake”) if __name__== ”__main__”: main()
輸出
Using CNTK version = 2.7 batch 0: mean loss = 0.6928, accuracy = 80.00% batch 50: mean loss = 0.6877, accuracy = 70.00% batch 100: mean loss = 0.6432, accuracy = 80.00% batch 150: mean loss = 0.4978, accuracy = 80.00% batch 200: mean loss = 0.4551, accuracy = 90.00% batch 250: mean loss = 0.3755, accuracy = 90.00% batch 300: mean loss = 0.2295, accuracy = 100.00% batch 350: mean loss = 0.1542, accuracy = 100.00% batch 400: mean loss = 0.1581, accuracy = 100.00% batch 450: mean loss = 0.1499, accuracy = 100.00% Evaluating test data Classification accuracy = 84.58% Predicting banknote authenticity for input features: [0.6 1.9 -3.3 -0.3] Prediction probabilities are: [0.7847 0.2536] Prediction: fake
單節點二元分類模型
實現程式與我們上面為雙節點分類所做的幾乎相同。主要變化在於使用雙節點分類技術時。
我們可以使用 CNTK 內建的 classification_error() 函式,但是對於單節點分類,CNTK 不支援 classification_error() 函式。這就是我們需要實現一個程式定義的函式的原因,如下所示:
def class_acc(mb, x_var, y_var, model): num_correct = 0; num_wrong = 0 x_mat = mb[x_var].asarray() y_mat = mb[y_var].asarray() for i in range(mb[x_var].shape[0]): p = model.eval(x_mat[i] y = y_mat[i] if p[0,0] < 0.5 and y[0,0] == 0.0 or p[0,0] >= 0.5 and y[0,0] == 1.0: num_correct += 1 else: num_wrong += 1 return (num_correct * 100.0)/(num_correct + num_wrong)
有了這個更改,讓我們看看完整的單節點分類示例:
完整的單節點分類模型
import numpy as np import cntk as C def create_reader(path, input_dim, output_dim, rnd_order, sweeps): x_strm = C.io.StreamDef(field='stats', shape=input_dim, is_sparse=False) y_strm = C.io.StreamDef(field='forgery', shape=output_dim, is_sparse=False) streams = C.io.StreamDefs(x_src=x_strm, y_src=y_strm) deserial = C.io.CTFDeserializer(path, streams) mb_src = C.io.MinibatchSource(deserial, randomize=rnd_order, max_sweeps=sweeps) return mb_src def class_acc(mb, x_var, y_var, model): num_correct = 0; num_wrong = 0 x_mat = mb[x_var].asarray() y_mat = mb[y_var].asarray() for i in range(mb[x_var].shape[0]): p = model.eval(x_mat[i] y = y_mat[i] if p[0,0] < 0.5 and y[0,0] == 0.0 or p[0,0] >= 0.5 and y[0,0] == 1.0: num_correct += 1 else: num_wrong += 1 return (num_correct * 100.0)/(num_correct + num_wrong) def main(): print("Using CNTK version = " + str(C.__version__) + "\n") input_dim = 4 hidden_dim = 10 output_dim = 1 train_file = ".\\...\\" #provide the name of the training file test_file = ".\\...\\" #provide the name of the test file X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim, np.float32) with C.layers.default_options(init=C.initializer.uniform(scale=0.01, seed=1)): hLayer = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name='hidLayer')(X) oLayer = C.layers.Dense(output_dim, activation=None, name='outLayer')(hLayer) model = oLayer tr_loss = C.cross_entropy_with_softmax(model, Y) max_iter = 1000 batch_size = 10 learn_rate = 0.01 learner = C.sgd(model.parameters, learn_rate) trainer = C.Trainer(model, (tr_loss), [learner]) rdr = create_reader(train_file, input_dim, output_dim, rnd_order=True, sweeps=C.io.INFINITELY_REPEAT) banknote_input_map = {X : rdr.streams.x_src, Y : rdr.streams.y_src } for i in range(0, max_iter): curr_batch = rdr.next_minibatch(batch_size, input_map=iris_input_map) trainer.train_minibatch(curr_batch) if i % 100 == 0: mcee=trainer.previous_minibatch_loss_average ca = class_acc(curr_batch, X,Y, model) print("batch %4d: mean loss = %0.4f, accuracy = %0.2f%% " \ % (i, mcee, ca)) print("\nEvaluating test data \n") rdr = create_reader(test_file, input_dim, output_dim, rnd_order=False, sweeps=1) banknote_input_map = { X : rdr.streams.x_src, Y : rdr.streams.y_src } num_test = 20 all_test = rdr.next_minibatch(num_test, input_map=iris_input_map) acc = class_acc(all_test, X,Y, model) print("Classification accuracy = %0.2f%%" % acc) np.set_printoptions(precision = 1, suppress=True) unknown = np.array([[0.6, 1.9, -3.3, -0.3]], dtype=np.float32) print("\nPredicting Banknote authenticity for input features: ") print(unknown[0]) pred_prob = model.eval({X:unknown}) print("Prediction probability: ") print(“%0.4f” % pred_prob[0,0]) if pred_prob[0,0] < 0.5: print(“Prediction: authentic”) else: print(“Prediction: fake”) if __name__== ”__main__”: main()
輸出
Using CNTK version = 2.7 batch 0: mean loss = 0.6936, accuracy = 10.00% batch 100: mean loss = 0.6882, accuracy = 70.00% batch 200: mean loss = 0.6597, accuracy = 50.00% batch 300: mean loss = 0.5298, accuracy = 70.00% batch 400: mean loss = 0.4090, accuracy = 100.00% batch 500: mean loss = 0.3790, accuracy = 90.00% batch 600: mean loss = 0.1852, accuracy = 100.00% batch 700: mean loss = 0.1135, accuracy = 100.00% batch 800: mean loss = 0.1285, accuracy = 100.00% batch 900: mean loss = 0.1054, accuracy = 100.00% Evaluating test data Classification accuracy = 84.00% Predicting banknote authenticity for input features: [0.6 1.9 -3.3 -0.3] Prediction probability: 0.8846 Prediction: fake