Keras をつかって顔認識してみる。

はじめに

以前乃木坂のメンバーの顔認識をchainerでやりましたが、今回それを Keras で試してみます。基本的なCNNの実装ってだけで技術的にはkerasの公式とソースはあまり変わりませんが、自分で用意したデータセットを使って学習したいという人は参考にしてください。
全ソースコード下にも書いていますがgithubにも公開しています。

TOMMY NOTES

chainer メモ（その３）機械学習で顔認識！CNN を実装

https://toxweblog.toxbe.com/2016/10/01/chainer-メモ（その３）機械学習で乃木坂メンバーの顔

はじめにいままでchainerについてちょいちょい書いているのですが（その１）（その２）、実際に画像のクラス分けをしてみたいので、その途中段階までをまとめます。顔認識ということですが、乃木坂のメンバーを認識してみようと思います。画像集めの問題もあって今回は入力画像をメンバー4人+それ以外人たちの5つのクラスの誰であるかを判別します。(adsbygoogle = window.adsbygoogle || ).push({}); データセットデータですが、乃木坂メンバーからとりあえず、秋元真夏、白石麻衣、西野七瀬、生田絵梨花、そして誰やねん...

TOMMY NOTES

chainer メモ（その４）機械学習で顔認識！実際に画像をつっこむ（OpenCV3使用）

https://toxweblog.toxbe.com/2016/10/01/chainer-メモ（その４）機械学習で顔認識！実際に画像

はじめに前回のブログで乃木坂メンバーの顔を学習するところまでやったので、実際に適当な画像を突っ込んで、顔認識からの誰が写っているかを試してみましょう。(adsbygoogle = window.adsbygoogle || ).push({}); コードコードの全文はgithubに下のコードは実際に画像を指定してpredictする。from image2TrainAndTest import image2TrainAndTestfrom image2TrainAndTest import getValueDataFromPathfrom image2TrainAndTest import getValueDataFromImgfrom faceDetection import faceDetectionFromPathimport alexLikede...

学習の前に

kerasでは画像を扱うために
keras.preprocessing.image
というものをインポートすることでかんたんに扱う事ができますが、これは内部でpillowという画像を扱うパッケージを使っているのでインストールしていない人はインストールする必要があります。

$ pip install pillow

また学習済みのモデルを保存する際、h5pyというパッケージが必要なのでこれもインストールする。

$ pip install h5py

予測する際に普通の画像を入れて顔を切り出してkerasに投げるのですが、opencv3を利用するのでこれもインストールします。

$ conda install -c https://conda.binstar.org/menpo opencv3

データセット

データですが、乃木坂メンバーからとりあえず、秋元真夏、白石麻衣、西野七瀬、生田絵梨花、そして誰やねんって人を大量に入れたやつ
それぞれの画像を300枚ほどあつめてOpencvを利用して顔画像のみを切り取り128×128のサイズにして保存しています。
データはここで公開しときます。
こんな感じの画像

keras のコード

訓練データのフォルダ構造が以下のようになっているとします。
-images
　|- the_others
　　|- 画像1
　　|- 画像2
　　|- …
　|- akimoto
　　|- 画像1
　　|- 画像2
　　|- …
　|- shiraishi
　　|- 画像1
　　|- 画像2
　　|- …
　|- ikuta
　　|- 画像1
　　|- 画像2
　　|- …
　…


from __future__ import print_function
import keras
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization, ZeroPadding2D
from keras import backend as K
import argparse
from load_images import load_images_from_labelFolder

batch_size = 20
num_classes = 7
epoch = 30

img_rows, img_cols = 128, 128


parser = argparse.ArgumentParser()
parser.add_argument('--path', '-p', default='.\\images')
args = parser.parse_args()


(x_train, y_train), (x_test, y_test) = load_images_from_labelFolder(args.path,img_cols, img_rows, train_test_ratio=(6,1))

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 3)
    input_shape = (img_rows, img_cols, 3)

x_train = x_train.astype('float32')
x_test  = x_test.astype('float32')
x_train /= 255
x_test /= 255


# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()

model.add(Conv2D(32, kernel_size=(3,3),
            activation='relu',
            input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
model.add(Dropout(0.2))

model.add(ZeroPadding2D(padding=(1, 1)))
model.add(Conv2D(96, kernel_size=(3,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
model.add(Dropout(0.2))

model.add(ZeroPadding2D(padding=(1, 1)))
model.add(Conv2D(96, kernel_size=(3,3)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
model.add(Flatten())

model.add(Dense(units=1024, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(units=num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
                optimizer=keras.optimizers.Adam(),
                metrics=['accuracy'])

model.fit(x_train, y_train,
            batch_size=batch_size,
            epochs=epoch,
            verbose=1,
            validation_data=(x_test, y_test))

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

model.save('mymodel.h5') # 学習済みモデルを保存

kerasのコードに関してはほぼほぼ公式サンプルのmnist_cnnと同じです。
load_images_from_labelFolderという、画像が入っているフォルダのパスと画像のサイズと学習とテストの割合を指定してあげると学習データを返してくれる関数を作りました。


import numpy as np
import glob
from keras.preprocessing.image import load_img, img_to_array
import re

def load_images_from_labelFolder(path, img_width, img_height, train_test_ratio=(9,1)):
    pathsAndLabels = []
    label_i = 0
    data_list = glob.glob(path + '\\*')
    datatxt = open('whoiswho.txt' ,'w')
    print('data_list', data_list)
    for dataFolderName in data_list:
        pathsAndLabels.append([dataFolderName, label_i])
        pattern = r".*\\(.*)" #pathが階層を色々含んでいるので、画像が入っているフォルダ名だけを取っている。
        matchOB = re.finditer(pattern, dataFolderName)
        directoryname = ""
        if matchOB:
            for a in matchOB:
                directoryname += a.groups()[0]
        datatxt.write(directoryname + "," + str(label_i) + "\n") # 誰が何番かテキストで保存しとく
        label_i = label_i + 1
    datatxt.close()
    allData = []
    for pathAndLabel in pathsAndLabels: # 全画像のパスとそのラベルをタプルでリストにし
        path = pathAndLabel[0]
        label = pathAndLabel[1]
        imagelist = glob.glob(path + "\\*")
        for imgName in imagelist:
            allData.append((imgName, label))
    allData = np.random.permutation(allData) # シャッフルする。

    train_x = []
    train_y = []
    for (imgpath, label) in allData: #kerasが提供しているpreprocessing.imageでは画像の前処理のメソッドがある。
        img = load_img(imgpath, target_size=(img_width,img_height)) # 画像を読み込む
        imgarry = img_to_array(img) # 画像ファイルを学習のためにarrayに変換する。
        train_x.append(imgarry)
        train_y.append(label)

    threshold = (train_test_ratio[0]*len(train_x))//(train_test_ratio[0]+train_test_ratio[1])
    test_x = np.array(train_x[threshold:])
    test_y = np.array(train_y[threshold:])
    train_x = np.array(train_x[:threshold])
    train_y = np.array(train_y[:threshold])

    return (train_x, train_y), (test_x, test_y)

学習

$ python .\train.py
Using TensorFlow backend.
data_list ['D:\\forwin\\deepLearning\\nogi_images\\images\\0_the_others', 'D:\\forwin\\deepLearning\\nogi_images\\images\\akimoto', 'D:\\forwin\\deepLearnin
g\\nogi_images\\images\\hashimoto', 'D:\\forwin\\deepLearning\\nogi_images\\images\\ikoma', 'D:\\forwin\\deepLearning\\nogi_images\\images\\ikuta', 'D:\\for
win\\deepLearning\\nogi_images\\images\\nishino', 'D:\\forwin\\deepLearning\\nogi_images\\images\\shiraishi']
Train on 2354 samples, validate on 393 samples
Epoch 1/30
2017-06-03 16:16:51.249284: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.249426: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.249870: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.250132: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.250631: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.251505: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.251783: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.252292: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 16:16:51.577805: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:887] Found dev
ice 0 with properties:
name: GeForce GTX 1050 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.392
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.32GiB
2017-06-03 16:16:51.577926: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:908] DMA: 0
2017-06-03 16:16:51.579198: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:918] 0:   Y
2017-06-03 16:16:51.579586: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0)
2354/2354 [==============================] - 14s - loss: 1.6099 - acc: 0.4571 - val_loss: 1.3219 - val_acc: 0.5827
Epoch 2/30
2354/2354 [==============================] - 11s - loss: 1.0838 - acc: 0.6206 - val_loss: 0.9719 - val_acc: 0.6539
Epoch 3/30
2354/2354 [==============================] - 11s - loss: 0.8529 - acc: 0.7022 - val_loss: 0.9590 - val_acc: 0.6794
Epoch 4/30
2354/2354 [==============================] - 11s - loss: 0.7101 - acc: 0.7685 - val_loss: 0.9310 - val_acc: 0.6768
Epoch 5/30
2354/2354 [==============================] - 11s - loss: 0.5977 - acc: 0.7935 - val_loss: 0.7474 - val_acc: 0.7150
Epoch 6/30
2354/2354 [==============================] - 11s - loss: 0.4693 - acc: 0.8386 - val_loss: 0.7349 - val_acc: 0.7634
Epoch 7/30
2354/2354 [==============================] - 11s - loss: 0.3611 - acc: 0.8768 - val_loss: 0.8088 - val_acc: 0.7328
Epoch 8/30
2354/2354 [==============================] - 11s - loss: 0.2873 - acc: 0.8985 - val_loss: 0.6574 - val_acc: 0.8015
Epoch 9/30
2354/2354 [==============================] - 11s - loss: 0.2519 - acc: 0.9129 - val_loss: 0.7998 - val_acc: 0.7735
Epoch 10/30
2354/2354 [==============================] - 11s - loss: 0.2004 - acc: 0.9350 - val_loss: 0.8732 - val_acc: 0.7684
Epoch 11/30
2354/2354 [==============================] - 11s - loss: 0.1950 - acc: 0.9376 - val_loss: 0.7465 - val_acc: 0.7964
Epoch 12/30
2354/2354 [==============================] - 11s - loss: 0.1400 - acc: 0.9494 - val_loss: 0.7360 - val_acc: 0.8041
Epoch 13/30
2354/2354 [==============================] - 11s - loss: 0.0927 - acc: 0.9673 - val_loss: 0.8668 - val_acc: 0.8142
Epoch 14/30
2354/2354 [==============================] - 11s - loss: 0.0902 - acc: 0.9647 - val_loss: 0.7354 - val_acc: 0.8372
Epoch 15/30
2354/2354 [==============================] - 11s - loss: 0.1122 - acc: 0.9579 - val_loss: 0.7828 - val_acc: 0.8066
Epoch 16/30
2354/2354 [==============================] - 11s - loss: 0.0982 - acc: 0.9707 - val_loss: 1.0783 - val_acc: 0.7863
Epoch 17/30
2354/2354 [==============================] - 11s - loss: 0.1339 - acc: 0.9575 - val_loss: 1.0563 - val_acc: 0.7735
Epoch 18/30
2354/2354 [==============================] - 11s - loss: 0.1289 - acc: 0.9562 - val_loss: 0.9549 - val_acc: 0.7939
Epoch 19/30
2354/2354 [==============================] - 11s - loss: 0.0929 - acc: 0.9677 - val_loss: 0.7774 - val_acc: 0.8015
Epoch 20/30
2354/2354 [==============================] - 11s - loss: 0.0662 - acc: 0.9788 - val_loss: 0.7677 - val_acc: 0.8372
Epoch 21/30
2354/2354 [==============================] - 11s - loss: 0.0905 - acc: 0.9720 - val_loss: 1.0242 - val_acc: 0.7812
Epoch 22/30
2354/2354 [==============================] - 11s - loss: 0.0733 - acc: 0.9775 - val_loss: 0.7614 - val_acc: 0.8219
Epoch 23/30
2354/2354 [==============================] - 11s - loss: 0.0395 - acc: 0.9877 - val_loss: 0.8486 - val_acc: 0.8372
Epoch 24/30
2354/2354 [==============================] - 11s - loss: 0.0396 - acc: 0.9843 - val_loss: 1.0073 - val_acc: 0.7888
Epoch 25/30
2354/2354 [==============================] - 11s - loss: 0.0341 - acc: 0.9902 - val_loss: 0.9374 - val_acc: 0.8092
Epoch 26/30
2354/2354 [==============================] - 11s - loss: 0.0472 - acc: 0.9851 - val_loss: 1.0889 - val_acc: 0.8015
Epoch 27/30
2354/2354 [==============================] - 11s - loss: 0.1202 - acc: 0.9618 - val_loss: 1.1969 - val_acc: 0.7786
Epoch 28/30
2354/2354 [==============================] - 654s - loss: 0.1190 - acc: 0.9669 - val_loss: 1.2336 - val_acc: 0.7939
Epoch 29/30
2354/2354 [==============================] - 11s - loss: 0.1724 - acc: 0.9511 - val_loss: 1.4932 - val_acc: 0.7913
Epoch 30/30
2354/2354 [==============================] - 11s - loss: 0.1195 - acc: 0.9698 - val_loss: 0.9864 - val_acc: 0.8168
Test loss: 0.986444676195
Test accuracy: 0.816793893433

予測

保存したモデルファイルを使って実際に予測するコード。
顔を抜き出すときにカスケードファイルが必要なので、保存してlib/においてください。


from keras.models import Sequential, load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np

import argparse
import cv2
from PIL import Image

def faceDetectionFromPath(path, size):
    cvImg = cv2.imread(path)
    cascade_path = "./lib/haarcascade_frontalface_alt.xml"
    cascade = cv2.CascadeClassifier(cascade_path)
    facerect = cascade.detectMultiScale(cvImg, scaleFactor=1.1, minNeighbors=1, minSize=(1, 1))
    faceData = []
    for rect in facerect:
        faceImg = cvImg[rect[1]:rect[1]+rect[3],rect[0]:rect[0]+rect[2]]
        resized = cv2.resize(faceImg,None, fx=float(size/faceImg.shape[0]),fy=float( size/faceImg.shape[1]))
        CV_im_RGB = resized[:, :, ::-1].copy()
        pilImg=Image.fromarray(CV_im_RGB)
        faceData.append(pilImg)

    return faceData

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', '-m', default='.\\mymodel.h5', required=True)
    parser.add_argument('--testpath', '-t', default='.\\images\\shiraishi.jpg')
    args = parser.parse_args()

    num_classes = 7
    img_rows, img_cols = 128, 128

    ident = [""] * num_classes
    for line in open("whoiswho.txt", "r"):
        dirname = line.split(",")[0]
        label = line.split(",")[1]
        ident[int(label)] = dirname

    model = load_model(args.model)
    faceImgs = faceDetectionFromPath(args.testpath, img_rows)
    imgarray = []
    for faceImg in faceImgs:
        faceImg.show()
        imgarray.append(img_to_array(faceImg))
    imgarray = np.array(imgarray) / 255.0
    imgarray.astype('float32')

    preds = model.predict(imgarray, batch_size=imgarray.shape[0])
    for pred in preds:
        predR = np.round(pred)
        for pre_i in np.arange(len(predR)):
            if predR[pre_i] == 1:
                print("he/she is {}".format(ident[pre_i]))

if __name__ == '__main__':
    main()

この画像を指定して、実行する。

[tensorflow] $ python .\prediction.py -m .\mymodel.h5
Using TensorFlow backend.
2017-06-03 19:16:43.844870: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.844982: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.846376: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.846776: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.847155: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.847555: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.847949: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:43.848033: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlo
w library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-03 19:16:44.117985: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:887] Found dev
ice 0 with properties:
name: GeForce GTX 1050 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.392
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.32GiB
2017-06-03 19:16:44.118111: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:908] DMA: 0
2017-06-03 19:16:44.119410: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:918] 0:   Y
2017-06-03 19:16:44.119459: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0)
he/she is shiraishi

実行するとこのように画像を切り出し入力に用いて予測する。最後の行でshi is shiraishiと正しくできています。

が所詮テストデータで正解率80%弱なので、結構間違えます。
学習データが少なすぎるので(300枚行かないぐらい)当たり前ではありますが、、、