Overview
InceptionV3 is one of the models to classify images. We can easily use it from TensorFlow or Keras.On this article, I’ll check the architecture of it and try to make fine-tuning model.
There are some image classification models we can use for fine-tuning.
Those model’s weights are already trained and by small steps, you can make models for your own data.
About the fine-tuning itself, please check the followings.
Or TensorFlow and Keras have nice documents of fine-tuning.
From TensorFlow
From Keras
InceptionV3
At first, I’ll check here the architecture of InceptionV3 with Keras. Please install Keras in advance.
To be added, to show the architecture, we need to install some things by following command on your terminal.
pip install pydot graphviz
pip install pydot3 pydot-ng
By executing following code, we can visualize the InceptionV3 model’s architecture.
from IPython.display import SVG
from keras.applications.inception_v3 import InceptionV3
from keras.utils.vis_utils import model_to_dot
inception_model = InceptionV3(weights='imagenet', include_top=False)
SVG(model_to_dot(inception_model).create(prog='dot', format='svg'))
Visually, this architecture has huge scale. And it looks so complex, compared with VGG.
It is composed of some modules. Each module has some sizes of convolutions to extract features.
To know details, the thesis below works well, especially when you think about the update.
To check the correspondence between the module and the layer’s number, we can print the layer’s name and number.
# check the layers by name
for i,layer in enumerate(inception_model.layers):
print(i,layer.name)
0 input_1
1 conv2d_1
2 batch_normalization_1
3 activation_1
4 conv2d_2
5 batch_normalization_2
6 activation_2
7 conv2d_3
8 batch_normalization_3
.
.
.
299 conv2d_94
300 batch_normalization_86
301 activation_88
302 activation_89
303 activation_92
304 activation_93
305 batch_normalization_94
306 activation_86
307 mixed9_1
308 concatenate_2
309 activation_94
310 mixed10
Data
Same as the article, VGG19 Fine-tuning model, I used cifar-10, simple color image data set.
This time, for fine-tuning, I limited the amount of data for training and size.
When you make fine-tuning model, be careful of the input image data size. Each model has own restriction about that. You can check it on Applications.
import random
import cv2
from keras.datasets import cifar10
from keras.utils import to_categorical
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Dropout
from keras.optimizers import SGD
import numpy as np
# read data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# limit the amount of the data
# train data
ind_train = random.sample(list(range(x_train.shape[0])), 2000)
x_train = x_train[ind_train]
y_train = y_train[ind_train]
# test data
ind_test = random.sample(list(range(x_test.shape[0])), 2000)
x_test = x_test[ind_test]
y_test = y_test[ind_test]
def resize_data(data):
data_upscaled = np.zeros((data.shape[0], 140, 140, 3))
for i, img in enumerate(data):
large_img = cv2.resize(img, dsize=(140, 140), interpolation=cv2.INTER_CUBIC)
data_upscaled[i] = large_img
return data_upscaled
# resize train and test data
x_train_resized = resize_data(x_train)
x_test_resized = resize_data(x_test)
# make explained variable hot-encoded
y_train_hot_encoded = to_categorical(y_train)
y_test_hot_encoded = to_categorical(y_test)
Fine-tuning
On the fine tuning phase, I added fully connected layers, selected training target layers, and trained.
About the detail about the things done on this point, please check the article below.
inc_model = InceptionV3(weights='imagenet', include_top=False)
# get layers and add average pooling layer
x = inc_model.output
x = GlobalAveragePooling2D()(x)
# add fully-connected layer
x = Dense(512, activation='relu')(x)
# add output layer
predictions = Dense(10, activation='softmax')(x)
model = Model(inputs=inc_model.input, outputs=predictions)
# freeze pre-trained model area's layer
for layer in inc_model.layers:
layer.trainable = False
# update the weight that are added
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit(x_train_resized, y_train_hot_encoded)
# choose the layers which are updated by training
layer_num = len(model.layers)
for layer in model.layers[:279]:
layer.trainable = False
for layer in model.layers[279:]:
layer.trainable = True
# training
model.compile(optimizer=SGD(lr=0.01, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train_resized, y_train_hot_encoded, batch_size=128, epochs=5, shuffle=True, validation_split=0.3)
Epoch 1/1
2000/2000 [==============================] - 159s - loss: 2.3449
Train on 1400 samples, validate on 600 samples
Epoch 1/5
1400/1400 [==============================] - 192s - loss: 1.1798 - acc: 0.6171 - val_loss: 1.3677 - val_acc: 0.5517
Epoch 2/5
1400/1400 [==============================] - 172s - loss: 0.6560 - acc: 0.8043 - val_loss: 1.1426 - val_acc: 0.6250
Epoch 3/5
1400/1400 [==============================] - 177s - loss: 0.3608 - acc: 0.9193 - val_loss: 1.0056 - val_acc: 0.6783
Epoch 4/5
1400/1400 [==============================] - 173s - loss: 0.2144 - acc: 0.9614 - val_loss: 0.9486 - val_acc: 0.6917
Epoch 5/5
1400/1400 [==============================] - 272s - loss: 0.1337 - acc: 0.9829 - val_loss: 0.9229 - val_acc: 0.7133
To check how train went on, we can visualize it. import matplotlib.pyplot as plt
def show_history(history):
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train_accuracy', 'test_accuracy'], loc='best')
plt.show()
show_history(history)