Overview
On the article, How to make Fine tuning model, I made fine-tuning models by some pre-trained models. At that time, I didn't write about the pre-trained model's architecture and the train target area based on it.
This time, I foucused on the VGG19 as pre-trained model. And in a nutshell, I tried to make fine-tuning model in better manner, checking some important points.
What is fine-tuning?
The simple explanation of fine-tuning is already on the article, How to make Fine tuning model. But just in case, let’s check again.
Fine-tuning is simple and flexible method to make huge-scaled model by relatively little time and by small amount of data. You can use already trained model's architecture and the weights to make model by training some layers.
On the image above, the blue circles and red lines means nodes and weights. Those are pre-trained by huge size of data. You can add some layers after that to adjust the model to your own data. On the training phase, you just train by your own data the layers you added including some layers before those.
The number of layers you need to train by data is much fewer than the original model’s number of layers.
VGG19
Architecture
By visualizing model's architecture, you can see and check the model's scale and the tips in it.It is easy to see model's architecture on Keras.
At first, you need to prepare for vizualization.
pip install pydot graphviz
pip install pydot3 pydot-ng
By the following code, you can check VGG19's architecture on the form of plot.
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
vgg_model = VGG19(weights='imagenet', include_top=False)
SVG(model_to_dot(vgg_model).create(prog='dot', format='svg'))
The output is the image below.
Basically, the model is composed of convolutional and pooling layers and it is not diverged at all.
For the fine-tuning purpose, you will add some layers to this and train that part including some layers on this architecture.
How to get layer’s information
Usually, when I choose the layers to train, I set True or False on the attribute, trainable, by using the layer’s index. So, by this code, I checked the layer’s index and name.
# check the layers by name
for i,layer in enumerate(vgg_model.layers):
print(i,layer.name)
When I make fine-tuning model, I check the architecture plot and the layer’s index, name to choose the layers to train and not to train. In this case, the output is as followings.
0 input_5
1 block1_conv1
2 block1_conv2
3 block1_pool
4 block2_conv1
5 block2_conv2
6 block2_pool
7 block3_conv1
8 block3_conv2
9 block3_conv3
10 block3_conv4
11 block3_pool
12 block4_conv1
13 block4_conv2
14 block4_conv3
15 block4_conv4
16 block4_pool
17 block5_conv1
18 block5_conv2
19 block5_conv3
20 block5_conv4
21 block5_pool
Data
This time, I used cifar-10 data set, which is composed of 10 classes color images.
The image above shows part of the data set.
Because fine-tuning doesn’t need much data and time for training(and actually, it is too hard to train with huge data on my laptop), I just limited the amount of data.
The code below is to import libraries and prepare the data.
import random
import cv2
from keras.datasets import cifar10
from keras.utils import to_categorical
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Dropout
from keras.optimizers import SGD
from keras.applications.vgg19 import VGG19
import numpy as np
# read data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# limit the amount of the data
# train data
ind_train = random.sample(list(range(x_train.shape[0])), 2000)
x_train = x_train[ind_train]
y_train = y_train[ind_train]
# test data
ind_test = random.sample(list(range(x_test.shape[0])), 2000)
x_test = x_test[ind_test]
y_test = y_test[ind_test]
def resize_data(data):
data_upscaled = np.zeros((data.shape[0], 48, 48, 3))
for i, img in enumerate(data):
large_img = cv2.resize(img, dsize=(48, 48), interpolation=cv2.INTER_CUBIC)
data_upscaled[i] = large_img
return data_upscaled
# resize train and test data
x_train_resized = resize_data(x_train)
x_test_resized = resize_data(x_test)
# make explained variable hot-encoded
y_train_hot_encoded = to_categorical(y_train)
y_test_hot_encoded = to_categorical(y_test)
Fine tuning
Here, I added some layers to the pre-trained model and trained. About this part, on the article, How to make Fine tuning model, I wrote the process step by step.
# get layers and add average pooling layer
x = vgg_model.output
x = GlobalAveragePooling2D()(x)
# add fully-connected layer
x = Dense(512, activation='relu')(x)
x = Dropout(0.3)(x)
# add output layer
predictions = Dense(10, activation='softmax')(x)
model = Model(inputs=vgg_model.input, outputs=predictions)
# freeze pre-trained model area's layer
for layer in vgg_model.layers:
layer.trainable = False
# update the weight that are added
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit(x_train_resized, y_train_hot_encoded)
# choose the layers which are updated by training
layer_num = len(model.layers)
for layer in model.layers[:21]:
layer.trainable = False
for layer in model.layers[21:]:
layer.trainable = True
# training
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train_resized, y_train_hot_encoded, batch_size=256, epochs=50, shuffle=True, validation_split=0.1)
About the part of choosing the layers which are trained by the data, I chose the index, 21, by checking the model’s architecture image.How did the training go on?
To check how train went on, we can plot the changes of accuracy(loss is also ok).
import matplotlib.pyplot as plt
def show_history(history):
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train_accuracy', 'test_accuracy'], loc='best')
plt.show()
show_history(history)
The plot image is the below.