Transfer learning with TF hub – First we will import all the necessary libraries install TensorFlow 2.0 and import tensorflow. Note that we are installing tensorflow_gpu since we are training a CNN on images which runs faster on gpu we are using gpu as a hardware accelerator for this colab.
For tensorflow_hub we import tensorflow_hub library and also import layers from the keras tensorflow.keras library. Next we download the classifier from tf.hub. We use hub.module to load a MobileNet and tf.keras.layers.lambda to wrap it using a keras layer.
So, this is the url for the classifier the MobileNet on tensorflow_hub. We define the shape of the image which 224 by 224 and we define a sequential model with the hub layer. If we run this particular model on a single image let us see what we get. So, we load here we download the image using tf.keras.util.get file and resize the image by the image shape.
You can see the input image. The input tensor or the input image is a colored image with three channels and has height and width of 224 each. We know that CNS take 4D tensor, so, we add a batch dimension and pass the image to the model.
So, the result of the classifier is a 2D tensor which has 1001 elements corresponding to logics rating the probability of each class for the image. The top class ID can be found using argmax. So, you can see that the class ID for the input image is 653. In order to get the text representation of the class we download the ImageNet labels file and use it to decode the name of the class.
So, you can see that the prediction which was ID 653 corresponds to Military Uniform. We can use tf hub to retrain the top layer of the model to recognize the classes in a dataset. Let us download a flower dataset and demonstrate transfer learning with tf hub. We load the data into our model using mage data generator which you can see here and we pass the rescaling parameter to it. The tensor flow hubs image modules expect a float input between 0 to 1, hence we rescale the input image.
We also resize the image to the desired shape. We can look at the shape of image batch and the label batch. The image batch is the 4D tensor each having 32 images with height and width of 224 and 3 channels. So, for each image we have a vector of size 5. So, the flower dataset has 5 classes and each class is represented in one hot encoding. Let us run the classifier on the image batch. Note that currently the classifier only contains the keras layer from hub.
If you apply the classifier on the image batch we get a 2D tensor of shape (32, 1001). Tensorflow_hub distributes model without a top classification layer. This can be used for transfer learning. So, we create a feature extractor as a keras layer with the input shape of 224 x 224 x 3 it returns a 1280 length vector. The feature batch is a 2D tensor. So, for every image we have 1280 length vector. You freeze the variable in the feature extraction layer, so that the training only modifies the new classifier layer.
So, we attach a new classifier layer to the model. The new classifier layer has units equal to the number of classes in the images and we use softmax as an activation function. So, since we have 5 different classes, the dense layer outputs 5 probabilities one corresponding to each class. So, the number of parameter for keras layer is equal to the number of parameters in the MobileNet.
Mobile net has 2.2 million parameters and the dense layer has 1280 inputs. So, for every unit we have this 1280 parameters corresponding to each of the input plus 1 bias. So, there are 1281 parameters per unit and we have 5 units making it to 6405 parameters. So, we can see that the total parameters are sum of the parameters in the keras layer and the parameters in the dense layer.
Out of this total parameters the parameters in the keras layer are non-trainable were as the parameters in the dense layer are trainable. Let us compile the model. Since we have 5 classes, we use categorical crossentropy loss. We use Adam as an optimizer. Let us fit the model. We will fit the model just for 2 epochs.
And, to visualize the training progress we use a custom call back to log the loss and the accuracy of each batch individually instead of epoch average. We also compute steps per epoch and define a CollectBatchStats call back. We use the call back in the fit and the steps per epoch computed over here. You can see that after two steps we reached an accuracy close to 94 percent.
If you look at the training accuracy by the steps, we can see that it is increasing as you progress further in the training. Let us get the prediction for the image batch and plot the results. If the model prediction is correct we use the green color and we use and we use red color if the predictions are incorrect.
Now, you can see that most of the prediction most of the predictions are correct. Now, that the model is trained you can export it as a saved model so that we can use it for deployment on some other device or we can also reload it for the future use. After saving the model we reload it and we check whether the results of reloaded results of the reloaded model and the earlier model matches that we do by taking the difference between the results.
So, in this session we looked at tf hub and understood how to use the models saved in tf hub for transfer learning on CNS. Hope you had fun learning these concepts. See you in the next articles.
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!