4.2 - Convolutional Neural Networks

4.2 - Convolutional Neural Networks#

Course’s material requires a tensorflow version lower than the default one used in Google Colab. Run the following cell to downgrade TensorFlow accordingly.

import os
def downgrade_tf_version():
    os.system("!yes | pip uninstall -y tensorflow")
    os.system("!yes | pip install tensorflow==2.12.0")
    os.kill(os.getpid(), 9)
downgrade_tf_version()

!wget -nc --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/2021.deeplearning/main/content/init.py
import init; init.init(force_download=False); 

import tensorflow as tf
from time import time
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from local.lib import mlutils
%matplotlib inline

/opt/anaconda2/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
  from numpy.core.umath_tests import inner1d

Image analytics tasks#

from IPython.display import Image
Image(filename='local/imgs/imgs_tasks.jpeg', width=800)

../_images/2f4e550cc92661253645e074692e48b970f2dae315352edd2b30e7034a010128.jpg

Explore COCO Dataset #

also

Image APIs Clarifai Amazon Rekognition, Google Cloud Vision
Image Captioning (con CNN + RNN!!!) caption bot

Convolutional Neural Networks#

see video series Tensorflow and Deep Learning without a PhD

see convolutions summary | filter activation demo | confusion matrix

see The 9 Deep Learning Papers You Should Know

RECOMMENDATION#

close all applications
install Maxthon browser http://www.maxthon.com
open only VirtualBox and Maxthon

First level filters and activations maps#

the filters in the middle are applied to the image on the left. Observe, for instance, in what parts of the image the seventh filter of the first row is activated (the one before the last one in the first row).

Image(filename='local/imgs/cnn_swan.png', width=800)

../_images/5345f3272b667e135b8c829bc681cf5a56a24d1c6d20ce21f1599129b2f613c6.png

Hierarchy of filters and activation maps#

Image(filename='local/imgs/cnn_features.png', width=600)

../_images/ec8704546036dbb5b6231da4918493fa6e40d6616fb46f686268297b85d8e643.png

Image(filename='local/imgs/conv1.jpg', width=800)

../_images/14a2db5006269456afb811b0e873f6c6192b3f7ddce9f48d03b5122bf1ba6c7f.jpg

Image(filename='local/imgs/conv2.jpg', width=800)

../_images/1209ac177d06132b8c5e0d6dadf79f5866788a49f7a6689322aaba94ebce3346.jpg

otros ejemplos de filtros de primer nivel

Image(filename='local/imgs/cnn_features2.png', width=600)

../_images/903c277a5ef54a71c2b03c88f0327411bcf08bb6268f507463d7ae4c2a9ab8e7.png

We have a small image dataset based on CIFAR-10, where each image size is 32x32x3.

!wget -nc https://s3.amazonaws.com/rlx/mini_cifar.h5

File ‘mini_cifar.h5’ already there; not retrieving.

import h5py
with h5py.File('mini_cifar.h5','r') as h5f:
    x_cifar = h5f["x"][:]
    y_cifar = h5f["y"][:]

mlutils.show_labeled_image_mosaic(x_cifar, y_cifar)

../_images/f21a714551ffbf8d62a05abfedb2a70e5cd14a253e70260fb67c320a7ccda8a9.png

print (np.min(x_cifar), np.max(x_cifar))

0.0 1.0

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x_cifar, y_cifar, test_size=.25)
print (x_train.shape, y_train.shape, x_test.shape, y_test.shape)
print ("\ndistribution of train classes")
print (pd.Series(y_train).value_counts())
print ("\ndistribution of test classes")
print (pd.Series(y_test).value_counts())

(2253, 32, 32, 3) (2253,) (751, 32, 32, 3) (751,)

distribution of train classes
2    778
0    750
1    725
dtype: int64

distribution of test classes
0    255
1    249
2    247
dtype: int64

build a Keras model

def get_conv_model_A(num_classes, img_size=32, compile=True):
    tf.keras.backend.clear_session()
    print ("using",num_classes,"classes")
    inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
    layers = tf.keras.layers.Conv2D(15,(3,3), activation="relu", padding="SAME")(inputs)
    layers = tf.keras.layers.Flatten()(layers)
    layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
    layers = tf.keras.layers.Dropout(0.2)(layers)
    predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
    model = tf.keras.Model(inputs = inputs, outputs=predictions)
    if compile:
        model.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])
    return model

num_classes = len(np.unique(y_cifar))
model = get_conv_model_A(num_classes)

using 3 classes

observe the weights initialized and their weights

weights = model.get_weights()
for i in weights:
    print (i.shape)

(3, 3, 3, 15)
(15,)
(15360, 16)
(16,)
(16, 3)
(3,)

we keep the filters on the first layer to later compare them with the same filters after training.

initial_w0 = model.get_weights()[0].copy()

y_test.shape, y_train.shape, x_test.shape, x_train.shape

((751,), (2253,), (751, 32, 32, 3), (2253, 32, 32, 3))

num_classes = len(np.unique(y_cifar))

def train(model, batch_size, epochs, model_name=""):
    tensorboard = tf.keras.callbacks.TensorBoard(log_dir="logs/"+model_name+"_"+"{}".format(time()))
    model.reset_states()
    model.fit(x_train, y_train, epochs=epochs, callbacks=[tensorboard],
              batch_size=batch_size,
              validation_data=(x_test, y_test))
    metrics = model.evaluate(x_test, y_test)
    return {k:v for k,v in zip (model.metrics_names, metrics)}

observe the shapes of model weights obtained above and try to see how they are related to the output shape and the number of parameters

model = get_conv_model_A(num_classes)
model.summary()

using 3 classes
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 32, 32, 15)        420       
_________________________________________________________________
flatten (Flatten)            (None, 15360)             0         
_________________________________________________________________
dense (Dense)                (None, 16)                245776    
_________________________________________________________________
dropout (Dropout)            (None, 16)                0         
_________________________________________________________________
output_1 (Dense)             (None, 3)                 51        
=================================================================
Total params: 246,247
Trainable params: 246,247
Non-trainable params: 0
_________________________________________________________________

train(model, batch_size=32, epochs=10, model_name="model_A")

Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 1s 490us/step - loss: 0.9741 - acc: 0.5109 - val_loss: 0.8331 - val_acc: 0.6498
Epoch 2/10
2253/2253 [==============================] - 1s 430us/step - loss: 0.8445 - acc: 0.6130 - val_loss: 0.7991 - val_acc: 0.6618
Epoch 3/10
2253/2253 [==============================] - 1s 465us/step - loss: 0.7801 - acc: 0.6338 - val_loss: 0.7409 - val_acc: 0.6804
Epoch 4/10
2253/2253 [==============================] - 1s 371us/step - loss: 0.7336 - acc: 0.6755 - val_loss: 0.7352 - val_acc: 0.6818
Epoch 5/10
2253/2253 [==============================] - 1s 486us/step - loss: 0.7217 - acc: 0.6826 - val_loss: 0.6991 - val_acc: 0.7071
Epoch 6/10
2253/2253 [==============================] - 1s 538us/step - loss: 0.6801 - acc: 0.7057 - val_loss: 0.6940 - val_acc: 0.6991
Epoch 7/10
2253/2253 [==============================] - 1s 383us/step - loss: 0.6133 - acc: 0.7266 - val_loss: 0.6788 - val_acc: 0.7044
Epoch 8/10
2253/2253 [==============================] - 1s 446us/step - loss: 0.5957 - acc: 0.7483 - val_loss: 0.6807 - val_acc: 0.7137
Epoch 9/10
2253/2253 [==============================] - 2s 702us/step - loss: 0.5585 - acc: 0.7581 - val_loss: 0.6566 - val_acc: 0.7217
Epoch 10/10
2253/2253 [==============================] - 2s 720us/step - loss: 0.5489 - acc: 0.7386 - val_loss: 0.6471 - val_acc: 0.7270
751/751 [==============================] - 0s 251us/step

{'acc': 0.7270306261496918, 'loss': 0.6470887158586881}

test_preds = model.predict(x_test).argmax(axis=1)
mlutils.plot_confusion_matrix(y_test, test_preds, classes=np.r_[0,1,2], normalize=True)

Normalized confusion matrix
[[0.69411765 0.16862745 0.1372549 ]
 [0.04016064 0.85140562 0.10843373]
 [0.22267206 0.1417004  0.63562753]]

<matplotlib.axes._subplots.AxesSubplot at 0x7efe9e4a1dd0>

../_images/82cd3b4594718147da85ecec02c3ad7212ba86f288a8299f7858af3ec4a85be9.png

observe the outp in tensorboard

tensorboard --logdir logs

first layer filters before training

mlutils.display_imgs(initial_w0)

../_images/c75f4955998d17482ac4823c6a6b3cf714360e7ad061d1cb2060a5b3bcf42883.png

and after training

w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)

(3, 3, 3, 15)

../_images/7b47eba8aa1159d3291e978044fc86c986ad9055b0feb6743e42db40100be3a0.png

idxs = np.random.permutation(len(x_test))[:5]
preds = model.predict(x_test[idxs])
mlutils.show_preds(x_test[idxs],y_test[idxs], preds)

../_images/f176813acdfeda87c5daa478c0c23979b07c370961af13a1f89dd578cfc56d14.png

../_images/935d94481c67b2455c620fca30b62242853dd99963487d87c3789b9e4ac34a91.png

../_images/21884392f0a8a144dca9c332bd19612e7bd9d0754703257519ce59b94bd42b75.png

../_images/2ecbedaa7642caff342345729a89996daf108cf25e31c4cd449a1511706d55b7.png

../_images/c9d7eb8acf3e93830caeb1fe6f311ca73336ff459ef901668d7cec386c0565a8.png

Let’s try a more complex network#

def get_conv_model_B(num_classes, img_size=32, compile=True):
    tf.keras.backend.clear_session()
    print ("using",num_classes,"classes")
    inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
    layers = tf.keras.layers.Conv2D(15,(5,5), activation="relu")(inputs)
    layers = tf.keras.layers.MaxPool2D((2,2))(layers)
    layers = tf.keras.layers.Conv2D(60,(5,5), activation="relu")(layers)
    layers = tf.keras.layers.Flatten()(layers)
    layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
    layers = tf.keras.layers.Dropout(0.2)(layers)
    predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
    model = tf.keras.Model(inputs = inputs, outputs=predictions)
    if compile:
        model.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])
    return model

model = get_conv_model_B(num_classes)
model.summary()
train(model, batch_size=32, epochs=10, model_name="model_B")

using 3 classes
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 28, 28, 15)        1140      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 15)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 10, 10, 60)        22560     
_________________________________________________________________
flatten (Flatten)            (None, 6000)              0         
_________________________________________________________________
dense (Dense)                (None, 16)                96016     
_________________________________________________________________
dropout (Dropout)            (None, 16)                0         
_________________________________________________________________
output_1 (Dense)             (None, 3)                 51        
=================================================================
Total params: 119,767
Trainable params: 119,767
Non-trainable params: 0
_________________________________________________________________
Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.9930 - acc: 0.4918 - val_loss: 0.9041 - val_acc: 0.5726
Epoch 2/10
2253/2253 [==============================] - 2s 878us/step - loss: 0.8651 - acc: 0.5832 - val_loss: 0.7849 - val_acc: 0.6431
Epoch 3/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.7790 - acc: 0.6316 - val_loss: 0.7374 - val_acc: 0.6551
Epoch 4/10
2253/2253 [==============================] - 4s 2ms/step - loss: 0.7289 - acc: 0.6587 - val_loss: 0.6836 - val_acc: 0.7310
Epoch 5/10
2253/2253 [==============================] - 5s 2ms/step - loss: 0.6608 - acc: 0.7079 - val_loss: 0.6550 - val_acc: 0.7097
Epoch 6/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.5975 - acc: 0.7386 - val_loss: 0.6259 - val_acc: 0.7430
Epoch 7/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.5508 - acc: 0.7590 - val_loss: 0.6558 - val_acc: 0.7031
Epoch 8/10
2253/2253 [==============================] - 4s 2ms/step - loss: 0.5065 - acc: 0.7852 - val_loss: 0.6555 - val_acc: 0.7177
Epoch 9/10
2253/2253 [==============================] - 3s 1ms/step - loss: 0.4732 - acc: 0.7994 - val_loss: 0.6463 - val_acc: 0.7350
Epoch 10/10
2253/2253 [==============================] - 3s 2ms/step - loss: 0.4236 - acc: 0.8282 - val_loss: 0.5843 - val_acc: 0.7537
751/751 [==============================] - 0s 580us/step

{'acc': 0.7536617846844517, 'loss': 0.5843058094362444}

w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)

(5, 5, 3, 15)

../_images/67a14eb46b4de16555fcc064b7368a290aff106a81a0913a7232951330e80c45.png

or with larger filters#

def get_conv_model_C(num_classes, img_size=32, compile=True):
    tf.keras.backend.clear_session()
    print ("using",num_classes,"classes")
    inputs = tf.keras.Input(shape=(img_size,img_size,3), name="input_1")
    layers = tf.keras.layers.Conv2D(96,(11,11), activation="relu")(inputs)
    layers = tf.keras.layers.MaxPool2D((2,2))(layers)
    layers = tf.keras.layers.Conv2D(60,(11,11), activation="relu")(layers)
    layers = tf.keras.layers.Flatten()(layers)
    layers = tf.keras.layers.Dense(16, activation=tf.nn.relu)(layers)
    layers = tf.keras.layers.Dropout(0.2)(layers)
    predictions = tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax, name="output_1")(layers)
    model = tf.keras.Model(inputs = inputs, outputs=predictions)
    if compile:
        model.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])
    return model

model = get_conv_model_C(num_classes)
model.summary()
train(model, batch_size=32, epochs=10, model_name="model_C")

using 3 classes
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 22, 22, 96)        34944     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 11, 11, 96)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 1, 1, 60)          697020    
_________________________________________________________________
flatten (Flatten)            (None, 60)                0         
_________________________________________________________________
dense (Dense)                (None, 16)                976       
_________________________________________________________________
dropout (Dropout)            (None, 16)                0         
_________________________________________________________________
output_1 (Dense)             (None, 3)                 51        
=================================================================
Total params: 732,991
Trainable params: 732,991
Non-trainable params: 0
_________________________________________________________________
Train on 2253 samples, validate on 751 samples
Epoch 1/10
2253/2253 [==============================] - 11s 5ms/step - loss: 1.0610 - acc: 0.4159 - val_loss: 0.9639 - val_acc: 0.5300
Epoch 2/10
2253/2253 [==============================] - 11s 5ms/step - loss: 0.9590 - acc: 0.5317 - val_loss: 0.9168 - val_acc: 0.5566
Epoch 3/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8939 - acc: 0.5584 - val_loss: 0.9846 - val_acc: 0.4940
Epoch 4/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8659 - acc: 0.5899 - val_loss: 0.9343 - val_acc: 0.5459
Epoch 5/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.8456 - acc: 0.6165 - val_loss: 0.8306 - val_acc: 0.6325
Epoch 6/10
2253/2253 [==============================] - 10s 5ms/step - loss: 0.8022 - acc: 0.6303 - val_loss: 0.8122 - val_acc: 0.6192
Epoch 7/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7941 - acc: 0.6343 - val_loss: 0.7889 - val_acc: 0.6325
Epoch 8/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7490 - acc: 0.6613 - val_loss: 0.7288 - val_acc: 0.6818
Epoch 9/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7323 - acc: 0.6600 - val_loss: 0.7660 - val_acc: 0.6591
Epoch 10/10
2253/2253 [==============================] - 10s 4ms/step - loss: 0.7040 - acc: 0.6747 - val_loss: 0.7095 - val_acc: 0.6791
751/751 [==============================] - 1s 1ms/step

{'acc': 0.6790945410093518, 'loss': 0.7095216961103813}

w0 = model.get_weights()[0]
print (w0.shape)
mlutils.display_imgs(w0)

(11, 11, 3, 96)

../_images/bdf6e58c2fc91adb53b134c51f186fdf8a9a1a4b1687384116de5a2f6012d43b.png

i = np.random.randint(len(x_test))
plt.imshow(x_test[i])
plt.axis("off");

../_images/a152d981db7e8c01c2aaa633602f7b125a3cc3cbc8610fdb8bc8798f9c923283.png

acts = mlutils.get_activations(model, x_test[i:i+1])["conv2d/Relu:0"][0]

plt.figure(figsize=(10,10))
for i in range(acts.shape[-1]):
    plt.subplot(10,10,i+1)
    plt.imshow(acts[:,:,i], cmap=plt.cm.Greys_r )
    plt.axis("off")

../_images/2e608385339ca36e974f52cc26f5d3bb73a77858d2adaa5d90cabb1ae0c1afd0.png

idxs = np.random.permutation(len(x_test))[:5]
preds = model.predict(x_test[idxs])
mlutils.show_preds(x_test[idxs],y_test[idxs], preds)

../_images/8255e38a1e6e6a0e0c725f06a529a46e303386fd11c62bd87f6e69ccbc3aafaa.png

../_images/118a7b7000448b27fe63b35df6e9f6cec37bc1d2c073d202e82c1f0c759c3412.png

../_images/2604fbc453ee56767d72ba493449723c89536834468744c3e5cbf542a4151361.png

../_images/d7d2760bec732f211a678f2280a9510cc230ec2497f37581fde47fb7a5bf63c8.png

../_images/8e046e43b819402725551a9bd4e7dfdc4646944ed13b1a8d709b8a3fa1dd376e.png

see

Class activation maps https://jacobgil.github.io/deeplearning/class-activation-maps