07.04 - TENSORFLOW

!wget --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/20201.xai4eng/master/content/init.py
import init; init.init(force_download=False); init.get_weblink()
Copy to clipboard
endpoint https://m5knaekxo6.execute-api.us-west-2.amazonaws.com/dev-v0001/rlxmooc
Copy to clipboard
import numpy as np
import matplotlib.pyplot as plt
from local.lib import mlutils
from IPython.display import Image

try:
    %tensorflow_version 2.x
    print ("Using TF2 in Google Colab")
except:
    pass

import tensorflow as tf
%matplotlib inline
Copy to clipboard

A dataset (again)

## KEEPOUTPUT
from sklearn.datasets import make_moons
X,y = make_moons(300, noise=.15)
plt.scatter(X[:,0][y==0], X[:,1][y==0], color="blue", label="class 0", alpha=.5)
plt.scatter(X[:,0][y==1], X[:,1][y==1], color="red", label="class 1", alpha=.5)
Copy to clipboard
<matplotlib.collections.PathCollection at 0x7f34b2988f70>
Copy to clipboard
../_images/NOTES 07.04 - NEURAL NETWORKS WITH TENSORFLOW_4_1.png

A neural network in tensorflow, 4 neurons in the hidden layer, 1 output

model = tf.keras.Sequential([
    tf.keras.layers.Dense(4, activation='tanh'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=.5),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),metrics=['accuracy'])
Copy to clipboard
## KEEPOUTPUT
model.fit(X,y, epochs=10, batch_size=1)
Copy to clipboard
Epoch 1/10
300/300 [==============================] - 0s 480us/step - loss: 0.4007 - accuracy: 0.8267
Epoch 2/10
300/300 [==============================] - 0s 478us/step - loss: 0.3605 - accuracy: 0.8467
Epoch 3/10
300/300 [==============================] - 0s 469us/step - loss: 0.2444 - accuracy: 0.8900
Epoch 4/10
300/300 [==============================] - 0s 473us/step - loss: 0.1709 - accuracy: 0.9467
Epoch 5/10
300/300 [==============================] - 0s 470us/step - loss: 0.1837 - accuracy: 0.9467
Epoch 6/10
300/300 [==============================] - 0s 455us/step - loss: 0.1485 - accuracy: 0.9600
Epoch 7/10
300/300 [==============================] - 0s 467us/step - loss: 0.1072 - accuracy: 0.9633
Epoch 8/10
300/300 [==============================] - 0s 456us/step - loss: 0.0981 - accuracy: 0.9667
Epoch 9/10
300/300 [==============================] - 0s 472us/step - loss: 0.1252 - accuracy: 0.9600
Epoch 10/10
300/300 [==============================] - 0s 465us/step - loss: 0.1253 - accuracy: 0.9633
Copy to clipboard
<tensorflow.python.keras.callbacks.History at 0x7f3430485850>
Copy to clipboard
## KEEPOUTPUT
predict = lambda X: (model.predict(X)[:,0]>.5).astype(int)
mlutils.plot_2Ddata_with_boundary(predict, X, y)
Copy to clipboard
(0.5318, 0.4682)
Copy to clipboard
../_images/NOTES 07.04 - NEURAL NETWORKS WITH TENSORFLOW_8_1.png

A bigger network

  • different activation functions

  • different optimizer

model = tf.keras.Sequential([
    tf.keras.layers.Dense(20, activation='tanh'),
    tf.keras.layers.Dense(50, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=.01),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),metrics=['accuracy'])
Copy to clipboard
## KEEPOUTPUT
model.fit(X,y, epochs=10, batch_size=1)
Copy to clipboard
Epoch 1/10
300/300 [==============================] - 0s 510us/step - loss: 0.3400 - accuracy: 0.8567
Epoch 2/10
300/300 [==============================] - 0s 508us/step - loss: 0.2804 - accuracy: 0.8867
Epoch 3/10
300/300 [==============================] - 0s 487us/step - loss: 0.2450 - accuracy: 0.8967
Epoch 4/10
300/300 [==============================] - 0s 532us/step - loss: 0.1722 - accuracy: 0.9433
Epoch 5/10
300/300 [==============================] - 0s 516us/step - loss: 0.1660 - accuracy: 0.9500
Epoch 6/10
300/300 [==============================] - 0s 527us/step - loss: 0.0917 - accuracy: 0.9700
Epoch 7/10
300/300 [==============================] - 0s 516us/step - loss: 0.1127 - accuracy: 0.9667
Epoch 8/10
300/300 [==============================] - 0s 507us/step - loss: 0.1118 - accuracy: 0.9600
Epoch 9/10
300/300 [==============================] - 0s 526us/step - loss: 0.1076 - accuracy: 0.9500
Epoch 10/10
300/300 [==============================] - 0s 487us/step - loss: 0.0840 - accuracy: 0.9800
Copy to clipboard
<tensorflow.python.keras.callbacks.History at 0x7f340807ac10>
Copy to clipboard
## KEEPOUTPUT
predict = lambda X: (model.predict(X)[:,0]>.5).astype(int)
mlutils.plot_2Ddata_with_boundary(predict, X, y)
Copy to clipboard
(0.522575, 0.477425)
Copy to clipboard
../_images/NOTES 07.04 - NEURAL NETWORKS WITH TENSORFLOW_12_1.png

Cross entropy - multiclass classification

follow THIS EXAMPLE in TensorFlow doc site. Observe that:

  • labels corresponding to a 10-class classification problem

  • the network contains 10 output neurons, one per output class

  • the loss function is SparseCategoricalCrossEntropy

Observe how cross entropy works with 4 classes:

  • first we convert the output to a one-hot encoding

  • we create a network with two output neurons with sigmoid activation

  • interpret each neuron’s output as elements of a probability distribution

  • normalize the probability distribution (must add up to one)

  • we consider network output is better when it yields more probability to the correct class

expected classes for five data points

## KEEPOUTPUT
y = np.random.randint(4, size=5)
y
Copy to clipboard
array([3, 1, 2, 0, 3])
Copy to clipboard

convert it to one hot encoding

## KEEPOUTPUT
y_ohe = np.eye(4)[y].astype(int)
y_ohe
Copy to clipboard
array([[0, 0, 0, 1],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])
Copy to clipboard

simulate some neural network output with NO ACTIVATION function

with 10 output neurons, so for each input element (we have five) we have 4 outputs.

this is called LOGITS in Tensorflow

## KEEPOUTPUT
y_hat = np.round(np.random.normal(size=y_ohe.shape), 2)
y_hat
Copy to clipboard
array([[ 0.06, -0.31, -0.95,  0.39],
       [ 0.92, -0.48, -0.08,  0.53],
       [-0.5 ,  0.22, -0.18,  1.81],
       [-0.49, -1.41,  0.09, -0.11],
       [-0.73,  0.26, -1.63, -0.68]])
Copy to clipboard

normalize LOGITS. This is the SOFTMAX function

LOGITS obtained from network last layer with no activation

ˆy(i)=[ˆy(i)0,ˆy(i)1,...,ˆy(i)9]

SOFTMAX ACTIVATION

ˆˉy(i)=[ˆˉy(i)0,ˆˉy(i)1,...,ˆˉy(i)9]

with

ˆˉy(i)k=eˆy(i)k9j=0eˆy(i)j

this ensures:

  • 9k=0ˆˉy(i)k=1

  • 0ˆˉy(i)k1

this way, for each input we have a nice probability distribution in its outputs.

This is implemented in Tensorflow

## KEEPOUTPUT
y_hatb = tf.nn.softmax(y_hat).numpy()
y_hatb
Copy to clipboard
array([[0.29019814, 0.20044982, 0.10569567, 0.40365637],
       [0.43638904, 0.10761221, 0.16053855, 0.2954602 ],
       [0.06893706, 0.14162659, 0.09493514, 0.69450122],
       [0.21519991, 0.08576126, 0.38435531, 0.31468351],
       [0.19420963, 0.52266365, 0.07895974, 0.20416697]])
Copy to clipboard

check sums

## KEEPOUTPUT
y_hatb.sum(axis=1)
Copy to clipboard
array([1., 1., 1., 1., 1.])
Copy to clipboard

how would you now measure how closely y_hatb is to the expected output on y_ohe?

cross entropy: just take the probability assigned to the correct class (and pass it through a log function)

loss(ˉy(i),ˆˉy(i))=9k=0ˉy(i)klog(ˆˉy(i)k)

where ˉy(i) is the one-hot encoding of the expected class (label) for data point i.

observe that,

  • in the one-hot encoding ˉy(i) only one of the elements will be 1, and the rest will be 0’s, so the sumation above is only taking the log of the probability of the correct label.

  • the negative sign accounts for logs of values <1 are negative and we will later want to minimize the loss

This is implemented in Tensorflow

## KEEPOUTPUT
tf.keras.losses.categorical_crossentropy(y_ohe, y_hatb)
Copy to clipboard
<tf.Tensor: shape=(5,), dtype=float64, numpy=array([0.90719134, 2.22922115, 2.35456137, 1.53618786, 1.58881713])>
Copy to clipboard

Observe that TensorFlow also implements the corresponding sparse convenience function that works directly with our labels

## KEEPOUTPUT
tf.keras.losses.sparse_categorical_crossentropy(y, y_hatb)
Copy to clipboard
<tf.Tensor: shape=(5,), dtype=float64, numpy=array([0.90719134, 2.22922115, 2.35456137, 1.53618786, 1.58881713])>
Copy to clipboard