Eli5 is one of the most commonly used libraries to interpret the predictions of Machine learning models. It let us interpret predictions of models created using scikit-learn, XGBoost, lightGBM, CatBoost, lightning, sklearn-crfsuite, and keras. We have covered a detailed tutorial explaining how we can use Eli5 for scikit-learn models (below link).
As a part of this tutorial, we'll concentrate on the keras model. Currently Eli5 only supports image classifiers created using keras. It explains predictions of image classifier using Grad-CAM (Gradient-weighted Class Activation Mapping) algorithm. Grad-CAM algorithm creates a heatmap of shape same as the original image using gradients and output last convolution layer that we can visualize to see which parts of the image are contributing to the prediction. Eli5 lets us use Grad-CAM with only one function call. If the reader wants to know how Grad-CAM works internally then we recommend another tutorial where we have explained a step-by-step guide to Grad-CAM using PyTorch.
In this tutorial, we have trained a simple convolutional neural network on the Fashion MNIST dataset. Then, we have explained the predictions of the network using Eli5's Grad-CAM implementation.
Below, we have highlighted important sections of tutorial to give an overview of the material covered.
Below, we have imported the necessary libraries and printed the versions that we have used in our tutorial.
Please make a NOTE that we have disabled tensorflow eager execution for Eli5 Grad-CAM implementation to work. It won't work with eager mode enabled.
import tensorflow
tensorflow.compat.v1.disable_eager_execution()
from tensorflow import keras
print("Keras Version : {}".format(keras.__version__))
import eli5
print("Eli5 Version : {}".format(eli5.__version__))
Below, we have loaded the Fashion MNIST dataset available from keras. The dataset has grayscale images of shape (28,28) pixels for 10 different fashion items. The dataset is already divided into the train (60k images) and test (10k images) sets. Below we have included a table that has a mapping from target class index to target class name.
Label | Description |
---|---|
0 | airplane |
1 | automobile |
2 | bird |
3 | cat |
4 | deer |
5 | dog |
6 | frog |
7 | horse |
8 | ship |
9 | truck |
from tensorflow import keras
import numpy as np
(X_train, Y_train), (X_test, Y_test) = keras.datasets.fashion_mnist.load_data()
#(X_train, Y_train), (X_test, Y_test) = keras.datasets.cifar10.load_data()
X_train, X_test = X_train.reshape(-1,28,28,1), X_test.reshape(-1,28,28,1)
X_train, X_test = X_train/255.0, X_test/255.0
classes = np.unique(Y_train)
class_labels = ["T-shirt/top","Trouser","Pullover","Dress","Coat","Sandal","Shirt","Sneaker","Bag","Ankle boot"]
#class_labels = ["airplane","automobile","bird","cat","deer","dog","frog","horse","ship","truck"]
mapping = dict(zip(classes, class_labels))
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have designed a small convolutional neural network to classify images of fashion MNIST dataset. The network has 3 convolution layers and one dense layer. The three convolution layers have 32, 16, and 8 output channels respectively. All of them apply kernel of size (3,3) on input data. All convolution layers apply relu activation to the output. The output of the third convolution layer is flattened and fed to a dense layer that has 10 output units (same as the target classes). The last dense layer has softmax activation which will transform the output of dense layers to probabilities.
After defining the network, we have compiled it to use Adam optimizer, cross entropy loss, and accuracy metric.
At last, we have trained the network for 8 epochs by giving train and validation data. We can notice from the accuracy getting printed after each epoch that our model is doing a good job at classifying images.
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
model = Sequential([
layers.Input(shape=X_train.shape[1:]),
layers.Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"),
layers.Conv2D(filters=16, kernel_size=(3,3), padding="same", activation="relu"),
layers.Conv2D(filters=8, kernel_size=(3,3), padding="same", activation="relu"),
layers.Flatten(),
layers.Dense(len(classes), activation="softmax")
])
model.summary()
model.compile("adam", "sparse_categorical_crossentropy", ["accuracy"])
model.fit(X_train, Y_train, batch_size=256, epochs=8, validation_data=(X_test, Y_test))
In this section, we have evaluated the performance of our network by calculating accuracy, confusion matrix and classification report metrics on test predictions. We can notice from the classification report and confusion matrix that our model is doing a good job for all categories except Shirt for which the accuracy is quite low compared to other categories.
We have calculated all metrics using various functions available from scikit-learn. Please feel free to check the below link if you want to know about various ML metrics available from sklearn in detail.
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
Y_test_preds = model.predict(X_test)
Y_test_preds = np.argmax(Y_test_preds, axis=1)
print("Test Accuracy : {}".format(accuracy_score(Y_test, Y_test_preds)))
print("\nConfusion Matrix : ")
print(confusion_matrix(Y_test, Y_test_preds))
print("\nClassification Report :")
print(classification_report(Y_test, Y_test_preds, target_names=class_labels))
In this section, we have explained how we can use Grad-CAM algorithm available from Eli5 to explain predictions. The Grad-CAM algorithm returns a heatmap that can be overlayed over our original image to show which parts of the image are contributing to the prediction.
In order to use Grad-CAM algorithm available from Eli5 library, we need to call explain_prediction() function available from keras sub-module of Eli5 library. It'll return an instance of Explanation class which we can use to generate an image (using format_as_image() function) that has heatmap generated by Grad-CAM overlayed over original image. We can then visualize this image returned by format_as_image() function to see results.
Below, we have randomly selected a sample from data and made predictions on it. After making predictions, we have printed actual and predicted labels of the selected image. Then, we have called explain_prediction() function by giving it our model and selected sample. The method internally runs Grad-CAM algorithm and returns an Explanation object. The Explanation object has the original image and heatmap contained in it. Then, we have called format_as_image() function of Eli5 by giving it explanation object to generate final image. The final image has Grad-CAM heatmap overlayed on our original image which we have visualized.
import numpy as np
idx = np.random.choice(range(10000))
prediction = model.predict(X_test[idx:idx+1]).argmax()
print("Actual Target : {}".format(mapping[Y_test[idx]]))
print("Predicted Target : {}".format(mapping[prediction]))
explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1])
print("Explanation Object : {}".format(type(explanation)))
print("Explanation Method : {}".format(explanation.method))
print("Explanation Description : {}".format(explanation.description))
import matplotlib.pyplot as plt
image = eli5.format_as_image(explanation)
def show_image(image):
fig = plt.figure(figsize=(6,6))
plt.imshow(image);
plt.xticks([],[]); plt.yticks([],[]);
plt.title("Image Overlayed with GradCAM Heatmap")
show_image(image)
In the below cell, we have called format_as_image() function again to generate images with a different colormap.
import matplotlib
from PIL import Image
image = eli5.format_as_image(explanation, colormap=matplotlib.cm.Reds, alpha_limit=0.9)
show_image(image)
In the below cell, we have called format_as_image() function with a resampling filter to check whether it helps improve results.
from PIL import Image
image = eli5.format_as_image(explanation, resampling_filter=Image.NEAREST,
colormap=matplotlib.cm.Reds, alpha_limit=0.9)
show_image(image)
By default, explain_prediction() function calculates heatmap using last convolution layer. But we can give convolution layer name as input asking it to generate heatmap with respect to some other convolution layer when executing Grad-CAM.
In the below cell, we have generated and visualized an image that was generated with respect to the second last convolution layer. We have given name of convolution layer to layer parameter of explain_prediction() function.
In the cell after the below cell, we have generated and visualized an image that was generated with respect to the first convolution layer.
import matplotlib.pyplot as plt
explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1], layer="conv2d_1")
image = eli5.format_as_image(explanation)
show_image(image)
import matplotlib.pyplot as plt
explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1], layer="conv2d")
image = eli5.format_as_image(explanation)
show_image(image)
Apart from explain_prediction(), eli5 provides another method named explain_prediction_keras() which works exactly like it. Below we have explained one example showing its usage.
import matplotlib.pyplot as plt
explanation = eli5.keras.explain_prediction.explain_prediction_keras(model, X_test[idx:idx+1], layer="conv2d")
image = eli5.format_as_image(explanation)
show_image(image)
Eli5 let us separately execute Grad-CAM algorithm as well and generate a heatmap by ourselves. It provides two functions for that purpose.
Below, we have explained with a simple example how we can use them. After generating a heatmap, we have visualized it next to the original image for comparison purposes.
weights, activations, gradients, predicted_idx, predicted_val = eli5.keras.gradcam_backend(model, X_test[idx:idx+1],
targets=[int(Y_test[idx]),],
activation_layer=model.get_layer("conv2d"))
heatmap = eli5.keras.gradcam(weights, activations)
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(8,8))
ax1 = fig.add_subplot(121)
ax1.imshow(X_test[idx], cmap="gray");
ax1.set_title("Actual Image");
ax1.set_xticks([],[]); ax1.set_yticks([],[]);
ax2 = fig.add_subplot(122)
ax2.imshow(heatmap, cmap="Greens");
ax2.set_title("GradCAM Heatmap");
ax2.set_xticks([],[]); ax2.set_yticks([],[]);
This ends our small tutorial explaining how we can use Grad-CAM implementation for keras networks available through Eli5. Please feel free to let us know your views in the comments section below.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to