Updated On : Mar-22,2022 Time Investment : ~30 mins

Eli5: Explain Image Classifier Predictions Using Grad-CAM (Keras)¶

Eli5 is one of the most commonly used libraries to interpret the predictions of Machine learning models. It let us interpret predictions of models created using scikit-learn, XGBoost, lightGBM, CatBoost, lightning, sklearn-crfsuite, and keras. We have covered a detailed tutorial explaining how we can use Eli5 for scikit-learn models (below link).

How to Use eli5 to Understand sklearn Model Predictions?

As a part of this tutorial, we'll concentrate on the keras model. Currently Eli5 only supports image classifiers created using keras. It explains predictions of image classifier using Grad-CAM (Gradient-weighted Class Activation Mapping) algorithm. Grad-CAM algorithm creates a heatmap of shape same as the original image using gradients and output last convolution layer that we can visualize to see which parts of the image are contributing to the prediction. Eli5 lets us use Grad-CAM with only one function call. If the reader wants to know how Grad-CAM works internally then we recommend another tutorial where we have explained a step-by-step guide to Grad-CAM using PyTorch.

PyTorch: Grad-CAM

In this tutorial, we have trained a simple convolutional neural network on the Fashion MNIST dataset. Then, we have explained the predictions of the network using Eli5's Grad-CAM implementation.

Below, we have highlighted important sections of tutorial to give an overview of the material covered.

Important Sections Of Tutorial¶

Load Dataset
Define And Train CNN
Evaluate Network Performance
Explain Prediction Using Eli5 Grad-CAM Implementation

Below, we have imported the necessary libraries and printed the versions that we have used in our tutorial.

Please make a NOTE that we have disabled tensorflow eager execution for Eli5 Grad-CAM implementation to work. It won't work with eager mode enabled.

import tensorflow
tensorflow.compat.v1.disable_eager_execution()

from tensorflow import keras

print("Keras Version : {}".format(keras.__version__))

Keras Version : 2.6.0

import eli5

print("Eli5 Version : {}".format(eli5.__version__))

Eli5 Version : 0.11.0

1. Load Dataset ¶

Below, we have loaded the Fashion MNIST dataset available from keras. The dataset has grayscale images of shape (28,28) pixels for 10 different fashion items. The dataset is already divided into the train (60k images) and test (10k images) sets. Below we have included a table that has a mapping from target class index to target class name.

Label	Description
0	airplane
1	automobile
2	bird
3	cat
4	deer
5	dog
6	frog
7	horse
8	ship
9	truck

from tensorflow import keras
import numpy as np

(X_train, Y_train), (X_test, Y_test) = keras.datasets.fashion_mnist.load_data()
#(X_train, Y_train), (X_test, Y_test) = keras.datasets.cifar10.load_data()

X_train, X_test = X_train.reshape(-1,28,28,1), X_test.reshape(-1,28,28,1)

X_train, X_test = X_train/255.0, X_test/255.0

classes =  np.unique(Y_train)
class_labels = ["T-shirt/top","Trouser","Pullover","Dress","Coat","Sandal","Shirt","Sneaker","Bag","Ankle boot"]
#class_labels = ["airplane","automobile","bird","cat","deer","dog","frog","horse","ship","truck"]
mapping = dict(zip(classes, class_labels))

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step

((60000, 28, 28, 1), (10000, 28, 28, 1), (60000,), (10000,))

2. Define And Train CNN ¶

In this section, we have designed a small convolutional neural network to classify images of fashion MNIST dataset. The network has 3 convolution layers and one dense layer. The three convolution layers have 32, 16, and 8 output channels respectively. All of them apply kernel of size (3,3) on input data. All convolution layers apply relu activation to the output. The output of the third convolution layer is flattened and fed to a dense layer that has 10 output units (same as the target classes). The last dense layer has softmax activation which will transform the output of dense layers to probabilities.

After defining the network, we have compiled it to use Adam optimizer, cross entropy loss, and accuracy metric.

At last, we have trained the network for 8 epochs by giving train and validation data. We can notice from the accuracy getting printed after each epoch that our model is doing a good job at classifying images.

from tensorflow.keras.models import Sequential
from tensorflow.keras import layers

model = Sequential([
    layers.Input(shape=X_train.shape[1:]),
    layers.Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"),
    layers.Conv2D(filters=16, kernel_size=(3,3), padding="same", activation="relu"),
    layers.Conv2D(filters=8, kernel_size=(3,3), padding="same", activation="relu"),

    layers.Flatten(),
    layers.Dense(len(classes), activation="softmax")
])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d (Conv2D)              (None, 28, 28, 32)        320
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 16)        4624
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 8)         1160
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0
_________________________________________________________________
dense (Dense)                (None, 10)                62730
=================================================================
Total params: 68,834
Trainable params: 68,834
Non-trainable params: 0
_________________________________________________________________

model.compile("adam", "sparse_categorical_crossentropy", ["accuracy"])

model.fit(X_train, Y_train, batch_size=256, epochs=8, validation_data=(X_test, Y_test))

Train on 60000 samples, validate on 10000 samples

2022-03-23 07:53:45.383517: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.

Epoch 1/8
59904/60000 [============================>.] - ETA: 0s - loss: 0.5507 - accuracy: 0.8072

/opt/conda/lib/python3.7/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '

60000/60000 [==============================] - 23s 380us/sample - loss: 0.5503 - accuracy: 0.8073 - val_loss: 0.3916 - val_accuracy: 0.8638
Epoch 2/8
60000/60000 [==============================] - 23s 376us/sample - loss: 0.3410 - accuracy: 0.8794 - val_loss: 0.3370 - val_accuracy: 0.8798
Epoch 3/8
60000/60000 [==============================] - 22s 366us/sample - loss: 0.3017 - accuracy: 0.8927 - val_loss: 0.3269 - val_accuracy: 0.8797
Epoch 4/8
60000/60000 [==============================] - 22s 373us/sample - loss: 0.2756 - accuracy: 0.9013 - val_loss: 0.3033 - val_accuracy: 0.8906
Epoch 5/8
60000/60000 [==============================] - 23s 379us/sample - loss: 0.2532 - accuracy: 0.9083 - val_loss: 0.3003 - val_accuracy: 0.8928
Epoch 6/8
60000/60000 [==============================] - 22s 370us/sample - loss: 0.2371 - accuracy: 0.9153 - val_loss: 0.2757 - val_accuracy: 0.9015
Epoch 7/8
60000/60000 [==============================] - 24s 402us/sample - loss: 0.2228 - accuracy: 0.9202 - val_loss: 0.2658 - val_accuracy: 0.9086
Epoch 8/8
60000/60000 [==============================] - 23s 390us/sample - loss: 0.2112 - accuracy: 0.9244 - val_loss: 0.2587 - val_accuracy: 0.9097

<keras.callbacks.History at 0x7fe734e0eb50>

3. Evaluate Network Performance ¶

In this section, we have evaluated the performance of our network by calculating accuracy, confusion matrix and classification report metrics on test predictions. We can notice from the classification report and confusion matrix that our model is doing a good job for all categories except Shirt for which the accuracy is quite low compared to other categories.

We have calculated all metrics using various functions available from scikit-learn. Please feel free to check the below link if you want to know about various ML metrics available from sklearn in detail.

Scikit-Learn - Model Evaluation & Scoring Metrics

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

Y_test_preds = model.predict(X_test)
Y_test_preds = np.argmax(Y_test_preds, axis=1)

print("Test Accuracy : {}".format(accuracy_score(Y_test, Y_test_preds)))
print("\nConfusion Matrix : ")
print(confusion_matrix(Y_test, Y_test_preds))
print("\nClassification Report :")
print(classification_report(Y_test, Y_test_preds, target_names=class_labels))

Test Accuracy : 0.9097

Confusion Matrix :
[[864   0  16  15   4   2  94   0   5   0]
 [  1 982   0  10   4   0   3   0   0   0]
 [ 24   2 841   8  49   0  73   0   3   0]
 [ 19   4  13 912  22   1  26   0   3   0]
 [  2   0  50  23 868   0  56   0   1   0]
 [  0   0   0   1   0 981   0  12   0   6]
 [110   2  49  27  59   0 745   0   8   0]
 [  0   0   0   0   0  13   0 960   1  26]
 [  2   1   1   4   2   4   4   3 979   0]
 [  1   0   0   0   0   5   0  29   0 965]]

Classification Report :
              precision    recall  f1-score   support

 T-shirt/top       0.84      0.86      0.85      1000
     Trouser       0.99      0.98      0.99      1000
    Pullover       0.87      0.84      0.85      1000
       Dress       0.91      0.91      0.91      1000
        Coat       0.86      0.87      0.86      1000
      Sandal       0.98      0.98      0.98      1000
       Shirt       0.74      0.74      0.74      1000
     Sneaker       0.96      0.96      0.96      1000
         Bag       0.98      0.98      0.98      1000
  Ankle boot       0.97      0.96      0.97      1000

    accuracy                           0.91     10000
   macro avg       0.91      0.91      0.91     10000
weighted avg       0.91      0.91      0.91     10000

4. Explain Prediction Using Eli5 Grad-CAM Implementation ¶

In this section, we have explained how we can use Grad-CAM algorithm available from Eli5 to explain predictions. The Grad-CAM algorithm returns a heatmap that can be overlayed over our original image to show which parts of the image are contributing to the prediction.

In order to use Grad-CAM algorithm available from Eli5 library, we need to call explain_prediction() function available from keras sub-module of Eli5 library. It'll return an instance of Explanation class which we can use to generate an image (using format_as_image() function) that has heatmap generated by Grad-CAM overlayed over original image. We can then visualize this image returned by format_as_image() function to see results.

Below, we have randomly selected a sample from data and made predictions on it. After making predictions, we have printed actual and predicted labels of the selected image. Then, we have called explain_prediction() function by giving it our model and selected sample. The method internally runs Grad-CAM algorithm and returns an Explanation object. The Explanation object has the original image and heatmap contained in it. Then, we have called format_as_image() function of Eli5 by giving it explanation object to generate final image. The final image has Grad-CAM heatmap overlayed on our original image which we have visualized.

import numpy as np

idx = np.random.choice(range(10000))

prediction = model.predict(X_test[idx:idx+1]).argmax()
print("Actual Target    : {}".format(mapping[Y_test[idx]]))
print("Predicted Target : {}".format(mapping[prediction]))

explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1])

Actual Target    : Sneaker
Predicted Target : Sneaker

print("Explanation Object : {}".format(type(explanation)))
print("Explanation Method : {}".format(explanation.method))
print("Explanation Description : {}".format(explanation.description))

Explanation Object : <class 'eli5.base.Explanation'>
Explanation Method : Grad-CAM
Explanation Description : Grad-CAM visualization for image classification;
output is explanation object that contains input image
and heatmap image for a target.

import matplotlib.pyplot as plt

image = eli5.format_as_image(explanation)

def show_image(image):
    fig = plt.figure(figsize=(6,6))
    plt.imshow(image);
    plt.xticks([],[]); plt.yticks([],[]);
    plt.title("Image Overlayed with GradCAM Heatmap")

show_image(image)

In the below cell, we have called format_as_image() function again to generate images with a different colormap.

import matplotlib
from PIL import Image

image = eli5.format_as_image(explanation, colormap=matplotlib.cm.Reds, alpha_limit=0.9)

show_image(image)

In the below cell, we have called format_as_image() function with a resampling filter to check whether it helps improve results.

from PIL import Image

image = eli5.format_as_image(explanation, resampling_filter=Image.NEAREST,
                             colormap=matplotlib.cm.Reds, alpha_limit=0.9)

show_image(image)

By default, explain_prediction() function calculates heatmap using last convolution layer. But we can give convolution layer name as input asking it to generate heatmap with respect to some other convolution layer when executing Grad-CAM.

In the below cell, we have generated and visualized an image that was generated with respect to the second last convolution layer. We have given name of convolution layer to layer parameter of explain_prediction() function.

In the cell after the below cell, we have generated and visualized an image that was generated with respect to the first convolution layer.

import matplotlib.pyplot as plt

explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1], layer="conv2d_1")

image = eli5.format_as_image(explanation)

show_image(image)

import matplotlib.pyplot as plt

explanation = eli5.keras.explain_prediction.explain_prediction(model, X_test[idx:idx+1], layer="conv2d")

image = eli5.format_as_image(explanation)

show_image(image)

Apart from explain_prediction(), eli5 provides another method named explain_prediction_keras() which works exactly like it. Below we have explained one example showing its usage.

import matplotlib.pyplot as plt

explanation = eli5.keras.explain_prediction.explain_prediction_keras(model, X_test[idx:idx+1], layer="conv2d")

image = eli5.format_as_image(explanation)

show_image(image)

Eli5 let us separately execute Grad-CAM algorithm as well and generate a heatmap by ourselves. It provides two functions for that purpose.

gradcam_backend(model, sample,targets=None, activation_layer=None) - This function takes as input model, selected sample, target class index, and keras layer with respect to which execute Grad-CAM. It returns weights of the selected layer, activated output of the layer, gradients, predicted index, and predicted value as output.
gradcam(weights, activations) - This function takes weights and activated values returned by gradcam_backend() function and executes Grad-CAM algorithm to generate heatmap.

Below, we have explained with a simple example how we can use them. After generating a heatmap, we have visualized it next to the original image for comparison purposes.

weights, activations, gradients, predicted_idx, predicted_val = eli5.keras.gradcam_backend(model, X_test[idx:idx+1],
                                                                                           targets=[int(Y_test[idx]),],
                                                                                           activation_layer=model.get_layer("conv2d"))

heatmap = eli5.keras.gradcam(weights, activations)

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(8,8))
ax1 = fig.add_subplot(121)
ax1.imshow(X_test[idx], cmap="gray");
ax1.set_title("Actual Image");
ax1.set_xticks([],[]); ax1.set_yticks([],[]);

ax2 = fig.add_subplot(122)
ax2.imshow(heatmap, cmap="Greens");
ax2.set_title("GradCAM Heatmap");
ax2.set_xticks([],[]); ax2.set_yticks([],[]);

This ends our small tutorial explaining how we can use Grad-CAM implementation for keras networks available through Eli5. Please feel free to let us know your views in the comments section below.

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

eli5, grad-cam, keras-image-classifiers

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)

Eli5: Explain Image Classifier Predictions Using Grad-CAM (Keras)¶

Important Sections Of Tutorial¶

1. Load Dataset ¶

2. Define And Train CNN ¶

3. Evaluate Network Performance ¶

4. Explain Prediction Using Eli5 Grad-CAM Implementation ¶

References¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

Want to Share Your Views? Have Any Suggestions?

Sunny Solanki

Subscribe to Our YouTube Channel

Tutorial Categories

Newsletter Subscription