Interpreting the predictions of our ML model helps us better understand whether our model has generalized or not. It further opens up opportunities to understand what tasks model is doing good and at what tasks it's getting confused. This can help us make informed changes that can further improve model performance. We can easily generate feature importance for simple ML models like linear regression, decision trees, random forests, etc. But when it comes to deep neural networks, it becomes hard to generate feature importances specifying which features are contributing to the prediction. LIME (Local Interpretable Model-Agnostic Explanations) is an algorithm that helps us solve this problem. It can help us understand the prediction of our deep network by training simple ML models (like decision trees, linear regression, etc) on fake data generated from the input sample. This model tries to mimic the predictions of our network. We have covered in detail how LIME works internally in a separate tutorial. Please feel free to check the below link which has steps of internal workings.
As a part of this tutorial, we have used LIME to explain the predictions made by our text classification keras neural network. We have used the newsgroups text dataset available from scikit-learn for our purpose. The tutorial trains model on data vectorized with different text vectorization approach to see which one is doing a better job. We recommend that the reader goes through the below link if he/she does not have a background on text classification using keras networks.
Below, we have listed important sections of tutorial to give an overview of the material covered.
Below, we have imported the necessary libraries and printed the versions that we have used in our tutorial.
import tensorflow
from tensorflow import keras
print("Keras Version : {}".format(keras.__version__))
In this section, we have vectorized our input text data using the word frequency approach and then trained a network on it. After training the network, we have evaluated its performance by calculating various ML metrics and explained predictions made by the network. We have used the text vectorization layer available from keras to vectorize data.
In this section, we have loaded 20 newsgroups dataset that we'll be using throughout our tutorial. The dataset is available from scikit-learn and has ~18k text documents of 20 different categories. As a part of our example, we have selected 5 categories as listed in the code below. We have loaded train and test datasets using fetch_20newsgroups() function available from scikit-learn.
import numpy as np
from sklearn import datasets
import gc
all_categories = ['alt.atheism','comp.graphics','comp.os.ms-windows.misc','comp.sys.ibm.pc.hardware',
'comp.sys.mac.hardware','comp.windows.x', 'misc.forsale','rec.autos','rec.motorcycles',
'rec.sport.baseball','rec.sport.hockey','sci.crypt','sci.electronics','sci.med',
'sci.space','soc.religion.christian','talk.politics.guns','talk.politics.mideast',
'talk.politics.misc','talk.religion.misc']
selected_categories = ['misc.forsale','rec.autos','rec.sport.baseball','sci.electronics','soc.religion.christian']
X_train, Y_train = datasets.fetch_20newsgroups(subset="train", categories=selected_categories, return_X_y=True)
X_test , Y_test = datasets.fetch_20newsgroups(subset="test", categories=selected_categories, return_X_y=True)
X_train = np.array(X_train)
X_test = np.array(X_test)
classes = np.unique(Y_train)
mapping = dict(zip(classes, selected_categories))
len(X_train), len(X_test), classes, mapping
In this section, we have trained our text vectorization layer available from keras using our dataset. The training will populate the vocabulary of the vectorization layer. We'll later use this layer as a part of our keras network.
text_vectorizer = keras.layers.TextVectorization(max_tokens=50000, standardize="lower_and_strip_punctuation",
split="whitespace", output_mode="count", pad_to_max_tokens=True)
text_vectorizer.adapt(np.concatenate((X_train, X_test)), batch_size=512)
vocab = text_vectorizer.get_vocabulary()
print("Vocab : {}".format(vocab[:10]))
print("Vocab Size : {}".format(text_vectorizer.vocabulary_size()))
out = text_vectorizer(X_train[:5])
print("Output Shape : {}".format(out.shape))
In this section, we have created a keras network which we'll use to classify text documents. The network consists of a text vectorization layer which we trained in the previous section to populate its dictionary followed by 3 dense layers. The three dense layers had 128, 64 AND 5 units respectively. The first two dense layers have relu activation and the last dense layer has softmax activation. When we perform a forward pass through the network, the first text vectorization layer will vectorize text data and pass vectorized data to the dense layer next. After creating a network, we have also summarized the network to show network parameters.
Then, we have compiled a network to use Adam optimizer, cross entropy loss, and accuracy metrics.
At last, we have called fit() method on the model to train it with train data for 8 epochs with a batch size of 256. We have also provided validation data to check the accuracy of the model on it.
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
def create_model(text_vectorizer):
return Sequential([
layers.Input(shape=(1,), dtype="string"),
text_vectorizer,
#layers.Dense(256, activation="relu"),
layers.Dense(128, activation="relu"),
layers.Dense(64, activation="relu"),
layers.Dense(len(classes), activation="softmax"),
])
model = create_model(text_vectorizer)
model.summary()
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(X_train, Y_train, batch_size=256, epochs=8, validation_data=(X_test, Y_test))
gc.collect()
In this section, we have evaluated the performance of the network by calculating accuracy, classification report and confusion matrix metrics on the test dataset. We can notice from the results that our model seems to be doing a good job as per the accuracy metric. The model is confusing some samples of categories misc.forsale, rec.autos and sci.electronics as per confusion matrix.
We have calculated various ML metrics using functions available from scikit-learn. Please feel free to check the below link if you want to learn about them and others available through scikit-learn. It covers them in detail.
In the next cell after the below cell, we have also plotted classification metrics using the scikit-plot python library. It provides visualizations for many ML metrics. Please feel free to check the below link if you want to learn about it.
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
train_preds = model.predict(X_train)
test_preds = model.predict(X_test)
print("Train Accuracy : {}".format(accuracy_score(Y_train, np.argmax(train_preds, axis=1))))
print("Test Accuracy : {}".format(accuracy_score(Y_test, np.argmax(test_preds, axis=1))))
print("\nClassification Report : ")
print(classification_report(Y_test, np.argmax(test_preds, axis=1), target_names=selected_categories))
print("\nConfusion Matrix : ")
print(confusion_matrix(Y_test, np.argmax(test_preds, axis=1)))
import scikitplot as skplt
import matplotlib.pyplot as plt
skplt.metrics.plot_confusion_matrix([selected_categories[i] for i in Y_test], [selected_categories[i] for i in np.argmax(test_preds, axis=1)],
normalize=True,
title="Confusion Matrix",
cmap="Purples",
hide_zeros=True,
figsize=(5,5)
);
plt.xticks(rotation=90);
In this section, we have explained the predictions made by our model using LIME algorithm. The lime library provides us with LimeTextExplainer that lets us explain predictions made by our model by creating visualizations that highlight words that contribute to the prediction category.
In order to explain predictions using lime, we first need to create an instance of LimeTextExplainer. Then, we need to call explain_instance() method on it with a sample to explain. The method returns an instance of Explanation on which we can call show_in_notebook() method to create visualization explaining prediction.
Below, we have first created an instance of LimeTextExplainer that we'll use to explain predictions. We have given it labels of all the output categories of our model.
We recommend that readers go through another simple tutorial on lime where we have covered arguments of LimeTextExplainer() constructor.
from lime import lime_text
explainer = lime_text.LimeTextExplainer(class_names=selected_categories, verbose=True)
explainer
In this section, we have explained the correct prediction made by our model. We have first randomly selected one sample from the test data set. The category of the selected sample is soc.religion.christian and the same is predicted by our model.
Then, we have called explain_instance() function on LimeTabularExplainer instance by giving text sample, reference to predict() function of model and actual label of the sample. This returned an instance of Explanation which has sample explanation details.
At last, we have called show_in_notebook() function on explanation instance to show sample explanation. The visualization has a bar chart that shows which words contributed positively to the prediction categories as well as has original text of the sample with words highlighted showing whether they contributed positively or negatively to the prediction label. The words are highlighted as shades of color which are based on their contribution to prediction. We can notice from the visualization that words like 'church', 'homosexual', 'churches', 'group', etc are contributing to predicting category soc.religion.christian.
import numpy as np
rng = np.random.RandomState(42)
idx = rng.randint(1, len(X_test))
print("Prediction : ", selected_categories[model.predict(X_test[idx:idx+1]).argmax(axis=-1)[0]])
print("Actual : ", selected_categories[Y_test[idx]])
explanation = explainer.explain_instance(X_test[idx], classifier_fn=model.predict, labels=Y_test[idx:idx+1])
explanation
explanation.show_in_notebook()
In this section, we have explained a wrong prediction using our explainer object. We have first found out indexes of all test samples that are predicted wrong by our model and then we have randomly selected one sample from it. The actual category of selected sample is sci.electronics and model predicted rec.autos. This time, we have generated two explanation objects (one with respect to the actual label and one with respect to the predicted label). This is done so that we can see which words are contributing to a particular category.
We can notice from the visualizations that words like 'batteries', 'old', etc contributed to predicting category sci.electronics and words like 'car', 'oil', 'engine', 'Honda', etc contributed to predicting category rec.autos.
import numpy as np
Y_test_preds = np.argmax(test_preds, axis=-1)
wrong_preds = np.argwhere(Y_test!=Y_test_preds)
rng = np.random.RandomState(123)
idx = rng.choice(wrong_preds.flatten())
print("Actual : ", selected_categories[Y_test[idx]])
print("Prediction : ", selected_categories[model.predict(X_test[idx:idx+1]).argmax(axis=-1)[0]])
explanation_actual = explainer.explain_instance(X_test[idx], classifier_fn=model.predict,
labels=Y_test[idx:idx+1])
explanation_pred = explainer.explain_instance(X_test[idx], classifier_fn=model.predict,
labels=Y_test_preds[idx:idx+1])
explanation_actual.show_in_notebook()
explanation_pred.show_in_notebook()
In this section, we have again explained the correct prediction but this time we have kept feature_selection parameter of LimeTextExplainer to 'none'.
from lime import lime_text
explainer = lime_text.LimeTextExplainer(class_names=selected_categories, verbose=True, feature_selection="none")
rng = np.random.RandomState(42)
idx = rng.randint(1, len(X_test))
print("Actual : ", selected_categories[Y_test[idx]])
print("Prediction : ", selected_categories[model.predict(X_test[idx:idx+1]).argmax(axis=-1)[0]])
explanation = explainer.explain_instance(X_test[idx], classifier_fn=model.predict, num_features=10,
labels=Y_test[idx:idx+1])
explanation.show_in_notebook()
In this section, we have again used the word frequency text vectorization approach but this time we have vectorized text data using CountVectorizer available from scikit-learn. Our keras network will be working directly on vectorized data this time.
Please NOTE that we have not covered text vectorization approaches that we have used in this tutorial in detail as we expect that reader has little background on them. Please feel free to check the below link if you want to learn about them in detail. The link covers them in detail.
Below, we have vectorized our text data using CountVectorizer available from scikit-learn.
import sklearn
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
vectorizer = CountVectorizer(max_features=50000)
vectorizer.fit(np.concatenate((X_train, X_test)))
X_train_vect = vectorizer.transform(X_train)
X_test_vect = vectorizer.transform(X_test)
X_train_vect, X_test_vect = X_train_vect.toarray(), X_test_vect.toarray()
X_train_vect.shape, X_test_vect.shape
In this section, we have first created a keras network that works on vectorized data. It has 3 dense layers with units 128, 64, and 5 respectively.
After creating a network, we have compiled it to use Adam optimizer, cross entropy loss, and accuracy metrics.
At last, we have trained the network by calling fit() function for 8 epochs with a batch size of 256. We can notice from the loss and accuracy getting printed after each epoch that the model is doing a good job.
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
def create_model(input_shape):
return Sequential([
layers.Input(shape=input_shape),
layers.Dense(128, activation="relu"),
layers.Dense(64, activation="relu"),
layers.Dense(len(selected_categories), activation="softmax"),
])
model = create_model(X_train_vect.shape[1:])
model.summary()
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(X_train_vect, Y_train, batch_size=256, epochs=8, validation_data=(X_test_vect, Y_test))
In this section, we have explained the correct prediction made by our model. We have randomly selected one sample from the test set. The actual and predicted category of that sample is soc.religion.christian.
This time we have created a small function that takes as input text samples and returns probabilities of those samples. This is required by classifier_fn parameter of explain_instance() function. In our previous example, we did not need to create such a function because our model was taking text samples as input and making predictions but in this example, it works on vectorized data hence we need to do it.
We can notice from the visualization that the words like 'church', 'homosexual', 'group', 'topic', etc are contributing to predicting category soc.religion.christian.
from lime import lime_text
def make_predictions(X_batch):
X_batch_text = vectorizer.transform(X_batch)
preds = model.predict(X_batch_text.toarray())
return preds
explainer = lime_text.LimeTextExplainer(class_names=selected_categories)
rng = np.random.RandomState(42)
idx = rng.randint(1, len(X_test))
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
print("Actual : ", selected_categories[Y_test[idx]])
explanation = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10,
labels=Y_test[idx:idx+1])
explanation.show_in_notebook()
In this section, we have explained one wrong prediction. The actual category of our selected sample is sci.electronics whereas our model predicted misc.forsale. As usual, we have created two visualizations to see which words are contributing to actual and predicted labels.
We can notice that words like 'sells', 'com', etc are contributing to category sci.electronics whereas words like 'tools', 'Mac', 'Macintosh', etc are contributing to category misc.forsale. This does not make much sense as words that should be part of sci.electronics category are contributing to misc.forsale category. This hints that our model is not that generic enough.
Y_test_preds = model.predict(X_test_vect).argmax(axis=-1)
wrong_preds = np.argwhere(Y_test!=Y_test_preds)
rng = np.random.RandomState(123)
idx = rng.choice(wrong_preds.flatten())
print("Actual : ", selected_categories[Y_test[idx]])
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
explanation_actual = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions,
num_features=10, labels=Y_test[idx:idx+1])
explanation_pred = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions,
num_features=10, labels=Y_test_preds[idx:idx+1])
explanation_actual.show_in_notebook()
explanation_pred.show_in_notebook()
In this example, we have again used the word frequency approach but this time, we have removed commonly appearing words in the English language (words like 'the', 'a', 'an', 'then', etc) that do not contribute much to classification as they appear in almost every other text document. The majority of the code in this section is repeated from the previous section hence we have not included a detailed explanation of repeated parts.
In this section, we have vectorized data using scikit-learn CountVectorizer. We have set stop_words parameter to 'english' asking it to remove stop words from the vocabulary.
import sklearn
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
vectorizer = CountVectorizer(max_features=50000, stop_words="english")
vectorizer.fit(np.concatenate((X_train, X_test)))
X_train_vect = vectorizer.transform(X_train)
X_test_vect = vectorizer.transform(X_test)
X_train_vect, X_test_vect = X_train_vect.toarray(), X_test_vect.toarray()
X_train_vect.shape, X_test_vect.shape
Here, we have created a model, compiled it, and trained it using vectorized data. We can notice from the loss and accuracy getting printed at the end of all epochs that the model is performing well.
model = create_model(X_train_vect.shape[1:])
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(X_train_vect, Y_train, batch_size=256, epochs=8, validation_data=(X_test_vect, Y_test))
In this section, we have explained the correct prediction made by our model. The actual category of the selected sample is soc.religion.christian and the same is predicted by our model. The words like 'church', 'group', 'homosexual', 'presbyterian', etc are contributing to the prediction.
from lime import lime_text
def make_predictions(X_batch):
X_batch_text = vectorizer.transform(X_batch)
preds = model.predict(X_batch_text.toarray())
return preds
explainer = lime_text.LimeTextExplainer(class_names=selected_categories)
rng = np.random.RandomState(42)
idx = rng.randint(1, len(X_test))
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
print("Actual : ", selected_categories[Y_test[idx]])
explanation = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test[idx:idx+1])
explanation.show_in_notebook()
In this section, we have explained one incorrect prediction. The actual category of the sample is 'misc.forsale' whereas our model predicted 'sci.electronics'. We have generated an explanation with respect to both actual and predicted labels.
Y_test_preds = model.predict(X_test_vect).argmax(axis=-1)
wrong_preds = np.argwhere(Y_test!=Y_test_preds)
rng = np.random.RandomState(123)
idx = rng.choice(wrong_preds.flatten())
print("Actual : ", selected_categories[Y_test[idx]])
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
explanation_actual = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test[idx:idx+1])
explanation_pred = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test_preds[idx:idx+1])
explanation_actual.show_in_notebook()
explanation_pred.show_in_notebook()
In this section, we have vectorized our data using Tf-Idf (Term Frequency-Inverse Document Frequency). It assigns float values to each word in a way that words that appear commonly across many documents get assigned low values and those appearing rarely get assigned high values.
Please feel free to check the below link if you want to learn about Tf-IDF in detail.
In this section, we have vectorized data using TfidfVectorizer available from scikit-learn. The TfidfVectorizer is an implementation of Tf-IDF concept.
import sklearn
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
vectorizer = TfidfVectorizer(max_features=50000, stop_words="english")
vectorizer.fit(np.concatenate((X_train, X_test)))
X_train_vect = vectorizer.transform(X_train)
X_test_vect = vectorizer.transform(X_test)
X_train_vect, X_test_vect = X_train_vect.toarray(), X_test_vect.toarray()
X_train_vect.shape, X_test_vect.shape
In this section, we have created and trained our model as usual. We can notice from the loss and accuracy that the model has done a good job at prediction.
model = create_model(X_train_vect.shape[1:])
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(X_train_vect, Y_train, batch_size=256, epochs=8, validation_data=(X_test_vect, Y_test))
In this section, we have explained correct prediction. The model has predicted correct category soc.religion.christian. The words like 'homosexual', 'church', 'passages', 'group', 'presbyterian', etc contributed to prediction.
from lime import lime_text
def make_predictions(X_batch):
X_batch_text = vectorizer.transform(X_batch)
preds = model.predict(X_batch_text.toarray())
return preds
explainer = lime_text.LimeTextExplainer(class_names=selected_categories)
rng = np.random.RandomState(42)
idx = rng.randint(1, len(X_test))
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
print("Actual : ", selected_categories[Y_test[idx]])
explanation = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test[idx:idx+1])
explanation.show_in_notebook()
In this section, we have explained one incorrect prediction. The actual category of the selected sample is rec.autos whereas our model predicted rec.sport.baseball. We can check from visualizations which words contributed to a particular category.
Y_test_preds = model.predict(X_test_vect).argmax(axis=-1)
wrong_preds = np.argwhere(Y_test!=Y_test_preds)
rng = np.random.RandomState(123)
idx = rng.choice(wrong_preds.flatten())
print("Actual : ", selected_categories[Y_test[idx]])
print("Prediction : ", selected_categories[model.predict(X_test_vect[idx:idx+1]).argmax(axis=-1)[0]])
explanation_actual = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test[idx:idx+1])
explanation_pred = explainer.explain_instance(X_test[idx], classifier_fn=make_predictions, num_features=10, labels=Y_test_preds[idx:idx+1])
explanation_actual.show_in_notebook()
explanation_pred.show_in_notebook()
This ends our small tutorial explaining how we can use lime to explain predictions made by our Keras text classification networks. Please feel free to let us know your views in the comments section.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to