Share @ LinkedIn Facebook  eli5, interpret-ml-models
How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions [Python]?

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Table of Contents

eli5

The scikit-learn is the go-to library for many machine learning practitioners around the world for quick construction of machine learning cycles. It provides a very easy to use interface and has an implementation of many machine learning algorithms. Even though scikit-learn provides an easy-to-use interface and implementation of all algorithms, we can't know the exact performance of our model only based on metrics like accuracy, r^2 score, roc curve, precision-recall curve, etc. We need something more robust which can give us insight into why the model is performing a particular way for a particular example. We need a way to understand how different features are contributing to predictions to better understand whether our model has generalized well and is reliable. The python has a library called eli5 which can help us better understand trained models on why they make a particular prediction on a particular sample. This can help us better understand our model to know about important features as well as the reliability of the model.

The eli5 has support for a list of below libraries whose white-box models it can explain though we'll be primarily concentrating on sklearn as a part of this tutorial.

  • Scikit-learn
  • XGBoost
  • CatBoost
  • Keras
  • lightning
  • LightGBM
  • sklearn-crfsuite

The eli5 provides two ways to understand ML models:

  • It lets us analyze model weights to understand the global performance of the model.
  • It lets us analyze individual sample prediction to understand the local performance of the model. This can help us drill down why the particular prediction was made and which parameters played what role in that prediction.

The eli5 divides its API into two-part where it can take model/sample as input and generate an Explanation object. This object can then be formatted in different ways by different formating methods for formatting output in various ways like HTML, image, text, dict, dataframe, etc.

The eli5 can also handle sklearn models which are the pipeline of estimators and can reverse encoding performed on data. It can even handle text to show us which part of the text is contributing to predict particular label and image data to highlight which part of the image was used to make a prediction. The eli5 also provides an implementation of LIME (Locally-fit simple, interpretable model-agnostic explanations) for estimators which are quite complicated and eli5 does not have methods which can explain such models. The eli5 has implementation methods that can handle all major model types available in all libraries mentioned above. The LIME is for black-box models for which eli5 does not have any implementation.

We'll be explaining the usage of eli5's API as a part of this tutorial through various examples and datasets.

In [1]:
import pandas as pd
import numpy as np

import sklearn

import eli5

import warnings
warnings.filterwarnings("ignore")
Using TensorFlow backend.

Structured Data : Regression

First, we'll use eli5 charts for explaining machine learning models which are used for regression tasks and structured data.

Example 1

As a part of our first example, we'll be using the Boston housing dataset available from scikit-learn. The dataset has information about various housing attributes and a target to predict is the median value of homes in 1000s dollar. The dataset is easily available from the scikit-learn library. We have below the printed dataset description explaining various feature meanings and have also loaded the dataset as a pandas dataframe.

In [2]:
from sklearn.datasets import load_boston

boston = load_boston()

for line in boston.DESCR.split("\n")[5:27]:
    print(line)

boston_df = pd.DataFrame(data=boston.data, columns = boston.feature_names)
boston_df["Price"] = boston.target

boston_df.head()
**Data Set Characteristics:**

    :Number of Instances: 506

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
        - LSTAT    % lower status of the population
        - MEDV     Median value of owner-occupied homes in $1000's

Out[2]:
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0 15.3 396.90 4.98 24.0
1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0 17.8 396.90 9.14 21.6
2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0 17.8 392.83 4.03 34.7
3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0 18.7 394.63 2.94 33.4
4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0 18.7 396.90 5.33 36.2

We'll now divide dataset into train(90%) and test(10%) sets using train_test_split() method of scikit-learn.

In [3]:
from sklearn.model_selection import train_test_split

X, Y = boston.data, boston.target

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, random_state=123, shuffle=True)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[3]:
((455, 13), (51, 13), (455,), (51,))

Below we have fitted a simple linear regression model to train data and then printed the r2 score of the model on both test and train data. If you are interested in learning about various machine learning metrics available from sklearn then please feel free to check our tutorial on the same.

In [4]:
from sklearn.linear_model import LinearRegression

lr = LinearRegression()
lr.fit(X_train, Y_train)

print("Test R^2 Score  : ", lr.score(X_test, Y_test))
print("Train R^2 Score : ", lr.score(X_train, Y_train))
Test R^2 Score  :  0.6412254020969463
Train R^2 Score :  0.7511685217987627

The eli5 provides two common methods that can be used across different models.

  • show_weights() - It displays estimator's global weights as HTML table, text, etc. These are generally the same as coef_ and feature_importances_ available with linear and tree or ensemble sklearn estimators.
  • show_prediction() - It displays estimator's performance on particular sample as html table, tree, text, etc.

show_weights()

The first method that we'll explore using eli5 is show_weights(). Below we have used show weights to show feature importance for the linear regression model which we trained in the previous cell.

In [ ]:
from eli5 import show_weights

show_weights(lr, feature_names=boston.feature_names)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Important Parameters of show_weights()

Below we have given a list of important parameters of the show_weights() method which can help us modify visualizations generated by the method.

  • feature_names: This parameter accepts a list of string specifying feature names.
  • top: This parameter accepts int or (pos:int, neg:int) tuple. If we specify only a single integer then many top features from the model will be included in the figure. If we specify (pos: int, neg: int) tuple then that many positive and negative features will be displayed.
  • target_names: This parameter accepts a list of string or dictionary with mapping from old target names to new target names.
  • target: This parameter accepts a list of target names for which we want to see weights to be displayed. This can be useful if our classification task has many classes and we want to see the weights of a few of them only.
  • feature_re: This parameter accepts string specifying regular expression which lets us select feature names that satisfy the criteria of that regular expression.
  • feature_filter: This parameter has the same usage as that of feature_re and let us specify callable which will accept the future name and return True/False. Then based on that True/False parameter will be selected/rejected to be displayed.
  • show: This parameter accepts a list of string values specifying which sections of explanation to show. Below is a list of values available. Please make a note that not all sections will be available with all models. E.g : decision_tree won't be available with LinearRegression.
    • targets - Shows feature weights for target class
    • transition_features - Shows transition feature of a CRF model.
    • feature_importances - Shows feature importances of the decision tree, random forest, etc.
    • decision_tree - Shows decision tree.
    • method - A string explaining the method.
    • description - Text explaining method used by the model and its caveats.

Below we have again used show_weights() to show model weights as a table but this time we are only displaying the top 7 important features.

In [ ]:
show_weights(lr, feature_names=boston.feature_names, top=7)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are again displaying model weights as an HTML table with the top 3 positive features and 1 negative feature in the table.

In [ ]:
show_weights(lr, feature_names=boston.feature_names, top=(3,1))

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are displaying model weights for the only feature whose name starts with the letter R.

In [ ]:
show_weights(lr, feature_names=boston.feature_names, top=5, feature_re=r"^R")

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have displayed model weights again but with the only feature whose name is ending with the character T.

In [ ]:
show_weights(lr, feature_names=boston.feature_names, top=5, feature_filter=lambda fet : fet.endswith("T"))

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have used the show_weights() method to show which method was used by the model for prediction as well as a description explaining how the model works and its caveats.

In [10]:
show_weights(lr, feature_names=boston.feature_names, show=["method", "description"])
Out[10]:

Explained as: linear model

Features with largest coefficients.
Caveats:
1. Be careful with features which are not
   independent - weights don't show their importance.
2. If scale of input features is different then scale of coefficients
   will also be different, making direct comparison between coefficient values
   incorrect.
3. Depending on regularization, rare features sometimes may have high
   coefficients; this doesn't mean they contribute much to the
   classification result for most examples.

show_prediction()

The second important method that eli5 provides is show_prediction() which can be used to show how individual feature is contributing when making a prediction for a particular sample. This can give us more insights into how the model is performing for a particular sample of data.

Below we are taking a random sample from test data and then plotting how individual feature is contributing in predicting output for this random sample along with feature values as HTML table.

In [ ]:
from eli5 import show_prediction

import random

rand = random.randint(1, len(X_test))

print("Actual Target Value : ", Y_test[rand])

show_prediction(lr, X_test[rand], feature_names=boston.feature_names, show_feature_values=True)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

The above table starts with the BIAS value and then adds all feature values one by one to it to generate a final prediction.

Important Parameters of show_prediction()

The show_prediction() method has all parameters almost same as that of show_weights() with few extra parameters. We have below highlighted all parameters with the description given for new parameters only.

  • top: It has same meaning as in show_weights().
  • top_targets: It has same meaning as in show_weights().
  • target_names: It has same meaning as in show_weights().
  • targets: It has same meaning as in show_weights().
  • feature_names: It has same meaning as in show_weights().
  • feature_re: It has same meaning as in show_weights().
  • feature_filter: It has same meaning as in show_weights().
  • show: It has same meaning as in show_weights().
  • horizontal_layout: It accepts bool value. If set to True then feature weights tables for classification tasks are laid out horizontally else vertically.
  • highlight_spaces: It accepts bool value specifying whether to highlight space between features when highlighting text data or not.
  • force_weights: It accepts bool value specifying whether to show weights table if features are already highlighted in the text representation.
  • preserve_density: It accepts bool value specifying intensities of the color of text sample when working with text data.
  • show_feature_values: It accepts bool value specifying whether feature values should be shown in along with weights in a table or not.

Below we are again plotting results from show_prediction() with only showing the top 5 features in the resulting table.

In [ ]:
show_prediction(lr, X_test[rand], feature_names=boston.feature_names, show_feature_values=True, top=5)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are plotting results from show_prediction() with only features which has the letter O contained in it.

In [ ]:
show_prediction(lr, X_test[rand], feature_names=boston.feature_names, show_feature_values=True, feature_re="O")

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Example 2

As a part of 2nd example, we'll be using the California housing dataset available from scikit-learn. We have plotted a dataset description explaining the meaning of various features available in the dataset.

We have then divided data into train/test sets, fitted model on train data, and evaluated it on test data. We have printed the r2 score of the model on both train and test sets.

We'll be explaining tree estimators using the show_weights() and show_prediction() method as a part of this example.

In [14]:
from sklearn.datasets import fetch_california_housing
from sklearn.tree import DecisionTreeRegressor

calif_housing = fetch_california_housing()

for line in calif_housing.DESCR.split("\n")[5:21]:
    print(line)

X, Y = calif_housing.data, calif_housing.target

print("\nData Size : ", X.shape, Y.shape)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, random_state=123)

print("Train/Test Split : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

dtree = DecisionTreeRegressor(max_depth=4, max_leaf_nodes=250, max_features="log2")

dtree.fit(X_train, Y_train)

print("Test  R^2 Score : %.2f"%dtree.score(X_test, Y_test))
print("Train R^2 Score : %.2f"%dtree.score(X_train, Y_train))
**Data Set Characteristics:**

    :Number of Instances: 20640

    :Number of Attributes: 8 numeric, predictive attributes and the target

    :Attribute Information:
        - MedInc        median income in block
        - HouseAge      median house age in block
        - AveRooms      average number of rooms
        - AveBedrms     average number of bedrooms
        - Population    block population
        - AveOccup      average house occupancy
        - Latitude      house block latitude
        - Longitude     house block longitude


Data Size :  (20640, 8) (20640,)
Train/Test Split :  (18576, 8) (2064, 8) (18576,) (2064,)
Test  R^2 Score : 0.53
Train R^2 Score : 0.54

Below we have plotted feature importances for the model.

In [ ]:
show_weights(dtree, feature_names=calif_housing.feature_names,
             show=["feature_importances"])

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have plotted a global decision tree, an algorithm used for prediction, and a description of an algorithm using the show_weights() method.

In [ ]:
show_weights(dtree, feature_names=calif_housing.feature_names,
             show=["decision_tree", "method", "description"])

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have printed model performance on random test sample using the show_prediction() method.

In [ ]:
rand = random.randint(1, len(X_test))

print("Actual Target Value : ", Y_test[rand])

show_prediction(dtree,
                X_test[rand],
                feature_names=calif_housing.feature_names,
                show_feature_values=True,
                )

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Example 3

As a part of our third example, we'll be using the same California housing dataset from the previous example. We have trained the decision tree on it as earlier. We'll be explaining how to create explanations from machine learning models and then format output from that explanations as a part of this example.

In [18]:
from sklearn.datasets import fetch_california_housing
from sklearn.tree import DecisionTreeRegressor

calif_housing = fetch_california_housing()

X, Y = calif_housing.data, calif_housing.target

print("Data Size : ", X.shape, Y.shape)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, random_state=123)

print("Train/Test Split : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

dtree = DecisionTreeRegressor(max_depth=3, max_leaf_nodes=250, max_features="log2")

dtree.fit(X_train, Y_train)

print("Test  R^2 Score : %.2f"%dtree.score(X_test, Y_test))
print("Train R^2 Score : %.2f"%dtree.score(X_train, Y_train))
Data Size :  (20640, 8) (20640,)
Train/Test Split :  (18576, 8) (2064, 8) (18576,) (2064,)
Test  R^2 Score : 0.47
Train R^2 Score : 0.47

explain_weights_sklearn()

The explain_weights_sklearn() method available as a part of the sklearn module of eli5 takes as input model used for training data and feature names as input and returns explanation object of type Explanation. We can pass this explanation object to different formating methods available with eli5 to display explanations in different formats.

Below we have generated an explanation object from our decision tree regressor model.

In [19]:
from eli5.sklearn import explain_weights_sklearn

explanation = explain_weights_sklearn(dtree, feature_names=calif_housing.feature_names)
type(explanation)
Out[19]:
eli5.base.Explanation

format_as_dataframe()

The format_as_dataframe() method takes as input explanation object and returns model weights as pandas dataframe.

In [20]:
from eli5.formatters import format_as_dataframe, format_as_dataframes

format_as_dataframe(explanation)
Out[20]:
feature weight
0 MedInc 0.582056
1 AveRooms 0.255254
2 AveOccup 0.157394
3 Longitude 0.005296
4 Latitude 0.000000
5 Population 0.000000
6 AveBedrms 0.000000
7 HouseAge 0.000000

format_as_text()

The format_as_text() method takes as input explanation object and formats explanation as text data.

In [21]:
from eli5.formatters import format_as_text

print(format_as_text(explanation))
Explained as: decision tree

Decision tree feature importances; values are numbers 0 <= x <= 1;
all values sum to 1.

0.5821  MedInc
0.2553  AveRooms
0.1574  AveOccup
0.0053  Longitude
     0  Latitude
     0  Population
     0  AveBedrms
     0  HouseAge

AveRooms <= 6.374  (82.2%)
    MedInc <= 3.547  (48.0%)
        AveOccup <= 2.295  (10.0%)  ---> 1.9974354115115716
        AveOccup > 2.295  (38.0%)  ---> 1.347902199886746
    MedInc > 3.547  (34.1%)
        AveOccup <= 2.445  (9.4%)  ---> 3.1369938917378946
        AveOccup > 2.445  (24.7%)  ---> 2.159249705433122
AveRooms > 6.374  (17.8%)
    MedInc <= 5.720  (8.3%)
        Longitude <= -116.865  (7.6%)  ---> 1.9912349575671855
        Longitude > -116.865  (0.7%)  ---> 1.2608730952380947
    MedInc > 5.720  (9.6%)
        MedInc <= 7.815  (6.2%)  ---> 3.338185060763899
        MedInc > 7.815  (3.4%)  ---> 4.598185987158948

format_as_html()

The format_as_html() method takes as input explanation object and returns explanation as HTML stored in a python string. We can then display this string HTML using IPython's display functionality.

In [ ]:
from eli5.formatters import format_as_html
from IPython.display import HTML

html_rep = format_as_html(explanation)
HTML(data=html_rep)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

explain_decision_tree()

The explain_decision_tree() method available from the sklearn module of eli5 takes as input decision tree model and returns explanation object of type Explanation. We can then display this object and it'll show weights HTML table as well as decision tree as HTML.

In [ ]:
from eli5.sklearn import explain_decision_tree

tree_explanation = explain_decision_tree(dtree, feature_names=calif_housing.feature_names)
print(type(tree_explanation))
explanation

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

explain_prediction_tree_regressor()

The explain_prediction_tree_regressor() method available as a part of the sklearn.explain_prediction module of eli5 takes as input decision tree, data sample, and return explanation object. We can then format this object with various formatters available with eli5.

In [24]:
from eli5.sklearn import explain_prediction

rand = random.randint(1, len(X_test))

print("Actual Target Value : ", Y_test[rand])

explanation = explain_prediction.explain_prediction_tree_regressor(dtree, X_test[rand],
                                                     feature_names=calif_housing.feature_names,
                                                     top=len(calif_housing.feature_names)
                                                    )
Actual Target Value :  2.109

Below we are formating explanation generated from the previous step as pandas dataframe.

In [25]:
format_as_dataframe(explanation)
Out[25]:
target feature weight value
0 y <BIAS> 2.062921 1.000000
1 y MedInc 0.553566 3.821400
2 y AveRooms -0.186498 5.897436
3 y AveOccup -0.270739 2.948718

Structured Data : Classification

As a part of this section, we'll be explaining the usage of eli5 for the classification of structured data. We'll be loading various structured datasets, perform classification on them, and explain predictions using eli5.

Example 1

The first example that we'll use for explaining the usage of eli5 for classification tasks uses a breast cancer dataset available from scikit-learn. We have loaded the dataset and printed description which explains dataset features and the target variable. The target, in this case, is a binary variable specifying whether the tumor is malignant or benign. We have then divided the dataset into train/test sets, fitted logistic regression on it, and printed classification metrics like accuracy, confusion matrix, and classification report of test data.

In [26]:
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.linear_model import LogisticRegression

breast_cancer = load_breast_cancer()

for line in breast_cancer.DESCR.split("\n")[5:32]:
    print(line)

X, Y = breast_cancer.data, breast_cancer.target

print("\nData Size : ", X.shape, Y.shape)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, stratify=Y, random_state=123)

print("Train/Test Sizes : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

lr = LogisticRegression()

lr.fit(X_train, Y_train)

print("Test  Accuracy : %.2f"%lr.score(X_test, Y_test))
print("Train Accuracy : %.2f"%lr.score(X_train, Y_train))
print()
print("Confusion Matrix : ")
print(confusion_matrix(Y_test, lr.predict(X_test)))
print()
print("Classification Report")
print(classification_report(Y_test, lr.predict(X_test)))
**Data Set Characteristics:**

    :Number of Instances: 569

    :Number of Attributes: 30 numeric, predictive attributes and the class

    :Attribute Information:
        - radius (mean of distances from center to points on the perimeter)
        - texture (standard deviation of gray-scale values)
        - perimeter
        - area
        - smoothness (local variation in radius lengths)
        - compactness (perimeter^2 / area - 1.0)
        - concavity (severity of concave portions of the contour)
        - concave points (number of concave portions of the contour)
        - symmetry
        - fractal dimension ("coastline approximation" - 1)

        The mean, standard error, and "worst" or largest (mean of the three
        largest values) of these features were computed for each image,
        resulting in 30 features.  For instance, field 3 is Mean Radius, field
        13 is Radius SE, field 23 is Worst Radius.

        - class:
                - WDBC-Malignant
                - WDBC-Benign


Data Size :  (569, 30) (569,)
Train/Test Sizes :  (512, 30) (57, 30) (512,) (57,)
Test  Accuracy : 0.96
Train Accuracy : 0.96

Confusion Matrix :
[[20  1]
 [ 1 35]]

Classification Report
              precision    recall  f1-score   support

           0       0.95      0.95      0.95        21
           1       0.97      0.97      0.97        36

    accuracy                           0.96        57
   macro avg       0.96      0.96      0.96        57
weighted avg       0.96      0.96      0.96        57

Below we have plotted weights of the model using the show_weights() method. We have passed target names as well this time to method.

In [ ]:
show_weights(lr,
             targets=[0, 1], target_names=breast_cancer.target_names,
             feature_names=breast_cancer.feature_names,
             top=len(breast_cancer.feature_names)+1)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are using the show_prediction() method to show the contribution of features in the prediction of the random sample chosen from the test dataset.

In [ ]:
rand  = random.randint(1, len(X_test))

print("Actual Target Value : ", breast_cancer.target_names[Y_test[rand]])
print("Model Prediction : ", breast_cancer.target_names[lr.predict(X_test[rand].reshape(1,-1))[0]])

show_prediction(lr, X_test[rand],
                targets=[0, 1], target_names=breast_cancer.target_names,
                feature_names=breast_cancer.feature_names,
                show_feature_values=True)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have displayed another example for show_prediction() but this time we have chosen a sample randomly from the test dataset which was predicted wrong by the model.

In [ ]:
preds = lr.predict(X_test)

false_preds = np.argwhere((preds != Y_test)).flatten()

rand  = random.choice(false_preds)

print("Actual Target Value : ", breast_cancer.target_names[Y_test[rand]])
print("Model Prediction : ", breast_cancer.target_names[lr.predict(X_test[rand].reshape(1,-1))[0]])

show_prediction(lr, X_test[rand],
                targets=[0, 1], target_names=breast_cancer.target_names,
                feature_names=breast_cancer.feature_names,
                show_feature_values=True)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Example 2

The second example that we'll use for explaining the usage of eli5 for the classification task uses a wine dataset which has information about various ingredients used in the creation of three different types of wines. It's a multi-class classification problem. We have divided the dataset into train/test sets, fitted model on train data, and printed various classification metrics like accuracy, confusion matrix, and classification report calculated on test data.

In [30]:
from sklearn.datasets import load_wine

wine = load_wine()

X, Y = wine.data, wine.target

print("Data : ", X.shape, Y.shape)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, stratify=Y, random_state=123)

print("Train/Test Split : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

lr = LogisticRegression()

lr.fit(X_train, Y_train)

print("Test  Accuracy : %.2f"%lr.score(X_test, Y_test))
print("Train Accuracy : %.2f"%lr.score(X_train, Y_train))

print("Confusion Matrix : ")
print(confusion_matrix(Y_test, lr.predict(X_test)))
print()
print("Classification Report")
print(classification_report(Y_test, lr.predict(X_test)))
Data :  (178, 13) (178,)
Train/Test Split :  (160, 13) (18, 13) (160,) (18,)
Test  Accuracy : 1.00
Train Accuracy : 0.97
Confusion Matrix :
[[6 0 0]
 [0 7 0]
 [0 0 5]]

Classification Report
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         6
           1       1.00      1.00      1.00         7
           2       1.00      1.00      1.00         5

    accuracy                           1.00        18
   macro avg       1.00      1.00      1.00        18
weighted avg       1.00      1.00      1.00        18

Below we are using the show_weights() method to plot a chart explaining weights for multi-class classification problems. We can see that three different tables, one for each class of wine type.

In [ ]:
show_weights(lr,
             targets=[0, 1, 2], target_names=wine.target_names,
             feature_names=wine.feature_names)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have used the show_prediction() method for explaining how different features contribute to the prediction of a random sample from the test dataset. We can notice that there are three tables in this case as well, one per each class. If we sum up the values of individual features and BIAS in each table, whichever is highest will be class predicted by the model.

In [ ]:
rand = random.randint(1, len(X_test))

print("Actual Target Value : ", wine.target_names[Y_test[rand]])

show_prediction(lr, X_test[rand],
                targets=[0, 1, 2], target_names=wine.target_names,
                feature_names=wine.feature_names)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we have shown another example of show_prediction() this time but we have lined up tables vertically because we wanted to show actual feature values as well in table and tables were overlapping.

In [ ]:
show_prediction(lr, X_test[rand],
                targets=[0, 1, 2], target_names=wine.target_names,
                feature_names=wine.feature_names,
                horizontal_layout=False,
                show_feature_values=True)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Example 3

As a part of our third example for explaining the classification task on structured data will use the IRIS flowers dataset available from scikit-learn. It has information about measurements of three different types of IRIS flowers. We'll be loading datasets, dividing it into train/test sets, fitting a decision tree to train data, and print various classification metrics evaluated on the test dataset.

In [34]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report

iris = load_iris()

for line in iris.DESCR.split("\n")[5:19]:
    print(line)

X, Y = iris.data, iris.target

print("\nData Size : ", X.shape, Y.shape)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, stratify=Y, random_state=123)

print("Train/Test Split : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

dtree = DecisionTreeClassifier(max_depth=None, max_features="log2")

dtree.fit(X_train, Y_train)

print("Test  Accuracy : %.2f"%dtree.score(X_test, Y_test))
print("Train Accuracy : %.2f"%dtree.score(X_train, Y_train))
print()
print("Confusion Matrix : ")
print(confusion_matrix(Y_test, dtree.predict(X_test)))
print()
print("Classification Report")
print(classification_report(Y_test, dtree.predict(X_test)))
**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica


Data Size :  (150, 4) (150,)
Train/Test Split :  (135, 4) (15, 4) (135,) (15,)
Test  Accuracy : 0.87
Train Accuracy : 1.00

Confusion Matrix :
[[5 0 0]
 [0 3 2]
 [0 0 5]]

Classification Report
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         5
           1       1.00      0.60      0.75         5
           2       0.71      1.00      0.83         5

    accuracy                           0.87        15
   macro avg       0.90      0.87      0.86        15
weighted avg       0.90      0.87      0.86        15

Below we are using the show_weights() method to plot weights importance, decision tree, the method used by ml algorithm for prediction, and description of the algorithm.

In [ ]:
show_weights(dtree, feature_names=iris.feature_names,
             show=["feature_importances", "decision_tree", "method", "description"])

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are using a random sample from the test dataset and plotting the contribution of individual features in predicting its class.

In [ ]:
rand = random.randint(1, len(X_test))

print("Actual Target Value : ", iris.target_names[Y_test[rand]])

show_prediction(dtree,
                X_test[rand],
                feature_names=iris.feature_names,
                targets=[0,1,2], target_names=iris.target_names,
                show_feature_values=True,
                show=["targets", "method", "description"]
                )

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

We'll now generate few explanation objects and format explanations generated by eli5 in a different way.

Below we are generating the eli5 Explanation object by using the explain_weights_sklearn() method passing it decision tree, feature names, and target names.

In [37]:
from eli5.sklearn import explain_weights_sklearn

explanation = explain_weights_sklearn(dtree, feature_names=iris.feature_names, target_names=iris.feature_names)

Below we are formating an explanation object generated from the previous cell as pandas dataframe.

In [38]:
from eli5.formatters import format_as_dataframe, format_as_dataframes

format_as_dataframe(explanation)
Out[38]:
feature weight
0 petal length (cm) 0.506667
1 petal width (cm) 0.400028
2 sepal width (cm) 0.064444
3 sepal length (cm) 0.028861

Below we are formatting explanation generated earlier as text using format_as_text() method. It gives us an explanation of how we are coming to a particular prediction based on feature values.

In [39]:
from eli5.formatters import format_as_text

print(format_as_text(explanation))
Explained as: decision tree

Decision tree feature importances; values are numbers 0 <= x <= 1;
all values sum to 1.

0.5067  petal length (cm)
0.4000  petal width (cm)
0.0644  sepal width (cm)
0.0289  sepal length (cm)

petal length (cm) <= 2.450  (33.3%)  ---> [1.000, 0.000, 0.000]
petal length (cm) > 2.450  (66.7%)
    petal width (cm) <= 1.750  (36.3%)
        sepal length (cm) <= 7.100  (35.6%)
            petal width (cm) <= 1.350  (20.7%)  ---> [0.000, 1.000, 0.000]
            petal width (cm) > 1.350  (14.8%)
                sepal width (cm) <= 2.650  (2.2%)  ---> [0.000, 0.000, 1.000]
                sepal width (cm) > 2.650  (12.6%)
                    sepal width (cm) <= 2.850  (3.7%)
                        petal length (cm) <= 4.950  (2.2%)  ---> [0.000, 1.000, 0.000]
                        petal length (cm) > 4.950  (1.5%)
                            petal width (cm) <= 1.550  (0.7%)  ---> [0.000, 0.000, 1.000]
                            petal width (cm) > 1.550  (0.7%)  ---> [0.000, 1.000, 0.000]
                    sepal width (cm) > 2.850  (8.9%)  ---> [0.000, 1.000, 0.000]
        sepal length (cm) > 7.100  (0.7%)  ---> [0.000, 0.000, 1.000]
    petal width (cm) > 1.750  (30.4%)
        sepal length (cm) <= 6.050  (5.2%)
            sepal length (cm) <= 5.850  (3.7%)  ---> [0.000, 0.000, 1.000]
            sepal length (cm) > 5.850  (1.5%)
                sepal width (cm) <= 3.100  (0.7%)  ---> [0.000, 0.000, 1.000]
                sepal width (cm) > 3.100  (0.7%)  ---> [0.000, 1.000, 0.000]
        sepal length (cm) > 6.050  (25.2%)  ---> [0.000, 0.000, 1.000]

Below we have plotted the explanation as HTML.

In [ ]:
from eli5.formatters import format_as_html
from IPython.display import HTML

html_rep = format_as_html(explanation)
HTML(data=html_rep)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are plotting feature importance and decision tree as HTML using the explain_decision_tree() method of eli5.

In [ ]:
from eli5.sklearn import explain_weights_sklearn, explain_decision_tree

explain_decision_tree(dtree, feature_names=iris.feature_names)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are generating an explanation of random sample from test data using the explain_prediction() method of sklearn.explain_prediction module of eli5.

In [42]:
from eli5.sklearn import explain_prediction

rand = random.randint(1, len(X_test))

print("Actual Target Value : ", iris.target_names[Y_test[rand]])

explanation = explain_prediction.explain_prediction_tree_classifier(dtree, X_test[rand],
                                                     targets=[0,1,2], target_names=iris.target_names,
                                                     feature_names=iris.feature_names
                                                    )
Actual Target Value :  versicolor

Below we are plotting explanation generated for a random sample from the previous cell as HTML.

In [ ]:
HTML(format_as_html(explanation))

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Unstructured Data (Text) : Classification

As a part of this section, we'll be explaining how to use eli5 explanation to explain the model trained on unstructured text data. If you want to learn in-depth about various scikit-learn estimators used as a part of this section then please feel free to check our tutorial on the same.

Example 1

The dataset that we'll use for explaining the usage of eli5 with unstructured data (text) is the spam/ham messages dataset available from UCI. Below we are downloading it from the UCI repository and then unzipping it.

In [47]:
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip
!unzip smsspamcollection.zip
--2020-10-14 17:47:04--  https://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 203415 (199K) [application/x-httpd-php]
Saving to: ‘smsspamcollection.zip’

smsspamcollection.z 100%[===================>] 198.65K  87.0KB/s    in 2.3s

2020-10-14 17:47:08 (87.0 KB/s) - ‘smsspamcollection.zip’ saved [203415/203415]

Archive:  smsspamcollection.zip
  inflating: SMSSpamCollection
  inflating: readme

Below we are loading the spam/ham dataset by splitting each line of the file.

In [44]:
with open('SMSSpamCollection') as f:
    data = [line.strip().split('\t') for line in f.readlines()]

y, text = zip(*data)

We can count a number of spam and ham samples by passing target values to Counter from the collections library.

In [45]:
import collections

collections.Counter(y)
Out[45]:
Counter({'ham': 4827, 'spam': 747})

All scikit-learn model expects input data to be a matrix of float values. The dataset that we have is a list of strings. We'll be transforming this dataset of text to the matrix of floats using the TfIdfVectorizer class of scikit-learn which calculates tf-idf values for each word of text and creates a matrix based on words present in each sample.

We'll start by dividing the dataset into train/test sets, transform datasets using TfIdfVectorizer, fit random forest classifier on train data and print various classification metrics like accuracy, confusion matrix, and classification report evaluated on the test dataset.

If you are interested in learning about the inner-workings of TfIdfVectorizer then please feel free to visit our tutorial on feature extraction from text data using scikit-learn where we explain it in-depth.

In [46]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, classification_report

text_train, text_test, y_train, y_test = train_test_split(text, y,
                                                          random_state=42,
                                                          test_size=0.25,
                                                          stratify=y)


tfidf_vectorizer = TfidfVectorizer(analyzer="word", stop_words='english')
tfidf_vectorizer.fit(text_train)

X_train_tfidf = tfidf_vectorizer.transform(text_train)
X_test_tfidf = tfidf_vectorizer.transform(text_test)


print("Train/Test Vector Size : ", X_train_tfidf.shape, X_test_tfidf.shape)

rf = RandomForestClassifier()

rf.fit(X_train_tfidf, y_train)

print("Test  Accuracy : %.2f"%rf.score(X_test_tfidf, y_test))
print("Train Accuracy : %.2f"%rf.score(X_train_tfidf, y_train))
print()
print("Confusion Matrix : ")
print(confusion_matrix(y_test, rf.predict(X_test_tfidf)))
print()
print("Classification Report")
print(classification_report(y_test, rf.predict(X_test_tfidf)))
Train/Test Vector Size :  (4180, 7184) (1394, 7184)
Test  Accuracy : 0.97
Train Accuracy : 1.00

Confusion Matrix :
[[1207    0]
 [  38  149]]

Classification Report
              precision    recall  f1-score   support

         ham       0.97      1.00      0.98      1207
        spam       1.00      0.80      0.89       187

    accuracy                           0.97      1394
   macro avg       0.98      0.90      0.94      1394
weighted avg       0.97      0.97      0.97      1394

Below we are plotting weights using the show_weights() method but unlike previous examples on structured data, this time we are plotting weights of a words text corpus. In order to show the mapping between words of text and weights, we need to pass the TF-IDF vectorizer created previous cell to vec parameter.

We can also pass list returned by get_feature_names() method of tfidf vectorizer to feature_names parameter of show_weights() method and it'll work as well.

In [ ]:
eli5.show_weights(rf, vec=tfidf_vectorizer, targets=[0,1], target_names=["ham", "spam"], top=15)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Below we are plotting the contribution of individual words from a random test sample in predicting the class of that sample. We can also note that the original text message is shown along with which words contributed most in prediction are highlighted as well.

In [ ]:
preds = rf.predict(X_test_tfidf)

false_preds = np.argwhere((preds != y_test)).flatten()

rand  = random.choice(false_preds)

print("Actual Target Value : ", y_test[rand])
print("Model Prediction : ", rf.predict(X_test_tfidf[rand])[0])

eli5.show_prediction(rf, text_test[rand], vec=tfidf_vectorizer, target_names=["ham", "spam"], top=10)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?

Example 2

The second example that we'll use for explaining unstructured text data uses the same data of spam/ham messages. But this time, we are using the TF-IDF vectorizer which is based on the character of length 3-5. All other steps are the same as the previous examples except change in the way TF-IDF transforms text data to float matrix.

In [49]:
tfidf_vectorizer = TfidfVectorizer(analyzer="char", stop_words='english', ngram_range=(3,5))
tfidf_vectorizer.fit(text_train)

X_train_tfidf = tfidf_vectorizer.transform(text_train)
X_test_tfidf = tfidf_vectorizer.transform(text_test)


print("Train/Test Vector Size : ", X_train_tfidf.shape, X_test_tfidf.shape)

rf = RandomForestClassifier()

rf.fit(X_train_tfidf, y_train)

print("Test  Accuracy : %.2f"%rf.score(X_test_tfidf, y_test))
print("Train Accuracy : %.2f"%rf.score(X_train_tfidf, y_train))
print()
print("Confusion Matrix : ")
print(confusion_matrix(y_test, rf.predict(X_test_tfidf)))
print()
print("Classification Report")
print(classification_report(y_test, rf.predict(X_test_tfidf)))
Train/Test Vector Size :  (4180, 130143) (1394, 130143)
Test  Accuracy : 0.98
Train Accuracy : 1.00

Confusion Matrix :
[[1207    0]
 [  25  162]]

Classification Report
              precision    recall  f1-score   support

         ham       0.98      1.00      0.99      1207
        spam       1.00      0.87      0.93       187

    accuracy                           0.98      1394
   macro avg       0.99      0.93      0.96      1394
weighted avg       0.98      0.98      0.98      1394

Below we are plotting a random sample from test data which was predicted wrong by the model and highlighting which n-grams contributed most to that prediction.

In [ ]:
preds = rf.predict(X_test_tfidf)

false_preds = np.argwhere((preds != y_test)).flatten()

rand  = random.choice(false_preds)

print("Actual Target Value : ", y_test[rand])
print("Model Prediction : ", rf.predict(X_test_tfidf[rand])[0])

eli5.show_prediction(rf, text_test[rand], vec=tfidf_vectorizer, target_names=["ham", "spam"], top=10)

How to Use eli5 to Understand sklearn Models, their Performance, and their Predictions?



Sunny Solanki  Sunny Solanki