Share @ LinkedIn Facebook  dice-ml, interpret-ml-models
dice-ml - Diverse Counterfactual Explanations for ML Models [Python]

dice-ml - Diverse Counterfactual Explanations for ML Models

The machine learning models have become quite common nowadays and people are using them in almost all domains (finance, insurance, education, etc) to make the first round of decisions. It has become quite common to expect explanations of a prediction made by the model. Python has a list of libraries (SHAP, lime, treeinterpreter, interpret-ml, interpret-text, eli5, etc) which can help us understand why the particular prediction was made and how much each data feature contributed to the prediction. The explanation of the model also gives confidence to the developer and others about the reliability of the model letting them know whether the features which should have ideally contributed to explanations are the only feature contributing or not. We have already created various tutorials on libraries mentioned earlier about machine learning model prediction interpretation (see references section for links to them). As a part of this tutorial, we'll be concentrating on how to use another Python library named dice-ml designed to explain machine learning predictions. It's open-source and designed by the Microsoft research team.

The DiCE is based on a concept of generating counterfactual example to our original example. It'll generate other examples that will have the majority of feature values almost the same as our original example with few values tweaked which will result in the model predicting an opposite class than the one it has predicted currently. As an example, let’s consider that a person applies for a loan at a bank that uses the ML algorithm to predict whether his/her application will be approved. After a person applies online by filling the loan application form, the model predicts that his loan will be rejected. Now just saying the person that his loan was rejected due to a low credit score won't help him/her as he'll need to understand what he/she could do that will result in his/her application getting approved next time. The DiCE generates counterfactual examples in this situation which will generate other examples that will have a majority of feature values same as the applicant with few changes like increasing monthly income, showing collateral with high value, etc will result in his loan getting approved. This will guide a person taking steps in a direction that will get him/her loan approved next time. We can even generate counterfactual examples of the same class as well with DiCE.

The dice-ml is mainly used to generate counterfactual examples for binary classification problems as of now. We'll be explaining how we can generate counterfactual examples for classification problems with Keras/Tensorflow and Pytorch models. We'll also try to generate counterfactual example with regression tasks. Please make a note that this tutorial expects that reader has a basic knowledge of Keras, scikit-learn metrics, and Pytorch as we'll be building on them and won't be spending much time in the detailed explanation on them due to this assumption.

We'll start by importing the necessary libraries.

In [1]:
import pandas as pd
import numpy as np

import warnings

warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", 35)

import dice_ml

Load Datasets

We'll be using the below-mentioned datasets when explaining the usage of dice-ml.

  • Boston Housing Dataset - Its a regression dataset which has information about housing price in Boston area.
  • Breast Cancer Dataset - It’s a classification dataset that has information about breast cancer tumor type (malignant & benign).

We have first loaded both the datasets and even printed their feature description. We have even loaded the dataset as a pandas dataframe and printed the first few examples.

In [2]:
from sklearn.datasets import fetch_california_housing, load_boston

boston = load_boston()

boston_df = pd.DataFrame(data=boston.data, columns=boston.feature_names)
boston_df["Price"] = boston.target

boston_df.head()
Out[2]:
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0 15.3 396.90 4.98 24.0
1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0 17.8 396.90 9.14 21.6
2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0 17.8 392.83 4.03 34.7
3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0 18.7 394.63 2.94 33.4
4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0 18.7 396.90 5.33 36.2
In [3]:
from sklearn.datasets import load_breast_cancer

breast_cancer = load_breast_cancer()

breast_cancer_df = pd.DataFrame(data=breast_cancer.data, columns=breast_cancer.feature_names)
breast_cancer_df["TumorType"] = breast_cancer.target

breast_cancer_df.head()
Out[3]:
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 17.99 10.38 122.80 1001.0 0.11840 0.27760 0.3001 0.14710 0.2419 0.07871 1.0950 0.9053 8.589 153.40 0.006399 0.04904 0.05373 0.01587 0.03003 0.006193 25.38 17.33 184.60 2019.0 0.1622 0.6656 0.7119 0.2654 0.4601 0.11890 0
1 20.57 17.77 132.90 1326.0 0.08474 0.07864 0.0869 0.07017 0.1812 0.05667 0.5435 0.7339 3.398 74.08 0.005225 0.01308 0.01860 0.01340 0.01389 0.003532 24.99 23.41 158.80 1956.0 0.1238 0.1866 0.2416 0.1860 0.2750 0.08902 0
2 19.69 21.25 130.00 1203.0 0.10960 0.15990 0.1974 0.12790 0.2069 0.05999 0.7456 0.7869 4.585 94.03 0.006150 0.04006 0.03832 0.02058 0.02250 0.004571 23.57 25.53 152.50 1709.0 0.1444 0.4245 0.4504 0.2430 0.3613 0.08758 0
3 11.42 20.38 77.58 386.1 0.14250 0.28390 0.2414 0.10520 0.2597 0.09744 0.4956 1.1560 3.445 27.23 0.009110 0.07458 0.05661 0.01867 0.05963 0.009208 14.91 26.50 98.87 567.7 0.2098 0.8663 0.6869 0.2575 0.6638 0.17300 0
4 20.29 14.34 135.10 1297.0 0.10030 0.13280 0.1980 0.10430 0.1809 0.05883 0.7572 0.7813 5.438 94.44 0.011490 0.02461 0.05688 0.01885 0.01756 0.005115 22.54 16.67 152.20 1575.0 0.1374 0.2050 0.4000 0.1625 0.2364 0.07678 0

Keras/Tensorflow Examples

As a part of this section, we'll explain how we can use dice-ml to generate counterfactual examples for Keras/Tensorflow models. We'll be explaining both regression and classification models.

Regression

We'll start by dividing the Boston housing regression dataset into the train (90%) and test (10%) sets.

In [4]:
from sklearn.model_selection import train_test_split

print("Dataset Size : ", boston.data.shape, boston.target.shape)

X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target,
                                                    train_size=0.90,
                                                    random_state=123)

print("Train/Test Sizes : ",X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
Dataset Size :  (506, 13) (506,)
Train/Test Sizes :  (455, 13) (51, 13) (455,) (51,)

We have then created a simple regression model using Keras which has 3 hidden layers with all of them having 50 units. We have used relu as the activation function in all hidden layers. Our output layer is a single unit which will be predicting the price of a house.

In [5]:
import tensorflow.keras

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
In [6]:
model = Sequential([
            Dense(50, activation="relu", input_shape=(len(boston.feature_names), )),
            Dense(50, activation="relu"),
            Dense(50, activation="relu"),
            Dense(1),
           ])

model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 50)                700
_________________________________________________________________
dense_1 (Dense)              (None, 50)                2550
_________________________________________________________________
dense_2 (Dense)              (None, 50)                2550
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 51
=================================================================
Total params: 5,851
Trainable params: 5,851
Non-trainable params: 0
_________________________________________________________________

We have now complied our model with adam as an optimizer and mean_squared_error as loss function which it'll be trying to reduce. We'll also be calculating mean absolute error metrics for each iteration through data.

In [7]:
model.compile(optimizer="adam", loss="mean_squared_error", metrics=["mae"])

Below we are fitting our model to train data with a batch size of 8 for 100 epochs.

In [8]:
%%time

history = model.fit(X_train, Y_train, batch_size=8, epochs=100, verbose=0)
CPU times: user 4.57 s, sys: 385 ms, total: 4.95 s
Wall time: 4.2 s

We have now printed our model performance by evaluating the r2 score and mean squared error on train and test datasets.

If you are interested in learning about various metrics available with sklearn then please feel free to check our tutorial on the same.

In [9]:
from sklearn.metrics import mean_squared_error, r2_score

print("Train MSE : %.2f"%mean_squared_error(Y_train, model.predict(X_train)))
print("Test  MSE : %.2f"%mean_squared_error(Y_test, model.predict(X_test)))

print("Train R2 Score : %.2f"%r2_score(Y_train, model.predict(X_train)))
print("Test  R2 Score : %.2f"%r2_score(Y_test, model.predict(X_test)))
Train MSE : 13.07
Test  MSE : 30.17
Train R2 Score : 0.84
Test  R2 Score : 0.74

The process for creating a counterfactual explanation using dice-ml consists of a few simple steps as mentioned below.

  • Create dice_ml.Data() instance with background data.
  • Create dice_ml.Model() instance with actual trained model.
  • Create dice_ml.Dice() instance with data and model instances created in the previous two steps.
  • Create explanations using the generate_counterfactuals() method of Dice instance by giving it sample as a dictionary for which we want to generate counterfactual examples.

Below we have created an instance of dice_ml.Data() by giving it our Boston dataframe, list of column continuous feature column names, and target column name. We have then created an instance of dice_ml.Model() by giving it our Keras model as input. As of last we have created a Dice instance by giving data and model as input.

In [10]:
d = dice_ml.Data(dataframe=boston_df, continuous_features=boston.feature_names.tolist(), outcome_name='Price')
In [11]:
m = dice_ml.Model(model=model, backend="TF2")
In [12]:
# initiate DiCE
exp = dice_ml.Dice(d, m)
exp
Out[12]:
<dice_ml.explainer_interfaces.dice_tensorflow2.DiceTensorFlow2 at 0x7fbe4afb0e48>

Below we have taken a random sample from the test set and generated a dictionary of sample for which counterfactual explanations will be generated.

In [13]:
import random

idx = random.randint(1, len(X_test))

print("Actual Price : %.2f"%Y_test[idx])

sample = dict(zip(boston.feature_names, X_test[idx]))
sample
Actual Price : 24.10
Out[13]:
{'CRIM': 0.03445,
 'ZN': 82.5,
 'INDUS': 2.03,
 'CHAS': 0.0,
 'NOX': 0.415,
 'RM': 6.162,
 'AGE': 38.4,
 'DIS': 6.27,
 'RAD': 2.0,
 'TAX': 348.0,
 'PTRATIO': 14.7,
 'B': 393.77,
 'LSTAT': 7.43}

We can generate counterfactual examples by using the generate_counterfactuals() method of Dice instance as explained below. we have called method by giving it sample from the previous step, a number of counterfactual to generate and desired class of outcome. We are trying to generate other same samples as our given example for this regression problem. We'll be generating the opposite class when trying classification problems. Below we have given some of the important parameters of the generate_counterfactuals() method.

  • generate_counterfactuals - It accepts a dictionary of our test sample where keys are feature names and values are feature values.
  • total_CFs - It accepts integer specifying how many counterfactual examples, we want to generate.
  • desired_class -It accepts integer 0 or 1 and string opposite. The 0 will generate the opposite class and 1 will generate the same class counterfactual examples. The default is opposite.
  • proximity_weight - It accepts float value specifying how close counterfactuals are to the original query sample. The higher value means more close. The default is 0.5.
  • diversity_weight - It accepts float value specifying how diverse counterfactuals are from the original query sample. The higher value means more diversity. The default is 1.0.
  • algorithm - It accepts integer specifying algorithm of finding counterfactuals.
    • DiverseCF
    • RandomInitCF
  • yloss_type - It accepts a string specifying a loss that will be optimized (reduced).
    • l2_loss
    • log_loss
    • hinge_loss - Default

Below we have generated an explanation instance with 4 counterfactual explanations with the same features as that of our original query sample.

In [14]:
dice_exp = exp.generate_counterfactuals(sample, total_CFs=4, desired_class=1)
WARNING:root: MAD for feature ZN is 0, so replacing it with 1.0 to avoid error.
WARNING:root: MAD for feature CHAS is 0, so replacing it with 1.0 to avoid error.
Diverse Counterfactuals found! total time taken: 02 min 04 sec

The explanation instance has a list of methods that can be used to visualize counterfactual examples generated. Below we have called visualize_as_dataframe() to print counterfactual examples. It prints the original sample and counterfactual examples generated so that we can compare them.

In the next cell, we have again called the same method with parameter show_only_changes set to True which will only show the difference in feature values from the original sample.

The explanation instance also has a method named visualize_as_list() which will frame results as Python lists.

In [15]:
dice_exp.visualize_as_dataframe()
Query instance (original outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.03445 82.5 2.0 0.0 0.415 6.162 38.4 6.27 2.0 348.0 14.7 393.8 7.43 0.739785
Diverse Counterfactual set (new outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.39550 99.8 2.6 0.0 0.415 6.605 32.4 6.7100 2.9 347.5 14.7 393.4 4.24 0.856
1 0.33600 61.0 4.1 0.0 0.415 5.970 38.4 6.2591 1.0 348.5 13.7 396.5 5.25 0.823
2 0.31266 88.3 1.3 0.8 0.451 6.368 46.5 5.3340 2.0 283.2 15.2 396.9 7.46 2.276
3 0.03993 82.5 2.1 0.0 0.454 6.081 38.4 6.2809 2.8 381.1 14.7 388.2 9.86 0.689
In [16]:
dice_exp.visualize_as_dataframe(show_only_changes=True)
Query instance (original outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.03445 82.5 2.0 0.0 0.415 6.162 38.4 6.27 2.0 348.0 14.7 393.8 7.43 0.739785
Diverse Counterfactual set (new outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 0.3955 99.8 2.6 - - 6.605 32.4 6.71 2.9 347.5 - 393.4 4.24 0.856
1 0.336 61.0 4.1 - - 5.97 - 6.2591 1.0 348.5 13.7 396.5 5.25 0.823
2 0.31266 88.3 1.3 0.8 0.451 6.368 46.5 5.334 - 283.2 15.2 396.9 7.46 2.276
3 0.03993 - 2.1 - 0.454 6.081 - 6.2809 2.8 381.1 - 388.2 9.86 0.689
In [17]:
dice_exp.visualize_as_list()
Query instance (original outcome : 1)
[0.03445, 82.5, 2.0, 0.0, 0.415, 6.162, 38.4, 6.27, 2.0, 348.0, 14.7, 393.8, 7.43, 0.7397846579551697]

Diverse Counterfactual set (new outcome : 1)
[0.3955, 99.8, 2.6, 0.0, 0.415, 6.605, 32.4, 6.71, 2.9, 347.5, 14.7, 393.4, 4.24, 0.856]
[0.336, 61.0, 4.1, 0.0, 0.415, 5.97, 38.4, 6.2591, 1.0, 348.5, 13.7, 396.5, 5.25, 0.823]
[0.31266, 88.3, 1.3, 0.8, 0.451, 6.368, 46.5, 5.334, 2.0, 283.2, 15.2, 396.9, 7.46, 2.276]
[0.03993, 82.5, 2.1, 0.0, 0.454, 6.081, 38.4, 6.2809, 2.8, 381.1, 14.7, 388.2, 9.86, 0.689]

Classification

Our classification example starts by dividing the breast cancer dataset into the train (90%) and test (10%) sets.

In [18]:
from sklearn.model_selection import train_test_split

print("Dataset Size : ", breast_cancer.data.shape, breast_cancer.target.shape)

X_train, X_test, Y_train, Y_test = train_test_split(breast_cancer.data, breast_cancer.target,
                                                    train_size=0.90,
                                                    stratify=breast_cancer.target,
                                                    random_state=123)

print("Train/Test Sizes : ",X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
Dataset Size :  (569, 30) (569,)
Train/Test Sizes :  (512, 30) (57, 30) (512,) (57,)

We have now generated a simple Keras model with 3 hidden layers of 50 units each and relu activation. We have kept the output layer as sigmoid with 1 unit as it'll generate a probability of the tumor being malignant or benign. We have then compiled the model and trained it on train data with 10 epochs.

In [19]:
model = Sequential([
            Dense(50, activation="relu", input_shape=(len(breast_cancer.feature_names), )),
            Dense(50, activation="relu"),
            Dense(50, activation="relu"),
            Dense(1, activation="sigmoid"),
           ])

model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense_4 (Dense)              (None, 50)                1550
_________________________________________________________________
dense_5 (Dense)              (None, 50)                2550
_________________________________________________________________
dense_6 (Dense)              (None, 50)                2550
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 51
=================================================================
Total params: 6,701
Trainable params: 6,701
Non-trainable params: 0
_________________________________________________________________
In [20]:
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
In [21]:
%%time

history = model.fit(X_train, Y_train, batch_size=8, epochs=10, verbose=0)
CPU times: user 938 ms, sys: 76.7 ms, total: 1.01 s
Wall time: 830 ms

Below we have calculated train and test accuracy as well as classification report of test data.

In [22]:
from sklearn.metrics import accuracy_score, classification_report

test_preds = [0 if pred< 0.5 else 1 for pred in model.predict(X_test).flatten()]
train_preds = [0 if pred< 0.5 else 1 for pred in model.predict(X_train).flatten()]

print("Train Accuracy : %.2f"%accuracy_score(Y_train, train_preds))
print("Test  Accuracy : %.2f"%accuracy_score(Y_test, test_preds))

print("\nTest  Classification Report : ")
print(classification_report(Y_test, test_preds))
Train Accuracy : 0.93
Test  Accuracy : 0.93

Test  Classification Report :
              precision    recall  f1-score   support

           0       0.95      0.86      0.90        21
           1       0.92      0.97      0.95        36

    accuracy                           0.93        57
   macro avg       0.93      0.91      0.92        57
weighted avg       0.93      0.93      0.93        57

We have now followed the same process which we followed in our previous example to create a Dice instance. We have first created an instance of Data and Model. We have then created an instance of Dice using that Data and Model instances which will be used to generate counter factual explanations.

In [23]:
d = dice_ml.Data(dataframe=breast_cancer_df,
                 continuous_features=breast_cancer.feature_names.tolist(),
                 outcome_name='TumorType')

m = dice_ml.Model(model=model, backend="TF2")

# initiate DiCE
exp = dice_ml.Dice(d, m)

Below we have taken a random sample from the test set and generated a query instance dictionary from it.

In [24]:
import random

idx = random.randint(1, len(X_test))

sample = dict(zip(breast_cancer.feature_names, X_test[idx]))

print("Actual Class : %d"%Y_test[idx])

sample
Actual Class : 0
Out[24]:
{'mean radius': 13.11,
 'mean texture': 15.56,
 'mean perimeter': 87.21,
 'mean area': 530.2,
 'mean smoothness': 0.1398,
 'mean compactness': 0.1765,
 'mean concavity': 0.2071,
 'mean concave points': 0.09601,
 'mean symmetry': 0.1925,
 'mean fractal dimension': 0.07692,
 'radius error': 0.3908,
 'texture error': 0.9238,
 'perimeter error': 2.41,
 'area error': 34.66,
 'smoothness error': 0.007162,
 'compactness error': 0.02912,
 'concavity error': 0.05473,
 'concave points error': 0.01388,
 'symmetry error': 0.01547,
 'fractal dimension error': 0.007098,
 'worst radius': 16.31,
 'worst texture': 22.4,
 'worst perimeter': 106.4,
 'worst area': 827.2,
 'worst smoothness': 0.1862,
 'worst compactness': 0.4099,
 'worst concavity': 0.6376,
 'worst concave points': 0.1986,
 'worst symmetry': 0.3147,
 'worst fractal dimension': 0.1405}

We have now generated 4 counterfactual examples using the generate_counterfactuals() method. We have then printed them using visualize_as_dataframe(). Please make a note that it might not be possible for a model to get the same number of counter factual explanations all the time.

In [25]:
dice_exp = exp.generate_counterfactuals(sample, total_CFs=4)
Only 1 (required 4) Diverse Counterfactuals found for the given configuation, perhaps try with different values of proximity (or diversity) weights or learning rate... ; total time taken: 02 min 08 sec
In [26]:
dice_exp.visualize_as_dataframe()
Query instance (original outcome : 1)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 13.11 15.56 87.21 530.2 0.1398 0.1765 0.2 0.1 0.1925 0.07692 0.3908 0.9238 2.41 34.66 0.007162 0.02912 0.1 0.0 0.01547 0.007098 16.31 22.4 106.4 827.2 0.1862 0.4099 0.6 0.2 0.3147 0.1405 0.546322
Diverse Counterfactual set (new outcome : 0)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 13.14 16.16 88.02 573.6 0.1402 0.1768 0.2 0.1 0.1666 0.07623 0.3836 0.9966 2.518 36.90 0.007439 0.05372 0.1 0.0 0.01549 0.007079 16.42 20.06 103.1 921.2 0.1852 0.4203 1.0 0.2 0.3127 0.13828 0.518
1 13.09 15.54 87.07 532.5 0.1397 0.1763 0.3 0.2 0.1926 0.07570 0.3935 0.8623 2.431 32.25 0.007738 0.02922 0.1 0.0 0.01552 0.007070 16.33 22.37 106.2 831.2 0.1861 0.4109 1.2 0.0 0.3142 0.14065 0.491
2 13.12 15.38 87.27 532.7 0.1359 0.1807 0.2 0.1 0.1838 0.07674 0.3149 0.9456 2.542 32.63 0.006974 0.03089 0.0 0.0 0.01328 0.006815 16.25 21.77 104.9 1184.8 0.1850 0.4030 1.2 0.2 0.3084 0.15441 0.514
3 13.19 15.39 84.26 462.5 0.1406 0.1650 0.2 0.1 0.1916 0.07656 0.4261 0.9266 2.468 30.07 0.007775 0.03200 0.1 0.0 0.01539 0.007013 16.54 12.02 107.2 837.6 0.1837 0.4119 1.2 0.2 0.3197 0.14179 0.516

Please feel free to compare how changes in particular feature values are changing the prediction class.

In [27]:
dice_exp.visualize_as_dataframe(show_only_changes=True)
Query instance (original outcome : 1)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 13.11 15.56 87.21 530.2 0.1398 0.1765 0.2 0.1 0.1925 0.07692 0.3908 0.9238 2.41 34.66 0.007162 0.02912 0.1 0.0 0.01547 0.007098 16.31 22.4 106.4 827.2 0.1862 0.4099 0.6 0.2 0.3147 0.1405 0.546322
Diverse Counterfactual set (new outcome : 0)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 13.14 16.16 88.02 573.6 0.1402 0.1768 - - 0.1666 0.07623 0.3836 0.9966 2.518 36.9 0.007439 0.05372 - - 0.01549 0.007079 16.42 20.06 103.1 921.2 0.1852 0.4203 1.0 - 0.3127 0.13828 0.518
1 13.09 15.54 87.07 532.5 0.1397 0.1763 0.3 0.2 0.1926 0.0757 0.3935 0.8623 2.431 32.25 0.007738 0.02922 - - 0.01552 0.00707 16.33 22.37 106.2 831.2 0.1861 0.4109 1.2 0.0 0.3142 0.14065 0.491
2 13.12 15.38 87.27 532.7 0.1359 0.1807 - - 0.1838 0.07674 0.3149 0.9456 2.542 32.63 0.006974 0.03089 0.0 - 0.01328 0.006815 16.25 21.77 104.9 1184.8 0.185 0.403 1.2 - 0.3084 0.15441 0.514
3 13.19 15.39 84.26 462.5 0.1406 0.165 - - 0.1916 0.07656 0.4261 0.9266 2.468 30.07 0.007775 0.032 - - 0.01539 0.007013 16.54 12.02 107.2 837.6 0.1837 0.4119 1.2 - 0.3197 0.14179 0.516

Pytorch Examples

As a part of this section, we'll explain how we can use dice-ml to generate counterfactual explanations for Pytorch models.

Regression

As usual, we have started by dividing the dataset into the train (90%) and test (10%) sets.

In [28]:
from sklearn.model_selection import train_test_split

print("Dataset Size : ", boston.data.shape, boston.target.shape)

X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target,
                                                    train_size=0.90,
                                                    random_state=123)

print("Train/Test Sizes : ",X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
Dataset Size :  (506, 13) (506,)
Train/Test Sizes :  (455, 13) (51, 13) (455,) (51,)

We have now created a simple Pytorch model with three hidden layers each having 50 units. We are using relu as the activation function in all hidden layers.

In [29]:
import torch
import torch.nn as neural_net
from torch import optim
In [61]:
class  MultiLayerPerceptron(neural_net.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = neural_net.Linear(len(boston.feature_names), 50)
        self.layer2 = neural_net.Linear(50, 50)
        self.layer3 = neural_net.Linear(50, 50)
        self.output_layer = neural_net.Linear(50, 1)
        self.relu = neural_net.ReLU()

    def forward(self, input_data):
        x = self.relu(self.layer1(input_data))
        x = self.relu(self.layer2(x))
        x = self.relu(self.layer3(x))
        output = self.output_layer(x)

        return output.flatten()

Below we have initialized the model, Adam optimizer, and mean squared error loss.

In [76]:
model = MultiLayerPerceptron()
optimizer = optim.Adam(params=model.parameters())
mse_loss = neural_net.MSELoss()

Below we have included logic for training our model on train data. We are training whole train data for 20 epochs with a batch size of 1 sample.

In [77]:
%%time

X_train, Y_train = torch.tensor(X_train, dtype=torch.float32), torch.tensor(Y_train, dtype=torch.float32)

print("MSE Loss Before Training : %.2f"% mse_loss(model(X_train).flatten(), Y_train))

n, c = X_train.shape
batch_size = 1

for epoch in range(20):
    for i in range((n - 1) // batch_size + 1):
        start_i = i * batch_size
        end_i = start_i + batch_size
        xb = X_train[start_i:end_i]
        yb = Y_train[start_i:end_i]
        pred = model(xb)
        loss = mse_loss(pred, yb)

        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

print("MSE Loss After  Training : %.2f"% mse_loss(model(X_train).flatten(), Y_train))
MSE Loss Before Training : 538.71
MSE Loss After  Training : 18.77
CPU times: user 7.43 s, sys: 376 ms, total: 7.8 s
Wall time: 7.3 s

We have now evaluated mean squared error and r2 score on test and train sets.

In [78]:
from sklearn.metrics import mean_squared_error, r2_score

with torch.no_grad():
    X_test = torch.tensor(X_test, dtype=torch.float32)

    test_preds = model(X_test)
    train_preds = model(X_train)

    print("Train MSE : %.2f"%mean_squared_error(Y_train, train_preds.numpy()))
    print("Test  MSE : %.2f"%mean_squared_error(Y_test, test_preds.numpy()))

    print("Train R2 Score : %.2f"%r2_score(Y_train, train_preds.numpy()))
    print("Test  R2 Score : %.2f"%r2_score(Y_test, test_preds.numpy()))
Train MSE : 18.77
Test  MSE : 28.01
Train R2 Score : 0.77
Test  R2 Score : 0.76

We have now created a Dice instance which using a Data instance created using the Boston data frame and a Model instance generated using the Pytorch model. This instance will be used to generate counter factual explanations.

In [79]:
d = dice_ml.Data(dataframe=boston_df, continuous_features=boston.feature_names.tolist(), outcome_name='Price')

m = dice_ml.Model(model=model, backend="PYT")

# initiate DiCE
exp = dice_ml.Dice(d, m)

Below we have taken a random test sample and generated a query instance dictionary for which counter factual explanations will be generated.

In [80]:
import random

idx = random.randint(1, len(X_test))

print("Actual Price : %.2f"%Y_test[idx])

sample = dict(zip(boston.feature_names, X_test[idx].numpy()))
sample
Actual Price : 7.20
Out[80]:
{'CRIM': 18.0846,
 'ZN': 0.0,
 'INDUS': 18.1,
 'CHAS': 0.0,
 'NOX': 0.679,
 'RM': 6.434,
 'AGE': 100.0,
 'DIS': 1.8347,
 'RAD': 24.0,
 'TAX': 666.0,
 'PTRATIO': 20.2,
 'B': 27.25,
 'LSTAT': 29.05}

Now we have generated 4 counterfactual examples for our sample test instance dictionary using the Dice instance. We have generated counterfactual explanations which are almost the same as our sample as its regression problem.

In [87]:
dice_exp = exp.generate_counterfactuals(sample, total_CFs=4)
WARNING:root: MAD for feature ZN is 0, so replacing it with 1.0 to avoid error.
WARNING:root: MAD for feature CHAS is 0, so replacing it with 1.0 to avoid error.
Only 0 (required 4) Diverse Counterfactuals found for the given configuation, perhaps try with different values of proximity (or diversity) weights or learning rate... ; total time taken: 00 min 38 sec
In [88]:
dice_exp.visualize_as_dataframe()
Query instance (original outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 18.0846 0.0 18.1 0.0 0.679 6.434 100.0 1.8347 24.0 666.0 20.2 27.3 29.05 0.801476
Diverse Counterfactual set (new outcome : 0)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 19.72351 70.4 16.7 0.0 0.688 6.197 100.0 1.8308 24.0 658.9 20.0 40.9 30.34 0.776
1 16.11318 99.2 17.7 0.0 0.677 6.365 99.6 3.3593 24.0 672.2 20.2 35.3 29.36 0.776
2 18.68579 79.5 18.3 0.0 0.679 6.409 96.8 2.0663 24.0 629.1 20.6 18.9 31.02 0.777
3 17.93752 100.0 17.8 0.0 0.659 6.520 95.4 2.0637 24.0 684.3 21.3 22.5 29.90 0.774
In [89]:
dice_exp.visualize_as_dataframe(show_only_changes=True)
Query instance (original outcome : 1)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 18.0846 0.0 18.1 0.0 0.679 6.434 100.0 1.8347 24.0 666.0 20.2 27.3 29.05 0.801476
Diverse Counterfactual set (new outcome : 0)
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Price
0 19.72351 70.4 16.7 - 0.688 6.197 - 1.8308 - 658.9 20.0 40.9 30.34 0.776
1 16.11318 99.2 17.7 - 0.677 6.365 99.6 3.3593 - 672.2 - 35.3 29.36 0.776
2 18.68579 79.5 18.3 - - 6.409 96.8 2.0663 - 629.1 20.6 18.9 31.02 0.777
3 17.93752 100.0 17.8 - 0.659 6.52 95.4 2.0637 - 684.3 21.3 22.5 29.9 0.774

Classification

For explaining the classification task Pytorch model with dice-ml, we'll start by dividing the breast cancer dataset into the train (90%) and test (10%) sets.

In [90]:
from sklearn.model_selection import train_test_split

print("Dataset Size : ", breast_cancer.data.shape, breast_cancer.target.shape)

X_train, X_test, Y_train, Y_test = train_test_split(breast_cancer.data, breast_cancer.target,
                                                    train_size=0.90,
                                                    stratify=breast_cancer.target,
                                                    random_state=123)

print("Train/Test Sizes : ",X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
Dataset Size :  (569, 30) (569,)
Train/Test Sizes :  (512, 30) (57, 30) (512,) (57,)

We have now created a simple Pytorch model of 3 hidden layers with each having 50 units. We have used relu as activation for each layer. Our last layer is sigmoid which will output probability between 0-1 predicting tumor type (malignant or benign).

In [102]:
class  MultiLayerPerceptron(neural_net.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = neural_net.Linear(len(breast_cancer.feature_names), 50)
        self.layer2 = neural_net.Linear(50, 50)
        self.layer3 = neural_net.Linear(50, 50)
        self.output_layer = neural_net.Linear(50, 1)
        self.relu = neural_net.ReLU()
        self.sigmoid = neural_net.Sigmoid()

    def forward(self, input_data):
        x = self.relu(self.layer1(input_data))
        x = self.relu(self.layer2(x))
        x = self.relu(self.layer3(x))
        output = self.sigmoid(self.output_layer(x))

        return output.flatten()

Below we have initialized our model, Adam optimizer, and binary cross-entropy loss.

In [147]:
model = MultiLayerPerceptron()
optimizer = optim.Adam(params=model.parameters())
bce_loss = neural_net.BCELoss()

We have now written logic for training. We are looping through train data for 10 epochs with a batch size of 1.

In [148]:
%%time

X_train, Y_train = torch.tensor(X_train, dtype=torch.float32), torch.tensor(Y_train, dtype=torch.float32)

print("Binary Cross Entropy Loss Before Training : %.2f"% bce_loss(model(X_train).flatten(), Y_train))

n, c = X_train.shape
batch_size = 1

for epoch in range(20):
    for i in range((n - 1) // batch_size + 1):
        start_i = i * batch_size
        end_i = start_i + batch_size
        xb = X_train[start_i:end_i]
        yb = Y_train[start_i:end_i]
        preds = model(xb)
        loss = bce_loss(preds, yb)

        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

print("Binary Cross Entropy Loss After  Training : %.2f"% mse_loss(model(X_train).flatten(), Y_train))
Binary Cross Entropy Loss Before Training : 4.20
Binary Cross Entropy Loss After  Training : 0.06
CPU times: user 4.54 s, sys: 249 ms, total: 4.79 s
Wall time: 4.34 s

Below we have generated the accuracy of the model on train and test data. We have even printed a classification report of model performance on test data.

In [149]:
from sklearn.metrics import accuracy_score, classification_report

with torch.no_grad():
    X_test = torch.tensor(X_test, dtype=torch.float32)

    test_preds = [0 if pred< 0.5 else 1 for pred in model(X_test)]
    train_preds = [0 if pred< 0.5 else 1 for pred in model(X_train)]

    print("Train Accuracy : %.2f"%accuracy_score(Y_train, train_preds))
    print("Test  Accuracy : %.2f"%accuracy_score(Y_test, test_preds))

    print("\nTest  Classification Report : ")
    print(classification_report(Y_test, test_preds))
Train Accuracy : 0.91
Test  Accuracy : 0.91

Test  Classification Report :
              precision    recall  f1-score   support

           0       0.94      0.81      0.87        21
           1       0.90      0.97      0.93        36

    accuracy                           0.91        57
   macro avg       0.92      0.89      0.90        57
weighted avg       0.91      0.91      0.91        57

We have now created an instance of Dice exactly the same way as explained in previous examples.

In [150]:
d = dice_ml.Data(dataframe=breast_cancer_df,
                 continuous_features=breast_cancer.feature_names.tolist(), outcome_name='TumorType')

m = dice_ml.Model(model=model, backend="PYT")

# initiate DiCE
exp = dice_ml.Dice(d, m)

We have now taken a random sample from the test set as a query instance for which we'll generate counterfactual explanations.

In [151]:
import random

idx = random.randint(1, len(X_test))

sample = dict(zip(breast_cancer.feature_names, X_test[idx].numpy()))

print("Actual Class : %d"%Y_test[idx])

sample
Actual Class : 1
Out[151]:
{'mean radius': 9.847,
 'mean texture': 15.68,
 'mean perimeter': 63.0,
 'mean area': 293.2,
 'mean smoothness': 0.09492,
 'mean compactness': 0.08419,
 'mean concavity': 0.0233,
 'mean concave points': 0.02416,
 'mean symmetry': 0.1387,
 'mean fractal dimension': 0.06891,
 'radius error': 0.2498,
 'texture error': 1.216,
 'perimeter error': 1.976,
 'area error': 15.24,
 'smoothness error': 0.008732,
 'compactness error': 0.02042,
 'concavity error': 0.01062,
 'concave points error': 0.006801,
 'symmetry error': 0.01824,
 'fractal dimension error': 0.003494,
 'worst radius': 11.24,
 'worst texture': 22.99,
 'worst perimeter': 74.32,
 'worst area': 376.5,
 'worst smoothness': 0.1419,
 'worst compactness': 0.2243,
 'worst concavity': 0.08434,
 'worst concave points': 0.06528,
 'worst symmetry': 0.2502,
 'worst fractal dimension': 0.09209}

We have now tried to generate 4 counterfactual explanations for our query instance.

In [152]:
dice_exp = exp.generate_counterfactuals(sample, total_CFs=4)
Only 0 (required 4) Diverse Counterfactuals found for the given configuation, perhaps try with different values of proximity (or diversity) weights or learning rate... ; total time taken: 00 min 50 sec
In [153]:
dice_exp.visualize_as_dataframe()
Query instance (original outcome : 1)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 9.85 15.68 63.0 293.2 0.0949 0.0842 0.0 0.0 0.1387 0.06891 0.2498 1.216 1.976 15.24 0.008732 0.02042 0.0 0.0 0.01824 0.003494 11.24 22.99 74.3 376.5 0.1419 0.2243 0.1 0.1 0.2502 0.09209 0.738846
Diverse Counterfactual set (new outcome : 0)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 10.03 14.73 63.56 297.8 0.0951 0.0820 0.0 0.0 0.1386 0.06904 0.2705 1.1938 1.562 14.87 0.008686 0.01898 0.0 0.0 0.01832 0.003568 11.12 22.19 73.9 325.1 0.1377 0.2178 0.7 0.2 0.2444 0.09342 0.716
1 9.76 15.18 65.34 297.2 0.0926 0.0844 0.1 0.1 0.1382 0.06801 0.2209 1.2838 1.851 11.64 0.008622 0.02045 0.0 0.0 0.01869 0.003807 11.38 23.39 77.4 403.3 0.1406 0.2299 0.1 0.3 0.2438 0.08979 0.695
2 9.86 15.80 62.30 308.1 0.0957 0.0868 0.4 0.2 0.1373 0.06894 0.2561 1.1911 1.853 16.75 0.008658 0.02103 0.0 0.0 0.01819 0.003495 11.53 22.54 73.2 441.0 0.1427 0.2393 0.1 0.3 0.2498 0.08874 0.651
3 10.00 15.41 60.12 263.4 0.0925 0.0862 0.3 0.0 0.1364 0.06877 0.2652 1.1465 1.971 22.85 0.008647 0.01925 0.0 0.0 0.01706 0.003496 11.14 23.35 74.9 328.2 0.1427 0.2352 0.1 0.1 0.2511 0.09361 0.711
In [154]:
dice_exp.visualize_as_dataframe(show_only_changes=True)
Query instance (original outcome : 1)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 9.85 15.68 63.0 293.2 0.0949 0.0842 0.0 0.0 0.1387 0.06891 0.2498 1.216 1.976 15.24 0.008732 0.02042 0.0 0.0 0.01824 0.003494 11.24 22.99 74.3 376.5 0.1419 0.2243 0.1 0.1 0.2502 0.09209 0.738846
Diverse Counterfactual set (new outcome : 0)
mean radius mean texture mean perimeter mean area mean smoothness mean compactness mean concavity mean concave points mean symmetry mean fractal dimension radius error texture error perimeter error area error smoothness error compactness error concavity error concave points error symmetry error fractal dimension error worst radius worst texture worst perimeter worst area worst smoothness worst compactness worst concavity worst concave points worst symmetry worst fractal dimension TumorType
0 10.03 14.73 63.56 297.8 0.0951 0.082 - - 0.1386 0.06904 0.2705 1.1938 1.562 14.87 0.008686 0.01898 - - 0.01832 0.003568 11.12 22.19 73.9 325.1 0.1377 0.2178 0.7 0.2 0.2444 0.09342 0.716
1 9.76 15.18 65.34 297.2 0.0926 0.0844 0.1 0.1 0.1382 0.06801 0.2209 1.2838 1.851 11.64 0.008622 0.02045 - - 0.01869 0.003807 11.38 23.39 77.4 403.3 0.1406 0.2299 - 0.3 0.2438 0.08979 0.695
2 9.86 15.8 62.3 308.1 0.0957 0.0868 0.4 0.2 0.1373 0.06894 0.2561 1.1911 1.853 16.75 0.008658 0.02103 - - 0.01819 0.003495 11.53 22.54 73.2 441.0 0.1427 0.2393 - 0.3 0.2498 0.08874 0.651
3 10.0 15.41 60.12 263.4 0.0925 0.0862 0.3 - 0.1364 0.06877 0.2652 1.1465 1.971 22.85 0.008647 0.01925 - - 0.01706 0.003496 11.14 23.35 74.9 328.2 0.1427 0.2352 - - 0.2511 0.09361 0.711


Sunny Solanki  Sunny Solanki