Updated On : May-31,2020 Time Investment : ~30 mins

Scikit-Learn - Ensemble Learning: Boosting¶

Table of Contents¶

Introduction
GradientBoostingRegressor
GradientBosstingClassifier
AdaBoostRegressor
AdaBoostClassifier
References

Introduction ¶

Boosting is a type of ensemble learning where we train estimators sequentially rather than training all estimators in parallel. We try to create a few fast simple (weak but better than random guess) models and then combine results of all weak estimators to make the final prediction. We have already discussed another ensemble learning method as a part of our tutorial on bagging & random forests. Please feel free to go through it if you want to learn about it.

Scikit-learn provides two different boosting algorithms for classification and regression problems:

Gradient Tree Boosting (Gradient Boosted Decision Trees) - It builds learners iteratively where weak learners train on errors of samples which were predicted wrong. It initially starts with one learner and then adds learners iteratively. It tries to minimize loss by adding new trees iteratively. It uses decision trees are weak estimators. Scikit-learn provides two classes for which implements Gradient Tree Boosting for classification and regression problems.
- GradientBoostingClassifier
- GradientBoostingRegressor
Adaptive Boost - It fits the list of weak estimators iteratively on modified data. It then combines results of all estimators based on a weighted vote to generate a final result. At each iteration, high weights are assigned to samples which were predicted wrong in the previous iteration, and wights are decreased for those samples which were predicted right in the previous iteration. This enables models to concentrate on samples that are going wrong. Initially, all samples are assigned the same weights (1/ n_samples). It let us specify which estimators to use for the process. Scikit-learn provides two classes for which implements Adaptive Boosting for classification and regression problems.
- AdaBoostClassifier
- AdaBoostRegressor

This ends our small introduction to the Boosting process. We'll now start with the coding part.

We'll start by importing necessary libraries.

import numpy as np
import pandas as pd

import sklearn
import warnings

warnings.filterwarnings("ignore")

np.set_printoptions(precision=3)
%matplotlib inline

Load Dataset¶

We'll be loading below mentioned two for our purpose.

Digits Dataset: We'll be using digits dataset which has images of size 8x8 for digits 0-9. We'll use digits data for classification tasks below.
Boston Housing Dataset: We'll be using the Boston housing dataset which has information about various house properties like average no of rooms, per capita crime rate in town, etc. We'll be using it for regression tasks.

Sklearn provides both of this dataset as a part of the datasets module. We can load them by calling load_digits() and load_boston() methods. It returns dictionary-like object BUNCH which can be used to retrieve features and target.

from sklearn.datasets import load_boston, load_digits

digits = load_digits()
X_digits, Y_digits = digits.data, digits.target

print('Dataset Size : ',X_digits.shape, Y_digits.shape)

Dataset Size :  (1797, 64) (1797,)

boston = load_boston()
X_boston, Y_boston = boston.data, boston.target
print('Dataset Size : ',X_boston.shape, Y_boston.shape)

Dataset Size :  (506, 13) (506,)

GradientBoostingRegressor ¶

TheGradientBoostingRegressor is available as a part of the ensemble module of sklearn. We'll be training the default model with Boston housing data and then tune the model by trying various hyperparameter settings to improve its performance. We'll also compare it with other regression estimators to check its performance relative to other machine learning models.

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (404, 13) (102, 13) (404,) (102,)

from sklearn.ensemble import GradientBoostingRegressor

grad_boosting_regressor = GradientBoostingRegressor()
grad_boosting_regressor.fit(X_train, Y_train)

GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
                          learning_rate=0.1, loss='ls', max_depth=3,
                          max_features=None, max_leaf_nodes=None,
                          min_impurity_decrease=0.0, min_impurity_split=None,
                          min_samples_leaf=1, min_samples_split=2,
                          min_weight_fraction_leaf=0.0, n_estimators=100,
                          n_iter_no_change=None, presort='auto',
                          random_state=None, subsample=1.0, tol=0.0001,
                          validation_fraction=0.1, verbose=0, warm_start=False)

Y_preds = grad_boosting_regressor.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test R^2 Score : %.3f'%grad_boosting_regressor.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training R^2 Score : %.3f'%grad_boosting_regressor.score(X_train, Y_train))

[33.731 26.108 48.711 18.784 31.065 43.077 25.474  9.03  18.201 29.294
 22.577 19.06  15.871 24.611 19.605]
[15.  26.6 45.4 20.8 34.9 21.9 28.7  7.2 20.  32.2 24.1 18.5 13.5 27.
 23.1]
Test R^2 Score : 0.812
Training R^2 Score : 0.979

Important Attributes of `GradientBoostingRegressor`¶

Below are some of the important attributes of GradientBoostingRegressor which can provide important information once the model is trained.

feature_importances_ - It returns an array of floats representing the importance of each feature in the dataset.
estimators_ - It returns trained estimators.
oob_improvement_ - It returns array of size (n_estimators,). Each value in the array represents an improvement in a loss in out-of-bag samples relative to the previous iteration.
loss_ - It returns loss function as object.

print("Feature Importances : ", grad_boosting_regressor.feature_importances_)

Feature Importances :  [1.551e-02 3.064e-04 1.025e-03 4.960e-05 3.208e-02 4.883e-01 1.295e-02
 5.908e-02 1.602e-03 1.187e-02 2.461e-02 5.472e-03 3.472e-01]

print("Estimators Shape: ", grad_boosting_regressor.estimators_.shape)

grad_boosting_regressor.estimators_[:2]

Estimators Shape:  (100, 1)

array([[DecisionTreeRegressor(criterion='friedman_mse', max_depth=3, max_features=None,
                      max_leaf_nodes=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      presort='auto',
                      random_state=RandomState(MT19937) at 0x7F0EF00E7780,
                      splitter='best')],
       [DecisionTreeRegressor(criterion='friedman_mse', max_depth=3, max_features=None,
                      max_leaf_nodes=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      presort='auto',
                      random_state=RandomState(MT19937) at 0x7F0EF00E7780,
                      splitter='best')]], dtype=object)

print("Loss : ", grad_boosting_regressor.loss_)

Loss :  <sklearn.ensemble._gb_losses.LeastSquaresError object at 0x7f0eac9d4b00>

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Below are list of common hyperparameters which needs tuning for getting best fit for our data. We'll try various hyperparemters settings to various splits of train/test data to find out best fit which will have almost same accuracy for both train & test dataset or have quite less different between accuracy.

learning_rate - It shrinks contribution of each tree. There is trade-off between learning_rate and n_estimatros.
n_estimators - Number of base estimators whose results will be combined to produce final prediction. default=100
max_depth - Maximum depth of individual trees. We need to find best value.default=3
min_samples_split - Number of samples required to split internal node. It accepts int(0-n_samples), float(0.0-0.5] values. Float takes ceil(min_samples_split * n_samples) features. default=2
min_samples_leaf - Minimum number of samples required to be at leaf node. It accepts int(0-n_samples), float(0.0-0.5] values. Float takes ceil(min_samples_leaf * n_samples) features. default=1
criterion - Cost function which we algorithm tries to minimize. Currently it supports mse(mean squared error) & mae(mean absolute error). default=friedman_mse
max_features - Number of features to consider when doing split. It accepts int(0-n_features), float(0.0-0.5], string(sqrt, log2, auto) or None as value. default=None
- None - n_features are used as value if None is provided.
- sqrt - sqrt(n_features) features are used for split.
- auto - sqrt(n_features) features are used for split.
- log2 - log2(n_features) features are used for split.
validation_fraction - It refers to proportion of training data to be used as validation for early stopping.It accepts float(0.0,1.0) default=0.1

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 3-fold cross-validation on data.

%%time

from sklearn.model_selection import GridSearchCV

n_samples = X_boston.shape[0]
n_features = X_boston.shape[1]

params = {'n_estimators': np.arange(100, 301, 50),
          'max_depth': [None, 3, 5,],
          'min_samples_split': [2, 0.3, 0.5, n_samples//2, ],
          'min_samples_leaf': [1, 0.3, 0.5, n_samples//2, ],
          'criterion': ['friedman_mse', 'mae'],
          'max_features': [None, 'sqrt', 'auto', 'log2', 0.3, 0.7, n_features//2, ],
         }

grad_boost_regressor_grid = GridSearchCV(GradientBoostingRegressor(random_state=1), param_grid=params, n_jobs=-1, cv=3, verbose=5)
grad_boost_regressor_grid.fit(X_train,Y_train)

print('Train R^2 Score : %.3f'%grad_boost_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%grad_boost_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%grad_boost_regressor_grid.best_score_)
print('Best Parameters : ',grad_boost_regressor_grid.best_params_)

Fitting 3 folds for each of 3360 candidates, totalling 10080 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:   27.6s
[Parallel(n_jobs=-1)]: Done 134 tasks      | elapsed:   29.9s
[Parallel(n_jobs=-1)]: Done 854 tasks      | elapsed:   38.2s
[Parallel(n_jobs=-1)]: Done 1662 tasks      | elapsed:   44.9s
[Parallel(n_jobs=-1)]: Done 2958 tasks      | elapsed:   55.1s
[Parallel(n_jobs=-1)]: Done 4542 tasks      | elapsed:  1.1min
[Parallel(n_jobs=-1)]: Done 5503 tasks      | elapsed:  1.8min
[Parallel(n_jobs=-1)]: Done 6281 tasks      | elapsed:  2.8min
[Parallel(n_jobs=-1)]: Done 7186 tasks      | elapsed:  3.5min
[Parallel(n_jobs=-1)]: Done 8124 tasks      | elapsed:  4.2min
[Parallel(n_jobs=-1)]: Done 9301 tasks      | elapsed:  5.3min
[Parallel(n_jobs=-1)]: Done 10080 out of 10080 | elapsed:  5.9min finished

Train R^2 Score : 0.997
Test R^2 Score : 0.776
Best R^2 Score Through Grid Search : 0.891
Best Parameters :  {'criterion': 'friedman_mse', 'max_depth': None, 'max_features': None, 'min_samples_leaf': 1, 'min_samples_split': 0.3, 'n_estimators': 150}
CPU times: user 5.14 s, sys: 254 ms, total: 5.39 s
Wall time: 5min 53s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(grad_boost_regressor_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 3360

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_criterion	param_max_depth	param_max_features	param_min_samples_leaf	param_min_samples_split	param_n_estimators	params	split0_test_score	split1_test_score	split2_test_score	mean_test_score	std_test_score	rank_test_score
0	0.446993	0.001296	0.001833	0.000558	friedman_mse	None	None	1	2	100	{'criterion': 'friedman_mse', 'max_depth': Non...	0.670962	0.863044	0.680220	0.738218	0.088509	829
1	0.230005	0.156945	0.001789	0.000547	friedman_mse	None	None	1	2	150	{'criterion': 'friedman_mse', 'max_depth': Non...	0.670966	0.863046	0.680222	0.738221	0.088509	821
2	0.123556	0.003804	0.001345	0.000026	friedman_mse	None	None	1	2	200	{'criterion': 'friedman_mse', 'max_depth': Non...	0.670966	0.863046	0.680222	0.738221	0.088509	821
3	0.127053	0.001129	0.001377	0.000047	friedman_mse	None	None	1	2	250	{'criterion': 'friedman_mse', 'max_depth': Non...	0.670966	0.863046	0.680222	0.738221	0.088509	821
4	0.130700	0.001741	0.001365	0.000030	friedman_mse	None	None	1	2	300	{'criterion': 'friedman_mse', 'max_depth': Non...	0.670966	0.863046	0.680222	0.738221	0.088509	821

Comparing Performance Of Gradient Boosting With Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

from sklearn import ensemble, tree
## Gradient Boosting Regressor with Default Params
gb_regressor = ensemble.GradientBoostingRegressor(random_state=1)
gb_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_regressor.__class__.__name__,
                                                     gb_regressor.score(X_train, Y_train),gb_regressor.score(X_test, Y_test)))

## Above Hyper-perameter tuned Gradient Boosting Regressor
gb_regressor = ensemble.GradientBoostingRegressor(random_state=1, **grad_boost_regressor_grid.best_params_)
gb_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_regressor.__class__.__name__,
                                                     gb_regressor.score(X_train, Y_train),gb_regressor.score(X_test, Y_test)))

## Random Forest Regressor with Default Params
rforest_regressor = ensemble.RandomForestRegressor(random_state=1)
rforest_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(rforest_regressor.__class__.__name__,
                                                     rforest_regressor.score(X_train, Y_train),rforest_regressor.score(X_test, Y_test)))


## Extra Trees Regressor with Default Params
extra_forest_regressor = ensemble.ExtraTreesRegressor(random_state=1)
extra_forest_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_forest_regressor.__class__.__name__,
                                                     extra_forest_regressor.score(X_train, Y_train),extra_forest_regressor.score(X_test, Y_test)))

## Bagging Regressor with Default Params
bag_regressor = ensemble.BaggingRegressor(random_state=1)
bag_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(bag_regressor.__class__.__name__,
                                                     bag_regressor.score(X_train, Y_train),bag_regressor.score(X_test, Y_test)))


## Decision Tree with Default Parameters
dtree_regressor = tree.DecisionTreeRegressor(random_state=1)
dtree_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(dtree_regressor.__class__.__name__,
                                                     dtree_regressor.score(X_train, Y_train),dtree_regressor.score(X_test, Y_test)))

## Decision Tree with Default Parameters
extra_tree_regressor = tree.ExtraTreeRegressor(random_state=1)
extra_tree_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_tree_regressor.__class__.__name__,
                                                     extra_tree_regressor.score(X_train, Y_train),extra_tree_regressor.score(X_test, Y_test)))

GradientBoostingRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
GradientBoostingRegressor : Train Accuracy : 1.00, Test Accuracy : 0.78
RandomForestRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
ExtraTreesRegressor : Train Accuracy : 1.00, Test Accuracy : 0.83
BaggingRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
DecisionTreeRegressor : Train Accuracy : 1.00, Test Accuracy : 0.44
ExtraTreeRegressor : Train Accuracy : 1.00, Test Accuracy : 0.51

GradientBosstingClassifier ¶

TheGradientBosstingClassifier is available as a part of the ensemble module of sklearn. We'll be training the default model with digits data and then tune model by trying various hyperparameter settings to improve its performance. We'll also compare it with other classification estimators to check its performance relative to other machine learning models.

X_train, X_test, Y_train, Y_test = train_test_split(X_digits, Y_digits, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (1437, 64) (360, 64) (1437,) (360,)

from sklearn.ensemble import GradientBoostingClassifier

grad_boosting_classif = GradientBoostingClassifier()
grad_boosting_classif.fit(X_train, Y_train)

GradientBoostingClassifier(criterion='friedman_mse', init=None,
                           learning_rate=0.1, loss='deviance', max_depth=3,
                           max_features=None, max_leaf_nodes=None,
                           min_impurity_decrease=0.0, min_impurity_split=None,
                           min_samples_leaf=1, min_samples_split=2,
                           min_weight_fraction_leaf=0.0, n_estimators=100,
                           n_iter_no_change=None, presort='auto',
                           random_state=None, subsample=1.0, tol=0.0001,
                           validation_fraction=0.1, verbose=0,
                           warm_start=False)

Y_preds = grad_boosting_classif.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test Accuracy : %.3f'%(Y_preds == Y_test).mean())
print('Test Accuracy : %.3f'%grad_boosting_classif.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training Accuracy : %.3f'%grad_boosting_classif.score(X_train, Y_train))

[3 3 4 4 1 3 1 0 7 4 0 6 5 1 6]
[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
Test Accuracy : 0.956
Test Accuracy : 0.956
Training Accuracy : 1.000

Important Attributes of `GradientBoostingClassifier`¶

The GradientBoostingClassifier has the same set of attributes as that of GradientBoostingRegressor.

print("Feature Importances Shape: ", grad_boosting_classif.feature_importances_.shape)

grad_boosting_classif.feature_importances_[:10]

Feature Importances Shape:  (64,)

array([0.   , 0.001, 0.011, 0.003, 0.003, 0.059, 0.004, 0.001, 0.001,
       0.002])

print("Estimators Shape : ", grad_boosting_classif.estimators_.shape)

Estimators Shape :  (100, 10)

print("Loss : ", grad_boosting_classif.loss_)

Loss :  <sklearn.ensemble._gb_losses.MultinomialDeviance object at 0x7f0eac97d4a8>

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

GradientBoostingClassifier has almost all parameters same as that of GradientBoostingRegressor

%%time

n_samples = X_digits.shape[0]
n_features = X_digits.shape[1]

params = {'n_estimators': [100, 200],
          'max_depth': [None, 2,5,],
          'min_samples_split': [2,0.5, n_samples//2, ],
          'min_samples_leaf': [1, 0.5, n_samples//2, ],
          'criterion': ['friedman_mse', 'mae'],
          'max_features': [None, 'sqrt', 'log2', 0.5, n_features//2,],
         }

grad_boost_classif_grid = GridSearchCV(GradientBoostingClassifier(random_state=1), param_grid=params, n_jobs=-1, cv=3, verbose=5)
grad_boost_classif_grid.fit(X_train,Y_train)

print('Train Accuracy : %.3f'%grad_boost_classif_grid.best_estimator_.score(X_train, Y_train))
print('Test Accuracy : %.3f'%grad_boost_classif_grid.best_estimator_.score(X_test, Y_test))
print('Best Accuracy Through Grid Search : %.3f'%grad_boost_classif_grid.best_score_)
print('Best Parameters : ',grad_boost_classif_grid.best_params_)

Fitting 3 folds for each of 540 candidates, totalling 1620 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:    7.6s
[Parallel(n_jobs=-1)]: Done  64 tasks      | elapsed:   21.5s
[Parallel(n_jobs=-1)]: Done 154 tasks      | elapsed:   40.8s
[Parallel(n_jobs=-1)]: Done 280 tasks      | elapsed:  1.3min
[Parallel(n_jobs=-1)]: Done 442 tasks      | elapsed:  1.9min
[Parallel(n_jobs=-1)]: Done 640 tasks      | elapsed:  2.7min
[Parallel(n_jobs=-1)]: Done 874 tasks      | elapsed: 13.0min
[Parallel(n_jobs=-1)]: Done 1144 tasks      | elapsed: 30.8min
[Parallel(n_jobs=-1)]: Done 1450 tasks      | elapsed: 48.8min
[Parallel(n_jobs=-1)]: Done 1620 out of 1620 | elapsed: 59.1min finished

Train Accuracy : 1.000
Test Accuracy : 0.972
Best Accuracy Through Grid Search : 0.978
Best Parameters :  {'criterion': 'mae', 'max_depth': 5, 'max_features': 'log2', 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 200}
CPU times: user 50.9 s, sys: 236 ms, total: 51.2 s
Wall time: 59min 56s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(grad_boost_classif_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 540

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_criterion	param_max_depth	param_max_features	param_min_samples_leaf	param_min_samples_split	param_n_estimators	params	split0_test_score	split1_test_score	split2_test_score	mean_test_score	std_test_score	rank_test_score
0	1.755920	0.019092	0.005332	0.001424	friedman_mse	None	None	1	2	100	{'criterion': 'friedman_mse', 'max_depth': Non...	0.898551	0.897490	0.907563	0.901183	0.004511	116
1	2.450977	0.141080	0.004983	0.000073	friedman_mse	None	None	1	2	200	{'criterion': 'friedman_mse', 'max_depth': Non...	0.898551	0.897490	0.907563	0.901183	0.004511	116
2	2.336374	0.033164	0.007230	0.000163	friedman_mse	None	None	1	0.5	100	{'criterion': 'friedman_mse', 'max_depth': Non...	0.954451	0.947699	0.953782	0.951983	0.003037	63
3	4.306882	0.083603	0.013569	0.000535	friedman_mse	None	None	1	0.5	200	{'criterion': 'friedman_mse', 'max_depth': Non...	0.958592	0.956067	0.953782	0.956159	0.001966	48
4	1.050215	0.109802	0.003766	0.000060	friedman_mse	None	None	1	898	100	{'criterion': 'friedman_mse', 'max_depth': Non...	0.931677	0.918410	0.915966	0.922060	0.006915	108

Comparing Performance Of Gradient Boosting With Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

from sklearn import ensemble

## Gradient Boosting Regressor with Default Params
gb_classifier = ensemble.GradientBoostingClassifier(random_state=1)
gb_classifier.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_classifier.__class__.__name__,
                                                     gb_classifier.score(X_train, Y_train),gb_classifier.score(X_test, Y_test)))

## Above Hyper-perameter tuned Gradient Boosting Regressor
gb_classifier = ensemble.GradientBoostingClassifier(random_state=1, **grad_boost_classif_grid.best_params_)
gb_classifier.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_classifier.__class__.__name__,
                                                     gb_classifier.score(X_train, Y_train),gb_classifier.score(X_test, Y_test)))

## Random Forest Regressor with Default Params
rforest_classif = ensemble.RandomForestClassifier(random_state=1)
rforest_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(rforest_classif.__class__.__name__,
                                                     rforest_classif.score(X_train, Y_train),rforest_classif.score(X_test, Y_test)))


## Extra Trees Regressor with Default Params
extra_forest_classif = ensemble.ExtraTreesClassifier(random_state=1)
extra_forest_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_forest_classif.__class__.__name__,
                                                     extra_forest_classif.score(X_train, Y_train),extra_forest_classif.score(X_test, Y_test)))

## Bagging Regressor with Default Params
bag_classif = ensemble.BaggingClassifier(random_state=1)
bag_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(bag_classif.__class__.__name__,
                                                     bag_classif.score(X_train, Y_train),bag_classif.score(X_test, Y_test)))


## Decision Tree with Default Parameters
dtree_classif = tree.DecisionTreeClassifier(random_state=1)
dtree_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(dtree_classif.__class__.__name__,
                                                     dtree_classif.score(X_train, Y_train),dtree_classif.score(X_test, Y_test)))

## Decision Tree with Default Parameters
extra_tree_classif = tree.ExtraTreeClassifier(random_state=1)
extra_tree_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_tree_classif.__class__.__name__,
                                                     extra_tree_classif.score(X_train, Y_train),extra_tree_classif.score(X_test, Y_test)))

GradientBoostingClassifier : Train Accuracy : 1.00, Test Accuracy : 0.96
GradientBoostingClassifier : Train Accuracy : 1.00, Test Accuracy : 0.97
RandomForestClassifier : Train Accuracy : 1.00, Test Accuracy : 0.94
ExtraTreesClassifier : Train Accuracy : 1.00, Test Accuracy : 0.95
BaggingClassifier : Train Accuracy : 1.00, Test Accuracy : 0.94
DecisionTreeClassifier : Train Accuracy : 1.00, Test Accuracy : 0.83
ExtraTreeClassifier : Train Accuracy : 1.00, Test Accuracy : 0.83

AdaBoostRegressor ¶

TheAdaBoostRegressor is available as a part of the ensemble module of sklearn. We'll be training the default model with Boston housing data and then tune the model by trying various hyperparameter settings to improve its performance. We'll also compare it with other regression estimators to check its performance relative to other machine learning models.

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (404, 13) (102, 13) (404,) (102,)

from sklearn.ensemble import AdaBoostRegressor

ada_boost_regressor = AdaBoostRegressor()
ada_boost_regressor.fit(X_train, Y_train)

AdaBoostRegressor(base_estimator=None, learning_rate=1.0, loss='linear',
                  n_estimators=50, random_state=None)

Y_preds = ada_boost_regressor.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test R^2 Score : %.3f'%ada_boost_regressor.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training R^2 Score : %.3f'%ada_boost_regressor.score(X_train, Y_train))

[18.221 27.061 47.212 18.138 31.677 39.125 27.313 11.922 17.933 26.785
 26.249 20.919 17.408 27.167 18.951]
[15.  26.6 45.4 20.8 34.9 21.9 28.7  7.2 20.  32.2 24.1 18.5 13.5 27.
 23.1]
Test R^2 Score : 0.834
Training R^2 Score : 0.912

Important Attributes of `AdaBoostRegressor`¶

Below are some of the important attributes of AdaBoostRegressor which can provide important information once the model is trained.

base_estimator_ - It returns base estimator from which whole strong estimator consisting of weak estimators is created.
feature_importances_ - It returns an array of floats representing the importance of each feature in the dataset.
estimators_ - It returns trained estimators.

print("Base Estimator : ", ada_boost_regressor.base_estimator_)

Base Estimator :  DecisionTreeRegressor(criterion='mse', max_depth=3, max_features=None,
                      max_leaf_nodes=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      presort=False, random_state=None, splitter='best')

print("Feature Importances : ", ada_boost_regressor.feature_importances_)

Feature Importances :  [3.205e-02 0.000e+00 4.921e-03 3.564e-04 4.383e-02 2.726e-01 8.314e-03
 1.300e-01 1.609e-02 5.684e-02 2.638e-02 4.679e-03 4.040e-01]

print("Estimators Shape : ", len(ada_boost_regressor.estimators_))

ada_boost_regressor.estimators_[:2]

Estimators Shape :  50

[DecisionTreeRegressor(criterion='mse', max_depth=3, max_features=None,
                       max_leaf_nodes=None, min_impurity_decrease=0.0,
                       min_impurity_split=None, min_samples_leaf=1,
                       min_samples_split=2, min_weight_fraction_leaf=0.0,
                       presort=False, random_state=280424452, splitter='best'),
 DecisionTreeRegressor(criterion='mse', max_depth=3, max_features=None,
                       max_leaf_nodes=None, min_impurity_decrease=0.0,
                       min_impurity_split=None, min_samples_leaf=1,
                       min_samples_split=2, min_weight_fraction_leaf=0.0,
                       presort=False, random_state=1540040155, splitter='best')]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Below is a list of common hyperparameters that needs tuning for getting best fit for our data. We'll try various hyperparameters settings to various splits of train/test data to find out best fit which will have almost the same accuracy for both train & test dataset or have quite less difference between accuracy.

base_estimator - It let us specify the base estimator from which ensemble will be created. It can be any other machine learning estimator like KNearestNeighbors, DecisionTree, etc. The default is a decision tree with a max depth of 3.
learning_rate - It shrinks the contribution of each tree. There is a trade-off between learning_rate and n_estimatros.
n_estimators - Number of base estimators whose results will be combined to produce a final prediction. default=100

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 3-fold cross-validation on data.

%%time

from sklearn.tree import DecisionTreeRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression

n_samples = X_boston.shape[0]
n_features = X_boston.shape[1]

params = {
            'base_estimator':[None, DecisionTreeRegressor(), KNeighborsRegressor(), LinearRegression()],
            'n_estimators': np.arange(100, 350, 50),
            'learning_rate': [0.5, 0.8, 1.0, 2.0, ]
         }

ada_boost_regressor_grid = GridSearchCV(AdaBoostRegressor(random_state=1), param_grid=params, n_jobs=-1, cv=3, verbose=5)
ada_boost_regressor_grid.fit(X_train,Y_train)

print('Train R^2 Score : %.3f'%ada_boost_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%ada_boost_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%ada_boost_regressor_grid.best_score_)
print('Best Parameters : ',ada_boost_regressor_grid.best_params_)

Fitting 3 folds for each of 80 candidates, totalling 240 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 tasks      | elapsed:    0.9s
[Parallel(n_jobs=-1)]: Done 120 tasks      | elapsed:    8.2s
[Parallel(n_jobs=-1)]: Done 240 out of 240 | elapsed:   15.0s finished

Train R^2 Score : 1.000
Test R^2 Score : 0.848
Best R^2 Score Through Grid Search : 0.878
Best Parameters :  {'base_estimator': DecisionTreeRegressor(criterion='mse', max_depth=None, max_features=None,
                      max_leaf_nodes=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      presort=False, random_state=None, splitter='best'), 'learning_rate': 1.0, 'n_estimators': 150}
CPU times: user 512 ms, sys: 11.9 ms, total: 524 ms
Wall time: 15.3 s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(ada_boost_regressor_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 80

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_base_estimator	param_learning_rate	param_n_estimators	params	split0_test_score	split1_test_score	split2_test_score	mean_test_score	std_test_score	rank_test_score
0	0.154460	0.004609	0.005228	0.000052	None	0.5	100	{'base_estimator': None, 'learning_rate': 0.5,...	0.851541	0.820585	0.793769	0.822035	0.023593	31
1	0.202743	0.011192	0.008456	0.000807	None	0.5	150	{'base_estimator': None, 'learning_rate': 0.5,...	0.853543	0.829347	0.781863	0.821683	0.029745	33
2	0.232278	0.013604	0.010420	0.000402	None	0.5	200	{'base_estimator': None, 'learning_rate': 0.5,...	0.850087	0.832786	0.779422	0.820867	0.030041	36
3	0.265798	0.001404	0.013722	0.001798	None	0.5	250	{'base_estimator': None, 'learning_rate': 0.5,...	0.850051	0.835823	0.778000	0.821399	0.031122	34
4	0.310391	0.001247	0.015124	0.000192	None	0.5	300	{'base_estimator': None, 'learning_rate': 0.5,...	0.851892	0.835819	0.779556	0.822528	0.030978	30

Comparing Performance Of Ada Boost With Gradient Boosting, Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

from sklearn import ensemble
## Ada Boosting Regressor with Default Params
ada_regressor = ensemble.AdaBoostRegressor(random_state=1)
ada_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(ada_regressor.__class__.__name__,
                                                     ada_regressor.score(X_train, Y_train),ada_regressor.score(X_test, Y_test)))

## Above Hyper-perameter tuned Ada Boosting Regressor
ada_regressor = ensemble.AdaBoostRegressor(random_state=1, **ada_boost_regressor_grid.best_params_)
ada_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(ada_regressor.__class__.__name__,
                                                     ada_regressor.score(X_train, Y_train),ada_regressor.score(X_test, Y_test)))

## Gradient Boosting Regressor with Default Params
gb_regressor = ensemble.GradientBoostingRegressor(random_state=1)
gb_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_regressor.__class__.__name__,
                                                     gb_regressor.score(X_train, Y_train),gb_regressor.score(X_test, Y_test)))

## Random Forest Regressor with Default Params
rforest_regressor = ensemble.RandomForestRegressor(random_state=1)
rforest_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(rforest_regressor.__class__.__name__,
                                                     rforest_regressor.score(X_train, Y_train),rforest_regressor.score(X_test, Y_test)))


## Extra Trees Regressor with Default Params
extra_forest_regressor = ensemble.ExtraTreesRegressor(random_state=1)
extra_forest_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_forest_regressor.__class__.__name__,
                                                     extra_forest_regressor.score(X_train, Y_train),extra_forest_regressor.score(X_test, Y_test)))

## Bagging Regressor with Default Params
bag_regressor = ensemble.BaggingRegressor(random_state=1)
bag_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(bag_regressor.__class__.__name__,
                                                     bag_regressor.score(X_train, Y_train),bag_regressor.score(X_test, Y_test)))


## Decision Tree with Default Parameters
dtree_regressor = tree.DecisionTreeRegressor(random_state=1)
dtree_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(dtree_regressor.__class__.__name__,
                                                     dtree_regressor.score(X_train, Y_train),dtree_regressor.score(X_test, Y_test)))

## Decision Tree with Default Parameters
extra_tree_regressor = tree.ExtraTreeRegressor(random_state=1)
extra_tree_regressor.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_tree_regressor.__class__.__name__,
                                                     extra_tree_regressor.score(X_train, Y_train),extra_tree_regressor.score(X_test, Y_test)))

AdaBoostRegressor : Train Accuracy : 0.92, Test Accuracy : 0.79
AdaBoostRegressor : Train Accuracy : 1.00, Test Accuracy : 0.85
GradientBoostingRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
RandomForestRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
ExtraTreesRegressor : Train Accuracy : 1.00, Test Accuracy : 0.83
BaggingRegressor : Train Accuracy : 0.98, Test Accuracy : 0.81
DecisionTreeRegressor : Train Accuracy : 1.00, Test Accuracy : 0.44
ExtraTreeRegressor : Train Accuracy : 1.00, Test Accuracy : 0.51

AdaBoostClassifier ¶

TheAdaBoostClassifier is available as a part of the ensemble module of sklearn. We'll be training the default model with digits data and then tune model by trying various hyperparameter settings to improve its performance. We'll also compare it with other classification estimators to check its performance relative to other machine learning models.

X_train, X_test, Y_train, Y_test = train_test_split(X_digits, Y_digits, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (1437, 64) (360, 64) (1437,) (360,)

from sklearn.ensemble import AdaBoostClassifier

ada_boosting_classif = AdaBoostClassifier()
ada_boosting_classif.fit(X_train, Y_train)

AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None, learning_rate=1.0,
                   n_estimators=50, random_state=None)

Y_preds = ada_boosting_classif.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test Accuracy : %.3f'%(Y_preds == Y_test).mean())
print('Test Accuracy : %.3f'%ada_boosting_classif.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training Accuracy : %.3f'%ada_boosting_classif.score(X_train, Y_train))

[8 8 8 8 8 8 8 9 8 8 0 9 6 6 6]
[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
Test Accuracy : 0.369
Test Accuracy : 0.369
Training Accuracy : 0.337

Important Attributes of `AdaBoostClassifier`¶

The AdaBoostClassifier has all attributes the same as that of AdaBoostRegressor.

print("Base Estimator : ", ada_boosting_classif.base_estimator_)

Base Estimator :  DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=1,
                       max_features=None, max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, presort=False,
                       random_state=None, splitter='best')

print("Feature Importances : ", ada_boosting_classif.feature_importances_)

Feature Importances :  [0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
 0.   0.   0.   0.   0.   0.   0.   0.02 0.   0.   0.   0.   0.   0.
 0.48 0.   0.   0.   0.   0.   0.   0.   0.5  0.   0.   0.   0.   0.
 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
 0.   0.   0.   0.   0.   0.   0.   0.  ]

print("Estimators Shape : ", len(ada_boosting_classif.estimators_))

ada_boosting_classif.estimators_[:2]

Estimators Shape :  50

[DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=1,
                        max_features=None, max_leaf_nodes=None,
                        min_impurity_decrease=0.0, min_impurity_split=None,
                        min_samples_leaf=1, min_samples_split=2,
                        min_weight_fraction_leaf=0.0, presort=False,
                        random_state=777627829, splitter='best'),
 DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=1,
                        max_features=None, max_leaf_nodes=None,
                        min_impurity_decrease=0.0, min_impurity_split=None,
                        min_samples_leaf=1, min_samples_split=2,
                        min_weight_fraction_leaf=0.0, presort=False,
                        random_state=429440265, splitter='best')]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

AdaBoostClassifier has almost all parameters same as that of AdaBoostRegressor

%%time

from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression

n_samples = X_digits.shape[0]
n_features = X_digits.shape[1]

params = {
            'base_estimator':[None, DecisionTreeClassifier(), SVC(), LogisticRegression()],
            'n_estimators': np.arange(100, 350, 100),
            'learning_rate': [0.5, 1.0, 2.0, ]
         }


ada_boost_classif_grid = GridSearchCV(AdaBoostClassifier(random_state=1, algorithm='SAMME'), param_grid=params, n_jobs=-1, cv=3, verbose=5)
ada_boost_classif_grid.fit(X_train,Y_train)

print('Train Accuracy : %.3f'%ada_boost_classif_grid.best_estimator_.score(X_train, Y_train))
print('Test Accuracy : %.3f'%ada_boost_classif_grid.best_estimator_.score(X_test, Y_test))
print('Best Accuracy Through Grid Search : %.3f'%ada_boost_classif_grid.best_score_)
print('Best Parameters : ',ada_boost_classif_grid.best_params_)

Fitting 3 folds for each of 36 candidates, totalling 108 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:    1.3s
[Parallel(n_jobs=-1)]: Done 108 out of 108 | elapsed: 11.2min finished

Train Accuracy : 0.985
Test Accuracy : 0.956
Best Accuracy Through Grid Search : 0.953
Best Parameters :  {'base_estimator': LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False), 'learning_rate': 2.0, 'n_estimators': 300}
CPU times: user 32.7 s, sys: 17.6 s, total: 50.3 s
Wall time: 11min 33s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(ada_boost_classif_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 36

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_base_estimator	param_learning_rate	param_n_estimators	params	split0_test_score	split1_test_score	split2_test_score	mean_test_score	std_test_score	rank_test_score
0	0.288682	0.006750	0.012123	0.000486	None	0.5	100	{'base_estimator': None, 'learning_rate': 0.5,...	0.778468	0.815900	0.792017	0.795407	0.015490	23
1	0.487868	0.042866	0.022856	0.000148	None	0.5	200	{'base_estimator': None, 'learning_rate': 0.5,...	0.815735	0.834728	0.810924	0.820459	0.010264	21
2	0.689390	0.005973	0.049050	0.009829	None	0.5	300	{'base_estimator': None, 'learning_rate': 0.5,...	0.811594	0.836820	0.836134	0.828114	0.011758	20
3	0.230102	0.005932	0.012194	0.000801	None	1	100	{'base_estimator': None, 'learning_rate': 1.0,...	0.853002	0.832636	0.815126	0.833681	0.015488	18
4	0.476351	0.002038	0.023662	0.000702	None	1	200	{'base_estimator': None, 'learning_rate': 1.0,...	0.824017	0.811715	0.861345	0.832289	0.021058	19

Comparing Performance Of Ada Boost With Gradient Boosting, Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

from sklearn import ensemble

## Gradient Boosting Regressor with Default Params
ada_classifier = ensemble.AdaBoostClassifier(random_state=1)
ada_classifier.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(ada_classifier.__class__.__name__,
                                                     ada_classifier.score(X_train, Y_train),ada_classifier.score(X_test, Y_test)))

## Above Hyper-perameter tuned Gradient Boosting Regressor
ada_classifier = ensemble.AdaBoostClassifier(random_state=1, **ada_boost_classif_grid.best_params_)
ada_classifier.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(ada_classifier.__class__.__name__,
                                                     ada_classifier.score(X_train, Y_train),ada_classifier.score(X_test, Y_test)))

## Gradient Boosting Regressor with Default Params
gb_classifier = ensemble.GradientBoostingClassifier(random_state=1)
gb_classifier.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(gb_classifier.__class__.__name__,
                                                     gb_classifier.score(X_train, Y_train),gb_classifier.score(X_test, Y_test)))


## Random Forest Regressor with Default Params
rforest_classif = ensemble.RandomForestClassifier(random_state=1)
rforest_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(rforest_classif.__class__.__name__,
                                                     rforest_classif.score(X_train, Y_train),rforest_classif.score(X_test, Y_test)))


## Extra Trees Regressor with Default Params
extra_forest_classif = ensemble.ExtraTreesClassifier(random_state=1)
extra_forest_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_forest_classif.__class__.__name__,
                                                     extra_forest_classif.score(X_train, Y_train),extra_forest_classif.score(X_test, Y_test)))

## Bagging Regressor with Default Params
bag_classif = ensemble.BaggingClassifier(random_state=1)
bag_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(bag_classif.__class__.__name__,
                                                     bag_classif.score(X_train, Y_train),bag_classif.score(X_test, Y_test)))


## Decision Tree with Default Parameters
dtree_classif = tree.DecisionTreeClassifier(random_state=1)
dtree_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(dtree_classif.__class__.__name__,
                                                     dtree_classif.score(X_train, Y_train),dtree_classif.score(X_test, Y_test)))

## Decision Tree with Default Parameters
extra_tree_classif = tree.ExtraTreeClassifier(random_state=1)
extra_tree_classif.fit(X_train, Y_train)
print("%s : Train Accuracy : %.2f, Test Accuracy : %.2f"%(extra_tree_classif.__class__.__name__,
                                                     extra_tree_classif.score(X_train, Y_train),extra_tree_classif.score(X_test, Y_test)))

AdaBoostClassifier : Train Accuracy : 0.34, Test Accuracy : 0.37
AdaBoostClassifier : Train Accuracy : 0.75, Test Accuracy : 0.74
GradientBoostingClassifier : Train Accuracy : 1.00, Test Accuracy : 0.96
RandomForestClassifier : Train Accuracy : 1.00, Test Accuracy : 0.94
ExtraTreesClassifier : Train Accuracy : 1.00, Test Accuracy : 0.95
BaggingClassifier : Train Accuracy : 1.00, Test Accuracy : 0.94
DecisionTreeClassifier : Train Accuracy : 1.00, Test Accuracy : 0.83
ExtraTreeClassifier : Train Accuracy : 1.00, Test Accuracy : 0.83

This ends our small tutorial on ensemble learning method boosting using scikit-learn. Please feel free to let us know your views in the comments section.

References ¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

sklearn, boosting

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)

Scikit-Learn - Ensemble Learning: Boosting¶

Table of Contents¶

Introduction ¶

Load Dataset¶

GradientBoostingRegressor ¶

Important Attributes of GradientBoostingRegressor¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

Comparing Performance Of Gradient Boosting With Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

GradientBosstingClassifier ¶

Important Attributes of GradientBoostingClassifier¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

Comparing Performance Of Gradient Boosting With Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

AdaBoostRegressor ¶

Important Attributes of AdaBoostRegressor¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

Comparing Performance Of Ada Boost With Gradient Boosting, Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

AdaBoostClassifier ¶

Important Attributes of AdaBoostClassifier¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

Comparing Performance Of Ada Boost With Gradient Boosting, Bagging, Random Forest, Extra Trees, Decision Tree and Extra Tree¶

References ¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

Want to Share Your Views? Have Any Suggestions?

Sunny Solanki

Subscribe to Our YouTube Channel

Tutorial Categories

Newsletter Subscription

Important Attributes of `GradientBoostingRegressor`¶

Important Attributes of `GradientBoostingClassifier`¶

Important Attributes of `AdaBoostRegressor`¶

Important Attributes of `AdaBoostClassifier`¶