Updated On : Jun-30,2020 Time Investment : ~30 mins

Scikit-Learn - Support Vector Machine¶

Table of Contents¶

Introduction
LinearSVR
LinearSVC
SVR
SVC
NuSVR
NuSVC
References

Introduction ¶

Support Vector Machine constructs a hyperplane or list of hyperplanes in high dimensional space, which are then used for classification/regression tasks or other tasks like outlier detection.

Below is a list of SVM versions provided by sklearn.

Classification Tasks
- LinearSVC
- SVC
- NuSVC
Regression Tasks
- LinearSVR
- SVR
- NuSVR

import numpy as np
import pandas as pd

import sklearn

import warnings

warnings.filterwarnings('ignore')

np.set_printoptions(precision=2)

%matplotlib inline

Load Dataset¶

We'll be loading below mentioned two for our purpose.

Digits Dataset: We'll be using digits dataset which has images of size 8x8 for digits 0-9. We'll use digits data for classification tasks below.
Boston Housing Dataset: We'll be using the Boston housing dataset which has information about various house properties like average no of rooms, per capita crime rate in town, etc. We'll be using it for regression tasks.

Sklearn provides both of this dataset as a part of the datasets module. We can load them by calling load_digits() and load_boston() methods. It returns dictionary-like object BUNCH which can be used to retrieve features and target.

from sklearn.datasets import load_boston, load_digits

digits = load_digits()
X_digits, Y_digits = digits.data, digits.target

print('Dataset Size : ',X_digits.shape, Y_digits.shape)

Dataset Size :  (1797, 64) (1797,)

boston = load_boston()
X_boston, Y_boston = boston.data, boston.target

print('Dataset Size : ',X_boston.shape, Y_boston.shape)

Dataset Size :  (506, 13) (506,)

LinearSVR ¶

The support vector machine model that we'll be introducing is LinearSVR. It is available as a part of svm module of sklearn. We'll divide the regression dataset into train/test sets, train LinearSVR with default parameter on it, evaluate performance on the test set and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

We'll split the dataset into two parts:

Training data which will be used for the training model.
Test data against which accuracy of the trained model will be checked.

train_test_split function of the model_selection module of sklearn will help us split data into two sets with 80% for training and 20% for test purposes. We are also using seed(random_state=123) with train_test_split so that we always get the same split and can reproduce results in the future as well.

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (404, 13) (102, 13) (404,) (102,)

Fitting Default LinearSVR to Train Data¶

from sklearn.svm import LinearSVR, NuSVR, OneClassSVM


linear_svr = LinearSVR(max_iter=1000000)
linear_svr.fit(X_train, Y_train)

LinearSVR(max_iter=1000000)

Evaluate Model Accuracy on Test Set¶

Y_preds = linear_svr.predict(X_test)

print(Y_preds[:10])
print(Y_test[:10])

print('Test R^2 Score : %.3f'%linear_svr.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training R^2 Score : %.3f'%linear_svr.score(X_train, Y_train))

[ 6.41 26.36 37.17 13.61 30.4  38.01 24.36  9.36 13.6  32.05]
[15.  26.6 45.4 20.8 34.9 21.9 28.7  7.2 20.  32.2]
Test R^2 Score : 0.579
Training R^2 Score : 0.709

Important Attributes of Estimator¶

LinearSVR provides a list of important attributes that can provide important insights one model is trained. Below is a list of attributes available through LinearSVR.

coef_ - It returns an array representing weights assigned to each feature by model. It represents the importance of each feature as per model trained.
intercept_ - It represents intercept of linear kernel function.

print("Feature Importances :", linear_svr.coef_)

Feature Importances : [-0.14  0.04  0.03  0.87 -0.72  6.01 -0.03 -0.75  0.13 -0.01 -0.57  0.01
 -0.3 ]

print("Model Intercept :", linear_svr.intercept_)

Model Intercept : [1.49]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Below is a list of common hyperparameters that need tuning for getting the best fit for our data. We'll try various hyperparameters settings to various splits of train/test data to find out best fit which will have almost the same accuracy for both train & test dataset or have quite less difference between accuracy.

C - It represents regularization applied to the linear kernel function. The strength of normalization is inversely proportional to C which means that low C will result in high regularization and vice-versa. The default value of 1.0 is set.
max_iter - It specifies the number of iteration to try before stopping algorithm. The value of -1 represents no limit and algorithm runs until convergence.

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 5-fold cross-validation on data.

%%time

from sklearn.model_selection import GridSearchCV

params = {
            'C': [0.1, 0.5, 1.0, 10.0],
         }

linear_svr_regressor_grid = GridSearchCV(LinearSVR(random_state=1, max_iter=1000000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
linear_svr_regressor_grid.fit(X_train,Y_train)

print('Train R^2 Score : %.3f'%linear_svr_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%linear_svr_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%linear_svr_regressor_grid.best_score_)
print('Best Parameters : ',linear_svr_regressor_grid.best_params_)

Fitting 5 folds for each of 4 candidates, totalling 20 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:   18.5s
[Parallel(n_jobs=-1)]: Done  18 out of  20 | elapsed:  1.1min remaining:    7.3s
[Parallel(n_jobs=-1)]: Done  20 out of  20 | elapsed:  1.4min finished

Train R^2 Score : 0.712
Test R^2 Score : 0.584
Best R^2 Score Through Grid Search : 0.708
Best Parameters :  {'C': 1.0}
CPU times: user 11.6 s, sys: 62.7 ms, total: 11.6 s
Wall time: 1min 37s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(linear_svr_regressor_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 4

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_C	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	2.174029	0.053218	0.001128	0.000120	0.1	{'C': 0.1}	0.651777	0.778885	0.700807	0.762146	0.600923	0.698908	0.066665	3
1	7.785498	0.342932	0.000914	0.000020	0.5	{'C': 0.5}	0.662025	0.798491	0.704975	0.762723	0.610880	0.707819	0.067436	2
2	12.054649	0.589841	0.000992	0.000131	1	{'C': 1.0}	0.653940	0.809422	0.704639	0.766378	0.605480	0.707972	0.073673	1
3	33.045799	2.934728	0.000853	0.000116	10	{'C': 10.0}	0.656110	0.756510	0.691831	0.663300	0.615075	0.676565	0.046903	4

LinearSVC ¶

The support vector machine model that we'll be introducing is LinearSVC. It is available as a part of svm module of sklearn. We'll divide classification dataset into train/test sets, train LinearSVC with default parameter on it, evaluate performance on the test set, and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

NOTE

Please make a note that we are also using stratify parameter which will prevent unequal distribution of all classes in train and test sets.For each classes, we'll have 80% samples in train set and 20% samples in test set. This will make sure that we don't have any dominating class in either train or test set.

X_train, X_test, Y_train, Y_test = train_test_split(X_digits, Y_digits, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (1437, 64) (360, 64) (1437,) (360,)

Fitting Default LinearSVC to Train Data¶

from sklearn.svm import LinearSVC

linear_svc = LinearSVC(max_iter=1000000)
linear_svc.fit(X_train, Y_train)

LinearSVC(max_iter=1000000)

Evaluate Model Accuracy on Test Set¶

Y_preds = linear_svc.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test Accuracy : %.3f'%(Y_preds == Y_test).mean())
print('Test Accuracy : %.3f'%linear_svc.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training Accuracy : %.3f'%linear_svc.score(X_train, Y_train))

[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
Test Accuracy : 0.961
Test Accuracy : 0.961
Training Accuracy : 0.998

Important Attributes of Estimator¶

The LinearSVC has the same attributes as that of LinearSVR.

print("Feature Importances Shape :", linear_svc.coef_.shape)

Feature Importances Shape : (10, 64)

print("Model Intercept :", linear_svc.intercept_)

Model Intercept : [-0.   -4.13 -0.01 -0.49  0.01 -0.05 -0.02 -0.02 -2.72 -2.74]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Below is a list of common hyperparameters that needs tuning for getting the best fit for our data. We'll try various hyperparameters settings to various splits of train/test data to find out best fit which will have almost the same accuracy for both train & test dataset or have quite less difference between accuracy.

C - It represents regularization applied to the linear kernel function. The strength of normalization is inversely proportional to C which means that low C will result in high regularization and vice-versa. The default value of 1.0 is set.
penalty - It accepts one of the two string values. It applies a penalty to linear kernel function and prevents it from overfitting data.
- l1 Penalty
- l2 Penalty(default)
max_iter - It specifies the number of iteration to try before stopping algorithm. The value of -1 represents no limit and algorithm runs until convergence.

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 5-fold cross-validation on data.

%%time

params = {
            'C': [0.1, 0.5, 1.0, 10.0],
         }

linear_svc_classifier_grid = GridSearchCV(LinearSVC(random_state=1, max_iter=1000000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
linear_svc_classifier_grid.fit(X_train,Y_train)

print('Train Accuracy : %.3f'%linear_svc_classifier_grid.best_estimator_.score(X_train, Y_train))
print('Test Accuracy : %.3f'%linear_svc_classifier_grid.best_estimator_.score(X_test, Y_test))
print('Best Accuracy Through Grid Search : %.3f'%linear_svc_classifier_grid.best_score_)
print('Best Parameters : ',linear_svc_classifier_grid.best_params_)

Fitting 5 folds for each of 4 candidates, totalling 20 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:   14.5s
[Parallel(n_jobs=-1)]: Done  18 out of  20 | elapsed:  1.3min remaining:    8.6s
[Parallel(n_jobs=-1)]: Done  20 out of  20 | elapsed:  1.6min finished

Train Accuracy : 0.997
Test Accuracy : 0.969
Best Accuracy Through Grid Search : 0.953
Best Parameters :  {'C': 0.1}
CPU times: user 2.13 s, sys: 23.2 ms, total: 2.15 s
Wall time: 1min 37s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(linear_svc_classifier_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 4

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_C	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	1.597615	0.072143	0.025226	0.014656	0.1	{'C': 0.1}	0.951389	0.961806	0.940767	0.951220	0.958188	0.952674	0.007202	1
1	6.911928	0.794496	0.001103	0.000042	0.5	{'C': 0.5}	0.930556	0.947917	0.926829	0.954704	0.944251	0.940851	0.010545	2
2	13.447654	2.130439	0.001011	0.000039	1	{'C': 1.0}	0.930556	0.940972	0.923345	0.951220	0.937282	0.936675	0.009439	3
3	46.044674	8.834060	0.000922	0.000089	10	{'C': 10.0}	0.923611	0.923611	0.909408	0.937282	0.926829	0.924148	0.008917	4

SVR ¶

The support vector machine model that we'll be introducing is SVR. It is available as a part of svm module of sklearn. We'll divide the regression dataset into train/test sets, train SVR with default parameter on it, evaluate performance on the test set, and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (404, 13) (102, 13) (404,) (102,)

Fitting Default SVR to Train Data¶

from sklearn.svm import SVR

svr = SVR(cache_size=1000)
svr.fit(X_train, Y_train)

SVR(cache_size=1000)

Evaluate Model Accuracy on Test Set¶

Y_preds = svr.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test R^2 Score : %.3f'%svr.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training R^2 Score : %.3f'%svr.score(X_train, Y_train))

[13.24 23.85 24.48 14.8  22.27 15.57 24.27 13.18 22.49 24.57 22.96 24.18
 20.36 19.44 21.17]
[15.  26.6 45.4 20.8 34.9 21.9 28.7  7.2 20.  32.2 24.1 18.5 13.5 27.
 23.1]
Test R^2 Score : 0.103
Training R^2 Score : 0.227

Important Attributes of Estimator¶

SVR provides a list of important attributes that can provide important insights one model is trained. Below is a list of attributes available through SVR.

support_vectors_ - It represents support vectors of the trained model.
intercept_ - It represents intercept of linear kernel function.

print("Support Vectors Shape:", svr.support_vectors_.shape)

Support Vectors Shape: (389, 13)

print("Model Intercept :", svr.intercept_)

Model Intercept : [19.43]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

C - It represents regularization applied to the linear kernel function. The strength of normalization is inversely proportional to C which means that low C will result in high regularization and vice-versa. The default value of 1.0 is set.
kernel - It specifies kernel type to be used in SVM. It accepts either one of the below strings or callable.
- linear
- poly
- rbf (default)
- sigmoid
- precomputed
degree - It accepts integer values and represents a degree for poly kernel. It's ignored when other kernels are used.
gamma - It represents kernel coefficient for rbf, poly and sigmoid kernels. It accepts one of the below strings or float as value.
- scale (default) - 1 / (n_features * X.var())
- auto - (1/ n_features)
cache_size - It accepts float values representing kernel cache size in MB. The default value is 200 MB. It's suggested to increase cachec_size based on RAM of computer to increase performance of SVM.
max_iter - It specifies number of iteration to try before stopping algorithm. The value of -1 represents no limit and algorithm runs until convergence.

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 5-fold cross-validation on data.

%%time

from sklearn.model_selection import GridSearchCV

params = {
            'C': [0.1, 1.0,],
            'kernel': ['linear','rbf', 'sigmoid', ],
            'gamma': ['auto', 'scale']
         }

svr_regressor_grid = GridSearchCV(SVR(cache_size=1000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
svr_regressor_grid.fit(X_train,Y_train)

print('Train R^2 Score : %.3f'%svr_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%svr_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%svr_regressor_grid.best_score_)
print('Best Parameters : ',svr_regressor_grid.best_params_)

Fitting 5 folds for each of 12 candidates, totalling 60 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:    0.5s
[Parallel(n_jobs=-1)]: Done  60 out of  60 | elapsed:   12.7s finished

Train R^2 Score : 0.723
Test R^2 Score : 0.605
Best R^2 Score Through Grid Search : 0.714
Best Parameters :  {'C': 1.0, 'gamma': 'auto', 'kernel': 'linear'}
CPU times: user 4.59 s, sys: 4.72 ms, total: 4.59 s
Wall time: 17.1 s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(svr_regressor_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 12

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_C	param_gamma	param_kernel	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	0.516886	0.149781	0.001687	0.000075	0.1	auto	linear	{'C': 0.1, 'gamma': 'auto', 'kernel': 'linear'}	0.702459	0.768035	0.727951	0.734547	0.576907	0.701980	0.065942	3
1	0.013797	0.000271	0.002895	0.000187	0.1	auto	rbf	{'C': 0.1, 'gamma': 'auto', 'kernel': 'rbf'}	-0.013447	0.001137	-0.011739	-0.048919	-0.021288	-0.018851	0.016669	10
2	0.009250	0.001254	0.001722	0.000120	0.1	auto	sigmoid	{'C': 0.1, 'gamma': 'auto', 'kernel': 'sigmoid'}	-0.016672	-0.001444	-0.013125	-0.053354	-0.026054	-0.022130	0.017488	11
3	0.510559	0.135548	0.001627	0.000090	0.1	scale	linear	{'C': 0.1, 'gamma': 'scale', 'kernel': 'linear'}	0.702459	0.768035	0.727951	0.734547	0.576907	0.701980	0.065942	3
4	0.013602	0.000387	0.002603	0.000111	0.1	scale	rbf	{'C': 0.1, 'gamma': 'scale', 'kernel': 'rbf'}	0.113661	0.169402	0.108838	0.089386	0.083959	0.113049	0.030331	6

SVC ¶

The support vector machine model that we'll be introducing is SVC. It is available as a part of svm module of sklearn. We'll divide classification dataset into train/test sets, train SVC with default parameter on it, evaluate performance on the test set, and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

X_train, X_test, Y_train, Y_test = train_test_split(X_digits, Y_digits, train_size=0.80, test_size=0.20, random_state=123)

print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (1437, 64) (360, 64) (1437,) (360,)

Fitting Default SVC to Train Data¶

from sklearn.svm import SVC

svc = SVC(cache_size=1000)
svc.fit(X_train, Y_train)

SVC(cache_size=1000)

Evaluate Model Accuracy on Test Set¶

Y_preds = svc.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test Accuracy : %.3f'%(Y_preds == Y_test).mean())
print('Test Accuracy : %.3f'%svc.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training Accuracy : %.3f'%svc.score(X_train, Y_train))

[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
Test Accuracy : 0.989
Test Accuracy : 0.989
Training Accuracy : 0.997

Important Attributes of Estimator¶

SVC has the same set of attributes as that of SVR.

print("Support Vectors :", svc.support_vectors_.shape)

Support Vectors : (660, 64)

#print("Feature Importances :", svc.coef_) ## Only for Linear Kernel

print("Model Intercept :", svc.intercept_)

Model Intercept : [-0.59 -0.33 -0.22 -0.49 -0.47 -0.06 -0.43 -0.31 -0.32  0.4   0.55  0.11
  0.43  0.55  0.07  0.7   0.43  0.07 -0.1   0.01  0.3  -0.13  0.14  0.05
 -0.29  0.09  0.22 -0.27 -0.17  0.02  0.29  0.6   0.06  0.18 -0.03  0.33
 -0.26  0.01 -0.38 -0.35 -0.4  -0.34  0.31 -0.05 -0.32]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

SVC has the same parameters as that of SVR

%%time

params = {
            'C': [0.1, 1.0, ],
            'kernel': ['linear', 'rbf', 'sigmoid',],
            'gamma': ['auto', 'scale']
         }

svc_classifier_grid = GridSearchCV(SVC(cache_size=1000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
svc_classifier_grid.fit(X_train,Y_train)

print('Train Accuracy : %.3f'%svc_classifier_grid.best_estimator_.score(X_train, Y_train))
print('Test Accuracy : %.3f'%svc_classifier_grid.best_estimator_.score(X_test, Y_test))
print('Best Accuracy Through Grid Search : %.3f'%svc_classifier_grid.best_score_)
print('Best Parameters : ',svc_classifier_grid.best_params_)

Fitting 5 folds for each of 12 candidates, totalling 60 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 tasks      | elapsed:    1.1s
[Parallel(n_jobs=-1)]: Done  60 out of  60 | elapsed:    3.9s finished

Train Accuracy : 0.997
Test Accuracy : 0.989
Best Accuracy Through Grid Search : 0.988
Best Parameters :  {'C': 1.0, 'gamma': 'scale', 'kernel': 'rbf'}
CPU times: user 411 ms, sys: 16.1 ms, total: 428 ms
Wall time: 4.19 s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(svc_classifier_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 12

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_C	param_gamma	param_kernel	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	0.052571	0.002183	0.014920	0.000300	0.1	auto	linear	{'C': 0.1, 'gamma': 'auto', 'kernel': 'linear'}	0.975694	0.986111	0.975610	0.982578	0.975610	0.979121	0.004409	2
1	0.417765	0.015139	0.058806	0.001837	0.1	auto	rbf	{'C': 0.1, 'gamma': 'auto', 'kernel': 'rbf'}	0.104167	0.104167	0.108014	0.108014	0.104530	0.105778	0.001830	10
2	0.301676	0.010570	0.046517	0.000632	0.1	auto	sigmoid	{'C': 0.1, 'gamma': 'auto', 'kernel': 'sigmoid'}	0.104167	0.104167	0.108014	0.108014	0.104530	0.105778	0.001830	10
3	0.052814	0.000838	0.015528	0.000516	0.1	scale	linear	{'C': 0.1, 'gamma': 'scale', 'kernel': 'linear'}	0.975694	0.986111	0.975610	0.982578	0.975610	0.979121	0.004409	2
4	0.228706	0.003448	0.050628	0.000324	0.1	scale	rbf	{'C': 0.1, 'gamma': 'scale', 'kernel': 'rbf'}	0.923611	0.972222	0.937282	0.954704	0.958188	0.949202	0.016958	6

NuSVR ¶

The support vector machine model that we'll be introducing is NuSVR. It is available as a part of svm module of sklearn. We'll divide the regression dataset into train/test sets, train NuSVR with default parameter on it, evaluate performance on the test set, and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

X_train, X_test, Y_train, Y_test = train_test_split(X_boston, Y_boston, train_size=0.80, test_size=0.20, random_state=123)
print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (404, 13) (102, 13) (404,) (102,)

Fitting Default NuSVR to Train Data¶

from sklearn.svm import NuSVR

nu_svr = NuSVR(cache_size=1000)
nu_svr.fit(X_train, Y_train)

NuSVR(cache_size=1000)

Evaluate Model Accuracy on Test Set¶

Y_preds = nu_svr.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test R^2 Score : %.3f'%nu_svr.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training R^2 Score : %.3f'%nu_svr.score(X_train, Y_train))

[13.73 24.76 25.28 15.44 23.1  16.22 25.11 13.7  23.71 25.25 23.87 25.04
 21.54 20.78 22.26]
[15.  26.6 45.4 20.8 34.9 21.9 28.7  7.2 20.  32.2 24.1 18.5 13.5 27.
 23.1]
Test R^2 Score : 0.148
Training R^2 Score : 0.255

Important Attributes of Estimator¶

NuSVR provides a list of important attributes that can provide important insights one model is trained. Below is a list of attributes available through NuSVR.

support_vectors_ - It represents support vectors of the trained model.
coef_ - It returns an array representing weights assigned to each feature by model. It represents the importance of each feature as per model trained.
intercept_ - It represents intercept of linear kernel function.

print("Support Vectors :", nu_svr.support_vectors_.shape)

Support Vectors : (204, 13)

#print("Feature Importances :", nu_svr.coef_) ## Only for Linear Kernel

print("Model Intercept :", nu_svr.intercept_)

Model Intercept : [19.65]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

nu - It accepts float value between 0-1. It represents an upper bound of the fraction of margin errors and lowers bound on the fraction of support vectors.
kernel - It specifies kernel type to be used in SVM. It accepts either one of the below strings or callable.
- linear
- poly
- rbf (default)
- sigmoid
- precomputed
degree - It accepts integer values and represents a degree for poly kernel. It's ignored when other kernels are used.
gamma - It represents kernel coefficient for rbf, poly and sigmoid kernels. It accepts one of the below strings or float as value.
- scale (default) - 1 / (n_features * X.var())
- auto - (1/ n_features)
cache_size - It accepts float values representing kernel cache size in MB. The default value is 200 MB. It's suggested to increase cachec_size based on RAM of the computer to increase the performance of SVM.
max_iter - It specifies the number of iteration to try before stopping algorithm. The value of -1 represents no limit and algorithm runs until convergence.

We'll below try various values for the above-mentioned hyperparameters to find the best estimator for our dataset by doing 5-fold cross-validation on data.

%%time

from sklearn.model_selection import GridSearchCV

params = {
            'nu': [0.1, 1.0,],
            'kernel': ['linear', 'rbf', 'sigmoid',],
            'gamma': ['auto', 'scale']
         }

svr_regressor_grid = GridSearchCV(NuSVR(cache_size=1000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
svr_regressor_grid.fit(X_train,Y_train)

print('Train R^2 Score : %.3f'%svr_regressor_grid.best_estimator_.score(X_train, Y_train))
print('Test R^2 Score : %.3f'%svr_regressor_grid.best_estimator_.score(X_test, Y_test))
print('Best R^2 Score Through Grid Search : %.3f'%svr_regressor_grid.best_score_)
print('Best Parameters : ',svr_regressor_grid.best_params_)

Fitting 5 folds for each of 12 candidates, totalling 60 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:    9.0s
[Parallel(n_jobs=-1)]: Done  53 out of  60 | elapsed:   25.7s remaining:    3.4s
[Parallel(n_jobs=-1)]: Done  60 out of  60 | elapsed:   46.9s finished

Train R^2 Score : 0.725
Test R^2 Score : 0.610
Best R^2 Score Through Grid Search : 0.713
Best Parameters :  {'gamma': 'auto', 'kernel': 'linear', 'nu': 1.0}
CPU times: user 8.76 s, sys: 13.3 ms, total: 8.77 s
Wall time: 55.5 s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(svr_regressor_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 12

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_gamma	param_kernel	param_nu	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	1.903638	0.435652	0.001060	0.000065	auto	linear	0.1	{'gamma': 'auto', 'kernel': 'linear', 'nu': 0.1}	0.682853	0.687668	0.717526	0.655335	0.605221	0.669721	0.037807	3
1	13.480651	5.694576	0.001730	0.000147	auto	linear	1	{'gamma': 'auto', 'kernel': 'linear', 'nu': 1.0}	0.679955	0.799852	0.716207	0.757667	0.613447	0.713426	0.064110	1
2	0.003685	0.000221	0.000962	0.000036	auto	rbf	0.1	{'gamma': 'auto', 'kernel': 'rbf', 'nu': 0.1}	-0.191299	-0.429605	-0.142359	-0.123709	-0.200381	-0.217471	0.109919	12
3	0.017416	0.000186	0.002699	0.000028	auto	rbf	1	{'gamma': 'auto', 'kernel': 'rbf', 'nu': 1.0}	0.005197	0.021444	-0.003875	-0.020708	-0.005375	-0.000664	0.013838	7
4	0.002699	0.000031	0.000793	0.000030	auto	sigmoid	0.1	{'gamma': 'auto', 'kernel': 'sigmoid', 'nu': 0.1}	-0.192127	-0.432815	-0.133293	-0.111075	-0.207635	-0.215389	0.114452	11

NuSVC ¶

The support vector machine model that we'll be introducing is NuSVC. It is available as a part of svm module of sklearn. We'll divide the regression dataset into train/test sets, train NuSVC with default parameter on it, evaluate performance on the test set, and then tune model by trying various hyperparameters to improve performance further. We'll also introduce various important attributes of the trained model which can give useful insights once the model is trained.

Splitting Dataset into Train & Test sets¶

X_train, X_test, Y_train, Y_test = train_test_split(X_digits, Y_digits, train_size=0.80, test_size=0.20, random_state=123)

print('Train/Test Sizes : ', X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

Train/Test Sizes :  (1437, 64) (360, 64) (1437,) (360,)

Fitting Default NuSVC to Train Data¶

from sklearn.svm import NuSVC

nu_svc = NuSVC(cache_size=1000)
nu_svc.fit(X_train, Y_train)

NuSVC(cache_size=1000)

Evaluate Model Accuracy on Test Set¶

Y_preds = nu_svc.predict(X_test)

print(Y_preds[:15])
print(Y_test[:15])

print('Test Accuracy : %.3f'%(Y_preds == Y_test).mean())
print('Test Accuracy : %.3f'%nu_svc.score(X_test, Y_test)) ## Score method also evaluates accuracy for classification models.
print('Training Accuracy : %.3f'%nu_svc.score(X_train, Y_train))

[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
[3 3 4 4 1 3 1 0 7 4 0 0 5 1 6]
Test Accuracy : 0.958
Test Accuracy : 0.958
Training Accuracy : 0.972

Important Attributes of Estimator¶

NuSVC has the same set of attributes as that of NuSVR.

print("Support Vectors :", nu_svc.support_vectors_.shape)

Support Vectors : (1215, 64)

#print("Feature Importances :", nu_svc.coef_) ## Only for Linear Kernel

print("Model Intercept :", nu_svc.intercept_)

Model Intercept : [-0.41 -0.28 -0.24 -0.38 -0.31 -0.09 -0.31 -0.36 -0.44  0.19  0.37  0.28
  0.24  0.46  0.11  0.48  0.16  0.14 -0.03  0.07  0.22 -0.08  0.02 -0.04
 -0.11  0.01  0.1  -0.18 -0.13 -0.12  0.07  0.34 -0.02  0.05 -0.01  0.18
 -0.06  0.02 -0.27 -0.2  -0.27 -0.26  0.12  0.03 -0.14]

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

NuSVC has same parameters as that of NuSVR

%%time

params = {
            'nu': [0.1, 1.0, ],
            'kernel': ['linear', 'rbf', 'sigmoid',],
            'gamma': ['auto', 'scale']
         }

svc_classifier_grid = GridSearchCV(NuSVC(cache_size=1000), param_grid=params, n_jobs=-1, cv=5, verbose=5)
svc_classifier_grid.fit(X_train,Y_train)

print('Train Accuracy : %.3f'%svc_classifier_grid.best_estimator_.score(X_train, Y_train))
print('Test Accuracy : %.3f'%svc_classifier_grid.best_estimator_.score(X_test, Y_test))
print('Best Accuracy Through Grid Search : %.3f'%svc_classifier_grid.best_score_)
print('Best Parameters : ',svc_classifier_grid.best_params_)

Fitting 5 folds for each of 12 candidates, totalling 60 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  12 tasks      | elapsed:    0.3s
[Parallel(n_jobs=-1)]: Done  60 out of  60 | elapsed:    1.7s finished

Train Accuracy : 0.997
Test Accuracy : 0.992
Best Accuracy Through Grid Search : 0.989
Best Parameters :  {'gamma': 'scale', 'kernel': 'rbf', 'nu': 0.1}
CPU times: user 422 ms, sys: 16.1 ms, total: 438 ms
Wall time: 1.98 s

Printing First Few Cross Validation Results¶

cross_val_results = pd.DataFrame(svc_classifier_grid.cv_results_)
print('Number of Various Combinations of Parameters Tried : %d'%len(cross_val_results))

cross_val_results.head() ## Printing first few results.

Number of Various Combinations of Parameters Tried : 12

	mean_fit_time	std_fit_time	mean_score_time	std_score_time	param_gamma	param_kernel	param_nu	params	split0_test_score	split1_test_score	split2_test_score	split3_test_score	split4_test_score	mean_test_score	std_test_score	rank_test_score
0	0.100268	0.000486	0.019382	0.000251	auto	linear	0.1	{'gamma': 'auto', 'kernel': 'linear', 'nu': 0.1}	0.968750	0.996528	0.975610	0.989547	0.986063	0.983299	0.009924	2
1	0.001678	0.000147	0.000000	0.000000	auto	linear	1	{'gamma': 'auto', 'kernel': 'linear', 'nu': 1.0}	NaN	NaN	NaN	NaN	NaN	NaN	NaN	7
2	0.520766	0.032240	0.063453	0.011014	auto	rbf	0.1	{'gamma': 'auto', 'kernel': 'rbf', 'nu': 0.1}	0.479167	0.475694	0.466899	0.554007	0.456446	0.486443	0.034685	4
3	0.002256	0.001235	0.000000	0.000000	auto	rbf	1	{'gamma': 'auto', 'kernel': 'rbf', 'nu': 1.0}	NaN	NaN	NaN	NaN	NaN	NaN	NaN	8
4	0.039665	0.003305	0.006151	0.000455	auto	sigmoid	0.1	{'gamma': 'auto', 'kernel': 'sigmoid', 'nu': 0.1}	0.104167	0.104167	0.108014	0.108014	0.104530	0.105778	0.001830	6

This ends our small tutorial introducing various SVM estimators available as a part of sklearn. Please feel free to let us know your views in the comments section.

References ¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Want to Share Your Views? Have Any Suggestions?

If you want to

provide some suggestions on topic
share your views
include some details in tutorial
suggest some new topics on which we should create tutorials/blogs

Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.

svm, sklearn

Sunny Solanki

Software Developer | Youtuber | Bonsai Enthusiast

Subscribe to Our YouTube Channel

Tutorial Categories

Artificial Intelligence (83)
Data Science (84)
Digital Marketing (8)
Machine Learning (38)
Python (131)

Scikit-Learn - Support Vector Machine¶

Table of Contents¶

Introduction ¶

Load Dataset¶

LinearSVR ¶

Splitting Dataset into Train & Test sets¶

Fitting Default LinearSVR to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

LinearSVC ¶

Splitting Dataset into Train & Test sets¶

Fitting Default LinearSVC to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

SVR ¶

Splitting Dataset into Train & Test sets¶

Fitting Default SVR to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

SVC ¶

Splitting Dataset into Train & Test sets¶

Fitting Default SVC to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

NuSVR ¶

Splitting Dataset into Train & Test sets¶

Fitting Default NuSVR to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

NuSVC ¶

Splitting Dataset into Train & Test sets¶

Fitting Default NuSVC to Train Data¶

Evaluate Model Accuracy on Test Set¶

Important Attributes of Estimator¶

Finetuning Model By Doing Grid Search On Various Hyperparameters¶

Printing First Few Cross Validation Results¶

References ¶

Sunny Solanki

Comfortable Learning through Video Tutorials?

Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

Want to Share Your Views? Have Any Suggestions?

Sunny Solanki

Subscribe to Our YouTube Channel

Tutorial Categories

Newsletter Subscription