Updated On : Oct-27,2021 Tags bayesian-optimization, hyperparameters-tuning
bayes_opt: Bayesian Optimization for Hyperparameters Tuning

bayes_opt: Bayesian Optimization for Hyperparameters Tuning

Machine learning is the most famous branch of computer science at the time of writing this tutorial. The deep neural networks are getting developed every other day to solve hard tasks (image classification, object detection, speech recognition, etc.) which were once impossible for computers to solve. The data required to train complicated ML models are also increasing in size day by day. As training neural network requires trying different combinations of settings, the process of building a model with a large amount of data can take a lot of time. The general approach used to find the best hyperparameters of the ML model is grid search hyperparameter values which try all possible combinations and can take a lot of time if the dataset is large enough. The training process will also waste time on hyperparameters combinations in particular ranges which do not give good results but it'll still try all combinations in those ranges. We need that our hyperparameters tuning process only tries combinations of hyperparameters that are supposed to give good results rather than trying all possible combinations in the given range.

Python has a library named bayes_opt which helps us with this. It uses Bayesian interference and Gaussian process to find values of hyperparameters which gives the best results in fewer trials. It can take any black-box function as input and maximize the output return value of that function. The library starts by constructing posterior distribution of functions (Gaussian process) that can accurately describe our input function whose output we want to maximize. It then tries different combinations of parameters that the function takes as input. As we try different combinations, the posterior distribution improves. It learns more about regions in parameter space where we are getting good results and keep exploring in that space rather than trying the whole parameter space. At each trial, the gaussian process is fitted and posterior distribution along with exploration strategy (UCB (Upper Confidence Bound), or EI (Expected Improvement)) is used to determine which combination of parameters to try next. It tries to find the best parameters setting for function in as many fewer trials as possible.

As a part of this tutorial, we'll explain with simple examples how we can use bayet_opt to optimize hyperparameters of scikit-learn estimators to get good results. We'll try to explain the API of the library with simple examples. Below we have listed important sections of the tutorial to give an overview of the material covered.

Important Sections of Tutorial

  1. Simple Line Formula
    • Define Objective Function
    • Define Search Space
    • Maximize Objective Function
    • Printing Best Results
    • Rerunning Maximization Process for Few More Steps
    • Changing Parameter Bounds
  2. Regression Scikit-Learn
    • Load Dataset
    • Define Objective Function
    • Define Search Space
    • Maximize Objective Function
    • Printing Best Results
    • Maximize Objective Function for Few More Iterations
    • Printing Best Results Again
    • Create ML Model with Best Hyperparameters and Evaluate
  3. Classification Scikit-Learn
  4. Manual Optimization Loop (Suggest Parameters, Evaluate Objective Function and Register Results)
  5. Guided Optimization
  6. Saving and Resuming Optimization Process
    • Load Dataset
    • Define Objective Function
    • Define Search Space
    • Define Optimizer
    • Define Logger and Subscribe Logged with Optimizer
    • Maximize Objective Function
    • Print Best Results
    • Create New Optimizer
    • Load New Optimizer using Logs of Previous Optimizer
    • Print Results using Optimizer Loaded from File
    • Maximize New Optimizer
    • Print Best Results

Steps for Hyperparameters Optimization

We'll be following the steps mentioned below in all our sections to explain how to perform hyperparameters optimization.

  1. Define Objective Function
    • This is the function that will have actual logic of creating ML model, fitting train data, evaluating some metric (R^2 score for regression task, accuracy for the classification task, etc.) on the test dataset and return metric value which we want to maximize.
  2. Define Hyperparameters Search Space
    • Here, we define a dictionary with hyperparameter name as key and range in which values to try for that hyperparameter as value.
  3. Maximize objective function by trying different hyperparameters combinations from hyperparameters search space.
    • Here, we create an instance of BayesianOptimization and use it to maximize the return value of the objective function by trying different combinations of hyperparameters on the objective function. This process will look for hyperparameter values in the range where we are getting the maximum value of an objective function.
  4. Print best results after optimization

Installation

We can easily install bayes_opt by using pip or conda commands.

  • pip install bayesian-optimization
  • conda install -c conda-forge bayesian-optimization

We'll start by importing bayes_opt library.

In [1]:
import bayes_opt


import warnings
warnings.filterwarnings("ignore")

1. Simple Line Formula

In this section, we'll explain how we can use bayes_opt to maximize the simple line formula. We'll be trying to optimize line formula -1 * abs(5x-21). This formula will have a maximum value of 0 when 5x-21 evaluates to 0. The abs() function will make sure that the output of 5x-21 will always be positive. When we multiply the output of abs() function with -1, it'll turn the value to negative. This way the output of the function will be maximum when abs(5x-21) returns 0 as all other values will turn negative and will be less than 0. We'll be providing a range for parameter x to try and different values will be tried by the bayesian optimizer to find the maximum value of the function in fewer trials.

Define Objective Function

In this section, we have simply defined objective function which takes as input single parameter x and returns value of formula -1 * abs(5x - 21). Our bayesian optimizer will provide different values of x to this function so that value returned by it is maximized in less number of trials.

In [2]:
def objective(x):
    return -1 * abs(5*x - 21)

Define Search Space

In this section, we have defined parameters search space dictionary. As we have only one parameter to optimize in our function, our dictionary consists of only 1 entry. We have provided range (-1,5) so that optimizer will try values in that range for x.

In [3]:
search_space = {"x" : (-1, 5)}

Maximize Objective Function

In this section, we have actually defined an instance of BayesianOptimization optimizer and performed objective function maximization using it. In order to perform parameter search, we need to create an instance of BayesianOptimization first. Below we have given the definition of class.


  • BayesianOptimization(f,pbounds,random_state=None,verbose=2) - This constructor will take as input objective function as first parameter and parameters search space dictionary as second parameter. It'll then return an instance of BayesianOptimization which we can use to perform parameters optimization by calling maximize() method on it.
    • The f parameter accepts a reference to the objective function.
    • The pbounds parameter accepts a reference to parameters searches space dictionary.
    • The random_state accepts integer value and is used for reproducibility.
    • The verbose parameter accepts integer values 0,1, or 2. The 0 will silent results, 1 will print results only when the maximum is found and 2 will print all trials.

Important Methods of BayesianOptimization

  • maximize(init_points=5,n_iter=25,acq='ucb') - This method call actually tries different values of hyperparameters on objective function and stores results. It tries values in a way that function is maximized in less number of trials.
    • The init_points parameter accepts integer values specifying how many random parameters combinations to try. After this random exploration completes, more trials will be performed based on values of n_iter parameter. These random trials can help better decide in which range values to try so that output is maximized faster. The default value is 5 for this parameter which means it'll try 5 random parameters combinations first.
    • The n_iter parameter accepts integer value specifying how many parameters combinations to try after initial random trials are completed. These trials will concentrate on areas where maximum results are coming faster. The default value is 25 for this parameter.
    • The acq parameter accepts one of the three below-mentioned strings specifying which exploration strategy to use to try different parameters combinations. The default is 'ucb'.
      • 'ucb' - Upper Confidence Bound
      • 'ei' - Expected Improvement
      • 'poi' - Probability of Improvement

Below, we have first created an instance of BayesianOptimization with our objective function and parameters search space. We have then called maximize() method on BayesianOptimization to maximize value returned by objective function for 15 trials.

We can notice from the output that it prints results for 20 trials (5 random and 15 input). We can notice from the output that the function has achieved the maximum value after 20 trials. At trial 20, the value of x tried is 4.2 which will almost make the value of our function(5x-21) zero.

In [4]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123
                              )
In [8]:
optimizer.maximize(n_iter=15)
|   iter    |  target   |     x     |
-------------------------------------
|  1        | -5.106    |  3.179    |
|  2        | -17.42    |  0.7168   |
|  3        | -19.19    |  0.3611   |
|  4        | -9.461    |  2.308    |
|  5        | -4.416    |  3.317    |
|  6        | -4.0      |  5.0      |
|  7        | -0.1811   |  4.236    |
|  8        | -1.127    |  4.425    |
|  9        | -1.078    |  3.984    |
|  10       | -26.0     | -1.0      |
|  11       | -0.1448   |  4.171    |
|  12       | -0.03653  |  4.193    |
|  13       | -13.09    |  1.583    |
|  14       | -2.611    |  3.678    |
|  15       | -2.578    |  4.716    |
|  16       | -0.1449   |  4.171    |
|  17       | -7.308    |  2.738    |
|  18       | -1.767    |  3.847    |
|  19       | -1.764    |  4.553    |
|  20       | -0.01521  |  4.203    |
=====================================

Printing Best Results

In this section, we are printing the results of our bayesian optimization process. The BayesianOptimization optimizer object has some attributes which we can explore to retrieve the results of our optimization process.

The max attribute holds a dictionary that has information about parameter settings that generated the maximum value of our objective function and maximum value as well. We have below printed which parameter setting gave maximum value.

We can access information about each setting tried and objective value for those settings by using res attribute of BayesianOptimization optimizer. In the next cell, we have printed the last 10 optimization settings tried.

In [9]:
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value      : {}".format(optimizer.max["target"]))
Best Parameter Setting : {'x': 4.203041946978986}
Best Target Value      : -0.015209734894927607
In [10]:
results = optimizer.res

results[-10:]
Out[10]:
[{'target': -0.14476294903148101, 'params': {'x': 4.171047410193704}},
 {'target': -0.03652678561840972, 'params': {'x': 4.192694642876318}},
 {'target': -13.085615967093661, 'params': {'x': 1.5828768065812677}},
 {'target': -2.610947786025889, 'params': {'x': 3.677810442794822}},
 {'target': -2.5780171943304673, 'params': {'x': 4.7156034388660935}},
 {'target': -0.1449457566183341, 'params': {'x': 4.1710108486763335}},
 {'target': -7.307649075358398, 'params': {'x': 2.73847018492832}},
 {'target': -1.767221827875474, 'params': {'x': 3.846555634424905}},
 {'target': -1.7638687020022559, 'params': {'x': 4.552773740400451}},
 {'target': -0.015209734894927607, 'params': {'x': 4.203041946978986}}]

Rerunning Maximization Process for Few More Steps

We can also call maximize() function more than once with different values of n_iter and init_points asking it to try more parameter settings if we are not satisfied with the results from the previous call to it.

Below we have called maximize() again with init_points set to 0 and n_iter set to 10. This will ask it to not try any random trials but try 10 more trials based on previous trials.

We can notice from the results that the value of x has stuck to 4.2 as this is the value where the objective function returns the maximum value.

In [73]:
optimizer.maximize(init_points=0, n_iter=10)
|   iter    |  target   |     x     |
-------------------------------------
|  21       | -22.33    | -0.2659   |
|  22       | -0.01649  |  4.203    |
|  23       | -0.01625  |  4.203    |
|  24       | -0.01594  |  4.203    |
|  25       | -0.01568  |  4.203    |
|  26       | -0.0155   |  4.203    |
|  27       | -0.01536  |  4.203    |
|  28       | -0.01526  |  4.203    |
|  29       | -0.01518  |  4.203    |
|  30       | -0.01512  |  4.203    |
=====================================

Printing Results

Below we have printed the best results after trying 10 more parameters combinations.

In [74]:
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value      : {}".format(optimizer.max["target"]))
Best Parameter Setting : {'x': 4.203024332590308}
Best Target Value      : -0.015121662951543158

Changing Parameter Bounds

We can also change the parameter range for any parameter if we are not getting good results from our current set ranges. The BayesianOptimization provides a method named set_bounds() which accepts dictionary with parameter search space. It'll replace search space for parameters mentioned in this dictionary.

Below we have replaced our search space for parameter x to the new range which is (-5,-1). We know that we won't get good results by trying values in this range but this is for explanation purposes.

In [58]:
optimizer.set_bounds(new_bounds={"x": (-5, -1)})

Maximize Objective Function with New Bounds

Below we have called maximize() again to run for 10 trials. It'll now try values in the new range and record results. We can notice from the results that they are not good compared to our previous trials.

In [59]:
optimizer.maximize(
    init_points=0,
    n_iter=10,
)
|   iter    |  target   |     x     |
-------------------------------------
|  31       | -46.0     | -5.0      |
|  32       | -35.33    | -2.866    |
|  33       | -30.32    | -1.865    |
|  34       | -40.3     | -3.86     |
|  35       | -28.04    | -1.408    |
|  36       | -32.76    | -2.352    |
|  37       | -37.74    | -3.349    |
|  38       | -26.87    | -1.174    |
|  39       | -42.98    | -4.395    |
|  40       | -26.28    | -1.055    |
=====================================

Below we have printed the best results for all trials. We can notice that it has still maintained the value of x at 4.2 which gave the best results. All trials which we performed after setting a new bound to x were not able to improve results further.

In [60]:
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value      : {}".format(optimizer.max["target"]))
Best Parameter Setting : {'x': 4.203024332590308}
Best Target Value      : -0.015121662951543158

2. Regression Scikit-Learn

In this section, we'll explain how we can use bayes_opt with scikit-learn to find the best hyperparameters for linear regression model solving regression tasks. We'll be using the Boston housing dataset available from scikit-learn for our example. We'll be following the same steps which we had listed earlier for using bayes_opt.

If you are interested in learning about regression using scikit-learn then please feel free to check our tutorial on the same which explains it with simple examples.

Load Dataset

In this section, We have loaded the Boston housing dataset available from scikit-learn. It has information houses in Boston like the number of bedrooms, the crime rate in the area, tax rate, etc. The target variable of the dataset is the median value of homes in 1000 dollars. As the target variable is a continuous variable, this will be a regression problem.

Below we have loaded our Boston hosing dataset as variable X and Y. The variable X has data for each feature and variable Y has target variable values. We have then divided the dataset into the train (80%) and test (20%) sets.

In [115]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

X, Y = datasets.load_boston(return_X_y=True)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[115]:
((404, 13), (102, 13), (404,), (102,))

Define Objective Function

In this section, we have created an objective function that we'll be using for our regression task. We'll be using Ridge regression model available from scikit-learn to train our training dataset. We'll be tunning 3 hyperparameters of the Ridge model.

  • alpha
  • fit_intercept
  • solver

Our objective function takes as input these 3 hyperparameters. It then creates an instance of the Ridge model using the values of these 3 hyperparameters.

The bayes_opt library can provide only float values as hyperparameters hence we have included little logic which converts these float values to actual hyperparameters values. We have created a list for values accepted by hyperparameters fit_intercept and solver. We have then converted the float value provided by the optimizer for these parameters to integer and used these integer values to retrieve hyperparameters value using integer indexing of array.

After creating Ridge regression model, we train it on train data. At last, we return R^2 score calculated on test data using a trained regression model. The R^2 score has value in the range (-1,1) and 1 being the best score. We want our optimizer to find the value of the score as high as possible for better performance of the model.

If you are interested in learning about R^2 score and other metrics available from scikit-learn then please feel free to check our tutorial on the same. It covers the majority of ML metrics.

In [116]:
from sklearn.linear_model import Ridge

fi_range = [True, False]
solvers = ["svd", "cholesky", "lsqr", "sparse_cg", "sag", "saga"]

def objective(alpha, fit_intercept, solver):
    regressor = Ridge(alpha=alpha,
                      fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
                      solver=solvers[int(solver)],
                      random_state=123)

    regressor.fit(X_train, Y_train)

    return regressor.score(X_test, Y_test)

Define Search Space

Below we have declared hyperparameters search space for our regression model. We have set alpha parameter range (0.5,5) which will try different float values in this range. The range for fit_intercept and solver is set as (0,1) and (0,5) which will try different float values in these ranges. The float values tried for these parameters will be converted to an integer and then the proper value for them will be selected using integer indexing. This logic is included inside of our objective function above.

In [117]:
search_space = {
    "alpha": (0.5, 5),
    "fit_intercept": (0, 1),
    "solver": (0, 5)
}

Maximize Objective Function

In this section, we have first created an optimizer using BayesianOptimization constructor by giving it an objective function and hyperparameters search space. We have then called maximize() method with default parameters on the optimizer instance. This will run the optimization process for a total of 30 (5 random + 25 normal) iterations. It'll try 30 different combinations of hyperparameters on the objective function.

In [118]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123
                              )
In [119]:
optimizer.maximize()
|   iter    |  target   |   alpha   | fit_in... |  solver   |
-------------------------------------------------------------
|  1        |  0.6428   |  3.634    |  0.2861   |  1.134    |
|  2        |  0.4664   |  2.981    |  0.7195   |  2.116    |
|  3        |  0.4665   |  4.913    |  0.6848   |  2.405    |
|  4        |  0.6278   |  2.265    |  0.3432   |  3.645    |
|  5        |  0.6448   |  2.474    |  0.05968  |  1.99     |
|  6        |  0.6278   |  2.273    |  0.2888   |  3.676    |
|  7        |  0.6466   |  1.846    |  0.0      |  1.878    |
|  8        |  0.6449   |  2.419    |  0.0      |  1.112    |
|  9        |  0.6432   |  3.317    |  0.0      |  0.2696   |
|  10       |  0.6419   |  4.433    |  0.0      |  0.3416   |
|  11       |  0.605    |  3.888    |  1.0      |  0.05943  |
|  12       |  0.6272   |  1.183    |  0.0      |  3.048    |
|  13       |  0.6466   |  1.828    |  0.0      |  0.0      |
|  14       |  0.6505   |  0.963    |  0.0      |  0.7647   |
|  15       |  0.6057   |  1.1      |  0.9904   |  0.156    |
|  16       |  0.6539   |  0.5      |  0.0      |  1.757    |
|  17       |  0.6061   |  0.5      |  1.0      |  1.822    |
|  18       |  0.5398   |  1.234    |  0.8002   |  4.552    |
|  19       |  0.6053   |  2.45     |  0.7578   |  0.1352   |
|  20       |  0.6453   |  3.64     |  0.0      |  5.0      |
|  21       |  0.5155   |  4.018    |  1.0      |  5.0      |
|  22       |  0.6475   |  1.575    |  0.375    |  1.091    |
|  23       |  0.6453   |  2.736    |  0.0      |  5.0      |
|  24       |  0.6499   |  3.278    |  0.0      |  4.149    |
|  25       |  0.6535   |  0.5498   |  0.01677  |  0.07451  |
|  26       |  0.605    |  5.0      |  1.0      |  0.0      |
|  27       |  0.6437   |  0.5001   |  0.03359  |  2.682    |
|  28       |  0.4389   |  0.5      |  1.0      |  3.125    |
|  29       |  0.6498   |  5.0      |  0.0      |  4.604    |
|  30       |  0.6289   |  4.168    |  0.000473 |  3.858    |
=============================================================

Printing Best Results

Below we have retrieved the best performing hyperparameters setting using max attribute of the optimizer. We have then converted values for fit_intercept and solver to their actual values using integer indexing. We have also printed the best parameter combination and the value of the objective function.

In [120]:
alpha = optimizer.max["params"]["alpha"]
fit_intercept = fi_range[int(optimizer.max["params"]["fit_intercept"])]
solver = solvers[int(optimizer.max["params"]["solver"])]

print("Best Parameter Setting : {}".format({"alpha": alpha, "fit_intercept": fit_intercept, "solver": solver}))
print("Best R^2               : {}".format(optimizer.max["target"]))
Best Parameter Setting : {'alpha': 0.5, 'fit_intercept': True, 'solver': 'cholesky'}
Best R^2               : 0.653876172658708

Maximize Objective Function for Few More Iterations

Below we have called maximize() function again on optimizer for 2 random iterations and 5 normal iterations. We have tried 7 more iterations to check whether it can further improve results.

In [121]:
optimizer.maximize(init_points=2, n_iter=5)
|   iter    |  target   |   alpha   | fit_in... |  solver   |
-------------------------------------------------------------
|  31       |  0.6425   |  3.821    |  0.1825   |  0.8773   |
|  32       |  0.4389   |  2.892    |  0.5318   |  3.172    |
|  33       |  0.606    |  0.5217   |  0.6437   |  0.9112   |
|  34       |  0.5397   |  2.531    |  0.8454   |  4.563    |
|  35       |  0.6438   |  1.063    |  0.03629  |  2.269    |
|  36       |  0.6293   |  5.0      |  0.0      |  3.853    |
|  37       |  0.6414   |  5.0      |  0.0      |  0.9283   |
=============================================================

Printing Best Results Again

Below we have printed the best hyperparameters combination and objective function value again to verify that whether there is any change from previously printed results. We can notice from the output that is exactly the same as our previous call to maximize(). We can stop the optimization process if it’s not improving model performance further.

In [122]:
alpha = optimizer.max["params"]["alpha"]
fit_intercept = fi_range[int(optimizer.max["params"]["fit_intercept"])]
solver = solvers[int(optimizer.max["params"]["solver"])]

print("Best Parameter Setting : {}".format({"alpha": alpha, "fit_intercept": fit_intercept, "solver": solver}))
print("Best R^2               : {}".format(optimizer.max["target"]))
Best Parameter Setting : {'alpha': 0.5, 'fit_intercept': True, 'solver': 'cholesky'}
Best R^2               : 0.653876172658708

Create ML Model with Best Hyperparameters and Evaluate

In this section, we have created an instance of Ridge regression using the best hyperparameters values that we got using the bayesian optimization process. We have then fit the model with train data and evaluated R^2 score on both train/test datasets.

In [123]:
regressor = Ridge(alpha=alpha,
                      fit_intercept=fit_intercept,
                      solver=solver,
                      random_state=123)

regressor.fit(X_train, Y_train)

print("Train R^2 : {:.2f}".format(regressor.score(X_train, Y_train)))
print("Test  R^2 : {:.2f}".format(regressor.score(X_test, Y_test)))
Train R^2 : 0.76
Test  R^2 : 0.65

3. Classification using Scikit-Learn

In this section, we'll explain how we can use bayes_opt for optimizing hyperparameters for the classification task. We'll be using the wine dataset available from scikit-learn for this example. We'll be creating a logistic regression classification model and trying different hyperparameters combinations using the bayesian process to improve the performance of the model.

If you are interested in learning about classification using scikit-learn then please feel free to check our tutorial on the same which explains it with simple examples.

Load Dataset

In this section, we have loaded the wine dataset available from scikit-learn. The wine dataset has the measurement of ingredients used in the creation of three different types of wine. The measurement of ingredients is the features of our dataset and wine type is the target variable.

Below we have loaded the wine dataset from scikit-learn and divided it into the train (80%) and test (20%) sets.

In [22]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

X, Y = datasets.load_wine(return_X_y=True)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[22]:
((142, 13), (36, 13), (142,), (36,))

Define Objective Function

In this section, we have declared an objective function for our classification task. Our objective function takes values for 4 hyperparameters as input.

  • C
  • fit_intercept
  • solver
  • penalty

Our objective function first creates an instance of LogisticRegression with hyperparameters values provided to it. We have included logic to convert float values of hyperparameters using integer indexing. As we explained during the regression section the optimizer can only provide float values and some of our hyperparameters have values of different data types. To solve this, we are maintaining lists of possible values of hyperparameters, we are then converting float values to integer and using integer indexing on hyperparameters list to retrieve actual hyperparameter values.

After creating the model, we have fit it with train data and returned model accuracy on the test dataset. We'll be trying to maximize this test accuracy returned by the objective function.

In [23]:
from sklearn.linear_model import LogisticRegression

fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]

def objective(C, fit_intercept, solver, penalty):
    classifier = LogisticRegression(C=C,
                                    fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
                                    solver=solvers[1 if solver > 0.5 else 0],
                                    penalty=penalties[1 if penalty > 0.5 else 0],
                                    max_iter=1000,
                                    random_state=123)

    classifier.fit(X_train, Y_train)

    return classifier.score(X_test, Y_test)

Define Search Space

In this section, we have declared hyperparameters search space as a python dictionary. The keys of the dictionary are hyperparameter names and values are range from which to try values of hyperparameters. The range for hyperparameter C is set as (0.5,5) which means float values in this range will be tried for it. The range for the other three hyperparameters is set as (0,1) which will try different float values in this range and our objective function will convert float values to hyperparameter values using integer indexing.

In [24]:
search_space = {
    "C": (0.5, 5),
    "fit_intercept": (0, 1),
    "solver": (0, 1),
    "penalty": (0, 1)
}

Maximize Objective Function

In this section, we have first created an optimizer instance (BayesianOptimization) with objective function and hyperparameters search space. We have then called maximize() method on the optimizer to perform hyperparameters optimization. This call will try 30 different hyperparameters combinations (5 random & 25 normal) on an objective function to maximize its output.

In [25]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123
                              )
In [26]:
optimizer.maximize()
|   iter    |  target   |     C     | fit_in... |  penalty  |  solver   |
-------------------------------------------------------------------------
|  1        |  1.0      |  3.634    |  0.2861   |  0.2269   |  0.5513   |
|  2        |  0.9722   |  3.738    |  0.4231   |  0.9808   |  0.6848   |
|  3        |  1.0      |  2.664    |  0.3921   |  0.3432   |  0.729    |
|  4        |  1.0      |  2.474    |  0.05968  |  0.398    |  0.738    |
|  5        |  0.9722   |  1.321    |  0.1755   |  0.5316   |  0.5318   |
|  6        |  0.9722   |  3.606    |  0.335    |  0.2265   |  0.4817   |
|  7        |  0.9167   |  4.42     |  0.2176   |  0.9687   |  0.4987   |
|  8        |  1.0      |  3.26     |  0.7578   |  0.4993   |  0.9653   |
|  9        |  1.0      |  1.942    |  0.3435   |  0.2603   |  0.7113   |
|  10       |  0.9722   |  1.542    |  0.1625   |  0.6846   |  0.7381   |
|  11       |  1.0      |  1.697    |  0.2227   |  0.437    |  0.7365   |
|  12       |  1.0      |  2.188    |  0.2167   |  0.3378   |  0.797    |
|  13       |  0.9722   |  1.537    |  0.9108   |  0.8714   |  0.5372   |
|  14       |  0.9167   |  2.315    |  0.03605  |  0.8236   |  0.1951   |
|  15       |  1.0      |  1.47     |  0.8772   |  0.03958  |  0.4581   |
|  16       |  0.9167   |  4.741    |  0.8155   |  0.5035   |  0.4211   |
|  17       |  1.0      |  3.702    |  0.1645   |  0.2279   |  0.7312   |
|  18       |  1.0      |  4.771    |  0.01461  |  0.2439   |  0.5287   |
|  19       |  1.0      |  2.418    |  0.2862   |  0.1897   |  0.6593   |
|  20       |  1.0      |  2.483    |  0.2439   |  0.2071   |  0.9688   |
|  21       |  0.9167   |  2.452    |  0.3194   |  0.974    |  0.3718   |
|  22       |  1.0      |  3.314    |  0.3361   |  0.3055   |  0.8988   |
|  23       |  1.0      |  2.949    |  0.5992   |  0.1868   |  1.0      |
|  24       |  0.9722   |  0.9259   |  0.7947   |  0.7214   |  0.9917   |
|  25       |  0.9167   |  3.557    |  0.1829   |  0.5801   |  0.02207  |
|  26       |  0.9722   |  2.274    |  0.1631   |  0.08408  |  0.01162  |
|  27       |  1.0      |  1.318    |  0.4761   |  0.1736   |  0.5302   |
|  28       |  1.0      |  1.575    |  0.5546   |  0.05297  |  0.9072   |
|  29       |  1.0      |  3.111    |  0.5719   |  0.3557   |  0.3084   |
|  30       |  1.0      |  2.944    |  0.8494   |  0.2959   |  0.591    |
=========================================================================

In this section, we have printed the parameters combination that maximized the output of our objective function. We have also converted float values of hyperparameters fit_intercept, solver, and penalty to their actual values which were tried. We have then printed the best hyperparameters combination and objective function output as well.

In [27]:
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer.max["target"]))
Best Parameter Setting : {'C': 3.6341113351903775, 'fit_intercept': True, 'solver': 'lbfgs', 'penalty': 'l2'}
Best Accuracy          : 1.00

Create ML Model with Best Hyperparameters and Evaluate

In this section, we have created an instance of LogisticRegression using the best hyperparameters combination which we found out using our bayesian optimization process. We have then fit it on the training dataset. At last, we have printed the accuracy of the model on train and test datasets.

In [28]:
classifier = LogisticRegression(C=C,
                                fit_intercept=fit_intercept,
                                solver=solver,
                                penalty=penalty,
                                max_iter=1000,
                                random_state=123)

classifier.fit(X_train, Y_train)

print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test  Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
Train Accuracy : 0.99
Test  Accuracy : 1.00

4. Manual Optimization Loop (Suggest Parameters, Evaluate Objective Function and Register Results)

In this section, we'll explain how we can loop through hyperparameters settings suggested by the Bayesian optimization process. Till now, all our examples have called maximize() method of BayesianOptimization optimizer which performs loop through different hyperparameters combinations on our behalf. But, there are situations when we want more control in our hands and we want to perform some extra operations during various trials like saving weights of models if the training process is costly. In this situation, we can use other methods available from BayesianOptimization object which let us loop through hyperparameters combinations.

We'll be using the wine classification dataset from scikit-learn in our example hence we'll be reusing code from the classification section of our tutorial.

Load Dataset

In this section, we have loaded the wine dataset and divided it into train/test sets. The code is almost the same as that of the classification section.

In [29]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

X, Y = datasets.load_wine(return_X_y=True)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[29]:
((142, 13), (36, 13), (142,), (36,))

Define Objective Function

In this section, we have defined the objective function that we have used for our classification problem. We have reused the objective function from the classification section which tries to optimize 4 hyperparameters of a logistic regression model. Please feel free to check the classification section if you want to understand this function in detail.

In [30]:
from sklearn.linear_model import LogisticRegression

fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]

def objective(C, fit_intercept, solver, penalty):
    classifier = LogisticRegression(C=C,
                                    fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
                                    solver=solvers[1 if solver > 0.5 else 0],
                                    penalty=penalties[1 if penalty > 0.5 else 0],
                                    max_iter=1000,
                                    random_state=123)

    classifier.fit(X_train, Y_train)

    return classifier.score(X_test, Y_test)

Define Hyperparameters Search Space

Below we have defined hyperparameters search space for our problem. It is the same as that of the classification section.

In [31]:
search_space = {
    "C": (0.5, 5),
    "fit_intercept": (0, 1),
    "solver": (0, 1),
    "penalty": (0, 1)
}

Maximize Objective Function

In this section, we have first defined our optimizer BayesianOptimization using objective function and hyperparameters search space. We'll be using a different approach to maximize objective function for this example.

In [32]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123
                              )

Steps to Maximized Objective Function using Loop

In order to maximize objective function using a loop, we need to follow the below steps.

  1. Declare an instance of UtilityFunction which will be used to decide which optimization process to follow. It has almost the same parameters as that of maximize() function of an optimizer which we had used in our previous sections.
  2. Call suggest() method on optimizer instance by giving instance of UtilityFunction created in step 1. This function call will return a dictionary with one combination of hyperparameters.
  3. Evaluate objective function using hyperparameters combination returned by suggest() method.
  4. Register hyperparameters combination and result of the objective function by calling register() method on optimizer instance.
  5. Repeat steps 2-4 for as many iterations as you want.

Below we have first created an instance of UtilityFunction.

In [33]:
from bayes_opt import UtilityFunction

utility = UtilityFunction(kind="ucb", kappa=2.5, xi=0.0)

In this cell, we have called suggest() method of optimizer with an instance of UtilityFunction which returns a single hyperparameters combination which we have printed as well.

In [34]:
next_point_to_probe = optimizer.suggest(utility)
print("Next point to probe is:", next_point_to_probe)
Next point to probe is: {'C': 3.6341113351903775, 'fit_intercept': 0.28613933495037946, 'penalty': 0.2268514535642031, 'solver': 0.5513147690828912}

Now, we have evaluated objective function by using hyperparameters combination from the previous cell. We have also printed the result of the objective function using this setting.

In [35]:
target = objective(**next_point_to_probe)
print("Found the target value to be:", target)
Found the target value to be: 1.0

At last, we have registered hyperparameters combination which we tried on objective function and result of the objective function by calling register() method on the optimizer. This will help the optimizer make decisions about the next settings to be suggested.

In [36]:
optimizer.register(
    params=next_point_to_probe,
    target=target,
)

Here, we have explained how we can perform a loop of all the steps which we explained earlier. We have created a loop that tries 5 different combinations of hyperparameters on the objective function and registers the results of them.

In [37]:
for _ in range(5):
    next_point = optimizer.suggest(utility)
    target = objective(**next_point)
    optimizer.register(params=next_point, target=target)

    print("Hyperparameters Setting : {}".format(next_point))
    print("Objective Value : {}\n".format(target))
Hyperparameters Setting : {'C': 0.5, 'fit_intercept': 1.0, 'penalty': 1.0, 'solver': 0.0}
Objective Value : 0.9166666666666666

Hyperparameters Setting : {'C': 4.419939330745741, 'fit_intercept': 0.21763375959672426, 'penalty': 0.9686630903040423, 'solver': 0.49872902617414683}
Objective Value : 0.9166666666666666

Hyperparameters Setting : {'C': 3.2600465731609907, 'fit_intercept': 0.7578140735743687, 'penalty': 0.49928794791785935, 'solver': 0.9652544519865794}
Objective Value : 1.0

Hyperparameters Setting : {'C': 2.986120906004026, 'fit_intercept': 0.4057189878869718, 'penalty': 0.0, 'solver': 0.6691695161316752}
Objective Value : 1.0

Hyperparameters Setting : {'C': 3.350543368370126, 'fit_intercept': 1.0, 'penalty': 0.0, 'solver': 0.0}
Objective Value : 1.0

In this section, we have retrieved the best results and hyperparameters setting from max property of the optimizer instance and printed them.

In [38]:
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer.max["target"]))
Best Parameter Setting : {'C': 3.6341113351903775, 'fit_intercept': True, 'solver': 'lbfgs', 'penalty': 'l2'}
Best Accuracy          : 1.00

Create ML Model with Best Hyperparameters and Evaluate

In this section, we have created an instance of logistic regression using the best hyperparameters setting that we got through our optimization process. We have then trained the model on train data and evaluated accuracy on both train/test sets.

In [39]:
classifier = LogisticRegression(C=C,
                                fit_intercept=fit_intercept,
                                solver=solver,
                                penalty=penalty,
                                max_iter=1000,
                                random_state=123)

classifier.fit(X_train, Y_train)

print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test  Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
Train Accuracy : 0.99
Test  Accuracy : 1.00

5. Guided Optimization

Until now, in all our examples, the hyperparameters combinations were suggested by Bayesian optimizer to maximize the objective function. We just need to give range to it and it'll try different values in that range for each hyperparameter. But there are situations where we know exactly which value of hyperparameters can give good results. For those situations, bayes_opt library let us suggest a hyperparameters combination. It provides a method named probe() through an instance of BayesianOptimization which lets us suggest hyperparameters combination. This is referred to as guided optimization as we are manually guiding the process which hyperparameters combination to try.

We'll be using the wine dataset from scikit-learn to explain guided optimization in this section. The code in this section reuses much of the code from the classification section hence there won't be a detailed explanation here. Please check the classification section above if you have come to this section directly but want to understand the code in detail.

Load Dataset

In this section, we have loaded the wine dataset from scikit-learn and divided it into train/test sets.

In [2]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

X, Y = datasets.load_wine(return_X_y=True)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[2]:
((142, 13), (36, 13), (142,), (36,))

Define Objective Function

In this section, we have defined an objective function that we'll be using for our classification problem. We have reused the function from the classification section.

In [3]:
from sklearn.linear_model import LogisticRegression

fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]

def objective(C, fit_intercept, solver, penalty):
    classifier = LogisticRegression(C=C,
                                    fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
                                    solver=solvers[1 if solver > 0.5 else 0],
                                    penalty=penalties[1 if penalty > 0.5 else 0],
                                    max_iter=1000,
                                    random_state=123)

    classifier.fit(X_train, Y_train)

    return classifier.score(X_test, Y_test)

Define Hyperparameters Search Space

In this section, we have defined hyperparameters’ search space for our problem.

In [4]:
search_space = {
    "C": (0.5, 5),
    "fit_intercept": (0, 1),
    "solver": (0, 1),
    "penalty": (0, 1)
}

Maximize Objective Function

In order to maximize objective function using guided optimization, we first need to create an optimizer as usual. Below we have created our optimizer BayesianOptimization using objective function and hyperparameters search space.

In [14]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123,
                                verbose=2
                              )

We can suggest hyperparameters combination using probe() method of the optimizer. We need to give hyperparameters combination to params parameter of the method.

Below we have suggested one combination of hyperparameters (as python dictionary) using probe() method. We can also suggest hyperparameters combination as a plain python list if we know the order of hyperparameters.

In [15]:
optimizer.probe(
    params={"C": 0.5, "fit_intercept": 0.7, "solver": 0.3, "penalty": 0.2},
    lazy=True,
)

We can retrieve the order of hyperparameters using space.keys attribute of optimizer instance.

In [16]:
print(optimizer.space.keys)
['C', 'fit_intercept', 'penalty', 'solver']

Below, we have suggested another three combinations using probe() method. We have provided combinations, two combinations as a python list and one as a python dictionary.

In [17]:
optimizer.probe(
    params=[0.5, 0.3, 0.7, 0.6],
    lazy=True,
)
In [18]:
optimizer.probe(
    params=[0.5, 0.7, 0.7, 0.6],
    lazy=True,
)
In [19]:
optimizer.probe(
    params={"C": 3.63, "fit_intercept": 0.7, "solver": 0.3, "penalty": 0.2},
    lazy=True,
)

At last, we need to call maximize() method of the optimizer to try combinations manually suggested by us. In order to try only our suggested combinations, we need to set init_points and n_iter parameters to 0.

If we want then we can provide values for these parameters, the maximize() method will first try all combinations we suggested through probe() method calls, then it'll try other combinations based on init_points and n_iter parameter values.

In [20]:
optimizer.maximize(init_points=0, n_iter=0)
|   iter    |  target   |     C     | fit_in... |  penalty  |  solver   |
-------------------------------------------------------------------------
|  1        |  1.0      |  0.5      |  0.7      |  0.2      |  0.3      |
|  2        |  0.9722   |  0.5      |  0.3      |  0.7      |  0.6      |
|  3        |  0.9722   |  0.5      |  0.7      |  0.7      |  0.6      |
|  4        |  1.0      |  3.63     |  0.7      |  0.2      |  0.3      |
=========================================================================

In this section, we have printed the results of our optimization process as usual.

In [21]:
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer.max["target"]))
Best Parameter Setting : {'C': 0.5, 'fit_intercept': False, 'solver': 'newton-cg', 'penalty': 'l2'}
Best Accuracy          : 1.00

Create ML Model with Best Hyperparameters and Evaluate

In this section, we have created an instance of logistic regression using the best parameters that we got using the bayesian optimization process. We have then trained the model on train data and evaluated the accuracy on train and test datasets.

In [63]:
classifier = LogisticRegression(C=C,
                                fit_intercept=fit_intercept,
                                solver=solver,
                                penalty=penalty,
                                max_iter=1000,
                                random_state=123)

classifier.fit(X_train, Y_train)

print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test  Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
Train Accuracy : 0.97
Test  Accuracy : 1.00

6. Saving and Resuming Optimization Process

There can be situations where we need to log the results of our optimization process so that we can resume the optimization process later from where we have left. The bayes_opt library provides us with this functionality. It let us define json logger which logs details about each optimization step to a json file. We can then reload optimizer history from this file so it has information about all trials that we had performed previously. The optimization process will resume by taking all previous steps tried.

We'll be using the wine dataset available from scikit-learn in this section to explain how we can save optimization results and reload them to resume the optimization process from where we have left last time. We'll be reusing much of the code that we have used in our classification section. Therefore, we have not included a detailed description of some parts of the code as their explanation is already present in the classification section. Please feel free to check it if you want to understand some code sections better but their description is not present here in detail.

Load Dataset

In this section, we have loaded the wine dataset from scikit-learn and divided it into train/test sets.

In [23]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

X, Y = datasets.load_wine(return_X_y=True)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Out[23]:
((142, 13), (36, 13), (142,), (36,))

Define Objective Function

In this section, we have defined the objective function that we'll use in this example. We have reused the objective function from the classification section again.

In [24]:
from sklearn.linear_model import LogisticRegression

fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]

def objective(C, fit_intercept, solver, penalty):
    classifier = LogisticRegression(C=C,
                                    fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
                                    solver=solvers[1 if solver > 0.5 else 0],
                                    penalty=penalties[1 if penalty > 0.5 else 0],
                                    max_iter=1000,
                                    random_state=123)

    classifier.fit(X_train, Y_train)

    return classifier.score(X_test, Y_test)

Define Search Space

In this section, we have defined hyperparameters’ search space over which to search for values.

In [25]:
search_space = {
    "C": (0.5, 5),
    "fit_intercept": (0, 1),
    "solver": (0, 1),
    "penalty": (0, 1)
}

Define Optimizer

In this section, we have defined our optimize BayesianOptimization using objective function and hyperparameters search space.

In [26]:
optimizer = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123,
                                verbose=2
                              )

Define Logger and Subscribe Logged with Optimizer

In order to log information about the optimization process, we need to create an instance of JSONLogger and subscribe it with optimizer so that it logs information about events to the logger.

Below we have first created an instance of JSONLogger with json file name classifier_opt.json. This is the file to which logging information about each optimization step will be stored.

We have then subscribed this logger to the optimization process by calling subscribe() method of the optimizer. We need to give two values to subscribe() method.

  1. Event to log
  2. Logger

There are three types of events available from bayes_opt.

  1. OPTIMIZATION_START - Records start of optimization process.
  2. OPTIMIZATION_STEP - Records each step.
  3. OPTIMIZATION_END - Records end.
In [27]:
from bayes_opt.logger import JSONLogger
from bayes_opt.event import Events

logger = JSONLogger(path="./classifier_opt.json")
optimizer.subscribe(Events.OPTIMIZATION_STEP, logger)

Maximize Objective Function

Now we have called maximize() method on the optimizer asking it to run the optimization process for 7 trials (2 random and 5 normal).

In [28]:
optimizer.maximize(
    init_points=2,
    n_iter=5,
)

Below we have printed the results of the optimization process which we performed in the previous step.

In [29]:
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer.max["target"]))
Best Parameter Setting : {'C': 3.6341113351903775, 'fit_intercept': True, 'solver': 'lbfgs', 'penalty': 'l2'}
Best Accuracy          : 1.00

Create New Optimizer

In this section, we have created another optimizer using our objective function and hyperparameters search space. We'll be loading optimization steps logged from our main optimizer into this optimizer.

In [30]:
optimizer2 = bayes_opt.BayesianOptimization(
                                f=objective,
                                pbounds=search_space,
                                random_state=123
                              )

Load New Optimizer using Logs of Previous Optimizer

We can load optimizer with steps from a log file using load_logs method available from bayes_opt.util module. We need to give it an optimizer instance and a list of log files from which to load logs.

Below we have loaded our second optimizer using logs of our first optimizer. We have not called maximize() a single time on our second optimizer hence it does not have any history. After loading the second optimizer, we have also printed the history to check whether it properly loaded all steps of the first optimizer. We can notice that it seems to have loaded all 7 trials which we had tried in our first optimizer.

In [31]:
from bayes_opt.util import load_logs

load_logs(optimizer2, logs=["./classifier_opt.json"]);
In [32]:
print("Loaded optimizer is now aware of {} points.".format(len(optimizer2.space)))

optimizer2.res
Loaded optimizer is now aware of 7 points.
Out[32]:
[{'target': 1.0,
  'params': {'C': 3.6341113351903775,
   'fit_intercept': 0.28613933495037946,
   'penalty': 0.2268514535642031,
   'solver': 0.5513147690828912}},
 {'target': 0.9722222222222222,
  'params': {'C': 3.7376103640350338,
   'fit_intercept': 0.42310646012446096,
   'penalty': 0.9807641983846155,
   'solver': 0.6848297385848633}},
 {'target': 0.9722222222222222,
  'params': {'C': 3.605733296787431,
   'fit_intercept': 0.33502834622040767,
   'penalty': 0.22646595798020908,
   'solver': 0.4817241754544501}},
 {'target': 0.9166666666666666,
  'params': {'C': 4.419939330745741,
   'fit_intercept': 0.21763375959672426,
   'penalty': 0.9686630903040423,
   'solver': 0.49872902617414683}},
 {'target': 1.0,
  'params': {'C': 3.6686220533003135,
   'fit_intercept': 0.22680377881594277,
   'penalty': 0.22731792201305523,
   'solver': 0.6357764847401262}},
 {'target': 1.0,
  'params': {'C': 3.8009105762021753,
   'fit_intercept': 0.36516281518269383,
   'penalty': 0.1677279324972666,
   'solver': 0.636278384826435}},
 {'target': 1.0,
  'params': {'C': 3.6097102799770973,
   'fit_intercept': 0.3510625142730247,
   'penalty': 0.06277675572068292,
   'solver': 0.7179928554138929}}]

In this section, we have printed the best results using a section optimizer to compare it with the first optimizer. We can notice that the results are the same as that of the first optimizer.

In [33]:
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer.max["target"]))
Best Parameter Setting : {'C': 3.6341113351903775, 'fit_intercept': True, 'solver': 'lbfgs', 'penalty': 'l2'}
Best Accuracy          : 1.00

Maximize New Optimizer

In this section, we have called maximize() method on our second optimizer to let it try more trials to check whether it can further improve results or not.

In [34]:
optimizer2.maximize()
|   iter    |  target   |     C     | fit_in... |  penalty  |  solver   |
-------------------------------------------------------------------------
|  1        |  1.0      |  3.61     |  0.3511   |  0.06278  |  0.718    |
|  2        |  1.0      |  3.61     |  0.3511   |  0.06278  |  0.718    |
|  3        |  1.0      |  2.664    |  0.3921   |  0.3432   |  0.729    |
|  4        |  1.0      |  2.474    |  0.05968  |  0.398    |  0.738    |
|  5        |  0.9722   |  1.321    |  0.1755   |  0.5316   |  0.5318   |
|  6        |  0.9722   |  1.153    |  0.4254   |  0.03152  |  0.429    |
|  7        |  0.9722   |  4.427    |  0.2272   |  0.969    |  0.5049   |
|  8        |  1.0      |  3.26     |  0.7578   |  0.4993   |  0.9653   |
|  9        |  1.0      |  2.461    |  0.02631  |  0.3782   |  0.751    |
|  10       |  0.9722   |  1.542    |  0.1625   |  0.6846   |  0.7381   |
|  11       |  1.0      |  3.785    |  0.3495   |  0.2052   |  0.6251   |
|  12       |  1.0      |  3.758    |  0.3998   |  0.2414   |  0.6519   |
|  13       |  0.9722   |  1.537    |  0.9108   |  0.8714   |  0.5372   |
|  14       |  1.0      |  2.409    |  0.06187  |  0.3992   |  0.7151   |
|  15       |  1.0      |  1.47     |  0.8772   |  0.03958  |  0.4581   |
|  16       |  0.9167   |  4.741    |  0.8155   |  0.5035   |  0.4211   |
|  17       |  0.9722   |  1.172    |  0.1189   |  0.4298   |  0.3037   |
|  18       |  1.0      |  3.646    |  0.3756   |  0.1018   |  0.7082   |
|  19       |  0.9722   |  3.245    |  0.07758  |  0.7472   |  0.9069   |
|  20       |  1.0      |  3.741    |  0.5936   |  0.4116   |  0.8685   |
|  21       |  1.0      |  3.597    |  0.3459   |  0.08042  |  0.7304   |
|  22       |  1.0      |  2.688    |  0.3519   |  0.3351   |  0.7552   |
|  23       |  0.9722   |  1.078    |  0.4672   |  0.1969   |  0.4418   |
|  24       |  1.0      |  3.265    |  0.792    |  0.4666   |  0.9276   |
|  25       |  1.0      |  3.715    |  0.4215   |  0.2338   |  0.6874   |
|  26       |  0.9722   |  2.274    |  0.1631   |  0.08408  |  0.01162  |
|  27       |  1.0      |  3.791    |  0.3529   |  0.1497   |  0.6344   |
|  28       |  0.9722   |  1.636    |  0.223    |  0.1516   |  0.223    |
|  29       |  1.0      |  3.695    |  0.4242   |  0.2555   |  0.6847   |
|  30       |  0.9722   |  3.345    |  0.892    |  0.643    |  0.9787   |
=========================================================================

At last, we have printed the best results again to check whether extra trials performed on the second optimizer have improved results any further or not. We can notice that results are the same as before hence the last call to maximize() on the second optimizer was not able to improve results further. We can end the optimization process if we are satisfied with the results or we can change the ranges of hyperparameters and try again further.

In [35]:
C = optimizer2.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer2.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer2.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer2.max["params"]["penalty"] > 0.5 else 0]

print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy          : {:.2f}".format(optimizer2.max["target"]))
Best Parameter Setting : {'C': 3.6341113351903775, 'fit_intercept': True, 'solver': 'lbfgs', 'penalty': 'l2'}
Best Accuracy          : 1.00

This ends our small tutorial explaining how we can use bayes_opt (Bayesian Optimization) library for maximizing any objective function. Please feel free to let us know your views in the comments section.

References

Other Hyperparameters Optimization Python Libraries



Sunny Solanki  Sunny Solanki