CoderzColumn : Machine Learning Tutorials (Page: 1)

Machine Learning Tutorials


The term 'machine learning' (ML) describes a system's capacity to gather and synthesize knowledge through extensive observation, as well as to develop and extend itself by picking up new information rather than having it preprogrammed into it. At CoderzColumn, you get a glimpse of the vast Machine Learning Field. We cover various concepts through tutorials. The concepts are:

  • Visualize ML Metrics
  • Gradient Boosted Decision Trees
  • Interpret Predictions Of ML Models
  • Hyperparameters Tuning / Optimization

For an in-depth understanding of the above concepts, check out the sections below.

Recent Machine Learning Tutorials


Tags optuna, hyperparameters-tuning
Optuna: Simple Guide to Hyperparameters Tuning / Optimization
Machine Learning

Optuna: Simple Guide to Hyperparameters Tuning / Optimization

A comprehensive guide on how to use Python library "optuna" to perform hyperparameters tuning / optimization of ML Models. Tutorial explains usage of Optuna with scikit-learn regression and classification models. Tutorial also covers data visualization and logging functionalities provided by Optuna in detail. Optuna also lets us prune underperforming hyperparameters combinations.

Sunny Solanki  Sunny Solanki
Tags hyperopt, hyperparameters-optimization
Hyperopt - Complete Guide to Hyperparameters Tuning / Optimization
Machine Learning

Hyperopt - Complete Guide to Hyperparameters Tuning / Optimization

A comprehensive guide on how to use Python library 'hyperopt' for hyperparameters tuning with simple examples. Tutorial explains how to fine-tune scikit-learn models solving regression and classification tasks. Tutorial is a complete guide to hyperparameters optimization of ML models in Python using 'hyperopt'.

Sunny Solanki  Sunny Solanki
Tags sklearn, naive-bayes
Scikit-Learn - Naive Bayes Classifiers
Machine Learning

Scikit-Learn - Naive Bayes Classifiers

A simple guide to use naive Bayes classifiers available from scikit-learn to solve classification tasks. All 5 naive Bayes classifiers available from scikit-learn are covered in detail. Tutorial first trains classifiers with default models on digits dataset and then performs hyperparameters tuning to improve performance. Various ML metrics are also evaluated to check performance of models.

Sunny Solanki  Sunny Solanki
Tags metrics, visualizations
Scikit-Plot: Visualize ML Model Performance Evaluation Metrics
Machine Learning

Scikit-Plot: Visualize ML Model Performance Evaluation Metrics

Tutorial explains how to use Python library scikit-plot to create data visualizations of various ML metrics. It is designed on top of matplotlib and provides charts for most commonly used ML metrics like confusion matrix, ROC AUC curve, Precision-Recall Curve, Elbow Method, Silhouette Analysis, Feature Importance, PCA Projection, etc.

Sunny Solanki  Sunny Solanki
Tags xgboost
XGBoost - An In-Depth Guide [Python API]
Machine Learning

XGBoost - An In-Depth Guide [Python API]

An in-depth guide on how to use Python ML library XGBoost which provides an implementation of gradient boosting on decision trees algorithm. Tutorial covers majority of features of library with simple and easy-to-understand examples. Apart from training models & making predictions, topics like cross-validation, saving & loading models, early stopping training to prevent overfitting, creating custom loss function & evaluation metrics, etc are covered in detail.

Sunny Solanki  Sunny Solanki
Tags catboost
CatBoost - An In-Depth Guide [Python API]
Machine Learning

CatBoost - An In-Depth Guide [Python API]

An in-depth guide on how to use Python ML library catboost which provides an implementation of gradient boosting on decision trees algorithm. Tutorial covers majority of features of library with simple and easy-to-understand examples. Apart from training models & making predictions, topics like hyperparameters tuning, cross-validation, saving & loading models, plotting training loss/metric values, early stopping training to prevent overfitting, creating custom loss function & evaluation metrics, etc are covered in detail.

Sunny Solanki  Sunny Solanki
Tags lightgbm, boosting-decision-trees
LightGBM - An In-Depth Guide [Python API]
Machine Learning

LightGBM - An In-Depth Guide [Python API]

An in-depth guide on how to use Python ML library LightGBM which provides an implementation of gradient boosting on decision trees algorithm. Tutorial covers majority of features of library with simple and easy-to-understand examples. Apart from training models & making predictions, topics like cross-validation, saving & loading models, plotting features importances, early stopping training to prevent overfitting, creating custom loss function & evaluation metrics, etc are covered in detail.

Sunny Solanki  Sunny Solanki
Tags scikit-learn, incremental-learning
Scikit-Learn - Incremental Learning for Large Datasets
Machine Learning

Scikit-Learn - Incremental Learning for Large Datasets

Tutorial explains how to use scikit-learn models/estimators with large datasets that do not fit into main memory of the computer. Majority of sklearn estimators can work with datasets that fit into main memory only. But there are few that can work with data in batches. All these models provide "partial_fit()" method that can be called more than once to update model weights. We have explained these kinds of models for solving ML tasks regression, classification, clustering, dimensionality reduction, and preprocessing.

Sunny Solanki  Sunny Solanki
Tags scoring_metrics, scikit-learn
Scikit-Learn - Model Evaluation & Scoring Metrics
Machine Learning

Scikit-Learn - Model Evaluation & Scoring Metrics

A brief guide on how to use various ML metrics/scoring functions available from "metrics" module of scikit-learn to evaluate model performance. It covers a guide on using metrics for different ML tasks like classification, regression, and clustering. It even explains how to create custom metrics and use them with scikit-learn API.

Sunny Solanki  Sunny Solanki
Tags eli5, interpret-ml-models
How to Use eli5 to Interpret ML Models and their Predictions [Python]?
Machine Learning

How to Use eli5 to Interpret ML Models and their Predictions [Python]?

A detailed guide on how to use Python library "eli5" to interpret/explain ML Models and their predictions. Tutorial explains simple sklearn ML Models trained on toy datasets to solve regression and classification tasks. It explains how to interpret predictions made by ML models on individual data examples. The usage of library is explained with structured data (tabular) as well as unstructured data (text).

Sunny Solanki  Sunny Solanki
Visualize Machine Learning Metrics

Visualize Machine Learning Metrics


Once our Machine Learning model is trained, we need some way to evaluate its performance. We need to know whether our model has generalized or not.

For this, various metrics (confusion matrix, ROC AUC curve, precision-recall curve, silhouette Analysis, elbow method, etc) are designed over time. These metrics help us understand the performance of our models trained on various tasks like classification, regression, clustering, etc.

Python has various libraries (scikit-learn, scikit-plot, yellowbrick, interpret-ml, interpret-text, etc) to calculate and visualize these metrics.

Interpret Predictions Of ML Models

Interpret Predictions Of ML Models


After training ML Model, we generally evaluate the performance of model by calculating and visualizing various ML Metrics (confusion matrix, ROC AUC curve, precision-recall curve, silhouette Analysis, elbow method, etc).

These metrics are normally a good starting point. But in many situations, they don’t give a 100% picture of model performance. E.g., A simple cat vs dog image classifier can be using background pixels to classify images instead of actual object (cat or dog) pixels.

In these situations, our ML metrics will give good results. But we should always be a little skeptical of model performance.

We can dive further deep and try to understand how our model is performing on an individual example by interpreting results. Various algorithms have been developed over time to interpret predictions of ML models and many Python libraries (lime, eli5, treeinterpreter, shap, etc) provide their implementation.

Hyperparameters Tuning / Optimization

Hyperparameters Tuning / Optimization


Machine Learning models generally have many parameters that need to be tuned to get the best performing model. E.g., a decision tree has parameters like tree depth, min samples per leaf, maximum leaf nodes, criteria to evaluate split, etc whose different values can be tried to get the best-performing decision tree model.

These parameters of ML models are generally referred to as Hyperparameters. Over the years, various approaches have been developed to get best performing Hyperparameters for ML Model. The process of finding best performing Hyperparameters is referred to as Hyperparameters tuning or Hyperparameters Optimization.

Python has many libraries (optuna, hyperopt, scikit-optimize, scikit-learn, etc) that let us perform Hyperparameters tuning to find best settings for our model.

Gradient Boosting

Gradient Boosting


Gradient boosting is a machine learning algorithm based on an ensemble of estimators and is used for regression and classification tasks. The ensemble consists of list of weak predictors / estimators whose predictions are combined to make final model predictions.

Majority of the time, these weak predictors are decision trees, and an algorithm is referred to as gradient-boosted trees or gradient-boosted decision trees. They are best suited for structured tabular datasets.

Python has many libraries (XGBoost, CatBoost, LightGBM, scikit-learn, etc) that provide an implementation of gradient boosting. Apart from implementation, these libraries provide many extra features like parallelization, GPU training, distributed training, command line training / evaluation, higher accuracy, etc.