PyTorch Lightning is a framework designed on the top of PyTorch to simplify the training and predictions tasks of neural networks. It helps developers eliminate loops to go through train data in batches to train networks, validation data in batches to evaluate model performance during training, and test data in batches to make predictions. Apart from this, it frees developers from moving models and data from CPUs to GPUs/TPUs and vice-versa. The developers can eliminate '.to(device)' code and it'll just work fine with PyTorch Lightning. It also frees developers from writing code to run training on multiple GPUs/TPUs in parallel. Just with a few settings, it'll run the training process in parallel on its own without the developer explicitly coding to run things in parallel.
As a part of this tutorial, we'll explain how we can create simple networks using PyTorch Lightning and automate training and predictions processes. The main aim of the tutorial is to get individuals started using PyTorch Lightning. As PyTorch Lightning is based on PyTorch, it requires PyTorch knowledge. If you want to learn about PyTorch and guide to design neural networks using it then please feel free to check the below links.
If you are a fan of scikit-learn then there is a package named Skorch that can give scikit-learn like API to your PyTorch models. Please feel free to check the below link if you want to learn about it.
Below, we have highlighted important sections of tutorial to give an overview of the material covered.
import pytorch_lightning as pl
print("PyTorch Lightning Version : {}".format(pl.__version__))
import torch
print("PyTorch Version : {}".format(torch.__version__))
In this section, we'll be loading the digits dataset available from scikit-learn as PyTorch dataset. PyTorch let us load data in batches using two classes.
Both classes are available from utils.data sub-module of Pytorch.
In order to incorporate our digits data into PyTorch and load it in batches, we have first created a class by extending a Dataset class which will hold our data. When we implement our custom Dataset class, we need to provide an implementation of three methods.
In our case, as our dataset is small, we have loaded it in the main memory inside of init() method. We have then divided the dataset into the train (80%) and test (20%) sets. Our implementation of init() method takes a few other arguments as well. The first argument specifies whether the dataset holds train data or test data specified as a string. The second and third arguments are transformations to be applied to features tensor and target tensors.
The len() method returns a number of samples in train data for the training dataset and the number of samples in test data for the test dataset. The __getitem() method returns a sample at a specified index based on train and test datasets. It returns a tuple of two values where the first value is a features tensor and the second value is the target tensor. It returns tensors after applying transformations. The transformations in our case transform arrays to torch tensors. Commonly applied transformations are cropping images, normalizing images, etc.
After defining the dataset class, we have initialized train and test Dataset objects. We have then wrapped both datasets inside of DataLoader object. We have also retrieved the single batch data from DataLoader objects and printed their shape for verification purposes.
from sklearn import datasets
from sklearn.model_selection import train_test_split
from torch.utils.data import Dataset
class DigitsDataset(Dataset):
def __init__(self, train_or_test="train", feat_transform=torch.tensor, target_transform=torch.tensor):
self.typ = train_or_test
X, Y = datasets.load_digits(return_X_y=True)
self.X_train, self.X_test, self.Y_train, self.Y_test = train_test_split(X, Y,
train_size=0.8,
stratify=Y,
random_state=123)
self.feat_transform = feat_transform
self.target_transform = target_transform
def __len__(self):
return len(self.Y_train) if self.typ == "train" else len(self.Y_test)
def __getitem__(self, idx):
if self.typ == "train":
x, y = self.X_train[idx], self.Y_train[idx]
else:
x, y = self.X_test[idx], self.Y_test[idx]
return self.feat_transform(x), self.target_transform(y)
train_dataset = DigitsDataset("train")
test_dataset = DigitsDataset("test")
from torch.utils.data import DataLoader
train_loader = DataLoader(train_dataset, batch_size=32)
test_loader = DataLoader(test_dataset, batch_size=32)
for X_batch, Y_batch in train_loader:
print(X_batch.shape, Y_batch.shape)
break
for X_batch, Y_batch in test_loader:
print(X_batch.shape, Y_batch.shape)
break
In this section, we have explained how we can create a model using PyTorch Lightening so that we can avoid training loops.
In order to create a network using PyTorch Lightning, we need to create a class that extends LightningModule class of pytorch lightning. Then, we need to implement a few methods in this class that will be used for training and making predictions. Below, we have highlighted important methods and what to implement in them. Some of the methods are optional and need to be implemented in special cases only. The superclass LightningModule has default implementation for the majority of methods.
There are other methods in LightningModule for advanced tasks which we had not covered here. Please feel free to check it from pytorch lightning docs on LightningModule.
Below, we have created a simple neural network by extending LightningModule class. We have implemented our neural network using Sequential API of PyTorch. We have kept model definition inside of init() method. We have then implemented forward() method that takes as input a single batch of data and performs forward pass of data through the network to make predictions.
Our network has 3 fully connected layers. The first layer takes data of shape (n_samples,64) and outputs (n_samples,16). The second linear layer takes input of shape (n_samples,16) and outputs (n_samples,32). The third and final linear layer takes data of shape (n_samples,32) and outputs (n_samples,10). The first two-layer has Relu (Rectified Linear Unit) as an activation function and the last layer has softmax activation function.
Then, in the next cell, we have initialized the classifier and printed it. Then, in the next cell after that, we have given random data of the expected input shape to initialized network for making predictions. We can verify from the output shape that the network works as expected.
from torch import nn
from torch.optim import Adam
class DigitsClassifier(pl.LightningModule):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(64,16),
nn.ReLU(),
nn.Linear(16,32),
nn.ReLU(),
nn.Linear(32,10),
nn.Softmax(dim=-1),
)
def forward(self, X_batch):
preds = self.model(X_batch)
return preds
classifier = DigitsClassifier()
classifier
preds = classifier(torch.rand(50,64))
preds.shape
preds[:5]
In the below cell, we have again implemented our neural network by extending LightningModule class but this time we have implemented the majority of the necessary methods. We have implemented our network in init() method like earlier and implemented forward pass in forward() method. The code is the same as the network that we had defined above. We have also defined cross entropy loss this time in init() method.
The training_step() method takes as input a single batch of data. It then makes predictions using our neural network, calculates cross-entropy loss, and returns it. The implementation of validation_step() and test_step() is almost exactly the same. The predict_step() method takes a batch of data as input, makes predictions on it, and returns them.
The optimizer for our network is defined in configure_optimizers() method. The method returns optimizer initialized with neural network parameters. We have defined Adam optimizer with a learning rate of 0.001.
from torch import nn
from torch.optim import Adam
class DigitsClassifier(pl.LightningModule):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(64,16),
nn.ReLU(),
nn.Linear(16,32),
nn.ReLU(),
nn.Linear(32,10),
nn.Softmax(dim=-1),
)
self.crossentropy_loss = nn.CrossEntropyLoss()
def forward(self, X_batch):
preds = self.model(X_batch)
return preds
def training_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self.model(X_batch.float())
loss_val = self.crossentropy_loss(preds, Y_batch.long())
self.log("Train Loss : ", loss_val)
return loss_val
def validation_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self.model(X_batch.float())
loss_val = self.crossentropy_loss(preds, Y_batch.long())
self.log("Validation Loss : ", loss_val)
return loss_val
def test_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self.model(X_batch.float())
loss_val = self.crossentropy_loss(preds, Y_batch.long())
self.log("Test Loss : ", loss_val)
return loss_val
def predict_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self.model(X_batch.float())
return preds
def configure_optimizers(self):
optimizer = Adam(self.model.parameters(), lr=1e-3)
return optimizer
In this section, we'll train the neural network that we created in the previous section. We have first initialized the train and test dataset objects and wrapped them inside of data loader objects. We have set the batch size to 64 in the data loader object which will give a batch of 64 samples to various methods of the network.
In order to train our neural network, we need to initialize the instance of Trainer class. This instance has a list of parameters that can help us with training. Below, we have highlighted some of the useful parameters/
In our case, we have initialized the Trainer object with max_epochs set to 30, accelerator as 'cpu' and log at every 20 steps.
In order to train our neural network, we need to call fit() method on Trainer object by giving neural network model, train data loader, and validation data loader. The validation data loader is optional. The call to fit() will start training and the progress bar will be displayed to show the progress of a single epoch. The loss will be printed at the end of an epoch.
We can separately call validate() and test() methods, if we want to get loss and other metrics of validation and test sets. The validate() and test() methods work exactly like fit() method and takes model followed by data loader object. We have called validate() and test() methods with our test dataset for testing purpose.
train_dataset = DigitsDataset("train")
test_dataset = DigitsDataset("test")
from torch.utils.data import DataLoader
train_loader = DataLoader(train_dataset, batch_size=64, num_workers=4)
test_loader = DataLoader(test_dataset, batch_size=64, num_workers=4)
classifier = DigitsClassifier()
#pl.seed_everything(42, workers=True)
trainer = pl.Trainer(max_epochs=30, accelerator="cpu", log_every_n_steps=20) #, deterministic=True)
trainer.fit(classifier, train_loader, test_loader)
trainer.validate(classifier, test_loader)
trainer.test(classifier, test_loader)
We can make predictions with PyTorch Lightning by calling predict() method on Trainer object by giving model and data loader objects. It'll return a list of predictions. We can combine them later. Below, we have made predictions on our test dataset by giving model and test loader to predict() method.
preds = trainer.predict(classifier, test_loader)
preds = torch.concat(preds)
preds = preds.argmax(axis=1)
preds[:5]
In this section, we have evaluated the performance of our neural network by calculating the accuracy of test predictions. We have also printed a classification report of test predictions that has information like precision, recall, and f1-score per target class.
Y_test = []
for x,y in test_loader:
Y_test.append(y)
Y_test = torch.concat(Y_test)
Y_test[:5]
from sklearn.metrics import accuracy_score
print("Test Accuracy : {:.3f}".format(accuracy_score(preds, Y_test)))
from sklearn.metrics import classification_report
print("Classification Report : ")
print(classification_report(preds, Y_test))
Below, we have shown one more example demonstrating how we can create neural networks using PyTorch Lightning. The majority of the methods are almost the same as our previous example. We have added implementation of two extra methods training_epoch_end() and validation_epoch_end(). Also, we haven't defined our neural network this time using Sequential API. Instead, we have defined layers of network in init() method. We have then called these layers inside of forward() method to perform forward pass-through data.
from torch import nn
from torch.optim import Adam
import torch.nn.functional as F
class DigitsClassifier(pl.LightningModule):
def __init__(self):
super().__init__()
self.lin1 = nn.Linear(64,16)
self.lin2 = nn.Linear(16,32)
self.lin3 = nn.Linear(32,10)
def forward(self, X_batch):
x = F.relu(self.lin1(X_batch))
x = F.relu(self.lin2(x))
x = F.relu(self.lin3(x))
return F.softmax(x, dim=-1)
def training_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self(X_batch.float())
loss_val = F.cross_entropy(preds, Y_batch.long())
self.log("Train Loss : ", loss_val)
return {"loss": loss_val}
def training_epoch_end(self,losses):
print(len(losses)) ## This will be same as number of training batches
def validation_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self(X_batch.float())
loss_val = F.cross_entropy(preds, Y_batch.long())
self.log("Validation Loss : ", loss_val)
return {"loss": loss_val}
def validation_epoch_end(self,losses):
print(len(losses)) ## This will be same as number of validation batches
def test_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self(X_batch.float())
loss_val = F.cross_entropy(preds, Y_batch.long())
self.log("Test Loss : ", loss_val)
return {"loss": loss_val}
def predict_step(self, batch, batch_idx):
X_batch, Y_batch = batch
preds = self(X_batch.float())
return preds
def configure_optimizers(self):
optimizer = Adam(self.parameters(), lr=1e-3, eps=1e-6)
return optimizer
This ends our small tutorial explaining how we can create a neural network using PyTorch Lightning. This will get individuals started with Lightning framework. Please feel free to let us know your views in the comments section.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to