Updated On : Jan-17,2022 Tags sonnet, tensorflow, cnn
Sonnet: Convolutional Neural Networks (CNNs)

Sonnet: Convolutional Neural Networks (CNNs)

Sonnet is a deep learning library built on top of Tensorflow by Google DeepMind to simplify the development of deep neural networks. Sonnet let us design neural networks like Keras (Sequential API) and PyTorch (Extending sonnet.Module class). Sonnet simplifies and speeds up neural network design and lets developers/researchers experiment more cycles. We have already covered a simple tutorial explaining how to create fully-connect networks using Sonnet. Please feel free to check the below tutorial if you are looking for it. It'll also help as a background for this tutorial.

As a part of this tutorial, we'll be explaining how we can create convolutional neural networks (CNNs) using Sonnet. We'll explain different ways of building CNNs and training them with different optimizers. This tutorial won't go into details of the neural network like layers, activation functions, optimizers, etc. We expect that reader has background on these things to follow along. The tutorial is designed so that individuals can start using CNNs for their tasks using Sonnet. If you want to know the theory behind CNNs and their pros/cons then please feel free to check our blog from the below link.

Below, we have highlighted important sections of our tutorial to give an overview of the material covered.

Important Sections of Tutorial

  1. Simple Convolutional Neural Network
    • Load Dataset
    • Create CNN
      • CNN using Sequential API
      • CNN by Extending sonnet.Module Class
    • Train Model (SGD)
    • Make Predictions
    • Evaluate Model Performance
    • Train Model (Adam)
    • Make Predictions
    • Evaluate Model Performance
  2. Guide to Handle Channels First vs Channels Last

Installation

  • pip install dm-sonnet

Below, we have imported Sonnet and tensorflow libraries. We have also printed the version of both that we'll be using in our tutorial.

In [2]:
import sonnet as snt

print("Sonnet Version : {}".format(snt.__version__))
Sonnet Version : 2.0.0
In [3]:
import tensorflow as tf

print("Tensorflow Version : {}".format(tf.__version__))
Tensorflow Version : 2.6.2

1. Simple Convolutional Neural Network

In this section, we'll explain how we can create a simple convolutional neural network of 2 convolution layers to solve multi-class classification tasks. We'll be using the fashion MNIST dataset available from keras for our purpose which has images for 10 different fashion items.

Load Dataset

In this section, we have loaded the Fashion MNIST dataset available from keras. It has grayscale images of shape (28,28) for 10 different fashion items. The dataset is already divided into the train (60k images) and test (10k images) sets. After loading datasets, we have converted them to tensorflow tensor as Sonnet networks require tensors as input. We have then reshaped the dataset and introduced one extra dimension at the end of images to transform them from shape (28,28) to (28,28,1). The reason behind doing this transformation is that convolution layers work on channels of input images and transform them. The color or RGB images have 3 channels (Red, Green, and Blue) whereas grayscale images have no channel or we can say has only one channel as it is just a shade of black and white. We have introduced one extra dimension at the end of tensors to treat it like a channel dimension for grayscale images. After adding an extra dimension, we have also divided images by float value 255 to bring all values of tensors in the range [0,1]. By default, tensors have values in the range [0,255]. This transformation of values in the range [0,1] will help the optimization algorithm converge faster during training.

In [4]:
from tensorflow import keras
from sklearn.model_selection import train_test_split

(X_train, Y_train), (X_test, Y_test) = keras.datasets.fashion_mnist.load_data()

X_train, X_test, Y_train, Y_test = tf.convert_to_tensor(X_train, dtype=tf.float32),\
                                   tf.convert_to_tensor(X_test, dtype=tf.float32),\
                                   tf.convert_to_tensor(Y_train, dtype=tf.float32),\
                                   tf.convert_to_tensor(Y_test, dtype=tf.float32)

X_train, X_test = tf.reshape(X_train, (-1,28,28,1)), tf.reshape(X_test, (-1,28,28,1))

X_train, X_test = X_train/255.0, X_test/255.0

classes =  tf.unique(Y_train)

X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step
Out[4]:
(TensorShape([60000, 28, 28, 1]),
 TensorShape([10000, 28, 28, 1]),
 TensorShape([60000]),
 TensorShape([10000]))

Create CNN

In this section, we have created a CNN for our multi-class classification task. We have explained two different ways of creating a CNN using Sonnet. We'll explain how both work as well.

CNN using Sequential API

In this section, we have created a CNN using Sequential API of Sonnet. Sonnet has a class named Sequential that accepts a list of layers as input and creates a neural network from it. It'll then apply layers in sequence in which they were given as input. This way of creating a neural network is almost the same as that of Keras Sequential API.

Our CNN consists of 2 convolution layers. The first convolution layer has 32 output channels and a kernel size of (3,3). The second convolution layer has 16 output channels and a kernel size of (3,3). Both convolution layer has padding set to 'SAME' which will ensure that the height and width of the image are the same after applying convolution operation. It'll apply padding of zeros to maintain image dimensions. We have used Relu (Rectified Linear Unit) activation function after both convolution layers. After the application of both convolution layers, we have flattened the output using flatten layer to give it to a linear/dense layer. The flattened output is given to a linear/dense layer that has 10 output units. Our dataset has 10 different categories of fashion images, hence we have chosen output units for the last layer to 10. At last, we have applied softmax activation function to the output of the linear layer. The softmax activation function will map the 10 output values of the linear layer to probabilities in the range [0,1] and the sum of all 10 values will be 1. The 10 values for each sample will be mapped to range [0,1] and the sum of probabilities for each sample of data will be 1.

Our input data has shape (n_samples,28,28,1). The first convolution layer will transform data from shape (n_samples,28,28,1) to (n_samples,28,28,32). The second convolution layer will transform data from shape (n_samples,28,28,32) to (n_samples,28,28,16). The flatten layer will flatten data from shape (n_samples,28,28,16) to (n_samples,28 x 28 x 16) = (n_samples,12544). Then at last linear layer will transform shape from (n_samples,12544) to (n_samples,10).

After creating a CNN with Sequential API, we can simply call it by providing input data to perform a forward pass through it to make predictions. We have performed a forward pass through the network by giving a few data samples and printed output in the next cell below. The model parameters are available through attribute trainable_variables. We have printed the shapes of model parameters as well in the cell below.

In [5]:
cnn = snt.Sequential([
                        snt.Conv2D(output_channels=32, kernel_shape=(3,3), padding="SAME"),
                        tf.nn.relu,
                        snt.Conv2D(output_channels=16, kernel_shape=(3,3), padding="SAME"),
                        tf.nn.relu,

                        snt.Flatten(),
                        snt.Linear(10),
                        tf.nn.softmax,
                    ])

cnn
Out[5]:
Sequential(
    layers=[Conv2D(output_channels=32, kernel_shape=(3, 3)),
            <function relu at 0x7f41709eff80>,
            Conv2D(output_channels=16, kernel_shape=(3, 3)),
            <function relu at 0x7f41709eff80>,
            Flatten(),
            Linear(output_size=10),
            <function softmax_v2 at 0x7f412dbee0e0>],
)
In [6]:
cnn(X_train[:5])
2022-01-20 11:21:45.443828: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005
Out[6]:
<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[0.10708603, 0.12154393, 0.11793932, 0.07805917, 0.09452628,
        0.0881447 , 0.0973983 , 0.08624125, 0.08782636, 0.12123458],
       [0.11654177, 0.0949267 , 0.11432538, 0.09900108, 0.09545408,
        0.07769915, 0.08637305, 0.09383278, 0.07463612, 0.14720987],
       [0.10034525, 0.10150975, 0.10733461, 0.10217309, 0.10231724,
        0.09843194, 0.09513237, 0.09644447, 0.09133261, 0.1049786 ],
       [0.09450637, 0.09789043, 0.11658692, 0.09337962, 0.10274986,
        0.09639522, 0.100873  , 0.08864614, 0.09153051, 0.11744192],
       [0.09605057, 0.10369902, 0.11335815, 0.09795629, 0.10283177,
        0.0888365 , 0.09710021, 0.09670954, 0.0829699 , 0.12048808]],
      dtype=float32)>
In [7]:
for tensor in cnn.trainable_variables:
    print("{} : {}".format(tensor.name, tensor.shape))
conv2_d/b:0 : (32,)
conv2_d/w:0 : (3, 3, 1, 32)
conv2_d/b:0 : (16,)
conv2_d/w:0 : (3, 3, 32, 16)
linear/b:0 : (10,)
linear/w:0 : (12544, 10)
2022-01-20 11:21:51.067816: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
CNN by Extending sonnet.Module Class

In this section, we have explained the second way of creating a CNN using Sonnet. Here, we'll create a CNN by extending sonnet.Module class. This approach almost seems like PyTorch approach of creating neural networks. We have created a CNN with the same layers as we had created with Sequential API in the previous section.

In order to create a CNN this way, we need to implement two methods.

  1. init() - In this method, we initialize the layers of our neural networks.
  2. call() - In this method, we perform forward pass through input data using layers defined in init() method. The method takes data as input and returns predictions at last.

After defining CNN, we have also performed a forward pass through it with a few data samples for verification purposes. We have also printed the shapes of network parameters later.

In [8]:
class CNN(snt.Module):
    def __init__(self,name="CNN"):
        super(CNN, self).__init__(name=name)
        self.conv1 = snt.Conv2D(output_channels=32, kernel_shape=(3,3), padding="SAME")
        self.conv2 = snt.Conv2D(output_channels=16, kernel_shape=(3,3), padding="SAME")
        self.flatten = snt.Flatten()
        self.linear = snt.Linear(10)

    def __call__(self, X_batch):
        x = tf.nn.relu(self.conv1(X_batch))
        x = tf.nn.relu(self.conv2(x))

        x = self.flatten(x)
        x = self.linear(x)
        return tf.nn.softmax(x)
In [9]:
cnn = CNN()

cnn(X_train[:5])
Out[9]:
<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[0.11357409, 0.08918905, 0.13917989, 0.08398306, 0.09884588,
        0.08988895, 0.09523971, 0.10949329, 0.11008118, 0.07052492],
       [0.10054702, 0.0931831 , 0.12344179, 0.09434835, 0.08868913,
        0.07975365, 0.11883506, 0.11021546, 0.1263581 , 0.06462835],
       [0.10483281, 0.09424844, 0.11472621, 0.09259024, 0.0998148 ,
        0.09449553, 0.10488506, 0.10251755, 0.10972098, 0.08216829],
       [0.10319267, 0.09225941, 0.11982124, 0.1019851 , 0.09488805,
        0.0898211 , 0.11230278, 0.10527758, 0.10919022, 0.07126177],
       [0.11829449, 0.08447925, 0.1335475 , 0.0958213 , 0.09391752,
        0.08763787, 0.11239003, 0.10256291, 0.10419867, 0.0671504 ]],
      dtype=float32)>
In [10]:
for tensor in cnn.trainable_variables:
    print("{} : {}".format(tensor.name, tensor.shape))
CNN/conv2_d/b:0 : (32,)
CNN/conv2_d/w:0 : (3, 3, 1, 32)
CNN/conv2_d/b:0 : (16,)
CNN/conv2_d/w:0 : (3, 3, 32, 16)
CNN/linear/b:0 : (10,)
CNN/linear/w:0 : (12544, 10)

Train Model (SGD)

In this section, we are training our CNN. In order to train the network, we have defined a function that will be used to train the neural network.

The function takes data features (X), target values (Y), number of epochs, and batch size as input. It then executes the training loop number of epochs time. Each time, it calculates the start and end indexes of batches of data. It then divides data into batches using these indexes and loops through data in batches. For each batch of data, it performs a forward pass through the network to make predictions. It then calculates loss using predictions and actual target values. Both of these operations are done inside tf.GradientTape() context manager which will record the gradient of the loss with respect to model parameters. We then retrieve the model parameters using trainable_variables attribute. We give loss value and model parameters to gradient() method of GradientTape which will calculate the gradients of loss with respect to parameters and return them. We'll then call apply() method of the optimizer to update model parameters with gradients. We are recording a loss for each batch and printing the average loss for each epoch.

In [11]:
def TrainModelInBatches(X, Y, epochs, batch_size=32):
    for i in range(1, epochs+1):
        batches = tf.range((X.shape[0]//batch_size)+1) ### Batch Indices

        losses = [] ## Record loss of each batch
        for batch in batches:
            if batch != batches[-1]:
                start, end = int(batch*batch_size), int(batch*batch_size+batch_size)
            else:
                start, end = int(batch*batch_size), None

            X_batch, Y_batch = X[start:end], Y[start:end] ## Single batch of data
            with tf.GradientTape() as tape:
                preds = cnn(X_batch) ## Make Predictions on Batch of Data
                loss = loss_func(Y_batch, preds) ## Calculate Loss

                params = cnn.trainable_variables ## Retrieve Model Parameters
                grads = tape.gradient(loss, params) ## Calculate Gradients

                optimizer.apply(grads, params) ## Update Weights

            losses.append(loss) ## Record Loss

        print("CrossEntropyLoss : {:.3f}".format(tf.math.reduce_mean(tf.convert_to_tensor(losses))))

Now, we are actually training our CNN by initializing necessary variables and calling a function defined in the previous cell. We have initialized the learning rate to 0.001, epochs to 25, and batch size to 256. Then, we have initialized SGD optimizer with learning rate from sonnet.optimizers module.We have initialized CategoricalCrossentropy loss for our task as well. We'll be using cross entropy loss for our purposes. It's a commonly used loss function for multi-class classification tasks. At last, we have called our training function with the necessary parameters to train CNN. Please make a note that we are giving target values as one-hot encoded using to_categorical() function of keras. Our loss function requires target values to be one-hot encoded to calculate loss value. We can notice from the loss value getting printed after each epoch that our model seems to be doing a good job.

In [12]:
learning_rate = 1/1e3
epochs = 25
batch_size=256

optimizer = snt.optimizers.SGD(learning_rate=learning_rate)
loss_func = tf.losses.CategoricalCrossentropy()

TrainModelInBatches(X_train, tf.keras.utils.to_categorical(Y_train), epochs, batch_size)
CrossEntropyLoss : 1.507
CrossEntropyLoss : 0.816
CrossEntropyLoss : 0.702
CrossEntropyLoss : 0.650
CrossEntropyLoss : 0.617
CrossEntropyLoss : 0.595
CrossEntropyLoss : 0.578
CrossEntropyLoss : 0.564
CrossEntropyLoss : 0.553
CrossEntropyLoss : 0.543
CrossEntropyLoss : 0.535
CrossEntropyLoss : 0.527
CrossEntropyLoss : 0.521
CrossEntropyLoss : 0.515
CrossEntropyLoss : 0.509
CrossEntropyLoss : 0.504
CrossEntropyLoss : 0.499
CrossEntropyLoss : 0.495
CrossEntropyLoss : 0.491
CrossEntropyLoss : 0.487
CrossEntropyLoss : 0.484
CrossEntropyLoss : 0.481
CrossEntropyLoss : 0.477
CrossEntropyLoss : 0.475
CrossEntropyLoss : 0.472

Make Predictions

In this section, we are making predictions on train and test datasets using our trained CNN model. The function loop through input data in batches and makes predictions for each batch of data. It then combines predictions of all batches and returns them.

As the output of our neural network is 10 probabilities per sample, we need to include logic to retrieve the target class from these 10 probabilities. To do that, we'll retrieve the index of highest probability from 10 probabilities and predict that index as the target class. We'll need to do that for each data sample. To execute this logic, we have used argmax() method on the output of the neural network to predict the actual target class for each data sample.

In [13]:
def MakePredictions(input_data, batch_size=32):
    batches = tf.range((input_data.shape[0]//batch_size)+1) ### Batch Indices

    preds = []
    for batch in batches:
        if batch != batches[-1]:
            start, end = int(batch*batch_size), int(batch*batch_size+batch_size)
        else:
            start, end = int(batch*batch_size), None

        X_batch = input_data[start:end]

        if X_batch.shape[0] != 0:
            preds.append(cnn(X_batch))

    return preds
In [14]:
test_preds = MakePredictions(X_test, batch_size=batch_size)

test_preds = tf.concat(test_preds, axis=0) ## Combine predictions of all batches

test_preds = tf.argmax(test_preds, axis=1)

train_preds = MakePredictions(X_train, batch_size=batch_size)

train_preds = tf.concat(train_preds, axis=0) ## Combine predictions of all batches

train_preds = tf.argmax(train_preds, axis=1)

test_preds[:5], train_preds[:5]
Out[14]:
(<tf.Tensor: shape=(5,), dtype=int64, numpy=array([9, 2, 1, 1, 6])>,
 <tf.Tensor: shape=(5,), dtype=int64, numpy=array([9, 0, 1, 0, 3])>)

Evaluate Model Performance

In this section, we are evaluating the performance of our neural network by calculating the accuracy of train and test predictions. We have also calculated a classification report on test data which has information like precision, recall, and f1-score for each target class. To calculate performance metrics, we have used functions available from scikit-learn.

If you want to learn about various functions available through scikit-learn for various ML metrics then please feel free to check the below tutorial that covers the majority of them in detail.

In [15]:
from sklearn.metrics import accuracy_score

print("Train Accuracy : {:.3f}".format(accuracy_score(Y_train, train_preds)))
print("Test  Accuracy : {:.3f}".format(accuracy_score(Y_test, test_preds)))
Train Accuracy : 0.837
Test  Accuracy : 0.822
In [16]:
from sklearn.metrics import classification_report

print("Test Classification Report ")
print(classification_report(Y_test, test_preds))
Test Classification Report
              precision    recall  f1-score   support

         0.0       0.80      0.80      0.80      1000
         1.0       0.96      0.94      0.95      1000
         2.0       0.81      0.55      0.65      1000
         3.0       0.80      0.87      0.83      1000
         4.0       0.66      0.81      0.73      1000
         5.0       0.95      0.90      0.92      1000
         6.0       0.56      0.54      0.55      1000
         7.0       0.90      0.92      0.91      1000
         8.0       0.91      0.95      0.93      1000
         9.0       0.92      0.95      0.93      1000

    accuracy                           0.82     10000
   macro avg       0.83      0.82      0.82     10000
weighted avg       0.83      0.82      0.82     10000

Train Model (Adam)

In this section, we are training our CNN again but this time we have used Adam optimizer instead of SGD optimizer. All other parameter settings are exactly the same as our previous SGD training. We have done a comparison here to check whether Adam helps improve performance.

In [17]:
learning_rate = 1/1e4
epochs = 25
batch_size=256

optimizer = snt.optimizers.Adam(learning_rate=learning_rate)
loss_func = tf.losses.SparseCategoricalCrossentropy()

TrainModelInBatches(X_train, Y_train, epochs, batch_size)
CrossEntropyLoss : 0.450
CrossEntropyLoss : 0.412
CrossEntropyLoss : 0.384
CrossEntropyLoss : 0.365
CrossEntropyLoss : 0.349
CrossEntropyLoss : 0.337
CrossEntropyLoss : 0.326
CrossEntropyLoss : 0.317
CrossEntropyLoss : 0.308
CrossEntropyLoss : 0.300
CrossEntropyLoss : 0.293
CrossEntropyLoss : 0.287
CrossEntropyLoss : 0.281
CrossEntropyLoss : 0.275
CrossEntropyLoss : 0.270
CrossEntropyLoss : 0.265
CrossEntropyLoss : 0.260
CrossEntropyLoss : 0.256
CrossEntropyLoss : 0.252
CrossEntropyLoss : 0.248
CrossEntropyLoss : 0.244
CrossEntropyLoss : 0.240
CrossEntropyLoss : 0.237
CrossEntropyLoss : 0.234
CrossEntropyLoss : 0.230

Make Predictions

In this section, we have made predictions on our train and test datasets using CNN trained with Adam optimizer.

In [18]:
test_preds = MakePredictions(X_test, batch_size=batch_size)

test_preds = tf.concat(test_preds, axis=0) ## Combine predictions of all batches

test_preds = tf.argmax(test_preds, axis=1)

train_preds = MakePredictions(X_train, batch_size=batch_size)

train_preds = tf.concat(train_preds, axis=0) ## Combine predictions of all batches

train_preds = tf.argmax(train_preds, axis=1)

test_preds[:5], train_preds[:5]
Out[18]:
(<tf.Tensor: shape=(5,), dtype=int64, numpy=array([9, 2, 1, 1, 6])>,
 <tf.Tensor: shape=(5,), dtype=int64, numpy=array([9, 0, 0, 3, 1])>)

Evaluate Model Performance

In this section, we have evaluated the performance of our CNN by calculating the accuracy of train and test predictions. We have also calculated the classification report for test predictions. From the results, we can notice that performance is improved.

In [19]:
from sklearn.metrics import accuracy_score

print("Train Accuracy : {:.3f}".format(accuracy_score(Y_train, train_preds)))
print("Test  Accuracy : {:.3f}".format(accuracy_score(Y_test, test_preds)))
Train Accuracy : 0.920
Test  Accuracy : 0.894
In [20]:
from sklearn.metrics import classification_report

print("Test Classification Report ")
print(classification_report(Y_test, test_preds))
Test Classification Report
              precision    recall  f1-score   support

         0.0       0.80      0.89      0.84      1000
         1.0       0.99      0.98      0.98      1000
         2.0       0.82      0.85      0.84      1000
         3.0       0.87      0.92      0.89      1000
         4.0       0.80      0.87      0.84      1000
         5.0       0.97      0.97      0.97      1000
         6.0       0.80      0.59      0.68      1000
         7.0       0.95      0.94      0.95      1000
         8.0       0.98      0.96      0.97      1000
         9.0       0.95      0.96      0.96      1000

    accuracy                           0.89     10000
   macro avg       0.89      0.89      0.89     10000
weighted avg       0.89      0.89      0.89     10000

2. Guide to Handle Channels First vs Channels Last

In our example above, we used grayscale images. We introduced the channels dimension at the end to train CNN. As we said earlier the RGB or color images has 3 channels. There are two different ways to represent channels in the multi-dimensional array when representing images.

  1. Channels First - Here, we represent color image of (28,28) pixels as (3,28,28) dimension array.
  2. Channels Last - Here, we represent color image of (28,28) pixels as (28,28,3) dimension array.

In our example, we had kept the channel dimension at last. But the developer can face situations where the data has channels first format. To handle those situations, Conv2D layer of Sonnet has parameter named data_format. The default value of this parameter is NHWC.

  • N - Number of data samples.
  • H - Height of Image
  • W - Width of Image
  • C - Number of channels.

If we have data where channel details are present at the beginning then we can specify the value of data_format parameter as NCHW and Conv2D layer will work fine with that format.

Below, we have explained with examples how we can use different data formats. If we don't handle them properly then it can impact the results.

In [21]:
conv1 = snt.Conv2D(output_channels=16, kernel_shape=(3,3), padding="SAME")
conv2 = snt.Conv2D(output_channels=32, kernel_shape=(3,3), padding="SAME")

preds1 = conv1(tf.random.normal((50,28,28,1)))
preds2 = conv2(preds1)

print("Weights of First Conv Layer : {}".format(conv1.trainable_variables[1].shape))
print("Weights of Second Conv Layer : {}".format(conv2.trainable_variables[1].shape))

print("\nInput Shape               : {}".format((50,28,28,1)))
print("Conv Layer 1 Output Shape : {}".format(preds1.shape))
print("Conv Layer 2 Output Shape : {}".format(preds2.shape))
Weights of First Conv Layer : (3, 3, 1, 16)
Weights of Second Conv Layer : (3, 3, 16, 32)

Input Shape               : (50, 28, 28, 1)
Conv Layer 1 Output Shape : (50, 28, 28, 16)
Conv Layer 2 Output Shape : (50, 28, 28, 32)
In [22]:
conv1 = snt.Conv2D(output_channels=16, kernel_shape=(3,3), padding="SAME")
conv2 = snt.Conv2D(output_channels=32, kernel_shape=(3,3), padding="SAME")

preds1 = conv1(tf.random.normal((50,1,28,28)))
preds2 = conv2(preds1)

print("Weights of First Conv Layer : {}".format(conv1.trainable_variables[1].shape))
print("Weights of Second Conv Layer : {}".format(conv2.trainable_variables[1].shape))

print("\nInput Shape               : {}".format((50,1,28,28)))
print("Conv Layer 1 Output Shape : {}".format(preds1.shape))
print("Conv Layer 2 Output Shape : {}".format(preds2.shape))
Weights of First Conv Layer : (3, 3, 28, 16)
Weights of Second Conv Layer : (3, 3, 16, 32)

Input Shape               : (50, 1, 28, 28)
Conv Layer 1 Output Shape : (50, 1, 28, 16)
Conv Layer 2 Output Shape : (50, 1, 28, 32)
In [23]:
conv1 = snt.Conv2D(output_channels=16, kernel_shape=(3,3), padding="SAME", data_format="NCHW")
conv2 = snt.Conv2D(output_channels=32, kernel_shape=(3,3), padding="SAME", data_format="NCHW")

preds1 = conv1(tf.random.normal((50,1,28,28)))
preds2 = conv2(preds1)

print("Weights of First Conv Layer : {}".format(conv1.trainable_variables[1].shape))
print("Weights of Second Conv Layer : {}".format(conv2.trainable_variables[1].shape))

print("\nInput Shape               : {}".format((50,1,28,28)))
print("Conv Layer 1 Output Shape : {}".format(preds1.shape))
print("Conv Layer 2 Output Shape : {}".format(preds2.shape))
Weights of First Conv Layer : (3, 3, 1, 16)
Weights of Second Conv Layer : (3, 3, 16, 32)

Input Shape               : (50, 1, 28, 28)
Conv Layer 1 Output Shape : (50, 16, 28, 28)
Conv Layer 2 Output Shape : (50, 32, 28, 28)

This ends our small tutorial explaining how we can create convolutional neural networks (CNN) using Sonnet. Please feel free to let us know your views in the comments section.

References

Sunny Solanki  Sunny Solanki

 Want to Share Your Views? Have Any Suggestions?

If you want to

  • provide some suggestions on topic
  • share your views
  • include some details in tutorial
  • suggest some new topics on which we should create tutorials/blogs
Please feel free to let us know in the comments section below (Guest Comments are allowed). We appreciate and value your feedbacks.

If you like our work please give a thumbs-up to our article in the comments section below. You can also support us with a small contribution by clicking on Support Us link in the footer section.