Updated On : Dec-02,2019 Time Investment : ~20 mins

PyTorch Basics

Pytorch is an open-source machine learning/deep learning library designed in Python, C++ & CUDA by Facebook's artificial-intelligence research group. It's based on the library Torch designed in Lua.

Pytorch provides linear algebra the same as that of Numpy but can run on GPU as well as CPU. It also provides an implementation of various neural network layers, optimization functions, few default neural models and many other things. Pytorch has other supporting packages like torchvision and torchtext for handling images and text data.

Pytorch provides Tensors which are the same as numpy's multidimensional arrays but can be easily moved to GPUs. We'll be learning about the basics of tensors.

import torch
import sys
import numpy as np
print('Python Version : '+sys.version)
print('Pytorch Version : '+torch.__version__)
Python Version : 3.6.6 |Anaconda, Inc.| (default, Oct  9 2018, 12:34:16)
[GCC 7.3.0]
Pytorch Version : 1.0.0

Various ways to create torch tensors

empty_tensor = torch.empty(2,5) ## Please make a note that it intiates tensor with garbage values
print(empty_tensor, empty_tensor.dtype)
empty_tensor = torch.empty(2,5, dtype=torch.int32) ## One can provide data type same as that of numpy array creation
print(empty_tensor)
tensor([[-9.0035e+23,  3.0782e-41,  5.6052e-45,  0.0000e+00,         nan],
        [ 0.0000e+00,  1.1578e+27,  4.1666e+34,  5.3853e+08,  9.3150e-39]]) torch.float32
tensor([[1050516344,      32567, -398768224,      21967, 2030068992],
        [1308648816,    6647407, 1864397423,         48,          0]],
       dtype=torch.int32)
rand_tensor = torch.rand(3,3) ## Returns between [0,1)
print(rand_tensor, rand_tensor.dtype)
rand_tensor = torch.rand(3,3, dtype=torch.float64)
print(rand_tensor)
rand_tensor = torch.randn(2,4) ## Returns values between (-1,1)
print(rand_tensor)
rand_tensor = torch.randint(0,50, (2,4))
print(rand_tensor, rand_tensor.dtype)
tensor([[0.2726, 0.5872, 0.9229],
        [0.4583, 0.3478, 0.7291],
        [0.2445, 0.2438, 0.6744]]) torch.float32
tensor([[0.5062, 0.4037, 0.5659],
        [0.9345, 0.6080, 0.8464],
        [0.5945, 0.4958, 0.0328]], dtype=torch.float64)
tensor([[ 0.3224, -0.2225,  1.4369,  0.6437],
        [-0.8881, -0.0233,  1.9804,  0.6136]])
tensor([[ 5, 30, 26, 32],
        [ 4, 17, 26,  3]]) torch.int64
zero_tensor = torch.zeros(2,4)
print(zero_tensor, zero_tensor.dtype)
zero_tensor = torch.zeros(2,4, dtype=torch.int32)
print(zero_tensor)
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]]) torch.float32
tensor([[0, 0, 0, 0],
        [0, 0, 0, 0]], dtype=torch.int32)
range_tensor = torch.arange(0,10,1)
print(range_tensor)
range_tensor = torch.arange(1,10,2)
print(range_tensor)
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
tensor([1, 3, 5, 7, 9])
eye_tensor = torch.eye(3)
print(eye_tensor, eye_tensor.dtype)
eye_tensor = torch.eye(3,2)
print(eye_tensor, eye_tensor.dtype)
eye_tensor = torch.eye(3, dtype=torch.int32)
print(eye_tensor)
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]) torch.float32
tensor([[1., 0.],
        [0., 1.],
        [0., 0.]]) torch.float32
tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]], dtype=torch.int32)
tensor = torch.Tensor([1,2,3,4])
print(tensor)
tensor([1., 2., 3., 4.])
one_tensor = torch.ones(2,3)
print(one_tensor)
tensor([[1., 1., 1.],
        [1., 1., 1.]])
rand_tensor = torch.rand(2,3)
like_tensor = torch.rand_like(rand_tensor) ## *_like functions can take other tensor as input and outs new tensor of same size as that of input
print(like_tensor)
tensor([[0.0544, 0.5047, 0.2185],
        [0.2285, 0.8365, 0.5912]])
torch.from_numpy(np.eye(5))
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]], dtype=torch.float64)

Properties of Tensor & few basic useful methods

like_tensor.shape, like_tensor.size(),
(torch.Size([2, 3]), torch.Size([2, 3]))
tensor.dim() ## Number of dimensions
1
tensor.requires_grad
False
tensor.device
device(type='cpu')
tensor.diag() ## Creates diagonal tensor with elements of tensor
tensor([[1., 0., 0., 0.],
        [0., 2., 0., 0.],
        [0., 0., 3., 0.],
        [0., 0., 0., 4.]])
tensor.dtype
torch.float32
tensor = torch.rand(2,5)
print(tensor)
print(tensor.argmax()) ## Returns dimension as if applied on flat array
print(tensor.argmax(0))
print(tensor.argmax(1))
tensor([[0.5269, 0.2705, 0.5362, 0.5909, 0.5753],
        [0.3483, 0.3009, 0.8224, 0.5473, 0.9958]])
tensor(9)
tensor([0, 1, 1, 0, 1])
tensor([3, 4])
tensor.min(), tensor.max()
(tensor(0.2705), tensor(0.9958))
tensor.max(dim=1) ##Returns max lements at 1st dimension and location in that dimension as well for max element.
(tensor([0.5909, 0.9958]), tensor([3, 4]))
tensor.numpy()
array([[0.5269252 , 0.27052522, 0.53622794, 0.59089255, 0.57531863],
       [0.34834433, 0.30090606, 0.8224293 , 0.547343  , 0.9957636 ]],
      dtype=float32)
tensor = tensor.unsqueeze(dim=0) ## Adds one dimension to specified index
print(tensor.shape)
tensor = tensor.squeeze(dim=0) ## Removes extra dimension at specified index
print(tensor.shape)
torch.Size([1, 2, 5])
torch.Size([2, 5])
tensor.reshape(5,2), tensor.view(5,2)
(tensor([[0.5269, 0.2705],
         [0.5362, 0.5909],
         [0.5753, 0.3483],
         [0.3009, 0.8224],
         [0.5473, 0.9958]]), tensor([[0.5269, 0.2705],
         [0.5362, 0.5909],
         [0.5753, 0.3483],
         [0.3009, 0.8224],
         [0.5473, 0.9958]]))

Basic Operations

tensor = torch.rand(2,4)
tensor + 5
tensor([[5.5429, 5.8203, 5.7927, 5.5110],
        [5.1027, 5.4699, 5.4129, 5.6300]])
torch.add(tensor, tensor)
tensor([[1.0858, 1.6407, 1.5853, 1.0219],
        [0.2055, 0.9398, 0.8259, 1.2600]])
tensor.add(5)
tensor([[5.5429, 5.8203, 5.7927, 5.5110],
        [5.1027, 5.4699, 5.4129, 5.6300]])
tensor.add(tensor) ## This one adds first argument to tensor and then multiply with second
tensor([[1.0858, 1.6407, 1.5853, 1.0219],
        [0.2055, 0.9398, 0.8259, 1.2600]])
out_tensor = torch.empty(2,4)
torch.add(tensor, 7, out=out_tensor) ## Stores addition results in another tensor provided to out parameter.
print(out_tensor)
out_tensor = torch.empty(2,4)
torch.add(tensor, tensor, out=out_tensor)
print(out_tensor)
tensor([[7.5429, 7.8203, 7.7927, 7.5110],
        [7.1027, 7.4699, 7.4129, 7.6300]])
tensor([[1.0858, 1.6407, 1.5853, 1.0219],
        [0.2055, 0.9398, 0.8259, 1.2600]])
summation = torch.sum(tensor)
print(summation, summation.item())
tensor(4.2825) 4.282470226287842
tensor * 10
tensor([[5.4292, 8.2035, 7.9266, 5.1097],
        [1.0273, 4.6990, 4.1293, 6.3001]])

Note: Please make a note that all operations on tensor ending with '_' are in-place operations on tensor.

tensor = torch.rand(2,4)
tensor.add_(10)
tensor
tensor([[10.9791, 10.6864, 10.9307, 10.6719],
        [10.0180, 10.1153, 10.4360, 10.0266]])
tensor = torch.rand(2,3)
print(tensor.t())
tensor = torch.mul(tensor, tensor)  ## Element wise matrix multiplication
print(tensor)
tensor = torch.matmul(tensor, tensor.t()) ## Matrix dot product
print(tensor)
tensor([[0.8868, 0.8224],
        [0.8373, 0.6509],
        [0.2621, 0.0664]])
tensor([[0.7864, 0.7011, 0.0687],
        [0.6763, 0.4237, 0.0044]])
tensor([[1.1148, 0.8293],
        [0.8293, 0.6370]])
tensor.mean(), tensor.std(), tensor.var()
(tensor(0.8526), tensor(0.1969), tensor(0.0388))
tensor = torch.rand(2,4)
tensor.clamp(0.2,0.4) ## It's same as numpy's clip function.
tensor([[0.4000, 0.4000, 0.4000, 0.4000],
        [0.4000, 0.4000, 0.4000, 0.2000]])

Moving tensors between CPU and GPU

device = 'cuda' if torch.cuda.is_available() else 'cpu'
device
'cuda'
torch.device(device)
device(type='cuda')
tensor = torch.rand(2,4,device='cuda')
print(tensor.device)
cuda:0
tensor1 = torch.rand(2,5, device = torch.device(device))
print(tensor1.device)
tensor2 = torch.rand(2,5)
print(tensor2.device)
cuda:0
cpu
tensor3 = tensor2.to(device)
tensor3.device
device(type='cuda', index=0)

Operations can not be performed between tensor on GPU and CPU. Both should be on same device.

tensor1 + tensor2
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-39-1dccf545ca40> in <module>()
----> 1 tensor1 + tensor2

RuntimeError: expected type torch.cuda.FloatTensor but got torch.FloatTensor

Tensors can be easily moved from GPU to CPU.

tensor4 = tensor1.cpu()
print(tensor4.device)
cpu
Sunny Solanki  Sunny Solanki

YouTube Subscribe Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

Need Help Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

Share Views Want to Share Your Views? Have Any Suggestions?

If you want to

  • provide some suggestions on topic
  • share your views
  • include some details in tutorial
  • suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.