EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 600+ Courses All in One Bundle
  • Login
Home Software Development Software Development Tutorials PyTorch Tutorial PyTorch Autograd
Secondary Sidebar
PyTorch Tutorial
  • PyTorch
    • PyTorch Image Classification
    • PyTorch Random
    • PyTorch Variable
    • PyTorch Activation Function
    • Python Formatted String
    • PyTorch GPU
    • PyTorch CUDA
    • PyTorch DataLoader
    • PyTorch LSTM
    • PyTorch Pad
    • PyTorch OpenCL
    • PyTorch Lightning
    • PyTorch SoftMax
    • PyTorch Flatten
    • PyTorch gan
    • PyTorch max
    • PyTorch pip
    • PyTorch Parameter
    • PyTorch Load Model
    • PyTorch Distributed
    • PyTorch BERT
    • PyTorch interpolate
    • PyTorch JIT
    • PyTorch expand
    • PyTorch AMD
    • PyTorch GRU
    • PyTorch rnn
    • PyTorch permute
    • PyTorch argmax
    • PyTorch SGD
    • PyTorch nn
    • PyTorch One Hot Encoding
    • PyTorch Tensors
    • What is PyTorch?
    • PyTorch MSELoss()
    • PyTorch NLLLOSS
    • PyTorch MaxPool2d
    • PyTorch Pretrained Models
    • PyTorch Squeeze
    • PyTorch Reinforcement Learning
    • PyTorch zero_grad
    • PyTorch norm
    • PyTorch VAE
    • PyTorch Early Stopping
    • PyTorch requires_grad
    • PyTorch MNIST
    • PyTorch Conv2d
    • Dataset Pytorch
    • PyTorch tanh
    • PyTorch bmm
    • PyTorch profiler
    • PyTorch unsqueeze
    • PyTorch adam
    • PyTorch backward
    • PyTorch concatenate
    • PyTorch Embedding
    • PyTorch Tensor to NumPy
    • PyTorch Normalize
    • PyTorch ReLU
    • PyTorch Autograd
    • PyTorch Transpose
    • PyTorch Object Detection
    • PyTorch Autoencoder
    • PyTorch Loss
    • PyTorch repeat
    • PyTorch gather
    • PyTorch sequential
    • PyTorch U-NET
    • PyTorch Sigmoid
    • PyTorch Neural Network
    • PyTorch Quantization
    • PyTorch Ignite
    • PyTorch Versions
    • PyTorch TensorBoard
    • PyTorch Dropout
    • PyTorch Model
    • PyTorch optimizer
    • PyTorch ResNet
    • PyTorch CNN
    • PyTorch Detach
    • Single Layer Perceptron
    • PyTorch vs Keras
    • torch.nn Module

PyTorch Autograd

PyTorch Autograd

Introduction to PyTorch Autograd

Automatic differentiation package or autograd helps in implementing automatic differentiation with the help of classes and functions where the differentiation is done on scalar-valued functions. Autograd is supported only for floating-point tensors. Grad = True keyword is required for the function to perform and a tensor in the code. This is basically an automatic differentiation engine that helps in managing neural network training. Nested functions are used in Autograd functionalities to perform the differentiation.

What is PyTorch Autograd?

Training in neural networks happens in forwarding and backward propagation. Correct output is guessed beforehand and the input is made to predict these guesses in forwarding propagation. Parameters are adjusted based on the error in backward propagation. Error derivatives are collected based on the gradients and the parameters are optimized using gradient descent in backward propagation. Either the sum of gradients is computed for the given tensors or the sum of gradients of outputs is computed based on inputs.

Create PyTorch Autograd

Two tensors should be created as the first step where grad = true is made. This makes autograd track all the movements.

import torch
x = torch.tensor([1., 2.], requires_grad=True)
y = torch.tensor([5., 3.], requires_grad=True)

Start Your Free Software Development Course

Web development, programming languages, Software testing & others

Another tensor must be created from the above tensors.

W = 3x3 – y2

Now considering x and y to be the parameters and W as the error, we write gradients with respect to the parameters and errors.

All in One Software Development Bundle(600+ Courses, 50+ projects)
Python TutorialC SharpJavaJavaScript
C Plus PlusSoftware TestingSQLKali Linux
Price
View Courses
600+ Online Courses | 50+ projects | 3000+ Hours | Verifiable Certificates | Lifetime Access
4.6 (86,560 ratings)

∂W/∂x = 9x2
∂W/∂y = -2y

These gradients are calculated and stored in .grad attribute. A gradient element should be passed as it is a vector and the gradient must be of the same shape as W.

dW/dW = 1

We can make this as a scalar as well.

ext_grad = torch.tensor([2., 2.])
W.backward(gradient=ext_grad)
print(9*x**2 == x.grad)
print(-2*y == y.grad)
a = torch.rand(3, 3)
b = torch.rand(3, 3)
c = torch.rand((3, 3), requires_grad=True)
x = a + b
print(f"Does `x` need gradients? : {x.requires_grad}")
y = a + c
print(f"Does `b` need gradients?: {y.requires_grad}")

Freezing parameters are done like this.

from torch import nn, optim
model_new = torchvision.models.resnet18(pretrained=True)
for parameter in model.parameters():
parameter.requires_grad = False
Linear layer is needed now.
models.fc = nn.Linear(256, 5)
Next step is to classify the optimizer.
Optimizer_req = optim.SGD(model.parameters(), lr=1e-5, momentum=0.5)

PyTorch Autograd explained

All the data records and operations executed are stored in Directed Acyclic Graph also called DAG which has function objects. Input tensors are considered as leaves and output tensors are considered as roots. All the gradients can be computed using the chain rule from roots to leaves. The requested operation is run to compute the tensor and the operation’s gradient function is maintained in DAG.

When backward() is called on the DAG root, the backward pass is started. Here the gradients are computed from all the .grad functions. They are stored in all the respective tensor’s .grad attribute and it is propagated to the leaf tensors using the chain rule in the tensor. Graphs are created from scratch that once the backward call happens, the graph is stopped and a new graph is populated. This is how the control flow statements are managed where the shape and size of each iteration are managed.

Torch.grad can track all the operations happening in the tensor when grad = true is set in the tensor. This also helps in computing DAG. When the grad = false is set, the operations are not tracked and the DAG is not drawn for those tensors. Also, the output must be a gradient only if at least one input is set as grad = true.

Frozen parameters are those that do not compute gradients. Hence, it is useful to freeze the parameters when we know before hand that these parameters are not required to calculate the gradients in the tensor. Also in finetuning, we freeze the model completely and computation is done only to those layers where predictions must be done on the new labels. Parameters used in the optimizer are only the bias and weights of the classifier.

PyTorch Autograd Examples

Forward pass defines the computational graph where nodes act as tensors and edges act as functions. Backpropagation will help us to compute gradients for all the tensors easily. The example explained here is the implementation of a sine wave with polynomial example.

import torch
import math
datatype = torch.float
device = torch.device("cpu")
a = torch.linspace(-math.pi, math.pi, 1500, device=device, datatype=datatype)
b = torch.sin(a)
m = torch.randn((), device=device, datatype=datatype, requires_grad=True)
n = torch.randn((), device=device, datatype=datatype, requires_grad=True)
o = torch.randn((), device=device, datatype=datatype, requires_grad=True)
p = torch.randn((), device=device, datatype=datatype, requires_grad=True)
lrning_rate = 1e-5
for k in range(1500):
b_pred = m + n * a + o * a ** 2 + p * a ** 3
loss_fn = (b_pred - b).pow(2).sum()
if k % 100 == 99:
print(k, loss_fn.item())
loss_fn.backward()
with torch.no_grad():
m -= lrning_rate * m.grad
n -= lrning_rate * n.grad
o -= lrning_rate * o.grad
p -= lrning_rate * p.grad
m.grad = None
n.grad = None
o.grad = None
p.grad = None
print(f'Result: b = {m.item()} + {n.item()} a + {o.item()} a^2 + {p.item()} a^3')
class Legend (torch.autograd.Function):
@staticmethod
def forward(ctx, ins):
ctx.save_for_backward(ins)
return 0.7 * (7 * ins ** 3 - 3 * ins)
@staticmethod
def backward(ctx, grad_outs):
ins, = ctx.saved_tensors
return grad_outs * 1.7 * (7 * input ** 3 - 1)
datatype = torch.float
device = torch.device("cpu")
a = torch.linspace(-math.pi, math.pi, 1500, device=device, datatype=datatype)
b = torch.sin(a)
m = torch.full((), 0.0, device=device, datatype=datatype, requires_grad=True)
n = torch.full((), -1.0, device=device, datatype=datatype, requires_grad=True)
o = torch.full((), 0.0, device=device, datatype=datatype, requires_grad=True)
p = torch.full((), 0.3, device=device, datatype=datatype, requires_grad=True)
for k in range(1500):
loss_fn.backward()
with torch.no_grad():
m -= learning_rate * m.grad
n -= learning_rate * n.grad
o -= learning_rate * o.grad
p -= learning_rate * p.grad

Conclusion

Autograd requires only small changes to the code present in PyTorch and hence gradient can be computed easily. Python and NumPy code can be easily differentiated using Autograd. Almost all Python features can be handled easily using Autograd and derivatives of child derivatives can be taken easily using the gradients and tensors in the code.

Recommended Articles

This is a guide to PyTorch Autograd. Here we discuss the introduction, What is PyTorch Autograd, Create PyTorch Autograd, Examples. You may also have a look at the following articles to learn more –

  1. What is PyTorch?
  2. PyTorch Versions
  3. Tensorflow vs Pytorch
  4. TypeScript while loop
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Java Tutorials
  • Python Tutorials
  • All Tutorials
Certification Courses
  • All Courses
  • Software Development Course - All in One Bundle
  • Become a Python Developer
  • Java Course
  • Become a Selenium Automation Tester
  • Become an IoT Developer
  • ASP.NET Course
  • VB.NET Course
  • PHP Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Software Development Course

C# Programming, Conditional Constructs, Loops, Arrays, OOPS Concept

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Software Development Course

Web development, programming languages, Software testing & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more