EDUCBA

EDUCBA

MENUMENU
  • Explore
    • Lifetime Membership
    • All in One Bundles
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign up
Home Software Development Software Development Tutorials PyTorch Tutorial PyTorch Autograd

PyTorch Autograd

Updated April 7, 2023

PyTorch Autograd

Introduction to PyTorch Autograd

An automatic differentiation package or autograd helps in implementing automatic differentiation with the help of classes and functions where the differentiation is done on scalar-valued functions. Autograd is supported only for floating-point tensors. Grad = True keyword is required for the function to perform and a tensor in the code. This is basically an automatic differentiation engine that helps in managing neural network training. Nested functions are used in Autograd functionalities to perform the differentiation.

Start Your Free Software Development Course

Web development, programming languages, Software testing & others

What is PyTorch Autograd?

Training in neural networks happens in forwarding and backward propagation. Correct output is guessed beforehand and the input is made to predict these guesses in forwarding propagation. Parameters are adjusted based on the error in backward propagation. Error derivatives are collected based on the gradients and the parameters are optimized using gradient descent in backward propagation. Either the sum of gradients is computed for the given tensors or the sum of gradients of outputs is computed based on inputs.

Create PyTorch Autograd

Two tensors should be created as the first step where grad = true is made. This makes autograd track all the movements.

import torch
x = torch.tensor([1., 2.], requires_grad=True)
y = torch.tensor([5., 3.], requires_grad=True)

Another tensor must be created from the above tensors.

W = 3x3 – y2

Now considering x and y to be the parameters and W as the error, we write gradients with respect to the parameters and errors.

∂W/∂x = 9x2
∂W/∂y = -2y

These gradients are calculated and stored in .grad attribute. A gradient element should be passed as it is a vector and the gradient must be of the same shape as W.

dW/dW = 1

We can make this as a scalar as well.

ext_grad = torch.tensor([2., 2.])
W.backward(gradient=ext_grad)
print(9*x**2 == x.grad)
print(-2*y == y.grad)
a = torch.rand(3, 3)
b = torch.rand(3, 3)
c = torch.rand((3, 3), requires_grad=True)
x = a + b
print(f"Does `x` need gradients? : {x.requires_grad}")
y = a + c
print(f"Does `b` need gradients?: {y.requires_grad}")

Freezing parameters are done like this.

from torch import nn, optim
model_new = torchvision.models.resnet18(pretrained=True)
for parameter in model.parameters():
    parameter.requires_grad = False
Linear layer is needed now. 
models.fc = nn.Linear(256, 5)

The next step is to classify the optimizer.

Optimizer_req = optim.SGD(model.parameters(), lr=1e-5, momentum=0.5)

Explanation of PyTorch Autograd

All the data records and operations executed are stored in Directed Acyclic Graph also called DAG which has function objects. Input tensors are considered as leaves and output tensors are considered as roots. All the gradients can be computed using the chain rule from roots to leaves. The requested operation is run to compute the tensor and the operation’s gradient function is maintained in DAG.

When backward() is called on the DAG root, the backward pass is started. Here the gradients are computed from all the .grad functions. They are stored in all the respective tensor’s .grad attribute and it is propagated to the leaf tensors using the chain rule in the tensor. Graphs are created from scratch that once the backward call happens, the graph is stopped and a new graph is populated. This is how the control flow statements are managed where the shape and size of each iteration are managed.

Torch.grad can track all the operations happening in the tensor when grad = true is set in the tensor. This also helps in computing DAG. When the grad = false is set, the operations are not tracked and the DAG is not drawn for those tensors. Also, the output must be a gradient only if at least one input is set as grad = true.

Frozen parameters are those that do not compute gradients. Hence, it is useful to freeze the parameters when we know before hand that these parameters are not required to calculate the gradients in the tensor. Also in finetuning, we freeze the model completely and computation is done only to those layers where predictions must be done on the new labels. Parameters used in the optimizer are only the bias and weights of the classifier.

PyTorch Autograd Examples

Forward pass defines the computational graph where nodes act as tensors and edges act as functions. Backpropagation will help us to compute gradients for all the tensors easily. The example explained here is the implementation of a sine wave with polynomial example.

Code:

import torch
import math
datatype = torch.float
device = torch.device("cpu")
a = torch.linspace(-math.pi, math.pi, 1500, device=device, datatype=datatype)
b = torch.sin(a)
m = torch.randn((), device=device, datatype=datatype, requires_grad=True)
n = torch.randn((), device=device, datatype=datatype, requires_grad=True)
o = torch.randn((), device=device, datatype=datatype, requires_grad=True)
p = torch.randn((), device=device, datatype=datatype, requires_grad=True)

lrning_rate = 1e-5
for k in range(1500):
b_pred = m + n * a + o * a ** 2 + p * a ** 3
    loss_fn = (b_pred - b).pow(2).sum()
    if k % 100 == 99:
        print(k, loss_fn.item())
    loss_fn.backward()

    with torch.no_grad():
        m -= lrning_rate * m.grad
        n -= lrning_rate * n.grad
        o -= lrning_rate * o.grad
        p -= lrning_rate * p.grad
        m.grad = None
        n.grad = None
        o.grad = None
        p.grad = None
print(f'Result: b = {m.item()} + {n.item()} a + {o.item()} a^2 + {p.item()} a^3')

class Legend (torch.autograd.Function):
    @staticmethod
    def forward(ctx, ins):
        ctx.save_for_backward(ins)
        return 0.7 * (7 * ins ** 3 - 3 * ins)

    @staticmethod
    def backward(ctx, grad_outs):
        ins, = ctx.saved_tensors
        return grad_outs * 1.7 * (7 * input ** 3 - 1)
datatype = torch.float
device = torch.device("cpu")
a = torch.linspace(-math.pi, math.pi, 1500, device=device, datatype=datatype)
b = torch.sin(a)
m = torch.full((), 0.0, device=device, datatype=datatype, requires_grad=True)
n = torch.full((), -1.0, device=device, datatype=datatype, requires_grad=True)
o = torch.full((), 0.0, device=device, datatype=datatype, requires_grad=True)
p = torch.full((), 0.3, device=device, datatype=datatype, requires_grad=True)
for k in range(1500):


    loss_fn.backward()

    with torch.no_grad():
        m -= learning_rate * m.grad
        n -= learning_rate * n.grad
        o -= learning_rate * o.grad
        p -= learning_rate * p.grad

Conclusion

Autograd requires only small changes to the code present in PyTorch and hence gradient can be computed easily. Python and NumPy code can be easily differentiated using Autograd. Almost all Python features can be handled easily using Autograd and derivatives of child derivatives can be taken easily using the gradients and tensors in the code.

Recommended Articles

We hope that this EDUCBA information on “PyTorch Autograd” was beneficial to you. You can view EDUCBA’s recommended articles for more information.

  1. What is PyTorch?
  2. PyTorch Versions
  3. Tensorflow vs Pytorch
  4. TypeScript while loop
ADVERTISEMENT
All in One Excel VBA Bundle
500+ Hours of HD Videos
15 Learning Paths
120+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
Financial Analyst Masters Training Program
2000+ Hours of HD Videos
43 Learning Paths
550+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
All in One Data Science Bundle
2000+ Hour of HD Videos
80 Learning Paths
400+ Courses
Verifiable Certificate of Completion
Lifetime Access
ADVERTISEMENT
All in One Software Development Bundle
5000+ Hours of HD Videos
149 Learning Paths
1050+ Courses
Verifiable Certificate of Completion
Lifetime Access
Primary Sidebar
Footer
Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
  • Blog as Guest
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Software Development Course

Web development, programming languages, Software testing & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

🚀 Cyber Monday Reloaded Price Drop! All in One Universal Bundle (3700+ Courses) @ 🎁 90% OFF - Ends in ENROLL NOW