EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 600+ Courses All in One Bundle
  • Login
Home Software Development Software Development Tutorials PyTorch Tutorial PyTorch Detach
Secondary Sidebar
PyTorch Tutorial
  • PyTorch
    • PyTorch Image Classification
    • PyTorch Random
    • PyTorch Variable
    • PyTorch Activation Function
    • Python Formatted String
    • PyTorch GPU
    • PyTorch CUDA
    • PyTorch DataLoader
    • PyTorch LSTM
    • PyTorch Pad
    • PyTorch OpenCL
    • PyTorch Lightning
    • PyTorch SoftMax
    • PyTorch Flatten
    • PyTorch gan
    • PyTorch max
    • PyTorch pip
    • PyTorch Parameter
    • PyTorch Load Model
    • PyTorch Distributed
    • PyTorch BERT
    • PyTorch interpolate
    • PyTorch JIT
    • PyTorch expand
    • PyTorch AMD
    • PyTorch GRU
    • PyTorch rnn
    • PyTorch permute
    • PyTorch argmax
    • PyTorch SGD
    • PyTorch nn
    • PyTorch One Hot Encoding
    • PyTorch Tensors
    • What is PyTorch?
    • PyTorch MSELoss()
    • PyTorch NLLLOSS
    • PyTorch MaxPool2d
    • PyTorch Pretrained Models
    • PyTorch Squeeze
    • PyTorch Reinforcement Learning
    • PyTorch zero_grad
    • PyTorch norm
    • PyTorch VAE
    • PyTorch Early Stopping
    • PyTorch requires_grad
    • PyTorch MNIST
    • PyTorch Conv2d
    • Dataset Pytorch
    • PyTorch tanh
    • PyTorch bmm
    • PyTorch profiler
    • PyTorch unsqueeze
    • PyTorch adam
    • PyTorch backward
    • PyTorch concatenate
    • PyTorch Embedding
    • PyTorch Tensor to NumPy
    • PyTorch Normalize
    • PyTorch ReLU
    • PyTorch Autograd
    • PyTorch Transpose
    • PyTorch Object Detection
    • PyTorch Autoencoder
    • PyTorch Loss
    • PyTorch repeat
    • PyTorch gather
    • PyTorch sequential
    • PyTorch U-NET
    • PyTorch Sigmoid
    • PyTorch Neural Network
    • PyTorch Quantization
    • PyTorch Ignite
    • PyTorch Versions
    • PyTorch TensorBoard
    • PyTorch Dropout
    • PyTorch Model
    • PyTorch optimizer
    • PyTorch ResNet
    • PyTorch CNN
    • PyTorch Detach
    • Single Layer Perceptron
    • PyTorch vs Keras
    • torch.nn Module

PyTorch Detach

PyTorch Detach

Introduction to PyTorch Detach

PyTorch Detach creates a sensor where the storage is shared with another tensor with no grad involved, and thus a new tensor is returned which has no attachments with the current gradients. A gradient is not required here, and hence the result will not have any forward gradients or any type of gradients as such. The output has no attachment with the computational graph, and hence the result has no gradient.

PyTorch Detach Overview

  • Variable is detached from the gradient computational graph where less number of variables and functions are used. Mostly it is used when loss and accuracy has to be displayed once the epoch ends in the neural network. Here, only consumed resources are used, and the gradients no longer affect the results. However, all the intermediary results are stored, and hence more memory is required here. All the operations within the statement of detach are affected, and hence it will not go to the next step of the process continuously unless detach command is removed.
  • When a tensor has to be removed from the computational graph, detach can be used. PyTorch helps in automatic differentiation by tracking all the operations to compute the gradient for everything. Thus, a graph is created for all the operations, which will require more memory. Now, if we use detach, the tensor view will be differentiated from the following methods, and all the tracking operations will be stopped. If we need to track furthermore, we have to start a new class or method.
  • We can also use detach().numpy() where the computational graph is broken directly, and thus the gradients can be calculated using PyTorch in the same program. However, here the tensors are converted to numpy arrays, and hence we will lose tracking of the gradients completely with the code.

How does Detach Work?

Let us see examples, where detach is used and not used.

Code:

a=torch.ones(20, requires_grad=True)
b=a**4
c=a**6
i=(b+c).sum()
i.backward()
print(a.grad)

Start Your Free Software Development Course

Web development, programming languages, Software testing & others

Here b equals a^4, and c equals a^6. Hence, I equal a^4 + a^6. The derivative will be 4a^3 + 6a^5. The gradient of a will be 4*2^3 + 6*2^5 = 224. a.grad produces the vector with 20 elements where each element has a value of 224.

All in One Software Development Bundle(600+ Courses, 50+ projects)
Python TutorialC SharpJavaJavaScript
C Plus PlusSoftware TestingSQLKali Linux
Price
View Courses
600+ Online Courses | 50+ projects | 3000+ Hours | Verifiable Certificates | Lifetime Access
4.6 (86,328 ratings)

Another example where detach is used.

Code:

a=torch.ones(20, requires_grad=True)
b=a**3
c=a.detach()**6
i=(b+c).sum()
i.backward()
print(a.grad)

Here c is not calculated while calculating the gradient as it is detached from the previous graph. Thus, the derivative value will be 3a^2 which is 12. A vector is produced by a.grad with 20 elements where all the elements have a value of 12.

Code:

m = torch.arange(5., requires_grad=True)
n = m**2
o = m.detach()
o.zero_()
n.sum().backward()
print(m.grad)

An error will be thrown here as the data is not correct. If we remove the o.zero command, then we will get the gradient value. Detach method does not create the tensor directly, but when the tensor is modified in the code, a tensor is updated in all streams of detach commands. Copies are not created using detach, but gradients are blocked to share the data without gradients. Detach is useful when the tensor values are not needed in the computational graph.

PyTorch Detach Method

It is important for PyTorch to keep track of all the information and operations related to tensors so that it will help to compute the gradients. These will be in the form of graphs where detach method helps to create a new view of the same where gradients are not needed. All the other tracking operations will be removed from the graph, and hence the graphs involving the results will not be recorded. Instead, we can use torchviz package to see how the gradient is computed with the tensor given.

Code:

a=T.ones(5, requires_grad=True)
b=a**4
c=a**6
i=(b+c).sum()
make_dot(i).render("attached", format="jpg")

The following operations cannot be tracked here, and the program will look like this.

Code:

b=a**4
c=a.detach()**6
i=(b+c).sum()
make_dot(i).render("detached", format="jpg")

Here c**6 will no longer be tracked, which is how the to detach method works in PyTorch.

The storage will be the same as the previous gradient. All the modifications can be seen in the tensor so that the original tensor can also be updated. Forward mode AD gradients will not be present in the system, and the results also will never show the forward gradients.

Example of PyTorch Detach

Given below is the example mentioned:

Code:

import torch
def storagespace(a,b):
if a.storage().data_ptr()==b.storage().data_ptr():
print("it is the same storage space")
else:
print("it is different storage space")
p = torch.ones((4,5), requires_grad=True)
print(p)
q = p
r = p.data
s = p.detach()
t = p.data.clone()
u = p.clone()
v = p.detach().clone()
w = torch.empty_like(p).copy_(p)
x = torch.tensor(p)

If we need to copy constructs from the tensor, we can use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True). Torch.sensor(sourceTensor) will not always work for the gradient problems.

Code:

print("p:",end='');samestorage(p,p)
print("q:",end='');samestorage(p,q)
print("r:",end='');samestorage(p,r)
print("s:",end='');samestorage(p,s)
print("t:",end='');samestorage(p,t)
print("u:",end='');samestorage(p,u)
print("v:",end='');samestorage(p,v)
print("w:",end='');samestorage(p,w)

The output will show whether it is the same or different storage. PyTorch has nearly 100 constructors, and hence we can add in anyways to the code. If we use copy(), all the related information will be copied along with the code, and hence it is better to use clone and detach in the code like this.

Code:

b = a.clone().detach()

Code:

import torch
import perfplot
perfplot.show(
setup=lambda l: torch.randn(l),
kernels=[
lambda x: x.new_tensor(x),
lambda x: x.clone().detach(),
lambda x: torch.empty_like(x).copy_(x),
lambda x: torch.tensor(x),
lambda x: x.detach().clone(),
],
labels=["new_tensor()", "clone().detach()", "empty_like().copy()", "tensor()", "detach().clone()"],
l_range=[3 ** i for i in range(30)],
alabel="len(x)",
loga=False,
logb=False,
title='Comparison for timing related to PyTorch tensor,
)

We cannot use the clone method alone as the gradient will be propagated to the cloned tensor, and thus original tensor also will be affected. This leads to errors that cannot be figured out easily. Hence detach() method can be used here so that graph is disconnected from the tensor, and hence errors will not occur.

Conclusion

If we want to copy the tensor first and then detach it from the computational graph, a clone should be used along with detach. The codes for detach are not always complicated, and hence we should be clear about the process being done for detaching the computational graph from the entire process.

Recommended Articles

This is a guide to PyTorch Detach. Here we discuss the introduction, overview, how does detach works? Method and example, respectively. You may also have a look at the following articles to learn more –

  1. PyTorch Versions
  2. torch.nn Module
  3. Tensorflow Basics
  4. Introduction to Tensorflow
Popular Course in this category
Machine Learning Training (20 Courses, 29+ Projects)
  19 Online Courses |  29 Hands-on Projects |  178+ Hours |  Verifiable Certificate of Completion
4.7
Price

View Course
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Java Tutorials
  • Python Tutorials
  • All Tutorials
Certification Courses
  • All Courses
  • Software Development Course - All in One Bundle
  • Become a Python Developer
  • Java Course
  • Become a Selenium Automation Tester
  • Become an IoT Developer
  • ASP.NET Course
  • VB.NET Course
  • PHP Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Software Development Course

C# Programming, Conditional Constructs, Loops, Arrays, OOPS Concept

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Software Development Course

Web development, programming languages, Software testing & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more