Updated April 7, 2023

Introduction to PyTorch MaxPool2d

PyTorch MaxPool2d is the class of PyTorch that is used in neural networks for pooling over specified signal inputs which internally contain various planes of input. It accepts various parameters in the class definition which include dilation, ceil mode, size of kernel, stride, dilation, padding, and return indices.

In this article, we will try to get more information about what is PyTorch MaxPool2d, how do we use PyTorch MaxPool2d, PyTorch MaxPool2d Class, PyTorch MaxPool2d examples, and a conclusion about PyTorch MaxPool2d.

What is PyTorch MaxPool2d?

PyTorch MaxPool2d is the class of torch library which has its complete definition as:

Class torch.neuralnetwork.MaxPool2d(size of kernel, stride = none, dilation = 1, ceil mode = false, padding = 0, return indices = false)

Simply put, we can understand it by using the case where the size of the input layer is (N, C, H, W) for the specified tensor and the output layer value is the (N, C, H out, W out) and the size of the kernel is (kH, kW) then MaxPool2d can be described by using the below equation –

Out ((Ni, Cj, H, W)) = max (m=0, …., k H-1) max (n=0, …. kW-1)
Input (Ni, Cj, stride [0] * h + m, stride [1] * w +n)

When the value of padding is non zero then the negative infinity value is padded on both the ides of the input implicitly for the number of points. In order to control the spacing that is present between the points of kernel, we need to specify the dilation. If you want to understand in detail the working of dilation and why we need to specify it then refer to this link.

When the value of the ceil mode parameter is set to true then if the sliding windows begin from the left input or padding then it is permitted to them to go off the bounds. The sliding windows are completely ignored if they begin from the right padded region.

How do we use PyTorch MaxPool2d?

While using PyTorch MaxPool2d, you need to specify certain parameters as specified in its class definition. Let’s discuss how you are supposed to pass the parameters and what each one of them does to understand how we can use MaxPool2d.

The value of the parameters – dilation, padding, size f kernel and stride can posses either of the below mentioned values –

A singular integer (int) value – This will specify that the same value will be considered for both the width and height dimensions.

A tuple of two integers (int) values – This case will involve consideration of the first integer for the value of height while the second one will be used for the width dimension of the value.

Let us try to describe each of the parameters use –

Dilation – In the specified window, to control the element’s strides we make the use of this parameter.
Stride – The parameter helps in specifying the stride value of the window and the default value when not specified is the size of the kernel.
Size of the kernel – This is the parameter that specifies the window size in order to take that value to the max over.
Padding – This is used for implicitly adding the padding on both of the sides of value.
Ceil mode – When we set the value of this parameter to true then ceil operation is used instead of the floor which is used for the computation of the shape of output.
Return indices – When we set the value to true this parameter will return the outputs along with the value of maximum indices. This parameter plays a crucial role in torch. nn.MaxUnpool2d is used in the later phase.

PyTorch MaxPool2d Class

The class of PyTorch MaxPool2d has its definition –

Class torch. neuralnetwork. MaxPool2d(size of kernel, stride = none, dilation = 1, ceil mode = false, padding = 0, return indices = false)

Where the parameters used are already described above. The working of MaxPool2d requires input and output whose shapes can be defined as –

Shape of input is – (C, H in, W in) or (N, C, H in, W in)

While the shape of output is – is (C, H out, W out) or is (N, C, H out, W out)

The calculation of H out and W out is carried out by using below formula –

W out = [((W in +2 * padding [1] – dilation [1] * (size of kernel [1] - 1) -1)/ stride [1]) +1]
H out = [((H in +2 * padding [0] – dilation [0] * (size of kernel [0] - 1) -1)/ stride [1]) +1]

PyTorch MaxPool2d Examples

Let us consider certain examples that will help us understand the implementation of MaxPool2d in PyTorch –

Code:

# The size is 3 and stride is 2 for a fully squared window
sampleEducbaMatrix = nn. MaxPool2d(3, stride = 2)
# Window pool having non squared regions or values
sampleEducbaMatrix = nn. MaxPool2d((3, 2), stride = (2, 1))
sampleEducbaInput = torch. randn (20, 16, 50, 32)
sampleEducbaOutput = sampleEducbaMatrix (sampleEducbaInput)

The output of the above program gives the same input matrix but scaled-down version of it as the output as we made the use of MaxPool2d. We can observe the output of above program in the below image –

In order to understand the importance and usage of MaxPool2d, you can refer to this link. The size of the kernel determines the small window for the area of pool over while the step from which to move is determined by the stride. In short, scaling is determined by the stride value that we specify.

Conclusion

We can make the use of MaxPool2d class in PyTorch for implementing and scaling down the input layer to a small output value. We can use various props of MaxPool2d to define its behavior such as the size of kernel, stride, and many others. We can call max unpool 2d to scale in upward way that means the input will be small and output will be scaled up.