Pooling
max_pool2d down-samples a spatial feature map by taking the maximum value within each non-overlapping window. This reduces spatial resolution while preserving the strongest activations, making the representation more compact and translation-invariant. During backpropagation, the gradient is routed only to the position that held the maximum in the forward pass.
import simplegrad as sg
x = sg.normal((1, 16, 28, 28), requires_grad=True)
out = sg.max_pool2d(x, kernel_size=2, stride=2)
# out.shape == (1, 16, 14, 14)
max_pool2d(x: Tensor, kernel_size: int | tuple[int, int], stride: int | tuple[int, int] = None, pad_width: int | tuple[int, int, int] = 0, pad_mode: str = 'constant', pad_value: int = 0) -> Tensor
Apply 2D max pooling over the input tensor.
Parameters:
-
x(Tensor) –Input tensor of shape
(batch, channels, H, W)or(channels, H, W). -
kernel_size(int | tuple[int, int]) –Pooling window size. Int or
(kH, kW). -
stride(int | tuple[int, int], default:None) –Step between pooling windows. Int or
(sH, sW). Defaults tokernel_sizeif not specified. -
pad_width(int | tuple[int, int, int], default:0) –Padding before pooling. Int (all sides) or
(top, bottom, left, right). -
pad_mode(str, default:'constant') –Padding mode. Defaults to
"constant". -
pad_value(int, default:0) –Fill value for constant padding. Defaults to 0.