PyTorch’s autograd capability provides a way to compute partial differentiation, which is core to backpropagation. Whenver a tensor requires tracking of the results of differentiation, the gradient, you specify so using requires_grad on the tensor.

Let’s take the simple equations below.

• $$x = \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}$$

• $$y_i = x + 2$$

• $$z_i = 3y_i^2 = 3(x_i + 2)^2$$

• $$o = \frac{1}{4} \sum_i z_i$$

:

import torch

y = x + 2
z = y * y * 3
o = z.mean()

:

x

:

tensor([[1., 1.],

:

y

:

tensor([[3., 3.],

:

z

:

tensor([[27., 27.],

:

o

:

tensor(27., grad_fn=<MeanBackward0>)


Note that the partial derivative of $$o$$ with respect to $$x_i$$ is analytically solved as below.

• $$\frac{\partial o}{\partial x_i} = \frac{3}{2} (x_i + 2)$$

This partial derivative evaluated at $$x_i=1$$ becomes 4.5.

• $$\frac{\partial o}{\partial x_i} |_{x_i=1} = \frac{3}{2}(1 + 2) = \frac{3}{2}(3) = \frac{9}{2} = 4.5$$

:

o.backward()

:

x.grad

:

tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])