3. Autograd

PyTorch’s autograd capability provides a way to compute partial differentiation, which is core to backpropagation. Whenver a tensor requires tracking of the results of differentiation, the gradient, you specify so using requires_grad on the tensor.

Let’s take the simple equations below.

  • \(x = \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}\)

  • \(y_i = x + 2\)

  • \(z_i = 3y_i^2 = 3(x_i + 2)^2\)

  • \(o = \frac{1}{4} \sum_i z_i\)

[1]:
import torch

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
o = z.mean()
[2]:
x
[2]:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
[3]:
y
[3]:
tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)
[4]:
z
[4]:
tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>)
[5]:
o
[5]:
tensor(27., grad_fn=<MeanBackward0>)

Note that the partial derivative of \(o\) with respect to \(x_i\) is analytically solved as below.

  • \(\frac{\partial o}{\partial x_i} = \frac{3}{2} (x_i + 2)\)

This partial derivative evaluated at \(x_i=1\) becomes 4.5.

  • \(\frac{\partial o}{\partial x_i} |_{x_i=1} = \frac{3}{2}(1 + 2) = \frac{3}{2}(3) = \frac{9}{2} = 4.5\)

[6]:
o.backward()
[7]:
x.grad
[7]:
tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])