3. Autograd

PyTorch’s autograd capability provides a way to compute partial differentiation, which is core to backpropagation. Whenver a tensor requires tracking of the results of differentiation, the gradient, you specify so using requires_grad on the tensor.

Let’s take the simple equations below.

\(x = \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}\)
\(y_i = x + 2\)
\(z_i = 3y_i^2 = 3(x_i + 2)^2\)
\(o = \frac{1}{4} \sum_i z_i\)

[1]:

import torch

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
o = z.mean()

[2]:

[2]:

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

[3]:

[3]:

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

[4]:

[4]:

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>)

[5]:

[5]:

tensor(27., grad_fn=<MeanBackward0>)

Note that the partial derivative of \(o\) with respect to \(x_i\) is analytically solved as below.

\(\frac{\partial o}{\partial x_i} = \frac{3}{2} (x_i + 2)\)

This partial derivative evaluated at \(x_i=1\) becomes 4.5.

\(\frac{\partial o}{\partial x_i} |_{x_i=1} = \frac{3}{2}(1 + 2) = \frac{3}{2}(3) = \frac{9}{2} = 4.5\)

[6]:

o.backward()

[7]:

x.grad

[7]:

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])