3. Autograd
PyTorch’s autograd
capability provides a way to compute partial differentiation, which is core to backpropagation
. Whenver a tensor
requires tracking of the results of differentiation, the gradient
, you specify so using requires_grad
on the tensor.
Let’s take the simple equations below.
\(x = \begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}\)
\(y_i = x + 2\)
\(z_i = 3y_i^2 = 3(x_i + 2)^2\)
\(o = \frac{1}{4} \sum_i z_i\)
[1]:
import torch
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
o = z.mean()
[2]:
x
[2]:
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
[3]:
y
[3]:
tensor([[3., 3.],
[3., 3.]], grad_fn=<AddBackward0>)
[4]:
z
[4]:
tensor([[27., 27.],
[27., 27.]], grad_fn=<MulBackward0>)
[5]:
o
[5]:
tensor(27., grad_fn=<MeanBackward0>)
Note that the partial derivative of \(o\) with respect to \(x_i\) is analytically solved as below.
\(\frac{\partial o}{\partial x_i} = \frac{3}{2} (x_i + 2)\)
This partial derivative evaluated at \(x_i=1\) becomes 4.5.
\(\frac{\partial o}{\partial x_i} |_{x_i=1} = \frac{3}{2}(1 + 2) = \frac{3}{2}(3) = \frac{9}{2} = 4.5\)
[6]:
o.backward()
[7]:
x.grad
[7]:
tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])