2. Tensor

A tensor is the core object used in PyTorch. To understand what a tensor is, we have to understand what is a vector and a matrix. A vector is simply an array of elements. A vector may be a row vector (elements are going left and right).

\(\begin{bmatrix}1 & 2 & 3\end{bmatrix}\)

A vector may be a column vector (elements are going up and down).

\(\begin{bmatrix}1 \\ 2 \\ 3\end{bmatrix}\)

A matrix generalizes a vector and has rows and columns. A two-dimensional matrix looks like the following. Note that this matrix has 3 rows and 3 columns. The rows and columns are called dimensions and this matrix has 2 dimensions, hence, a two-dimensional matrix.

\(\begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9\end{bmatrix}\)

A 3-, 4-, or higher dimensional matrix is a tensor (a tensor generalizes a matrix). It is difficult, if not impossible, to write down and visualize matrices with 3 or more dimensions.

2.1. Creation

[1]:
import torch
import numpy as np

# one element
a = torch.tensor([1])

# two element
b = torch.tensor([1, 2])

# two float elements
c = torch.tensor([1., 2.])

# matrix
d = torch.tensor([[1., 2.], [3., 4.]])

# from python array
e = torch.as_tensor([1., 2., 3.], dtype=torch.float32)

# from numpy array
f = torch.from_numpy(np.array([1.0, 2.0]))

# make tensor of zeros
g = torch.zeros(1)
h = torch.zeros(1, 2)
i = torch.zeros(2, 2)

# empty tensor
j = torch.empty(2, 2)

# make tensor ones
k = torch.ones(2, 2)

# make tensor from range
l = torch.arange(start=0, end=5, step=1)

# make tensor from line space
m = torch.linspace(start=0, end=1, steps=5)

# make tensor from log space
n = torch.logspace(start=0, end=1, steps=5)

# make identity matrix
o = torch.eye(n=3, m=3)

# matrix whose elements are all same values
p = torch.full(size=(3, 3), fill_value=7)

results = [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p]
for c, r in zip(list('abcdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
a: tensor([1])
b: tensor([1, 2])
c: tensor([1., 2.])
d: tensor([[1., 2.],
        [3., 4.]])
e: tensor([1., 2., 3.])
f: tensor([1., 2.], dtype=torch.float64)
g: tensor([0.])
h: tensor([[0., 0.]])
i: tensor([[0., 0.],
        [0., 0.]])
j: tensor([[4.2039e-45, 4.4645e-42],
        [9.5925e-40, 1.1652e-32]])
k: tensor([[1., 1.],
        [1., 1.]])
l: tensor([0, 1, 2, 3, 4])
m: tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])
n: tensor([ 1.0000,  1.7783,  3.1623,  5.6234, 10.0000])
o: tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
p: tensor([[7., 7., 7.],
        [7., 7., 7.],
        [7., 7., 7.]])

2.2. Numpy bridge

Tensors may be converted to numpy arrays.

[2]:
a = torch.tensor([1., 2.])
b = a.numpy()
b
[2]:
array([1., 2.], dtype=float32)

Numpy arrays may also be converted back to Tensors.

[3]:
a = np.array([1, 2], dtype=np.float)
b = torch.from_numpy(a)
b
[3]:
tensor([1., 2.], dtype=torch.float64)

2.3. Indexing, slicing, joining and mutating

2.3.1. Concatenating

[4]:
a = torch.tensor([1., 2.])
b = torch.tensor([2., 3.])
torch.cat((a, b))
[4]:
tensor([1., 2., 2., 3.])

2.3.2. Flattening

[5]:
a = torch.tensor([[1., 2.], [3., 4.]])
b = torch.flatten(a)
b
[5]:
tensor([1., 2., 3., 4.])

2.3.3. Flipping

[6]:
a = torch.tensor([[1., 2.], [3., 4.]])
b = torch.flip(a, dims=[0, 1])
b
[6]:
tensor([[4., 3.],
        [2., 1.]])

2.3.4. Index selecting

[7]:
a = torch.tensor([[1, 2], [3, 4]])
i = torch.tensor([1])
torch.index_select(a, dim=1, index=i)
[7]:
tensor([[2],
        [4]])

2.3.5. Permuting

Some data come in HWC (height, width, channel) format, but PyTorch needs CHW format. Here’s how to permute the data.

[8]:
hwc = torch.rand(640, 480, 3)
chw = hwc.permute(2, 0, 1)

print(hwc.shape)
print(chw.shape)
torch.Size([640, 480, 3])
torch.Size([3, 640, 480])

2.3.6. Reshaping

Some data come as a single array, but you may reshape it into a tensor with different dimensions.

[9]:
a = torch.rand(784)
b = a.view(1, 28, 28)
c = a.reshape(1, 28, 28)

print(a.shape)
print(b.shape)
print(c.shape)
torch.Size([784])
torch.Size([1, 28, 28])
torch.Size([1, 28, 28])

2.3.7. Squeezing

Get rid of all the dimensions with size 1.

[10]:
# a 2x1x2x1x2 tensor
a = torch.zeros(2, 1, 2, 1, 2)
b = torch.squeeze(a)

print('a: ', a.shape)
print('b: ', b.shape)
a:  torch.Size([2, 1, 2, 1, 2])
b:  torch.Size([2, 2, 2])

2.3.8. Stacking

[11]:
a = torch.zeros(1, 2)
b = torch.zeros(1, 2)
c = torch.stack([a, b], dim=0)
d = torch.stack([a, b], dim=1)

print('c: ', c.shape, ': ', c)
print('d: ', d.shape, ': ', d)
c:  torch.Size([2, 1, 2]) :  tensor([[[0., 0.]],

        [[0., 0.]]])
d:  torch.Size([1, 2, 2]) :  tensor([[[0., 0.],
         [0., 0.]]])
[12]:
a = torch.zeros(2, 3)
b = torch.t(a)

print('a: ', a.shape)
print('b: ', b.shape)
a:  torch.Size([2, 3])
b:  torch.Size([3, 2])

2.3.9. Type conversion

[13]:
# torch.LongTensor
a = torch.tensor([[0, 1], [2, 3]])

# torch.FloatTensor
b = a.to(dtype=torch.float32)
b.type()
[13]:
'torch.FloatTensor'

2.3.10. Unbinding

[14]:
a = torch.rand(3, 3)
b = torch.unbind(a)
b
[14]:
(tensor([0.6910, 0.8778, 0.3101]),
 tensor([0.4176, 0.3738, 0.3886]),
 tensor([0.8491, 0.4955, 0.3924]))

2.3.11. Unsqueeze

[15]:
a = torch.tensor([1, 2, 3])
b = torch.unsqueeze(a, 0)
c = torch.unsqueeze(a, 1)

print('a: ', a.shape)
print('b: ', b.shape)
print('c: ', c.shape)
a:  torch.Size([3])
b:  torch.Size([1, 3])
c:  torch.Size([3, 1])

2.3.12. Where

[16]:
a = torch.randn(3, 3)
b = torch.ones(3, 3)
torch.where(a > 0, a, b)
[16]:
tensor([[0.0879, 1.0000, 0.2121],
        [1.0000, 1.0000, 1.0000],
        [0.6462, 1.0000, 0.6818]])

2.4. Random sampling

[17]:
# generates 2 random numbers, normal(0, 1)
a = torch.randn(2)

# generates 2x2 matrix of random numbers, normal(0, 1)
b = torch.randn(2, 2)

# generates 2 random numbers, uniform [0, 1]
c = torch.rand(2)

# generates 2x2 random numbers, uniform [0, 1]
d = torch.rand(2, 2)

# samples from [0, 10)
e = torch.randint(low=0, high=10, size=(1, 5))

# samples a random permutation fro [0, n)
f = torch.randperm(5)

# samples from a bernoulli distribution
g = torch.bernoulli(torch.rand(3))

# samples from a multinomial distribution
weights = torch.tensor([5, 10], dtype=torch.float)
h = torch.multinomial(input=weights, num_samples=10, replacement=True)

# samples from a normal distribution
i = torch.normal(0, 1, size=(1, 5))

results = [a,b,c,d,e,f,g,h,i]
for c, s in zip(list('abcdefghijklmopqrstuvwxyz'), results):
    print(f'{c}: {s}')
a: tensor([1.1063, 0.0471])
b: tensor([[ 0.6093,  0.8607],
        [-1.0625, -0.9297]])
c: tensor([0.6415, 0.2973])
d: tensor([[0.9616, 0.8872],
        [0.5437, 0.6504]])
e: tensor([[0, 7, 5, 1, 3]])
f: tensor([2, 1, 3, 4, 0])
g: tensor([1., 1., 1.])
h: tensor([1, 1, 1, 1, 1, 0, 1, 0, 1, 1])
i: tensor([[-1.7465, -0.5181,  0.8286,  0.1691, -0.0902]])

2.5. Operations

2.5.1. Pointwise operations

[18]:
# absolute value
a = torch.abs(torch.tensor([-1., -2.]))

# square root
b = torch.sqrt(torch.tensor([4., 9., 16.]))

# add constant to tensor
c = torch.add(torch.tensor([0, 1]), 10)

# multiply constant to tensor
d = torch.mul(torch.tensor([0, 1]), 10)

# trig functions, sin, cos, tan, etc...
e = torch.sin(torch.tensor([45.]))
f = torch.cos(torch.tensor([45.]))
g = torch.tan(torch.tensor([45.]))
h = torch.asin(torch.tensor([.4]))
i = torch.acos(torch.tensor([.4]))
j = torch.atan(torch.tensor([.4]))
k = torch.atan2(torch.tensor([.4]), torch.tensor([.4]))

# bitwise not
l = torch.bitwise_not(torch.tensor([-1, -2, 4]))

# rounding
m = torch.ceil(torch.tensor([-0.5, -1.2, 1.2, 0.5]))
n = torch.floor(torch.tensor([-0.5, -1.2, 1.2, 0.5]))

# restricting values
o = torch.clamp(
    torch.tensor([-8., -0.3, 0.0, 0.3, 8.]),
    min=-0.5, max=0.5)

# exponentiation
p = torch.exp(torch.tensor([2.0]))

# modulus and remainder
q = torch.fmod(torch.tensor([-2, -1, 1, 2]), 2)
r = torch.remainder(torch.tensor([-2, -1, 1, 2]), 2)

# log
s = torch.log(torch.tensor([1., 2., 2.]))

# logical not anx XOR
t = torch.logical_not(torch.tensor([True, False]))
u = torch.logical_xor(
    torch.tensor([True, False, True]),
    torch.tensor([True, False, False]))

# negative
v = torch.neg(torch.tensor([-1, -2]))

# power
w = torch.pow(torch.tensor([1., 2., 3.]), 2)

# reciprocal
x = torch.reciprocal(torch.tensor([1., 2., 3.]))

# sigmoid
y = torch.sigmoid(torch.randn(4))

# get the sign
z = torch.sign(torch.tensor([-0.5, 0.5]))

results = [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]
for c, r in zip(list('abcdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
a: tensor([1., 2.])
b: tensor([2., 3., 4.])
c: tensor([10, 11])
d: tensor([ 0, 10])
e: tensor([0.8509])
f: tensor([0.5253])
g: tensor([1.6198])
h: tensor([0.4115])
i: tensor([1.1593])
j: tensor([0.3805])
k: tensor([0.7854])
l: tensor([ 0,  1, -5])
m: tensor([-0., -1.,  2.,  1.])
n: tensor([-1., -2.,  1.,  0.])
o: tensor([-0.5000, -0.3000,  0.0000,  0.3000,  0.5000])
p: tensor([7.3891])
q: tensor([ 0, -1,  1,  0])
r: tensor([0, 1, 1, 0])
s: tensor([0.0000, 0.6931, 0.6931])
t: tensor([False,  True])
u: tensor([False, False,  True])
v: tensor([1, 2])
w: tensor([1., 4., 9.])
x: tensor([1.0000, 0.5000, 0.3333])
y: tensor([0.7613, 0.1120, 0.4234, 0.5516])
z: tensor([-1.,  1.])

2.5.2. Reduction operations

[19]:
# min and max
a = torch.tensor([-1., 0, -2., 1]).min()
b = torch.tensor([-1., 0, -2., 1]).max()

# argument (index) min and max
c = torch.tensor([-1., 0, -2., 1]).argmax()
d = torch.tensor([-1., 0, -2., 1]).argmin()

# sum and product of elements
e = torch.sum(torch.tensor([-1., 1, -2., 1]))
f = torch.prod(torch.tensor([-1., 1, -2., 1]))

# cummulative sum and product
g = torch.cumsum(torch.tensor([-1., 1, -2., 1]), dim=0)
h = torch.cumprod(torch.tensor([-1., 1, -2., 1]), dim=0)

# p-norm distance between 2 tensors
i = torch.dist(torch.tensor([0., 0.]), torch.tensor([1., 1.]), p=2)

# log of sum of exponentiations
j = torch.logsumexp(torch.tensor([-1., 0, -2., 1]), dim=0)

# mean, median, mode, variance, standard deviaion
k = torch.mean(torch.tensor([-1., 0, -2., 1]), dim=0)
l = torch.median(torch.tensor([-1., 0, -2., 1]))
m = torch.mode(torch.tensor([-1., 1, -2., 1]))
n = torch.std(torch.tensor([-1., 1, -2., 1]))
o = torch.std_mean(torch.tensor([-1., 1, -2., 1]))
p = torch.var(torch.tensor([-1., 1, -2., 1]))
q = torch.var_mean(torch.tensor([-1., 1, -2., 1]))

# unique
r = torch.unique(torch.tensor([-1., 1, -2., 1]))

results = [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r]
for c, r in zip(list('abcdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
a: -2.0
b: 1.0
c: 3
d: 2
e: -1.0
f: 2.0
g: tensor([-1.,  0., -2., -1.])
h: tensor([-1., -1.,  2.,  2.])
i: 1.4142135381698608
j: 1.4401897192001343
k: -0.5
l: -1.0
m: torch.return_types.mode(
values=tensor(1.),
indices=tensor(3))
n: 1.5
o: (tensor(1.5000), tensor(-0.2500))
p: 2.25
q: (tensor(2.2500), tensor(-0.2500))
r: tensor([-2., -1.,  1.])

2.5.3. Comparison operations

[20]:
# are elements pretty close?
a = torch.allclose(
    torch.tensor([1.09, 2.08, float('nan')]),
    torch.tensor([1.08, 2.07, float('nan')]),
    rtol=1e-02, equal_nan=True)

# sort the indices according to corresponding values
b = torch.argsort(torch.tensor([5., 3., 1., -1., -3., -5., 6]))

# element-wise equality
c = torch.eq(torch.tensor([1., 2.]), torch.tensor([1., 2.]))

# are tensors equal by dimensions
d = torch.equal(torch.tensor([1., 2.]), torch.tensor([1., 2.]))

# are tensors equal by dimensions
e = torch.equal(torch.tensor([1., 2.]), torch.tensor([1., 2.]))

# equality, >, <, >=, <=
f = torch.gt(torch.tensor([1., 2.]), torch.tensor([1., 2.]))
g = torch.lt(torch.tensor([1., 2.]), torch.tensor([1., 2.]))
h = torch.ge(torch.tensor([1., 2.]), torch.tensor([1., 2.]))
i = torch.le(torch.tensor([1., 2.]), torch.tensor([1., 2.]))

# check if each element is finite, infinity
j = torch.isfinite(
    torch.tensor([1, float('inf'), float('-inf'), float('nan')]))
k = torch.isinf(
    torch.tensor([1, float('inf'), float('-inf'), float('nan')]))
l = torch.isnan(
    torch.tensor([1, float('inf'), float('-inf'), float('nan')]))

# finds k smallest numbers, # finds k largest numbers
m = torch.kthvalue(torch.tensor([[-1, 0], [0, -1]]), k=1, dim=0)
n = torch.topk(torch.tensor([[-1, 0], [0, -1]]), k=1, dim=0)

# finds min/max
o = torch.min(torch.tensor([5, -1, 10]))
p = torch.max(torch.tensor([5, -1, 10]))

# sorting
q = torch.sort(torch.tensor([[10, 9, 8]]))

results = [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q]
for c, r in zip(list('abcdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
a: True
b: tensor([5, 4, 3, 2, 1, 0, 6])
c: tensor([True, True])
d: True
e: True
f: tensor([False, False])
g: tensor([False, False])
h: tensor([True, True])
i: tensor([True, True])
j: tensor([ True, False, False, False])
k: tensor([False,  True,  True, False])
l: tensor([False, False, False,  True])
m: torch.return_types.kthvalue(
values=tensor([-1, -1]),
indices=tensor([0, 1]))
n: torch.return_types.topk(
values=tensor([[0, 0]]),
indices=tensor([[1, 0]]))
o: -1
p: 10
q: torch.return_types.sort(
values=tensor([[ 8,  9, 10]]),
indices=tensor([[2, 1, 0]]))

2.6. Basic math

2.6.1. Vectors

[21]:
a = torch.tensor([1., 2.])
b = torch.tensor([2., 3.])

c = a + b
d = a - b
e = a * b
f = a / b
g = a.dot(b)

results = [c,d,e,f,g]
for c, r in zip(list('cdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
c: tensor([3., 5.])
d: tensor([-1., -1.])
e: tensor([2., 6.])
f: tensor([0.5000, 0.6667])
g: 8.0

2.6.2. Matrices

[22]:
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float)
b = torch.tensor([[1, 2], [3, 4]], dtype=torch.float)

c = a + b
d = a - b
e = a * b
f = a / b
g = torch.matmul(a, b)
h = torch.matrix_power(a, 2)
i = torch.matrix_rank(a)
j = torch.lu(a)
k = torch.det(a)
l = torch.inverse(a)
m = torch.mv(a, torch.tensor([6., 7.]))

results = [c,d,e,f,g,h,i,j,k,l,m]
for c, r in zip(list('cdefghijklmnopqrstuvwxyz'), results):
    print(f'{c}: {r}')
c: tensor([[2., 4.],
        [6., 8.]])
d: tensor([[0., 0.],
        [0., 0.]])
e: tensor([[ 1.,  4.],
        [ 9., 16.]])
f: tensor([[1., 1.],
        [1., 1.]])
g: tensor([[ 7., 10.],
        [15., 22.]])
h: tensor([[ 7., 10.],
        [15., 22.]])
i: 2
j: (tensor([[3.0000, 4.0000],
        [0.3333, 0.6667]]), tensor([2, 2], dtype=torch.int32))
k: -1.9999998807907104
l: tensor([[-2.0000,  1.0000],
        [ 1.5000, -0.5000]])
m: tensor([20., 46.])

2.7. Broadcasting

Two tensors are broadcastable if

  • each tensor has at least one dimension, and

  • from the trailing dimension to the leadng one, the dimensions are

    • equal,

    • one of them is 1, or

    • one of them does not exists.

[23]:
# can be broadcasted
a = torch.empty(5, 7, 3)
b = torch.empty(5, 7, 3)

# cannot be broadcasted
a = torch.empty((0,))
b = torch.empty(2, 2)

# can line up trailing dimensions
a = torch.empty(5,3,4,1)
b = torch.empty(  3,1,1)

# cannot be broadcasted
a = torch.empty(5,2,4,1)
b = torch.empty(  3,1,1)

2.8. Device-agnostic tensors

[24]:
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
print(f'device = {device}')

a = torch.tensor(10, device=device)
b = torch.rand(2, device=device)

print(a)
print(b)
device = cuda
tensor(10, device='cuda:0')
tensor([0.0472, 0.9254], device='cuda:0')

You may move the tensors between devices (CPU or GPU) as follows.

[25]:
# a is on the GPU
a = torch.tensor(10, device='cuda:0')
# b is on the CPU
b = torch.tensor(10, device='cpu')

# move a to the CPU is possible in two ways
c = a.cpu()
d = a.to('cpu')

# move b to the GPU is as follows
e = b.to('cuda:0')

print('a', a)
print('b', b)
print('c', c)
print('d', d)
print('e', e)
a tensor(10, device='cuda:0')
b tensor(10)
c tensor(10)
d tensor(10)
e tensor(10, device='cuda:0')

2.9. Serialization

[26]:
a = torch.tensor([0, 1, 2, 3])
torch.save(a, './output/mytensor.pt')

b = torch.load('./output/mytensor.pt')
b
[26]:
tensor([0, 1, 2, 3])