Basics

Prerequisites

Deep Learning Setup : Setup workspace and download python libraries

Learning Objectives

What is a Tensor?
Basic Operations
Matrix Multiplication
Tensor Manipulation
Accessing Elements in a Tensor
Tensor Statistics
Working with GPUs
Random Values and Reproducibility
Tensor and NumPy Integration

What is a Tensor?

Machine learning is all about manipulating numbers and to do that we have different ways of storing those numbers, typically in structures called tensors. Tensors can be a single number (scalar), a list or vector of numbers (1D tensor), a matrix (2D tensor), a list of matrices (3D tensor), so on and so forth:

What is a Tensor?

Now let's see how we make a tensor in PyTorch!

scalar = torch.tensor(7)  # Scalar
vector = torch.tensor([1, 2, 3])  # Vector
matrix = torch.tensor([[1, 2], [3, 4]])  # Matrix
tensor = torch.tensor([[[1, 2, 3], [4, 5, 6], [7, 8, 9]]])  # 3D Tensor

Machine learning papers always include a lot of math jargon, but don't be afraid! Let's go through a few of the symbols for tensors!

Scalar: A single number \(s \in \mathbb{R}\).
Vector: A 1D array or vector is shown as: \(\mathbf{v} \in \mathbb{R}^n\).
Matrix: A 2D array or matrix with m rows and n columns is shown as: \(\mathbf{M} \in \mathbb{R}^{m \times n}\).
3D Tensor: a list of o matrices with m rows and n columns is shown as: \(\mathbf{M} \in \mathbb{R}^{o \times m \times n}\).

Now in python these tensors are shown using different numbers of brackets:

Tips and Tricks

Dimensions as brackets:
[] → 1D (vector)
[[]] → 2D (matrix)
[[[]]] → 3D

Basic Operations

Again, tensors are just numbers and we can perform basic operations on them, like addition, subtraction, multiplication and division. Let's go through how we would do that!

tensor = torch.tensor([1, 2, 3])
print(tensor + 10)  # Addition
print(tensor * 10)  # Multiplication

output

tensor([11, 12, 13])
tensor([10, 20, 30])

Now in math speak we represent the operations above (and others like division and subtraction) like so:

Element-wise addition: \(\mathbf{A} + c\), where \(c\) is added to each element in \(\mathbf{A}\).
Element-wise multiplication: \(\mathbf{A} \times c\), where \(c\) multiplies each element of \(\mathbf{A}\).

Tips and Tricks

Reassign to modify: Operations don’t change the tensor unless you reassign the result back to some variable.

Matrix Multiplication

A lot of machine learning relies heavily on matrix operations and often times multiplying matrices together. When we multiple matrices we need to match the inner dimensions and then we end up with a matrix with the outer dimensions:

This Works!

\(A(4 \times 2) \cdot B(2 \times 3) \rightarrow C(4 \times 3)\) This works because the inner dimensions are 2 and 2

This Doesn't Work!

\(A(4 \times 3) \cdot B(2 \times 3) \rightarrow X\) This does not work because the inner dimensions are 3 and 2. They are different!

Let's do this in PyTorch!

# Define A as a 2x3 tensor
A = torch.tensor([[1, 2, 3], [4, 5, 6]])
# Define B as a 3x2 tensor
B = torch.tensor([[7, 8], [9, 10], [11, 12]])
# Perform matrix multiplication
result = torch.matmul(A, B)
print(result)

output

tensor([[ 58,  64],
        [139, 154]])

Tips and Tricks

Instead of writing out torch.matmul(A,B) you can use the shorthand A @ B for the same result!

Now what would this look like in math speak?

Matrix multiplication:
\(\mathbf{A} \in \mathbb{R}^{m \times n}\)
\(\mathbf{B} \in \mathbb{R}^{n \times p}\)
the matrix product is:
\(\mathbf{C} = \mathbf{A} \cdot \mathbf{B}\)
and each value is calculated by
\(c_{ij} = \sum_{k=1}^{n} a_{ik} \cdot b_{kj}\)

Now that's a lot, let's just look at a visualization of how you multiply matrices:

Tensor Manipulation

Ok so when working with tensors and especially performing matrix operations, we will often be trying to match dimensions. We can do that using a number of different functions in PyTorch. Let's start out by making a vector:

x = torch.tensor([1,2,3,4,5,6])
x

output

tensor([1, 2, 3, 4, 5, 6])

Great, now let's add an extra dimension so that it no longer a vector but one row of a matrix. We can do this by using the view or reshapefunction:

x.view(1,6),x.reshape(1,6)

output

(tensor([[1, 2, 3, 4, 5, 6]]), tensor([[1, 2, 3, 4, 5, 6]]))

Now what about making it a column?

x.view(6,1),x.reshape(6,1)

output

(tensor([[1],
         [2],
         [3],
         [4],
         [5],
         [6]]),
 tensor([[1],
         [2],
         [3],
         [4],
         [5],
         [6]]))

Now what if we wanted to convert it back to just that list of numbers?

x.view(1,6).squeeze()

output

tensor([1, 2, 3, 4, 5, 6])

Hmmm, I want to have my column back, let's reverse with unsqueeze! Here we set dim=1 so that it unsqueezes back to a column, dim=0 would make it a row:

x.view(1,6).squeeze().unsqueeze(dim=1)

output

(tensor([[1],
         [2],
         [3],
         [4],
         [5],
         [6]])

Now what if we needed to stack our tensors? Well the stack function will do just that. Here we stack by row:

torch.stack([x,x],dim=0)

output

tensor([[1, 2, 3, 4, 5, 6],
      [1, 2, 3, 4, 5, 6]])

Let's stack by column as well!

torch.stack([x,x],dim=1)

output

tensor([[1, 1],
        [2, 2],
        [3, 3],
        [4, 4],
        [5, 5],
        [6, 6]])

Accessing Elements in a Tensor

To get specfic elements in a tensor we should start by taking an example tensor:

one_mat = torch.tensor([[[1,2,3],
                         [4,5,6],
                         [7,8,9]]])
one_mat

output

tensor([[[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]]])

Let's see it's shape:

one_mat.shape

output

torch.Size([1, 3, 3])

Now let's see what is the first element in the tensor:

one_mat[0]

output

tensor([[1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]])

Hey look at that the first element is the matrix, let's look at the first element of this first element:

one_mat[0][0]

output

tensor([1, 2, 3])

Now how about the first element of this element:

one_mat[0][0][0]

output

tensor(1)

Tensor Statistics

Now often times we may want to summarize elements in our tensors. What is the maximum or minimum value? How about the sum? The mean? We can do this using the following functions:

# here we specify that the data type is a float so that we can get these summary statistics!
x = torch.tensor([1, 2, 3, 4], dtype=torch.float)
print(x.min(), x.max(), x.mean(), x.sum())

output

tensor(1.) tensor(4.) tensor(2.5000) tensor(10.)

Minimum: \(\text{min}(x)\) returns the smallest element.
Maximum: \(\text{max}(x)\) returns the largest element.
Mean: \(\text{mean}(x) = \frac{1}{n} \sum_{i=1}^{n} x_i\), where \(n\) is the number of elements.
Sum: \(\text{sum}(x) = \sum_{i=1}^{n} x_i\), where \(n\) is the number of elements.

It may be important to grab where in the tensor our minimum or maximum is as well:

x = torch.tensor([10, 20, 30])
print(x.argmax(), x.argmin())  # Index of max and min

output

tensor(2) tensor(0)

Working with GPUs

Machine learning can be hard and often needs computer power to make it happen. One way it can make that happen is with graphic processing units or GPUs. These are hardware that are great at handling certain tasks like matrix math really fast. So it can be useful to move our tensors to a GPU so that our AI models can run faster! But first let's check to see if they are available:

torch.cuda.is_available()  # True if GPU is available

output

True

Now to move a tensor to GPU:

device = 'cuda' if torch.cuda.is_available() else 'cpu'
tensor = torch.tensor([1, 2, 3]).to(device)
print(tensor)

To move it back to the central processing unit or CPU, just do the following:

tensor_cpu = tensor.to('cpu')

Random Values and Reproducibility

Now we often times initialize model values with random numbers to ensure those numbers are consistent, you can set a "seed" number to ensure consistency:

torch.manual_seed(42)
random_tensor = torch.rand(3, 4)
print(random_tensor)

output

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor and NumPy Integration

Tensors are special objects designed for complex operations. However, we may want to convert them back to numpy objects so that we can do things like plot model results. First we will show you how to take a numpy array and make a tensor:

# NumPy to PyTorch
import numpy as np
np_array = np.array([1, 2, 3])
tensor = torch.from_numpy(np_array)
print(tensor)

output

tensor([1, 2, 3], dtype=torch.int32)

Now let's convert that back to numpy!

# PyTorch to NumPy
numpy_array = tensor.numpy()
print(numpy_array)

output

[1, 2, 3]

Key Points

Tensors store data across different dimensions: scalars (0D), vectors (1D), matrices (2D), and higher.
Basic element-wise operations can be used to modify tensors (i.e. addition, subtraction, multiplication, division, etc.).
For matrix multiplication, the inner dimensions must match!
Tensor shapes can be manipulated which can be necessary for things like matrix multiplication.
Tensor elements are accessed by the bracket they are in
GPUs can be used for faster computation.
Set random seeds with torch.manual_seed() for reproducibility.
Converting between PyTorch tensors and NumPy arrays can be useful when plotting model metrics