Notes on Linear Algebra

Several of my classes, especially Recommender Systems, make use of linear algebra concepts and notation. This document collects some basic notes on linear algebra concepts and what they mean to act as quick crash course or refresher, along with pointers for further study.
This document is relatively terse; for a more in-depth treatment, see the resources at the end of the page, particularly Linear Algebra Done Right.
Core Terms
The core structures in liner algebra are vectors and matrices.
Vectors
A vector is an ordered sequence of values, typically
real numbers. In programming terms, we would represent it as an
array (i.e., a 1-“dimensional” NumPy ndarray
). We
denote vectors either with the vector symbol
or as bold-face lowercase letters
;
to write out the content of a vector, we usually use angle brackets,
e.g. or
.
For a vector
,
we denote individual elements of the vector with a subscripted variable
,
where
is the position in the vector (starting from 1).
The number of values is called the dimension of the vector. To say that a vector is a real-valued vector with dimension , we write .
Matrices
A matrix is a grid of numbers (a 2D array, in programming terms). For example, the following is a matrix :
We conventionally use uppercase letters to denote matrices (and sets). Individual elements of the matrix are denoted with subscripted lowercase, with the row number coming first. For the matrix above, . When we are using variables as subscripts, we may omit the comma, as in .
The rows and columns of the matrix form vectors, which we also reference with subscripts while specifying whether we are referring to a row or column vector. For , the column vector , while the row vector .
As with a vector, we denote the dimension of a matrix by using a superscript on the domain of the matrix values: a matrix with rows and columns of real numbers is denoted .
There are a few special kinds of matrices. A diagonal matrix is a matrix is zero everywhere except along the diagonal starting at the top left. That is, , for example:
Diagonal matrices are usually square.
An identity matrix, usually denoted or , is a square diagonal matrix whose diagonal is all 1.
Other Domains
Above, we have used the real numbers as the domain, or the set of valid valies for a matrix or vector. These are not the only valid values, though; in principle, any set can be used.
If we have a vector matrix of binary values 0 and 1, we may write . If our values are known to be in the range 0–1, we may write ( is the standard notation for the closed interval from to : ).
Term Cross-Reference
The terminological overlap between mathematical dimension (number of
elements in a vector) and programming dimension (whether you have a 1D
array, 2D matrix, 3D, etc.) is confusing. Vectors also have a
mathematical length, which is different from the mathematical
dimension (see Vector Norms). When programming
languages talk about the length of an array or list, e.g. with
the Python len(array)
function, that is the
dimension in linear algebra.
- Dimension
-
- In linear algebra, the number of elements in a vector or matrix.
- In programming (including NumPy), the number of dimensions along which we can index an array, such as 1D for a vector, 2D for a matrix, and 3D for a tensor.
- Length
-
- In linear algebra, the norm of a vector (see Vector Norms).
- In programming, the number of elements in an array, along a specific dimension (equivalent to the linear algebra dimension).
To reduce confusion, I rarely use the term “length” to refer to mathematical length, preferring “ norm” or “Euclidean length”.
Beyond Matrices
A tensor is a generalization of a matrix to more than 2 dimensions.
A scalar is a single value (not a vector or matrix).
Vector Operations
Vectors define several operations:
- Addition
-
Two vectors can be added, denoted by . Both vectors must be the same length, and vector addition adds the corresponding elements of each vector to produce a new vector ().
Vectors can also be added to a scalar, as in , in which case is added to each element of .
- Scalar multiplication
-
A vector can be multipled by a scalar , denoted . This is defined by multipling each element of the vector by the scalar, so .
Addition and scalar multiplication combine to support subraction ().
- Inner product
-
The inner product or dot product of two vectors, denoted , is the sum of the products of the corresponding pairs of elements:
Note that “multiplication” is not a supported operation for
vectors. Array-oriented programming environments such as NumPy, R, and
Matlab implement array multiplication arr1 * arr2
that
multiplies the elements pairwise to produce a new vector of the same
length, analogous to vector addition. This pairwise multiplication is a
very useful operation in many contexts, but it is not a defined operator
in linear algebra.
Vector Norms
The norm of a vector is a measure of its length, in geometrical terms. The typical length is the L₂ norm, also called the Euclidean norm or Euclidean length (or sometimes Euclidean distance, which is just the length of a vector obtained by subtracting two other vectors), and is a multidimensional generalization of the Pythagorean theorem:
If you see a reference to the “norm” of a vector without further qualification, or the length operator , it is usually referring to the L₂ norm. In my writing and teaching, I try to explicitly name the norm to avoid confusion.
We will often seen the square of the L₂ norm, denoted : this is just the sum of the squares of the vector’s elements, as the square has the effect of removing the square root. Further, .
A unit vector is a vector whose L₂ norm is 1 (). We can normalize a vector to be a unit vector by dividing it by the norm: .
Another common norm is the L₁ norm, also called the Manhattan distance. This is denoted , and is the sum of the absolute values: .
You may sometimes see reference to the norm, which is the maximum value of the vector.
Matrix Operations
Matrices also support addition and scalar multiplication, identically to vectors. In addition, they support the following operations:
- Transpose
- denotes the transpose of matrix , which just switches the rows and columns of . That is, if and , then and .
- Multiplication
-
Unlike vectors, two matrices can be multiplied, provided their dimensions align. If and (so has the same number of columns as has rows), then the multiplication yields a matrix such that:
This is the same as the inner product: is the inner product of row of and column of .
Note that unlike scalar multiplication, matrix multiplication is not commutative ( is not necessarily the same as , even if the matrices are square so the multiplication is defined). Multplication also interacts with transpose, by reversing the matrices: .
- Matrix-vector multiplication
-
You can also multiply a matrix by a vector. If and , then the multiplication is defined, and the result is the same as multiplying by a matrix with a single column ().
If and (the vector matches the number of rows), then instead you can pre-multiply the vector by the matrix, as in — , and .
- Matrix inversion
-
The matrix inverse is a matrix such that . Not all matrices have inverses; if the inverse of a matrix exists, then is called invertible, and if no inverse exists then it is singular.
While linear algebra packages include routines to compute matrix inverses, we very rarely use them. The primary use for a matrix inverse is to solve a system of equations, and there are usually betters ways to solve the system (see Linear Systems).
- Frobenius norm
- The Frobenius norm of a matrix, denoted , is the matrix version of the L₂ norm: the square root of the sum of the squares. Its square .
Linear Systems
TBD.
Further Reading
- Linear Algebra Done Right by Sheldon Axler — an approachable undergraduate textbook, available free online.
- Handbook of Linear Algebra — a dense but useful reference for linear algebra properties and theorems.