Vector and Matrix Spaces

Fields

Before we talk about vector spaces I want to talk about a similar concept you have already been using a lot called fields. Fields are written using upper-case hollow letters such as $\mathbb{Z,R,C}$ . Seems familar? A field is a set of elements for which the basic arithmetic operations are defined:

Addition and subtraction
Multiplication and division.

In relation to vectors these fields are just used to indicate the dimensionality of a vector. For example the field $\mathbb{R}^3$ indicates that the vector is a 3-dimensional vector with real numbers such as $\boldsymbol{x}^T = [1, 2, 3]$ .

Vector Spaces

On the other hand a vector space or also called linear space is a set of elements for which addition and scalar multiplication is defined. The elements of a vector space are then called vectors, which we have already gotten to know. Vector spaces are usually written using italicized upper-case letters such as $\it{V}$ . More specifally the elements of a vector space $\it{V}$ must have the following properties where $\boldsymbol{o}$ is the null vector:

Additive inverse: $\boldsymbol{v} + (-\boldsymbol{v}) = \boldsymbol{o}$ .
Additive identity: $\boldsymbol{v} + \boldsymbol{o} = \boldsymbol{v}$ .
Addition is commutative: $\boldsymbol{v} + \boldsymbol{w} = \boldsymbol{w} + \boldsymbol{v}$ .
Addition is associative: $(\boldsymbol{v} + \boldsymbol{w}) + \boldsymbol{u} = \boldsymbol{v} + (\boldsymbol{w} + \boldsymbol{u})$ .
Scalar multiplication identity: $1 \cdot \boldsymbol{v} = \boldsymbol{v}$ .
Scalar multiplication is distributive over vector addition: $a \cdot (\boldsymbol{v} + \boldsymbol{w}) = a \cdot \boldsymbol{v} + a \cdot \boldsymbol{w}$ .

For our "normal" vectors where the elements are real numbers we already know this is the case. Therefore the set of all vectors with our defintions of vector addition and scalar multiplication is a vector space, the so called real vector space. However, with proper defintions of these operations this idea can be extended to create a vector space where the elements of the vectors are complex numbers or functions.

An equivalent definition of a vector space can be given, which is much more concise but uses lots of fancy words from abstract algebra. The first four axioms (related to vector addition) say that a vector space is an abelian/commutative group under addition, and the remaining axioms (related to the scalar multiplication) say that this operation defines a ring homomorphism from the field F into the endomorphism ring of this group. Even more concisely, a vector space is a module over a field.

Subspaces and Ambient Spaces

We often don't actually care about vector spaces, but much more about subspaces. A subspace is a subset of a vector space that is itself is a vector space. The ambient space is the vector space that contains the subspace. Think of it this way. We can have a 3-dimensional ambient space, so a room. In this room we can then add a wall, or a plane which is a 2-dimensional subspace. This plane is then a vector space itself, but it is contained in the 3-dimensional ambient space.

Lets look at some examples of subspaces. Think of a vector $\boldsymbol{v}$ in 2-dimensional space. Now we can obtain a set of vectors that are all multiples of the vector $\boldsymbol{v}$ . If we then think of all the points we can reach we get a infinetly long line through the origin. This line is a 1-dimensional subspace of the 2-dimensional ambient space. We can then then take another vector $\boldsymbol{w}$ and do the same thing. This will give us another line through the origin and another 1-dimensional subspace. If we now combine these two vectors as a linear combination we get a plane through the origin. This plane is a 2-dimensional subspace of the 2-dimensional ambient space. So it covers the whole space. We also often then say that $\boldsymbol{v}$ and $\boldsymbol{w}$ span the subspace.

On the left we can see a plane, which is a 2-dimensional subspace in a 3-dimensional ambient space. On the right we can see a line, which is a 1-dimensional subspace in a 2-dimensional ambient space.

So we can formally define a subspace $\it{S}$ of a vector space $\it{V}$ as a subset of $\it{V}$ that satisfies the following properties:

\forall \boldsymbol{u}, \boldsymbol{v} \in \it{S} \text{ and } \forall a, b \in \mathbb{R}: a\boldsymbol{u} + b\boldsymbol{v} \in \it{S}

So if we take any two vectors from the subspace and multiply them with any scalar we get a vector that is also in the subspace (closure under addition and scalar multiplication). Additionaly the subspace must contain the null vector $\boldsymbol{o}$ . This is because the null vector is always in the subspace. If we take any vector or linear combination of vectors in the subspace and multiply it with 0 we get the null vector.

Do any two vectors span a plane?

Think of this question before you read on. The answer is no. Two vectors can only span a plane if they are linearly independent. This means that they are not multiples of each other. If they are multiples of each other they are linearly dependent and they only span a line.

0-Dimensional Subspace

We saw above that in an ambient space of dimension $2$ we can have subspaces of dimension $1$ and $2$ . However, we can also have subspaces of dimension $0$ . This 0-dimensional subspace can be created by taking the null vector $\boldsymbol{o}$ . No matter how many times we multiply it with a scalar it will always stay the null vector and just a point at the origin. This is a 0-dimensional subspace.

This means that in an $n$ -dimensional vector space we can create $n+1$ subspaces. One for each dimension from $0$ to $n$ .

Therefore all vector spaces have at least two subspaces, the 0-dimensional subspace and the vector space itself.

Span

The span of a set of vectors is the set of all possible linear combinations of the vectors. This is quiet clearly related to subspaces as subspaces are the set of all possible linear combinations of a set of vectors. The span of a set of vectors is a subspace of the vector space. This is why it is often said that a subspaces is spanned by a set of vectors.

span(\{\boldsymbol{v}_1, \boldsymbol{v}_2, \ldots, \boldsymbol{v}_n\}) = \{a_1\boldsymbol{v}_1 + a_2\boldsymbol{v}_2 + \ldots + a_n\boldsymbol{v}_n | a_1, a_2, \ldots, a_n \in \mathbb{R}\}

Example

$span(\{\begin{bmatrix} 0 \\ 0 \end{bmatrix}\}) = \{\begin{bmatrix} 0 \\ 0 \end{bmatrix}\}$ , a 0-dimensional subspace in a 2-dimensional ambient space.
$span(\{\begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \end{bmatrix}\}) = \{\begin{bmatrix} a \\ b \end{bmatrix} | a, b \in \mathbb{R}\}$ , a 2-dimensional subspace (plane) in a 2-dimensional ambient space.
$span(\{\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ 2 \\ 2 \end{bmatrix}\}) = \{\begin{bmatrix} a \\ b \\ c \end{bmatrix} | a, b, c \in \mathbb{R}\}$ , a 2-dimensional subspace in a 3-dimensional ambient space.

Basis

We have already seen above that some vectors don't actually increase the dimensionality of the subspace they are in. This has to do with them being a linear combination of the other vectors. So if we have a specific subspace we might want to find the minimal set of vectors that spans this subspace. This is called a basis. More formally a basis of some subspace is a set of vectors that are linearly independent and span the subspace. The most common example of a basis is the standard basis $S$ that spans the real vector space. The standard basis is the set of vectors $\boldsymbol{e}_1, \boldsymbol{e}_2, \ldots, \boldsymbol{e}_n$ where $\boldsymbol{e}_i$ is the vector with a 1 at the $i$ -th position and 0 elsewhere.

S = \{\boldsymbol{e}_1, \boldsymbol{e}_2, \ldots, \boldsymbol{e}_n\} = \{\begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{bmatrix}, \ldots, \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 1 \end{bmatrix}\}

A subspace can have many different bases, but all bases have the same number of vectors. This number is called the dimension of the subspace. So any linearly independent set of 2 vectors with 2 components will span a 2-dimensional subspace in a 2-dimensional ambient space.

Example

Some basis for the 2-dimensional real vector space $\mathbb{R}^2$ :

$B_1 = \{\begin{bmatrix} 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \end{bmatrix}\}$
$B_2 = \{\begin{bmatrix} 2 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 2 \end{bmatrix}\}$
$B_3 = \{\begin{bmatrix} 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 0 \\ 2 \end{bmatrix}\}$

Orthogonal and Orthonormal Basis

Certain bases are better then others as they make calculations easier and have nice properties.

One of these categrories are orthogonal bases. An orthogonal basis is a basis where all the vectors are orthogonal to each other. This means that the inner/dot product of any two vectors in the basis is zero. However, this does require that that the vector space is a inner product space, which is a vector space with an inner product defined on it.

Another category are orthonormal bases. An orthonormal basis is a basis where all the vectors are orthogonal to each other and have a length of 1. This means that the inner/dot product of any two vectors in the basis is zero and the inner/dot product of a vector with itself is 1 because the length of a vector is the square root of the inner product of the vector with itself. An example of an orthonormal basis is the standard basis of the real vector space.

Change of Basis

Coordinate Vectors

We know vectors are identified by their magnitude and direction. Most often it also easist to think of a vector in its standard position, an arrow pointing from the origin to somewhere in space. In the standard position the point the vector is pointing to in the cartesian coordinate system is the point that matches the vectors components. This sequence of coordinates is called the coordinate vector of the vector. More formally the coordinate vector of a vector $\boldsymbol{v}$ with respect to a basis $B = \{\boldsymbol{v}_1, \boldsymbol{v}_2, \ldots, \boldsymbol{v}_n\}$ is the set of scalars $a_1, a_2, \ldots, a_n$ such that for any vector from the vector space spanned by the basis $\boldsymbol{v} = a_1\boldsymbol{v}_1 + a_2\boldsymbol{v}_2 + \ldots + a_n\boldsymbol{v}_n$ . In the standard position the vector space is spanned by the standard basis vectors $\boldsymbol{e}_1, \boldsymbol{e}_2, \ldots, \boldsymbol{e}_n$ which is why the coordinates are just the components of the vector and the coordinate vector is the vector itself. However, we have seen that a vector space can be spanned many different bases, so the coordinate vector of a vector can change depending on the basis.

[\boldsymbol{v}]_B = \begin{bmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{bmatrix}

The same vector can have different coordinate vectors depending on the basis.

Example

We have the vector $\boldsymbol{p} = \begin{bmatrix} 2 \\ 3 \end{bmatrix}$ in $\mathbb{R}^2$ . We then have the standard basis $S=\{\begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \end{bmatrix}\}$ . The coordinate vector of $\boldsymbol{p}$ with respect to the standard basis $S$ is then:

\begin{align*} \boldsymbol{p} &= \begin{bmatrix} 2 \\ 3 \end{bmatrix} = 2\begin{bmatrix} 1 \\ 0 \end{bmatrix} + 3\begin{bmatrix} 0 \\ 1 \end{bmatrix} \\ [\boldsymbol{p}]_S &= \begin{bmatrix} 2 \\ 3 \end{bmatrix} \end{align*}

Now if we have the basis $B=\{\begin{bmatrix} 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 0 \\ 2 \end{bmatrix}\}$ the coordinate vector of $\boldsymbol{p}$ with respect to the basis $B$ is:

\begin{align*} \boldsymbol{p} &= \begin{bmatrix} 2 \\ 3 \end{bmatrix} = 2\begin{bmatrix} 1 \\ 1 \end{bmatrix} + \frac{1}{2}\begin{bmatrix} 0 \\ 2 \end{bmatrix} \\ [\boldsymbol{p}]_B &= \begin{bmatrix} 2 \\ \frac{1}{2} \end{bmatrix} \end{align*}

Matrix Spaces

In some cases it is useful to think of a matrix as a collection of vectors.

Column Space

A matrix can be thought of as a collection of column vector. So a matrix with $n$ columns can be thought of as $n$ column vectors concatenated together. If we then take the span of these column vectors we get a subspace of the vector space. This subspace is called the column space of a matrix. The number of rows in the matrix is the dimensionality of the ambient space. The largest number of linearly independent column vectors is the dimension of the column space, this corresponds to the rank of the matrix. If all the column vectors are linearly independent then the column vectors also form a basis of the column space. The column space of the matrix $\boldsymbol{A} \in R^{m \times n}$ is denoted as $C(\boldsymbol{A})$ .

\begin{align*} C(\boldsymbol{A}) &= \{x_1\boldsymbol{a}_1 + x_2\boldsymbol{a}_2 + \ldots + x_n\boldsymbol{a}_n | x_1, x_2, \ldots, x_n \in \mathbb{R}\, \text{and} \, \boldsymbol{a}_i \in \mathbb{R}^m\} \\ C(\boldsymbol{A}) &= span(\{\boldsymbol{a}_1, \boldsymbol{a}_2, \ldots, \boldsymbol{a}_n\}) \end{align*}

Example

Lets look at the following matrices:

\boldsymbol{A} = \begin{bmatrix} 1 & 2 & 4 \\ 0 & 4 & 4 \\ 4 & 1 & 9 \\ 6 & 0 & 12 \\ 1 & 1 & 3 \end{bmatrix} \text{, } \boldsymbol{B} = \begin{bmatrix} 1 & 2 & 3 & 5 \\ 0 & 8 & 13 & 21 \\ 0 & 0 & 34 & 55 \\ 0 & 0 & 0 & 89 \end{bmatrix}

Determining the ambient spaces is easy as we just need to count the number of rows, i.e the components of the vectors. The column spaces are then the span of the column vectors.

To find the dimension of the column space we can use the rank of the matrix. The rank of matrix $\boldsymbol{A}$ is $2$ because the third column vector is two times the first column vector plus the second column vector. The rank of matrix $\boldsymbol{B}$ is $4$ , i.e it is full rank.

Therefore the column space of matrix $\boldsymbol{A}$ is a 2-dimensional subspace in a 5-dimensional ambient space and the column space of matrix $\boldsymbol{B}$ is a 4-dimensional subspace in a 4-dimensional ambient space.

Column Space of AA^T

Interestingly the column space of the matrix $\boldsymbol{A}\boldsymbol{A}^T$ is the same as the column space of the matrix $\boldsymbol{A}$ .

Firstly if $\boldsymbol{A}$ is a $m \times n$ then $\boldsymbol{A}\boldsymbol{A}^T$ is a $m \times m$ matrix. So we can see that the ambient spaces are the same, but the number of column vectors is different. The ambient spaces being the same is a however a good precondition for the column spaces being the same.

If we then look at what the matrix multiplication for $\boldsymbol{A}^T\boldsymbol{A}$ actually is, we can see that it is just a linear combination of the column vectors of $\boldsymbol{A}$ . This means that the column space of $\boldsymbol{A}\boldsymbol{A}^T$ is the same as the column space of $\boldsymbol{A}$ . Because a vector space is defined by all the possible linear combinations of the vectors that span it. Thus a linear combination of the column vectors of $\boldsymbol{A}$ can not leave the column space of $\boldsymbol{A}$ so it must be at least a subset of the vector space. But because we also know that the rank of $\boldsymbol{A}\boldsymbol{A}^T$ is the same as the rank of $\boldsymbol{A}$ due to the rank-nullity theorem we know that the column spaces have the same dimensionality and therefore must be the same. However, the basis of the column space of $\boldsymbol{A}\boldsymbol{A}^T$ is different from the basis of the column space of $\boldsymbol{A}$ because the vectors are different and the number of vectors is different. Therefore we can say the following:

C(\boldsymbol{A}\boldsymbol{A}^T) = C(\boldsymbol{A})

Example

If we look at the matrix multiplication we can see the linear combination of the column vectors of $\boldsymbol{A}$ in the matrix $\boldsymbol{A}\boldsymbol{A}^T$ .

\begin{align*} \begin{bmatrix} 0 & 10 \\ 3 & 7 \\ 5 & 3 \\ \end{bmatrix} \begin{bmatrix} 0 & 3 & 5 \\ 10 & 7 & 3 \\ \end{bmatrix} &= \begin{bmatrix} 0 \begin{bmatrix} 0 \\ 3 \\ 5 \end{bmatrix} + 10 \begin{bmatrix} 10 \\ 7 \\ 3 \end{bmatrix} \quad 3 \begin{bmatrix} 0 \\ 3 \\ 5 \end{bmatrix} + 7 \begin{bmatrix} 10 \\ 7 \\ 3 \end{bmatrix} \quad 5 \begin{bmatrix} 0 \\ 3 \\ 5 \end{bmatrix} + 3 \begin{bmatrix} 10 \\ 7 \\ 3 \end{bmatrix} \end{bmatrix} \\ &= \begin{bmatrix} 100 & 70 & 30 \\ 70 & 58 & 34 \\ 30 & 34 & 34 \\ \end{bmatrix} \end{align*}

Membership of a Vector in the Column Space

If we have a vector $\boldsymbol{v}$ and we want to know if it is in the column space of a matrix $\boldsymbol{A}$ we are actually asking the question of whether the vector $\boldsymbol{v}$ can be written as a linear combination of the column vectors of $\boldsymbol{A}$ . This is the same as asking if the vector $\boldsymbol{v}$ is in the span of the column vectors of $\boldsymbol{A}$ . This turns into our favorite equation to solve:

\boldsymbol{A}\boldsymbol{x} = \boldsymbol{v}

Where $\boldsymbol{x}$ is the vector containing the linear weights of the linear combination. If we can find a solution for $\boldsymbol{x}$ then the vector $\boldsymbol{v}$ is in the column space of $\boldsymbol{A}$ . This equation can be solved using gaussian elimination. If the vector $\boldsymbol{v}$ is not in the column space of $\boldsymbol{A}$ then the equation has no solution. This can be interpreted as if the say the matrix $\boldsymbol{A} \in \mathbb{R}^{3 \times 3}$ has a column space that is a plane in $\mathbb{R}^3$ and the vector $\boldsymbol{v}$ is a vector that is not in the plane. Then the vector $\boldsymbol{v}$ is not in the column space of $\boldsymbol{A}$ .

Example

\begin{align*} \begin{bmatrix} 2 & 1 \\ 4 & 4 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} &= \begin{bmatrix} 4 \\ 12 \\ 0 \\ \end{bmatrix} \\ \begin{bmatrix} 2 & 1 \\ 4 & 4 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ \end{bmatrix} &= \begin{bmatrix} 4 \\ 12 \\ 0 \\ \end{bmatrix} \end{align*}

Therefore the vector is in the column space of the matrix.

Augment-rank Algorithm

There is also another way to determine if a vector is in the column space of a matrix. This is the augment-rank algorithm. The idea is very simple and relies on the fact that the rank of a matrix is equivalent to the dimensionality of the spanned space. If we concatonate the matrix $\boldsymbol{A}$ with the vector $\boldsymbol{v}$ and then calculate the rank of the augmented matrix. If the rank of the augmented matrix is the same as the rank of the matrix $\boldsymbol{A}$ then the vector $\boldsymbol{v}$ is in the column space of the matrix $\boldsymbol{A}$ because the vector $\boldsymbol{v}$ can be written as a linear combination of the column vectors of $\boldsymbol{A}$ . If the rank increases then the vector $\boldsymbol{v}$ is not in the column space of the matrix $\boldsymbol{A}$ and has added a new dimension to the spanned space.

To calculate the rank of the augmented matrix we can use gaussian elimination to bring the matrix into row-echelon form. The rank of the matrix is then the number of non-zero rows in the row-echelon form.

Row Space

Just like a matrix can be thought of as a collection of column vectors it can also be thought of as a collection of row vectors. If we then take the span of these row vectors we get a subspace of the vector space. This subspace is called the row space of a matrix. The number of columns in the matrix is the dimensionality of the ambient space. The largest number of linearly independent row vectors is the dimension of the row space, this corresponds to the rank of the matrix. The row space of the matrix $\boldsymbol{A} \in R^{m \times n}$ is denoted as $R(\boldsymbol{A})$ .

\begin{align*} R(\boldsymbol{A}) &= \{x_1\boldsymbol{a}_1 + x_2\boldsymbol{a}_2 + \ldots + x_n\boldsymbol{a}_n | x_1, x_2, \ldots, x_n \in \mathbb{R}\, \text{and} \, \boldsymbol{a}_i \in \mathbb{R}^n\} \\ R(\boldsymbol{A}) &= span(\{\boldsymbol{a}_1, \boldsymbol{a}_2, \ldots, \boldsymbol{a}_n\}) \end{align*}

Therefore by taking the transpose of the matrix we can interchange the row space and the column space. This means that the row space of the matrix $\boldsymbol{A}$ is the same as the column space of the matrix $\boldsymbol{A}^T$ and vice versa.

R(\boldsymbol{A}) = C(\boldsymbol{A}^T) \text{, } C(\boldsymbol{A}) = R(\boldsymbol{A}^T)

From the above it is quiet easily to see that if a matrix is symmetric then the row space is the same as the column space. This is because the matrix is the same as its transpose. Also if the matrix is full rank then the row space is the same as the column space. This is because the two spaces then have the same dimensionality and therefore must be the same.

Row Space of A^TA

Just like for the column space of $\boldsymbol{A}\boldsymbol{A}^T$ the row space of $\boldsymbol{A}^T\boldsymbol{A}$ is the same as the row space of $\boldsymbol{A}$ because we can show that the matrix is a linear combination of the row vectors of $\boldsymbol{A}$ . So we can say the following:

\begin{align*} R(\boldsymbol{A}^T\boldsymbol{A}) &= R(\boldsymbol{A}) \\ C(\boldsymbol{A}^T\boldsymbol{A}) &= C(\boldsymbol{A}) \\ \end{align*}

And because the matrices $\boldsymbol{A}^T\boldsymbol{A}$ and $\boldsymbol{A}\boldsymbol{A}^T$ are symmetric we can also say that the column spaces are the same as the row spaces:

\begin{align*} C(\boldsymbol{A}^T\boldsymbol{A}) &= R(\boldsymbol{A}^T\boldsymbol{A}) \\ C(\boldsymbol{A}\boldsymbol{A}^T) &= R(\boldsymbol{A}\boldsymbol{A}^T) \\ \end{align*}

However this does not mean that $C(\boldsymbol{A}) = R(\boldsymbol{A})$ . This is only the case if the matrix is symmetric and full rank.

Null Space of a Matrix

Not interested in trivial case where $\boldsymbol{x} = \boldsymbol{o}$ .

N(\boldsymbol{A}) = \{\beta \boldsymbol{x} | \boldsymbol{A}\boldsymbol{x} = \boldsymbol{o}, \boldsymbol{x} \in \mathbb{R}^n\} \setminus \{\boldsymbol{o}\}

Left Null Space

null space of the transpose of the matrix. $(\boldsymbol{y}^T\boldsymbol{A})^T = \boldsymbol{o}^{TT}$ is the same as $\boldsymbol{A}^T\boldsymbol{y} = \boldsymbol{o}$ .

N(\boldsymbol{A}^T) = \{\beta \boldsymbol{x} | \boldsymbol{x}^T\boldsymbol{A} = \boldsymbol{o}^T, \boldsymbol{x} \in \mathbb{R}^m\} \setminus \{\boldsymbol{o}^T\}

Orthogonal Subspaces and Complements

Spaces of Functions

We can also define a space of functions. This is a vector space where the elements are functions. The functions must satisfy the properties of a vector space.

For example C[a,b] is the space of continuous functions on the interval [a,b]. This is a vector space because the sum of two continuous functions is continuous and the multiplication of a continuous function with a scalar is continuous.

Can also be extended to C^k[a,b] where the functions are k-times differentiable.

P_m is the space of polynomials of degree m. This is a vector space because the sum of two polynomials is a polynomial and the multiplication of a polynomial with a scalar is a polynomial.

Normalized Vector Spaces

Vector Spaces with Dot Products

from this you can then also define orthogonality and therefore also orthonormal bases.

LU Decomposition Cramer's Rule