BACK

Linear Algebra

Vectors, matrices, and the geometry of linear spaces.

Official Documentation

May 2026

Contents

Linear Algebra

  • Vector Spaces and Axioms
  • Linear Independence and Span

Vector Spaces

  • Basis and Dimension

Linear Algebra

  • Matrices and Linear Systems
  • Eigenvalues and Eigenvectors

Linear Maps

  • Linear Transformations and Matrices

Linear Algebra

  • Inner Product Spaces

Vector Spaces

  • Subspaces and Quotients

Linear Algebra

  • Singular Value Decomposition (SVD)
  • Canonical Forms
  • Tensors and Multilinear Algebra
  • Representation Theory

Linear Algebra

Section Detail

Vector Spaces and Axioms

The Axiomatic Foundation of Vector Spaces

In elementary physics, a vector is often described as a directed line segment. While intuitive, this definition is insufficient for higher mathematics. Modern linear algebra treats a Vector Space as an abstract algebraic structure—a “playground” where elements can be added together and scaled by numbers.

1. Defining the Playground

A Vector Space VV over a field FF (typically R\mathbb{R}) is a set equipped with two operations: vector addition (++) and scalar multiplication (\cdot). Instead of memorizing axioms as dry rules, we can view them as the “laws of physics” for our data.

Standard Euclidean Space: Rn\mathbb{R}^n

The most common example is Rn\mathbb{R}^n, where addition and scaling are performed component-wise. Let’s verify the Commutativity (u+v=v+uu + v = v + u) and Distributivity (a(u+v)=au+ava(u+v) = au + av) properties using NumPy.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

2. The Eight Axioms

For any set VV to be a formal Vector Space, these eight rules must hold for all u,v,wVu, v, w \in V and scalars a,bRa, b \in \mathbb{R}:

  1. Commutativity: u+v=v+uu + v = v + u.
  2. Associativity: (u+v)+w=u+(v+w)(u + v) + w = u + (v + w).
  3. Additive Identity: There is a 0\mathbf{0} such that v+0=vv + \mathbf{0} = v.
  4. Additive Inverse: For every vv, there is a v-v such that v+(v)=0v + (-v) = \mathbf{0}.
  5. Multiplicative Identity: 1v=v1 \cdot v = v.
  6. Compatibility: a(bv)=(ab)va(bv) = (ab)v.
  7. Distributivity of Scalar: a(u+v)=au+ava(u + v) = au + av.
  8. Distributivity of Vector: (a+b)v=av+bv(a + b)v = av + bv.

Exercise: The Non-Vector Space

Consider the set of all points in the first quadrant: Q={(x,y)R2x,y0}Q = \{(x, y) \in \mathbb{R}^2 \mid x, y \ge 0\}. Why does this fail to be a vector space?

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

The failure shown above violates Closure under Scalar Multiplication. If we scale a “positive” vector by a negative number, we leave the set. Thus, the first quadrant is not a vector space.

3. Abstract Examples: Polynomials

The beauty of these axioms is that “vectors” don’t have to be arrows. They can be functions or polynomials. The set Pn\mathbb{P}_n of polynomials of degree n\le n forms a vector space because adding two polynomials yields another polynomial, and the axioms hold.

Why is the set of polynomials of *exactly* degree 3 not a vector space?

4. Function Spaces

In advanced applications like Fourier analysis, we treat signals (functions) as vectors. If f(t)f(t) and g(t)g(t) are continuous functions, then h(t)=f(t)+g(t)h(t) = f(t) + g(t) is also continuous.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

5. Summary Check

Which property ensures that 2(u + v) is the same as 2u + 2v?

Section Detail

Linear Independence and Span

Linear Independence and Span

Once we have a vector space, we need a way to describe its contents efficiently. If a vector space is a “playground,” then Linear Independence and Span are the rules for how many “tools” (vectors) you actually need to build everything in that playground.

1. The Concept of Span

The Span of a set of vectors S={v1,v2,,vn}S = \{v_1, v_2, \dots, v_n\} is the set of all possible linear combinations of those vectors.

span(S)={c1v1+c2v2++cnvn:ciR}\text{span}(S) = \{ c_1 v_1 + c_2 v_2 + \dots + c_n v_n : c_i \in \mathbb{R} \}

Intuition: If you have two non-parallel arrows in a 2D plane, their span is the entire plane because you can reach any point by scaling and adding those two arrows.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

2. Linear Independence: Avoiding Redundancy

A set of vectors is linearly independent if none of the vectors can be written as a linear combination of the others. In other words, they are all “original” and provide “new directions.”

Formal Definition

Vectors {v1,,vn}\{v_1, \dots, v_n\} are linearly independent if the equation c1v1++cnvn=0c_1 v_1 + \dots + c_n v_n = \mathbf{0} has only the trivial solution ci=0c_i = 0.

If a vector can be built from others, the set is Dependent. We can test this by checking the Rank of the matrix formed by these vectors. If Rank(A)<Number of Vectors\text{Rank}(A) < \text{Number of Vectors}, they are dependent.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

3. Visualizing Dependency in 3D

Imagine three vectors in 3D space. If one is a combination of the others, they all lie on a single plane (2D), even though there are three vectors.

If a set of vectors contains the zero vector, is it linearly independent?

4. Why Does It Matter?

Linear independence tells us if our data is redundant. If you have 100 sensors measuring the same physical phenomenon (e.g., temperature) and they are perfectly correlated, your “feature matrix” will be rank-deficient. You have 100 numbers, but effectively only 1 “dimension” of information.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

5. Summary Check

If Rank(A) = n for n vectors in R^n, the vectors are:

Vector Spaces

Section Detail

Basis and Dimension

Basis and Dimension

The concepts of basis and dimension provide a way to “measure” the size and complexity of a vector space.

Linear Independence and Spanning

A set of vectors {v1,,vn}\{\mathbf{v}_1, \dots, \mathbf{v}_n\} is linearly independent if the only solution to c1v1++cnvn=0c_1\mathbf{v}_1 + \dots + c_n\mathbf{v}_n = \mathbf{0} is ci=0c_i = 0 for all ii. The span of a set is the set of all possible linear combinations.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Definition of a Basis

A basis BB for a vector space VV is a set of vectors that:

  1. Is linearly independent.
  2. Spans VV.

How many vectors are in any basis of R^3?

Dimension

The dimension of a vector space VV, denoted dim(V)\dim(V), is the number of vectors in any basis for VV.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

If a subspace of R^5 has dimension 0, what is in the subspace?

Linear Algebra

Section Detail

Matrices and Linear Systems

Matrices and Linear Systems

Matrices are the numerical engines of Linear Algebra. While a “Vector Space” is a theoretical playground, a Matrix is the specific blueprint that tells us how to manipulate that space.

1. The Matrix as a Map

A matrix AA of size m×nm \times n is a grid of numbers that represents a mapping from Rn\mathbb{R}^n to Rm\mathbb{R}^m. Each column of the matrix tells us where one of the basis vectors of Rn\mathbb{R}^n “lands” in Rm\mathbb{R}^m.

Let’s visualize a Shear Transformation matrix: A=(1101)A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}. This matrix leaves the xx-axis alone but shifts the yy-direction.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

2. Linear Systems: Ax=bAx = b

A system of linear equations asks: “Which vector xx lands on bb when we apply the transformation AA?”

If the matrix AA squashes space into a lower dimension (i.e., it is Rank-Deficient or Singular), then bb might be unreachable, or there might be infinitely many paths to reach it.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

3. Multiplication: Composition of Maps

Multiplying two matrices ABAB represents applying transformation BB first, then AA. Because the order of geometric transformations (like rotating then shifting) matters, matrix multiplication is not commutative (ABBAAB \neq BA).

What geometric transformation is represented by the matrix [[0, -1], [1, 0]]?

4. Reduced Row Echelon Form (RREF)

RREF is the “simplest” version of a matrix that still represents the same linear system. It allows us to read off the solutions directly. In RREF, the leading entry of each row is 1, and all other entries in that column are 0.

If the RREF of a matrix has a row of zeros [0, 0, 0] but the constant vector b is [0, 0, 1], what can we conclude?

5. Summary Check

If A is an n x n matrix and Rank(A) = n, which of the following is true?

Section Detail

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors

Most vectors change direction when a linear transformation is applied. However, some special vectors keep their direction and are only stretched or shrunk. These are eigenvectors, and their scaling factor is the eigenvalue.

1. The Stability Equation

For a linear operator AA, a vector vv is an eigenvector if: Av=λvAv = \lambda v where λ\lambda is a scalar (the eigenvalue).

Intuition: In a 2D rotation, no real vector keeps its direction (except the zero vector). But in a scaling transformation, the axes are eigenvectors because points on them move only along the line.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

2. Finding the Spectrum

To find λ\lambda, we must solve det(AλI)=0\det(A - \lambda I) = 0. This gives us the Characteristic Polynomial.

If a matrix is triangular (all zeros below the diagonal), what are its eigenvalues?

3. The Power Method: Finding Eigenvectors Iteratively

In high-dimensional spaces (like Google’s PageRank), we don’t calculate determinants. Instead, we use the Power Method: repeatedly apply AA to a random vector until it converges to the dominant eigenvector.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

4. Diagonalization: A=PDP1A = PDP^{-1}

If a matrix has enough eigenvectors, we can rotate our coordinate system so the transformation is just axis-aligned scaling. This is the foundation of Principal Component Analysis (PCA).

True or False: A matrix must be square to have eigenvalues.

5. Summary Check

If λ = 0 is an eigenvalue of A, what does this imply?

Linear Maps

Section Detail

Linear Transformations and Matrices

Linear Transformations and Matrices

A linear transformation T:VWT: V \to W is a mapping between vector spaces that preserves the operations of addition and scalar multiplication.

Properties of Linear Maps

TT is linear if for all u,vV\mathbf{u}, \mathbf{v} \in V and cFc \in F:

  1. T(u+v)=T(u)+T(v)T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})
  2. T(cv)=cT(v)T(c\mathbf{v}) = cT(\mathbf{v})

Every linear transformation between finite-dimensional vector spaces can be represented as a matrix.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Composition and Matrix Multiplication

If T:UVT: U \to V and S:VWS: V \to W are linear maps, their composition ST:UWS \circ T: U \to W is also linear. The matrix representing STS \circ T is the product of the matrices representing SS and TT.

If A is a 3x2 matrix and B is a 2x5 matrix, what is the size of AB?

Change of Basis

The matrix representation of a linear map depends on the choice of bases for VV and WW. If [T]B[T]_B is the matrix in basis BB, and PP is the transition matrix from basis BB' to BB, then: [T]B=P1[T]BP[T]_{B'} = P^{-1} [T]_B P

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

What type of matrix transforms any vector to itself?

Linear Algebra

Section Detail

Inner Product Spaces

Inner Product Spaces

Up to this point, our vector spaces have been “bare.” We can add vectors and scale them, but we have no notion of how long a vector is, or what the angle between two vectors might be. In bare linear algebra, there is no “perpendicular.”

Inner product spaces equip vector spaces with this geometric structure. This allows us to define limits, identify the “closest” approximation to a signal, and decompose data into independent components.

Geometry from Algebra

An inner product is a function that takes two vectors and returns a scalar. For Rn\mathbb{R}^n, the standard inner product is the dot product uv=uiviu \cdot v = \sum u_i v_i. To generalize this to any vector space VV over R\mathbb{R} or C\mathbb{C}, we define an inner product u,v\langle u, v \rangle as a map satisfying:

  1. Conjugate Symmetry: u,v=v,u\langle u, v \rangle = \overline{\langle v, u \rangle}.
  2. Linearity: αu+v,w=αu,w+v,w\langle \alpha u + v, w \rangle = \alpha \langle u, w \rangle + \langle v, w \rangle.
  3. Positive Definiteness: v,v0\langle v, v \rangle \geq 0, and v,v=0    v=0\langle v, v \rangle = 0 \iff v = \mathbf{0}.

Defining Length and Distance

The norm (length) of a vector is defined as v=v,v\|v\| = \sqrt{\langle v, v \rangle}. This is the generalized Pythagorean theorem. The distance between two points is then simply d(u,v)=uvd(u, v) = \|u - v\|.

The Cauchy-Schwarz Inequality

One of the most powerful results in mathematics is the Cauchy-Schwarz inequality: u,vuv|\langle u, v \rangle| \leq \|u\| \|v\| This ensures that the “angle” θ\theta defined by cosθ=u,vuv\cos \theta = \frac{\langle u, v \rangle}{\|u\| \|v\|} is always between -1 and 1 for real spaces.

Orthogonality: The Power of Perpendiculars

Two vectors are orthogonal if u,v=0\langle u, v \rangle = 0. In terms of data, orthogonal vectors are completely “unrelated” or “uncorrelated” in the geometry defined by that inner product.

A set of vectors is orthonormal if they are all orthogonal to each other and all have length 1. Working with an orthonormal basis {e1,,en}\{e_1, \dots, e_n\} is trivial compared to a general basis because the coefficients of any vector vv are just the inner products: v=v,e1e1++v,enenv = \langle v, e_1 \rangle e_1 + \dots + \langle v, e_n \rangle e_n

Gram-Schmidt: Generating Order from Chaos

The Gram-Schmidt process is a recipe for turning any basis into an orthonormal one. It works by taking the first vector, then taking the second and subtracting the part that “leaks” into the first, and so on.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Best Approximations and Projections

The most significant application of inner products is the Orthogonal Projection. Given a subspace WW and a vector vv not in WW, the “closest” vector in WW to vv is the projection PW(v)P_W(v).

This is the engine behind Least Squares Regression. If we have a system Ax=bAx = b that has no solution, we look for the xx that minimizes the error Axb\|Ax - b\|. This happens when AxAx is the projection of bb onto the column space of AA.

Fun Application: Fourier Series

Calculus students often see Fourier series as a specialized topic. In reality, it is just linear algebra on an infinite-dimensional inner product space of functions L2L^2. The functions sin(nx)\sin(nx) and cos(nx)\cos(nx) form an orthogonal basis. Calculating Fourier coefficients is exactly the same as calculating coordinates in Rn\mathbb{R}^n using dot products!

Exercises

If two vectors u and v are orthogonal, what is ||u + v||²?

Why might the classical Gram-Schmidt process fail in high-precision computer science tasks?

Which of the following is the definition of a Unitary matrix U?

Vector Spaces

Section Detail

Subspaces and Quotients

Subspaces and Quotients

A vector space can contain smaller vector spaces called subspaces. We can also “divide” a space by a subspace to create a quotient space.

Subspaces

A subset WW of a vector space VV is a subspace if:

  1. The zero vector 0W\mathbf{0} \in W.
  2. WW is closed under addition: u,vW    u+vW\mathbf{u}, \mathbf{v} \in W \implies \mathbf{u} + \mathbf{v} \in W.
  3. WW is closed under scalar multiplication: cF,vW    cvWc \in F, \mathbf{v} \in W \implies c\mathbf{v} \in W.
python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Quotient Spaces

Given a subspace WVW \subseteq V, the quotient space V/WV/W is the set of all cosets v+W\mathbf{v} + W. Intuitively, the quotient space “ignores” all differences that lie within WW.

The dimension of a quotient space is: dim(V/W)=dim(V)dim(W)\dim(V/W) = \dim(V) - \dim(W)

If V = R3 and W is a line through the origin, what is the dimension of V/W?

Kernels and Images

For a linear map T:VWT: V \to W:

  • The Kernel ker(T)={vVT(v)=0}\ker(T) = \{ \mathbf{v} \in V \mid T(\mathbf{v}) = \mathbf{0} \} is a subspace of VV.
  • The Image im(T)={T(v)vV}im(T) = \{ T(\mathbf{v}) \mid \mathbf{v} \in V \} is a subspace of WW.
python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

According to the First Isomorphism Theorem for vector spaces, V/ker(T) is isomorphic to what?

Linear Algebra

Section Detail

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD)

The Singular Value Decomposition (SVD) is arguably the most important result in applied linear algebra. While eigendecomposition works only for special square matrices, SVD works for any matrix—tall, wide, square, singular, or non-singular. It provides a way to “see” the underlying structure of data by decomposing it into its most significant components.

The Geometric Idea

Every matrix AA represents a linear map. SVD says that any such map can be broken down into three simple steps:

  1. A rotation in the input space (VV).
  2. A scaling along the principal axes (Σ\Sigma).
  3. A rotation in the output space (UU).

Mathematically: A=UΣVTA = U \Sigma V^T

  • VV: Columns are “right singular vectors.” They define an orthonormal basis in the input space.
  • Σ\Sigma: A diagonal matrix of “singular values” σ1σ2σn0\sigma_1 \ge \sigma_2 \ge \dots \ge \sigma_n \ge 0. These tell you the “strength” or “gain” of the matrix in each direction.
  • UU: Columns are “left singular vectors.” They define an orthonormal basis in the output space.

Data Compression: The Best Low-Rank Approximation

The real magic of SVD is the Eckart-Young Theorem. It states that if you want the best possible “summary” of a matrix AA using only kk dimensions (where kk is less than the rank of AA), the answer is to keep only the kk largest singular values and their corresponding vectors.

Ak=i=1kσiuiviTA_k = \sum_{i=1}^k \sigma_i u_i v_i^T

This is how image compression and noise reduction work. By throwing away small singular values, we lose “noise” or “unimportant detail” but keep the overall structure.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

The Pseudoinverse: Solving the Solvable and Unsolvable

When a matrix AA is not invertible (e.g., it is not square or is singular), we can still “solve” Ax=bAx = b using the Moore-Penrose Pseudoinverse AA^\dagger.

Using SVD, the pseudoinverse is trivial to compute: A=VΣUTA^\dagger = V \Sigma^\dagger U^T where Σ\Sigma^\dagger is formed by transposing Σ\Sigma and replacing every non-zero σi\sigma_i with 1/σi1/\sigma_i.

The solution x=Abx = A^\dagger b is the “best” solution in two senses:

  1. It minimizes the error Axb2\|Ax - b\|^2 (Least Squares).
  2. If there are many such solutions, it picks the one with the smallest length x\|x\|.

Exercises

In an SVD decomposition A = UΣVᵀ, what do the values in Σ represent?

If a matrix has singular values [100, 50, 0.01, 0.0001], which singular values should we keep for a good low-rank approximation?

What is the relationship between singular values and the eigenvalues of AᵀA?

Section Detail

Canonical Forms

Canonical Forms

Sometimes, a matrix is “ugly”—it is filled with dense numbers that obscure the underlying physics or logic of the system. Canonical forms are the “simplest” possible representations of a linear operator. By changing our basis, we can reveal the true nature of the transformation.

Why Canonical Forms?

If you are studying a physical system, like a vibrating string or a chemical reaction, the equations are often coupled (everything depends on everything else). A canonical form decouples the system.

The most famous canonical form is the Diagonal Form. If a matrix AA is diagonalizable, it means there exists a basis where the operator simply scales each axis independently.

The Jordan Normal Form

What happens if a matrix is not diagonalizable? This occurs when there are not enough eigenvectors (the matrix is “defective”).

The Jordan Normal Form (JNF) is the best we can do for any square matrix. It decomposes the operator into Jordan Blocks on the diagonal: J=(λ100λ100λ)J = \begin{pmatrix} \lambda & 1 & 0 \\ 0 & \lambda & 1 \\ 0 & 0 & \lambda \end{pmatrix}

Everything off the block is zero. Inside the block, we have the eigenvalue on the diagonal and 1s just above it. These 1s represent “coupling” that cannot be removed.

Practical Use: Stability Analysis

In control theory, we look at the JNF to determine if a system will explode or settle.

  • If the eigenvalues have negative real parts, the system is stable.
  • If we have a Jordan block with λ=0\lambda=0 and a 1 above it, the system might grow linearly over time (tt), which could be problematic!
python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Rational Canonical Form

The Jordan form requires complex numbers (to find all roots of the characteristic polynomial). If we want to stay within the field of rational numbers Q\mathbb{Q} or real numbers R\mathbb{R}, we use the Rational Canonical Form (also known as the Frobenius map).

Instead of eigenvalues, this form uses Invariant Factors—polynomials that divide each other. This is deeply connected to the structure theory of modules over a Principal Ideal Domain (PID).

Exercises

When is a matrix guaranteed to be diagonalizable?

In a Jordan block, what do the 1s above the diagonal represent?

Which canonical form is most useful for solving systems of linear differential equations?

Section Detail

Tensors and Multilinear Algebra

Tensors and Multilinear Algebra

A scalar is a “rank-0” tensor (a single number). A vector is a “rank-1” tensor (an array). A matrix is a “rank-2” tensor (a grid). Beyond these lie higher-rank tensors—multi-dimensional arrays that follow specific transformation rules. Tensors are the natural language of General Relativity, Quantum Mechanics, and modern Deep Learning.

What is a Tensor?

While computer scientists often define a tensor as “a multidimensional array,” mathematicians define it by how it transforms or what it does.

A tensor is a multilinear map. Just as a matrix AA represents a linear map f(v)f(v), a tensor TT takes multiple vectors as input and produces a scalar (or another vector). For example, a rank-2 tensor T(u,v)T(u, v) is a function that is linear in uu AND linear in vv.

Covariance and Contravariance

In physics, we distinguish between:

  • Contravariant vectors (viv^i): Things like velocity or position that scale “with” the coordinate system.
  • Covariant vectors (wiw_i): Things like gradients or dual vectors that scale “against” the coordinate changes.

A general tensor TjiT^i_j can have both types of indices. This distinction is crucial for ensuring that physical laws remain the same regardless of the units or axes we choose.

The Tensor Product

The tensor product \otimes is a way to combine two vector spaces VV and WW into a larger space VWV \otimes W. If vVv \in V and wWw \in W, then vwv \otimes w is an element of the product space. If V=RmV = \mathbb{R}^m and W=RnW = \mathbb{R}^n, the tensor product VWV \otimes W has dimension m×nm \times n. You can think of this as the space of all possible m×nm \times n matrices.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Multilinear Forms and Determinants

The determinant is the most famous example of a multilinear form. It is an alternating (n,0)(n,0)-tensor. “Alternating” means that if you swap two input vectors, the sign of the output flips. This property is what allows the determinant to measure signed volume.

Einstein Summation Notation

In tensor calculus, we often drop the summation sign \sum. If an index appears twice (once up, once down), it is summed over. Instead of: yi=jAjixjy^i = \sum_j A^i_j x^j We write: yi=Ajixjy^i = A^i_j x^j This compact notation is the standard in engineering and theoretical physics.

Exercises

How is a rank-3 tensor typically visualized in computer science?

What does it mean for a map to be 'multilinear'?

In Einstein notation, what does 'R_ii' (repeated index) imply?

Section Detail

Representation Theory

Representation Theory

Group theory studies symmetry in the abstract. Linear algebra studies matrices acting on vectors. Representation Theory is the bridge between them: it studies how abstract groups can be represented as matrices. This allows us to use the powerful tools of linear algebra (trace, determinant, eigenvalues) to solve problems in abstract algebra and physics.

The Basic Idea

A collection of symmetries (a Group GG) can often be represented by a set of linear transformations on a vector space VV. A representation is a homomorphism ρ:GGL(V)\rho: G \to GL(V). This means that for every group element gg, there is an invertible matrix ρ(g)\rho(g) such that the group composition is preserved: ρ(g1g2)=ρ(g1)ρ(g2)\rho(g_1 g_2) = \rho(g_1) \rho(g_2)

Irreducible Representations (Irreps)

Just as an integer can be broken down into prime factors, a representation can often be broken down into smaller, simpler representations. If a representation cannot be broken down further, it is called irreducible.

The Maschke Theorem states that for finite groups (over fields like C\mathbb{C}), every representation is a direct sum of irreducible ones. This is effectively the “fundamental theorem of arithmetic” for representations.

Characters: Data Compression for Symmetries

Working with full matrices for every group element is computationally expensive. Character Theory simplifies this by focusing only on the trace of the matrices. The character of a representation ρ\rho is the function χ:GC\chi: G \to \mathbb{C} defined by: χ(g)=tr(ρ(g))\chi(g) = \text{tr}(\rho(g))

Characters are “class functions”—they are the same for elements in the same conjugacy class. This remarkably compact representation contains almost all the information about the representation.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1
Inspect the idea
Step 2
Edit the program
Step 3
Run and compare

Symmetry in Physics and Chemistry

In quantum mechanics, particles are described by wavefunctions. If a physical system has a certain symmetry (like rotating a crystal), the wavefunction must transform according to a representation of that symmetry group.

  • Molecular Vibrations: Predicting which vibrations are “infrared active” in a molecule is done by decomposing the representation of the molecule’s symmetry group.
  • Particle Physics: The “Standard Model” is built on the representations of the groups SU(3)×SU(2)×U(1)SU(3) \times SU(2) \times U(1). Particles are literally just vectors in the spaces where these groups act!

Exercises

What is a 'representation' of a group?

Why are characters so useful in representation theory?

What does the dimension of a representation correspond to in its character?