Linear Algebra

Section Detail

Vector Spaces and Axioms

The Axiomatic Foundation of Vector Spaces

In elementary physics, a vector is often described as a directed line segment. While intuitive, this definition is insufficient for higher mathematics. Modern linear algebra treats a Vector Space as an abstract algebraic structure—a “playground” where elements can be added together and scaled by numbers.

1. Defining the Playground

A Vector Space $V$ over a field $F$ (typically $\mathbb{R}$ ) is a set equipped with two operations: vector addition ( $+$ ) and scalar multiplication ( $\cdot$ ). Instead of memorizing axioms as dry rules, we can view them as the “laws of physics” for our data.

Standard Euclidean Space: $\mathbb{R}^n$

The most common example is $\mathbb{R}^n$ , where addition and scaling are performed component-wise. Let’s verify the Commutativity ( $u + v = v + u$ ) and Distributivity ( $a(u+v) = au + av$ ) properties using NumPy.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

2. The Eight Axioms

For any set $V$ to be a formal Vector Space, these eight rules must hold for all $u, v, w \in V$ and scalars $a, b \in \mathbb{R}$ :

Commutativity: $u + v = v + u$ .
Associativity: $(u + v) + w = u + (v + w)$ .
Additive Identity: There is a $\mathbf{0}$ such that $v + \mathbf{0} = v$ .
Additive Inverse: For every $v$ , there is a $-v$ such that $v + (-v) = \mathbf{0}$ .
Multiplicative Identity: $1 \cdot v = v$ .
Compatibility: $a(bv) = (ab)v$ .
Distributivity of Scalar: $a(u + v) = au + av$ .
Distributivity of Vector: $(a + b)v = av + bv$ .

Exercise: The Non-Vector Space

Consider the set of all points in the first quadrant: $Q = \{(x, y) \in \mathbb{R}^2 \mid x, y \ge 0\}$ . Why does this fail to be a vector space?

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

The failure shown above violates Closure under Scalar Multiplication. If we scale a “positive” vector by a negative number, we leave the set. Thus, the first quadrant is not a vector space.

3. Abstract Examples: Polynomials

The beauty of these axioms is that “vectors” don’t have to be arrows. They can be functions or polynomials. The set $\mathbb{P}_n$ of polynomials of degree $\le n$ forms a vector space because adding two polynomials yields another polynomial, and the axioms hold.

Why is the set of polynomials of exactly degree 3 not a vector space?

4. Function Spaces

In advanced applications like Fourier analysis, we treat signals (functions) as vectors. If $f(t)$ and $g(t)$ are continuous functions, then $h(t) = f(t) + g(t)$ is also continuous.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

5. Summary Check

Which property ensures that 2(u + v) is the same as 2u + 2v?

Section Detail

Linear Independence and Span

Once we have a vector space, we need a way to describe its contents efficiently. If a vector space is a “playground,” then Linear Independence and Span are the rules for how many “tools” (vectors) you actually need to build everything in that playground.

1. The Concept of Span

The Span of a set of vectors $S = \{v_1, v_2, \dots, v_n\}$ is the set of all possible linear combinations of those vectors.

$\text{span}(S) = \{ c_1 v_1 + c_2 v_2 + \dots + c_n v_n : c_i \in \mathbb{R} \}$

Intuition: If you have two non-parallel arrows in a 2D plane, their span is the entire plane because you can reach any point by scaling and adding those two arrows.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

2. Linear Independence: Avoiding Redundancy

A set of vectors is linearly independent if none of the vectors can be written as a linear combination of the others. In other words, they are all “original” and provide “new directions.”

Formal Definition

Vectors $\{v_1, \dots, v_n\}$ are linearly independent if the equation $c_1 v_1 + \dots + c_n v_n = \mathbf{0}$ has only the trivial solution $c_i = 0$ .

If a vector can be built from others, the set is Dependent. We can test this by checking the Rank of the matrix formed by these vectors. If $\text{Rank}(A) < \text{Number of Vectors}$ , they are dependent.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

3. Visualizing Dependency in 3D

Imagine three vectors in 3D space. If one is a combination of the others, they all lie on a single plane (2D), even though there are three vectors.

If a set of vectors contains the zero vector, is it linearly independent?

4. Why Does It Matter?

Linear independence tells us if our data is redundant. If you have 100 sensors measuring the same physical phenomenon (e.g., temperature) and they are perfectly correlated, your “feature matrix” will be rank-deficient. You have 100 numbers, but effectively only 1 “dimension” of information.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

5. Summary Check

If Rank(A) = n for n vectors in R^n, the vectors are:

Vector Spaces

Section Detail

Basis and Dimension

The concepts of basis and dimension provide a way to “measure” the size and complexity of a vector space.

Linear Independence and Spanning

A set of vectors $\{\mathbf{v}_1, \dots, \mathbf{v}_n\}$ is linearly independent if the only solution to $c_1\mathbf{v}_1 + \dots + c_n\mathbf{v}_n = \mathbf{0}$ is $c_i = 0$ for all $i$ . The span of a set is the set of all possible linear combinations.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Definition of a Basis

A basis $B$ for a vector space $V$ is a set of vectors that:

Is linearly independent.
Spans $V$ .

How many vectors are in any basis of R^3?

Dimension

The dimension of a vector space $V$ , denoted $\dim(V)$ , is the number of vectors in any basis for $V$ .

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

If a subspace of R^5 has dimension 0, what is in the subspace?

Linear Algebra

Section Detail

Matrices and Linear Systems

Matrices are the numerical engines of Linear Algebra. While a “Vector Space” is a theoretical playground, a Matrix is the specific blueprint that tells us how to manipulate that space.

1. The Matrix as a Map

A matrix $A$ of size $m \times n$ is a grid of numbers that represents a mapping from $\mathbb{R}^n$ to $\mathbb{R}^m$ . Each column of the matrix tells us where one of the basis vectors of $\mathbb{R}^n$ “lands” in $\mathbb{R}^m$ .

Let’s visualize a Shear Transformation matrix: $A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}$ . This matrix leaves the $x$ -axis alone but shifts the $y$ -direction.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

2. Linear Systems: $Ax = b$

A system of linear equations asks: “Which vector $x$ lands on $b$ when we apply the transformation $A$ ?”

If the matrix $A$ squashes space into a lower dimension (i.e., it is Rank-Deficient or Singular), then $b$ might be unreachable, or there might be infinitely many paths to reach it.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

3. Multiplication: Composition of Maps

Multiplying two matrices $AB$ represents applying transformation $B$ first, then $A$ . Because the order of geometric transformations (like rotating then shifting) matters, matrix multiplication is not commutative ( $AB \neq BA$ ).

What geometric transformation is represented by the matrix [[0, -1], [1, 0]]?

4. Reduced Row Echelon Form (RREF)

RREF is the “simplest” version of a matrix that still represents the same linear system. It allows us to read off the solutions directly. In RREF, the leading entry of each row is 1, and all other entries in that column are 0.

If the RREF of a matrix has a row of zeros [0, 0, 0] but the constant vector b is [0, 0, 1], what can we conclude?

5. Summary Check

If A is an n x n matrix and Rank(A) = n, which of the following is true?

Section Detail

Eigenvalues and Eigenvectors

Most vectors change direction when a linear transformation is applied. However, some special vectors keep their direction and are only stretched or shrunk. These are eigenvectors, and their scaling factor is the eigenvalue.

1. The Stability Equation

For a linear operator $A$ , a vector $v$ is an eigenvector if: $Av = \lambda v$ where $\lambda$ is a scalar (the eigenvalue).

Intuition: In a 2D rotation, no real vector keeps its direction (except the zero vector). But in a scaling transformation, the axes are eigenvectors because points on them move only along the line.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

2. Finding the Spectrum

To find $\lambda$ , we must solve $\det(A - \lambda I) = 0$ . This gives us the Characteristic Polynomial.

If a matrix is triangular (all zeros below the diagonal), what are its eigenvalues?

3. The Power Method: Finding Eigenvectors Iteratively

In high-dimensional spaces (like Google’s PageRank), we don’t calculate determinants. Instead, we use the Power Method: repeatedly apply $A$ to a random vector until it converges to the dominant eigenvector.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

4. Diagonalization: $A = PDP^{-1}$

If a matrix has enough eigenvectors, we can rotate our coordinate system so the transformation is just axis-aligned scaling. This is the foundation of Principal Component Analysis (PCA).

True or False: A matrix must be square to have eigenvalues.

5. Summary Check

If λ = 0 is an eigenvalue of A, what does this imply?

Linear Maps

Section Detail

Linear Transformations and Matrices

A linear transformation $T: V \to W$ is a mapping between vector spaces that preserves the operations of addition and scalar multiplication.

Properties of Linear Maps

$T$ is linear if for all $\mathbf{u}, \mathbf{v} \in V$ and $c \in F$ :

$T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})$
$T(c\mathbf{v}) = cT(\mathbf{v})$

Every linear transformation between finite-dimensional vector spaces can be represented as a matrix.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Composition and Matrix Multiplication

If $T: U \to V$ and $S: V \to W$ are linear maps, their composition $S \circ T: U \to W$ is also linear. The matrix representing $S \circ T$ is the product of the matrices representing $S$ and $T$ .

If A is a 3x2 matrix and B is a 2x5 matrix, what is the size of AB?

Change of Basis

The matrix representation of a linear map depends on the choice of bases for $V$ and $W$ . If $[T]_B$ is the matrix in basis $B$ , and $P$ is the transition matrix from basis $B'$ to $B$ , then: $[T]_{B'} = P^{-1} [T]_B P$

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

What type of matrix transforms any vector to itself?

Linear Algebra

Section Detail

Inner Product Spaces

Up to this point, our vector spaces have been “bare.” We can add vectors and scale them, but we have no notion of how long a vector is, or what the angle between two vectors might be. In bare linear algebra, there is no “perpendicular.”

Inner product spaces equip vector spaces with this geometric structure. This allows us to define limits, identify the “closest” approximation to a signal, and decompose data into independent components.

Geometry from Algebra

An inner product is a function that takes two vectors and returns a scalar. For $\mathbb{R}^n$ , the standard inner product is the dot product $u \cdot v = \sum u_i v_i$ . To generalize this to any vector space $V$ over $\mathbb{R}$ or $\mathbb{C}$ , we define an inner product $\langle u, v \rangle$ as a map satisfying:

Conjugate Symmetry: $\langle u, v \rangle = \overline{\langle v, u \rangle}$ .
Linearity: $\langle \alpha u + v, w \rangle = \alpha \langle u, w \rangle + \langle v, w \rangle$ .
Positive Definiteness: $\langle v, v \rangle \geq 0$ , and $\langle v, v \rangle = 0 \iff v = \mathbf{0}$ .

Defining Length and Distance

The norm (length) of a vector is defined as $\|v\| = \sqrt{\langle v, v \rangle}$ . This is the generalized Pythagorean theorem. The distance between two points is then simply $d(u, v) = \|u - v\|$ .

The Cauchy-Schwarz Inequality

One of the most powerful results in mathematics is the Cauchy-Schwarz inequality: $|\langle u, v \rangle| \leq \|u\| \|v\|$ This ensures that the “angle” $\theta$ defined by $\cos \theta = \frac{\langle u, v \rangle}{\|u\| \|v\|}$ is always between -1 and 1 for real spaces.

Orthogonality: The Power of Perpendiculars

Two vectors are orthogonal if $\langle u, v \rangle = 0$ . In terms of data, orthogonal vectors are completely “unrelated” or “uncorrelated” in the geometry defined by that inner product.

A set of vectors is orthonormal if they are all orthogonal to each other and all have length 1. Working with an orthonormal basis $\{e_1, \dots, e_n\}$ is trivial compared to a general basis because the coefficients of any vector $v$ are just the inner products: $v = \langle v, e_1 \rangle e_1 + \dots + \langle v, e_n \rangle e_n$

Gram-Schmidt: Generating Order from Chaos

The Gram-Schmidt process is a recipe for turning any basis into an orthonormal one. It works by taking the first vector, then taking the second and subtracting the part that “leaks” into the first, and so on.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Best Approximations and Projections

The most significant application of inner products is the Orthogonal Projection. Given a subspace $W$ and a vector $v$ not in $W$ , the “closest” vector in $W$ to $v$ is the projection $P_W(v)$ .

This is the engine behind Least Squares Regression. If we have a system $Ax = b$ that has no solution, we look for the $x$ that minimizes the error $\|Ax - b\|$ . This happens when $Ax$ is the projection of $b$ onto the column space of $A$ .

Fun Application: Fourier Series

Calculus students often see Fourier series as a specialized topic. In reality, it is just linear algebra on an infinite-dimensional inner product space of functions $L^2$ . The functions $\sin(nx)$ and $\cos(nx)$ form an orthogonal basis. Calculating Fourier coefficients is exactly the same as calculating coordinates in $\mathbb{R}^n$ using dot products!

Exercises

If two vectors u and v are orthogonal, what is ||u + v||²?

Why might the classical Gram-Schmidt process fail in high-precision computer science tasks?

Which of the following is the definition of a Unitary matrix U?

Vector Spaces

Section Detail

Subspaces and Quotients

A vector space can contain smaller vector spaces called subspaces. We can also “divide” a space by a subspace to create a quotient space.

Subspaces

A subset $W$ of a vector space $V$ is a subspace if:

The zero vector $\mathbf{0} \in W$ .
$W$ is closed under addition: $\mathbf{u}, \mathbf{v} \in W \implies \mathbf{u} + \mathbf{v} \in W$ .
$W$ is closed under scalar multiplication: $c \in F, \mathbf{v} \in W \implies c\mathbf{v} \in W$ .

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Quotient Spaces

Given a subspace $W \subseteq V$ , the quotient space $V/W$ is the set of all cosets $\mathbf{v} + W$ . Intuitively, the quotient space “ignores” all differences that lie within $W$ .

The dimension of a quotient space is: $\dim(V/W) = \dim(V) - \dim(W)$

If V = R3 and W is a line through the origin, what is the dimension of V/W?

Kernels and Images

For a linear map $T: V \to W$ :

The Kernel $\ker(T) = \{ \mathbf{v} \in V \mid T(\mathbf{v}) = \mathbf{0} \}$ is a subspace of $V$ .
The Image $im(T) = \{ T(\mathbf{v}) \mid \mathbf{v} \in V \}$ is a subspace of $W$ .

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

According to the First Isomorphism Theorem for vector spaces, V/ker(T) is isomorphic to what?

Linear Algebra

Section Detail

Singular Value Decomposition (SVD)

The Singular Value Decomposition (SVD) is arguably the most important result in applied linear algebra. While eigendecomposition works only for special square matrices, SVD works for any matrix—tall, wide, square, singular, or non-singular. It provides a way to “see” the underlying structure of data by decomposing it into its most significant components.

The Geometric Idea

Every matrix $A$ represents a linear map. SVD says that any such map can be broken down into three simple steps:

A rotation in the input space ( $V$ ).
A scaling along the principal axes ( $\Sigma$ ).
A rotation in the output space ( $U$ ).

Mathematically: $A = U \Sigma V^T$

$V$ : Columns are “right singular vectors.” They define an orthonormal basis in the input space.
$\Sigma$ : A diagonal matrix of “singular values” $\sigma_1 \ge \sigma_2 \ge \dots \ge \sigma_n \ge 0$ . These tell you the “strength” or “gain” of the matrix in each direction.
$U$ : Columns are “left singular vectors.” They define an orthonormal basis in the output space.

Data Compression: The Best Low-Rank Approximation

The real magic of SVD is the Eckart-Young Theorem. It states that if you want the best possible “summary” of a matrix $A$ using only $k$ dimensions (where $k$ is less than the rank of $A$ ), the answer is to keep only the $k$ largest singular values and their corresponding vectors.

$A_k = \sum_{i=1}^k \sigma_i u_i v_i^T$

This is how image compression and noise reduction work. By throwing away small singular values, we lose “noise” or “unimportant detail” but keep the overall structure.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

The Pseudoinverse: Solving the Solvable and Unsolvable

When a matrix $A$ is not invertible (e.g., it is not square or is singular), we can still “solve” $Ax = b$ using the Moore-Penrose Pseudoinverse $A^\dagger$ .

Using SVD, the pseudoinverse is trivial to compute: $A^\dagger = V \Sigma^\dagger U^T$ where $\Sigma^\dagger$ is formed by transposing $\Sigma$ and replacing every non-zero $\sigma_i$ with $1/\sigma_i$ .

The solution $x = A^\dagger b$ is the “best” solution in two senses:

It minimizes the error $\|Ax - b\|^2$ (Least Squares).
If there are many such solutions, it picks the one with the smallest length $\|x\|$ .

Exercises

In an SVD decomposition A = UΣVᵀ, what do the values in Σ represent?

If a matrix has singular values [100, 50, 0.01, 0.0001], which singular values should we keep for a good low-rank approximation?

What is the relationship between singular values and the eigenvalues of AᵀA?

Section Detail

Canonical Forms

Sometimes, a matrix is “ugly”—it is filled with dense numbers that obscure the underlying physics or logic of the system. Canonical forms are the “simplest” possible representations of a linear operator. By changing our basis, we can reveal the true nature of the transformation.

Why Canonical Forms?

If you are studying a physical system, like a vibrating string or a chemical reaction, the equations are often coupled (everything depends on everything else). A canonical form decouples the system.

The most famous canonical form is the Diagonal Form. If a matrix $A$ is diagonalizable, it means there exists a basis where the operator simply scales each axis independently.

The Jordan Normal Form

What happens if a matrix is not diagonalizable? This occurs when there are not enough eigenvectors (the matrix is “defective”).

The Jordan Normal Form (JNF) is the best we can do for any square matrix. It decomposes the operator into Jordan Blocks on the diagonal: $J = \begin{pmatrix} \lambda & 1 & 0 \\ 0 & \lambda & 1 \\ 0 & 0 & \lambda \end{pmatrix}$

Everything off the block is zero. Inside the block, we have the eigenvalue on the diagonal and 1s just above it. These 1s represent “coupling” that cannot be removed.

Practical Use: Stability Analysis

In control theory, we look at the JNF to determine if a system will explode or settle.

If the eigenvalues have negative real parts, the system is stable.
If we have a Jordan block with $\lambda=0$ and a 1 above it, the system might grow linearly over time ( $t$ ), which could be problematic!

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Rational Canonical Form

The Jordan form requires complex numbers (to find all roots of the characteristic polynomial). If we want to stay within the field of rational numbers $\mathbb{Q}$ or real numbers $\mathbb{R}$ , we use the Rational Canonical Form (also known as the Frobenius map).

Instead of eigenvalues, this form uses Invariant Factors—polynomials that divide each other. This is deeply connected to the structure theory of modules over a Principal Ideal Domain (PID).

Exercises

When is a matrix guaranteed to be diagonalizable?

In a Jordan block, what do the 1s above the diagonal represent?

Which canonical form is most useful for solving systems of linear differential equations?

Section Detail

Tensors and Multilinear Algebra

A scalar is a “rank-0” tensor (a single number). A vector is a “rank-1” tensor (an array). A matrix is a “rank-2” tensor (a grid). Beyond these lie higher-rank tensors—multi-dimensional arrays that follow specific transformation rules. Tensors are the natural language of General Relativity, Quantum Mechanics, and modern Deep Learning.

What is a Tensor?

While computer scientists often define a tensor as “a multidimensional array,” mathematicians define it by how it transforms or what it does.

A tensor is a multilinear map. Just as a matrix $A$ represents a linear map $f(v)$ , a tensor $T$ takes multiple vectors as input and produces a scalar (or another vector). For example, a rank-2 tensor $T(u, v)$ is a function that is linear in $u$ AND linear in $v$ .

Covariance and Contravariance

In physics, we distinguish between:

Contravariant vectors ( $v^i$ ): Things like velocity or position that scale “with” the coordinate system.
Covariant vectors ( $w_i$ ): Things like gradients or dual vectors that scale “against” the coordinate changes.

A general tensor $T^i_j$ can have both types of indices. This distinction is crucial for ensuring that physical laws remain the same regardless of the units or axes we choose.

The Tensor Product

The tensor product $\otimes$ is a way to combine two vector spaces $V$ and $W$ into a larger space $V \otimes W$ . If $v \in V$ and $w \in W$ , then $v \otimes w$ is an element of the product space. If $V = \mathbb{R}^m$ and $W = \mathbb{R}^n$ , the tensor product $V \otimes W$ has dimension $m \times n$ . You can think of this as the space of all possible $m \times n$ matrices.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Multilinear Forms and Determinants

The determinant is the most famous example of a multilinear form. It is an alternating $(n,0)$ -tensor. “Alternating” means that if you swap two input vectors, the sign of the output flips. This property is what allows the determinant to measure signed volume.

Einstein Summation Notation

In tensor calculus, we often drop the summation sign $\sum$ . If an index appears twice (once up, once down), it is summed over. Instead of: $y^i = \sum_j A^i_j x^j$ We write: $y^i = A^i_j x^j$ This compact notation is the standard in engineering and theoretical physics.

Exercises

How is a rank-3 tensor typically visualized in computer science?

What does it mean for a map to be 'multilinear'?

In Einstein notation, what does 'R_ii' (repeated index) imply?

Section Detail

Representation Theory

Group theory studies symmetry in the abstract. Linear algebra studies matrices acting on vectors. Representation Theory is the bridge between them: it studies how abstract groups can be represented as matrices. This allows us to use the powerful tools of linear algebra (trace, determinant, eigenvalues) to solve problems in abstract algebra and physics.

The Basic Idea

A collection of symmetries (a Group $G$ ) can often be represented by a set of linear transformations on a vector space $V$ . A representation is a homomorphism $\rho: G \to GL(V)$ . This means that for every group element $g$ , there is an invertible matrix $\rho(g)$ such that the group composition is preserved: $\rho(g_1 g_2) = \rho(g_1) \rho(g_2)$

Irreducible Representations (Irreps)

Just as an integer can be broken down into prime factors, a representation can often be broken down into smaller, simpler representations. If a representation cannot be broken down further, it is called irreducible.

The Maschke Theorem states that for finite groups (over fields like $\mathbb{C}$ ), every representation is a direct sum of irreducible ones. This is effectively the “fundamental theorem of arithmetic” for representations.

Characters: Data Compression for Symmetries

Working with full matrices for every group element is computationally expensive. Character Theory simplifies this by focusing only on the trace of the matrices. The character of a representation $\rho$ is the function $\chi: G \to \mathbb{C}$ defined by: $\chi(g) = \text{tr}(\rho(g))$

Characters are “class functions”—they are the same for elements in the same conjugacy class. This remarkably compact representation contains almost all the information about the representation.

python

Interactive Lab

Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.

Step 1

Inspect the idea

Step 2

Edit the program

Step 3

Run and compare

Symmetry in Physics and Chemistry

In quantum mechanics, particles are described by wavefunctions. If a physical system has a certain symmetry (like rotating a crystal), the wavefunction must transform according to a representation of that symmetry group.

Molecular Vibrations: Predicting which vibrations are “infrared active” in a molecule is done by decomposing the representation of the molecule’s symmetry group.
Particle Physics: The “Standard Model” is built on the representations of the groups $SU(3) \times SU(2) \times U(1)$ . Particles are literally just vectors in the spaces where these groups act!

Linear Algebra

Contents

Linear Algebra

Vector Spaces

Linear Algebra

Linear Maps

Linear Algebra

Vector Spaces

Linear Algebra

Linear Algebra

The Axiomatic Foundation of Vector Spaces

1. Defining the Playground

Standard Euclidean Space: Rn\mathbb{R}^n

Interactive Lab

2. The Eight Axioms

Exercise: The Non-Vector Space

Interactive Lab

3. Abstract Examples: Polynomials

Why is the set of polynomials of *exactly* degree 3 not a vector space?

4. Function Spaces

Interactive Lab

5. Summary Check

Which property ensures that 2(u + v) is the same as 2u + 2v?

Linear Independence and Span

1. The Concept of Span

Interactive Lab

2. Linear Independence: Avoiding Redundancy

Formal Definition

Interactive Lab

3. Visualizing Dependency in 3D

If a set of vectors contains the zero vector, is it linearly independent?

4. Why Does It Matter?

Interactive Lab

5. Summary Check

If Rank(A) = n for n vectors in R^n, the vectors are:

Vector Spaces

Basis and Dimension

Linear Independence and Spanning

Interactive Lab

Definition of a Basis

How many vectors are in any basis of R^3?

Dimension

Interactive Lab

If a subspace of R^5 has dimension 0, what is in the subspace?

Linear Algebra

Matrices and Linear Systems

1. The Matrix as a Map

Interactive Lab

2. Linear Systems: Ax=bAx = b

Interactive Lab

3. Multiplication: Composition of Maps

What geometric transformation is represented by the matrix [[0, -1], [1, 0]]?

4. Reduced Row Echelon Form (RREF)

If the RREF of a matrix has a row of zeros [0, 0, 0] but the constant vector b is [0, 0, 1], what can we conclude?

5. Summary Check

If A is an n x n matrix and Rank(A) = n, which of the following is true?

Eigenvalues and Eigenvectors

1. The Stability Equation

Interactive Lab

2. Finding the Spectrum

If a matrix is triangular (all zeros below the diagonal), what are its eigenvalues?

3. The Power Method: Finding Eigenvectors Iteratively

Interactive Lab

4. Diagonalization: A=PDP−1A = PDP^{-1}

True or False: A matrix must be square to have eigenvalues.

5. Summary Check

If λ = 0 is an eigenvalue of A, what does this imply?

Linear Maps

Linear Transformations and Matrices

Properties of Linear Maps

Interactive Lab

Composition and Matrix Multiplication

If A is a 3x2 matrix and B is a 2x5 matrix, what is the size of AB?

Change of Basis

Interactive Lab

What type of matrix transforms any vector to itself?

Linear Algebra

Inner Product Spaces

Geometry from Algebra

Defining Length and Distance

Standard Euclidean Space: $\mathbb{R}^n$

Why is the set of polynomials of exactly degree 3 not a vector space?

2. Linear Systems: $Ax = b$

4. Diagonalization: $A = PDP^{-1}$