Singular Value Decomposition (SVD)
The Singular Value Decomposition (SVD) is arguably the most important result in applied linear algebra. While eigendecomposition works only for special square matrices, SVD works for any matrix—tall, wide, square, singular, or non-singular. It provides a way to “see” the underlying structure of data by decomposing it into its most significant components.
The Geometric Idea
Every matrix represents a linear map. SVD says that any such map can be broken down into three simple steps:
- A rotation in the input space ().
- A scaling along the principal axes ().
- A rotation in the output space ().
Mathematically:
- : Columns are “right singular vectors.” They define an orthonormal basis in the input space.
- : A diagonal matrix of “singular values” . These tell you the “strength” or “gain” of the matrix in each direction.
- : Columns are “left singular vectors.” They define an orthonormal basis in the output space.
Data Compression: The Best Low-Rank Approximation
The real magic of SVD is the Eckart-Young Theorem. It states that if you want the best possible “summary” of a matrix using only dimensions (where is less than the rank of ), the answer is to keep only the largest singular values and their corresponding vectors.
This is how image compression and noise reduction work. By throwing away small singular values, we lose “noise” or “unimportant detail” but keep the overall structure.
Interactive Lab
Read the code, make a small change, then run it and inspect the output. Runtime setup messages stay outside the terminal so the result remains focused on what the program prints.
The Pseudoinverse: Solving the Solvable and Unsolvable
When a matrix is not invertible (e.g., it is not square or is singular), we can still “solve” using the Moore-Penrose Pseudoinverse .
Using SVD, the pseudoinverse is trivial to compute: where is formed by transposing and replacing every non-zero with .
The solution is the “best” solution in two senses:
- It minimizes the error (Least Squares).
- If there are many such solutions, it picks the one with the smallest length .