Week 4 — Numerical Linear Algebra II: Eigenvalue Algorithms and Iterative Methods

Overview

You cannot compute eigenvalues by finding roots of the characteristic polynomial — that problem is hopelessly ill-conditioned and, by Abel’s theorem, has no closed form past degree four. Eigenvalues are computed iteratively. This week covers the two pillars: dense eigenvalue algorithms (power iteration, the shifted QR algorithm) for moderate matrices, and Krylov-subspace methods (Arnoldi, Lanczos, conjugate gradients, GMRES) for the large sparse systems that dominate real workloads — PageRank, PDE solves, and the linear systems inside optimization.

Readings

T&B Lectures 24–29 — eigenvalue problems, reduction to Hessenberg/tridiagonal form, Rayleigh quotient, power iteration, inverse iteration, the QR algorithm with shifts.
T&B Lectures 30–33 — overview of iterative methods, Krylov subspaces, Arnoldi iteration, how Krylov methods estimate eigenvalues.
T&B Lectures 35–40 — GMRES, the Lanczos iteration, conjugate gradients, preconditioning.

Key Concepts

Why iteration, and the Rayleigh quotient

Eigenvalue algorithms produce a sequence converging to the spectrum. The Rayleigh quotient

\[r(x) = \frac{x^\top A x}{x^\top x}\]

is the best scalar approximation to an eigenvalue given an approximate eigenvector \(x\); for symmetric \(A\) it is accurate to second order near an eigenvector, which drives the fast local convergence of the methods below.

Power iteration and friends

Power iteration repeatedly applies \(A\) and normalizes, \(x_{k+1} = A x_k / \|A x_k\|\), converging to the dominant eigenvector at rate \(|\lambda_2/\lambda_1|\). Inverse iteration applies \((A - \mu I)^{-1}\) to converge to the eigenvector whose eigenvalue is nearest the shift \(\mu\). Rayleigh-quotient iteration updates the shift to the current Rayleigh quotient each step and converges cubically for symmetric matrices.

The QR algorithm

The practical workhorse for dense eigenproblems: repeatedly factor \(A_k = Q_k R_k\) and reform \(A_{k+1} = R_k Q_k\). The iterates are orthogonally similar to \(A\) and converge to (block) triangular form, revealing all eigenvalues at once. First reducing \(A\) to upper-Hessenberg form (tridiagonal if symmetric) makes each step cheap, and shifts accelerate convergence dramatically. This is essentially what LAPACK’s eig does.

Krylov subspaces

For large sparse \(A\), you only get to compute matrix-vector products. The order-\(k\) Krylov subspace

\[\mathcal{K}_k(A, b) = \operatorname{span}\{b, Ab, A^2 b, \dots, A^{k-1}b\}\]

is the natural space in which to seek approximate solutions and eigenvectors. Arnoldi iteration builds an orthonormal basis of \(\mathcal{K}_k\) and a small Hessenberg matrix whose eigenvalues (Ritz values) approximate those of \(A\); for symmetric \(A\) it specializes to the cheaper three-term Lanczos recurrence.

CG and GMRES

For solving \(Ax = b\) iteratively:

Conjugate gradients (CG) — for symmetric positive-definite \(A\). It minimizes the \(A\)-norm of the error over the Krylov subspace using a short recurrence, with convergence governed by \(\kappa(A)\): roughly \(\sqrt{\kappa}\) iterations to reduce the error by a fixed factor.
GMRES — for general nonsymmetric \(A\). It minimizes the residual 2-norm over the Krylov subspace, but stores the full basis (cost grows per iteration), so it is usually restarted.

Preconditioning — solving \(M^{-1}A x = M^{-1}b\) with \(M \approx A\) cheaply invertible — is what makes these methods fast in practice by shrinking the effective condition number.

Connections

Forward: CG reappears in Week 9 as an inner solver for Newton steps and large-scale convex problems. Eigenvalue/spectral reasoning underlies the steady-state Markov computation in Week 7.
Across courses: Sparse iterative solvers and spectral methods for large graphs/embeddings (Courses 1, 3); the same matvec-only mindset governs GPU-friendly numerical kernels (Course 1).