Chapter 19: Power Method and Inverse Iteration

Iterative Eigenvalue Algorithms

1. Introduction and Motivation

There are two broad approaches to numerically solving the eigenvalue problem $A\mathbf{v} = \lambda \mathbf{v}$ for an $N \times N$ matrix $A$.

  1. Methods that compute all eigenvalues simultaneously: the QR algorithm (Chapter 18)
  2. Iterative methods that selectively compute specific eigenvalues: the power method and its variants

This chapter covers approach (2) in detail. The power method is the most fundamental technique for finding the eigenvalue of largest absolute value, and inverse iteration is a variant that can efficiently find the eigenvalue closest to any target value.

Historical Note

The power method was systematized by Richard von Mises in 1929. It is also known as the power iteration or Von Mises iteration.

Historical background

The power method was systematized in 1929 by R. von Mises and H. Geiringer in Praktische Verfahren der Gleichungsauflösung. von Mises -- also known for his work in probability and fluid dynamics -- gave one of the first systematic algorithms for eigenvalue computation.

Inverse iteration was proposed by H. Wielandt in 1944 and later refined with error analysis and practical-implementation details by J. H. Wilkinson during 1958-1965. It is the standard eigenvector solver in LAPACK today.

Rayleigh quotient iteration derives from the Rayleigh quotient of J. W. Strutt, 3rd Baron Rayleigh (1842-1919). A. M. Ostrowski rigorously proved its cubic convergence for symmetric matrices in a series of papers in 1958-1959.

2. Power Method

2.1 Prerequisites

Prerequisites for the Power Method

  1. $A$ is diagonalizable (the eigenvectors form a basis for $\mathbb{R}^N$ or $\mathbb{C}^N$).
  2. When the eigenvalues are ordered by absolute value, the dominant eigenvalue is unique: $$|\lambda_1| > |\lambda_2| \geq |\lambda_3| \geq \cdots \geq |\lambda_N|$$
  3. When the initial vector $\mathbf{x}^{(0)}$ is expanded in the eigenvector basis $$\mathbf{x}^{(0)} = c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_N \mathbf{v}_N$$ the coefficient satisfies $c_1 \neq 0$.

Meaning of $c_1 \neq 0$

If the initial vector has no component in the $\mathbf{v}_1$ direction, the power method will not converge to $\lambda_1$. For symmetric matrices, the eigenvectors are orthogonal, so $c_1 = \mathbf{x}^{(0)} \cdot \mathbf{v}_1 / \|\mathbf{v}_1\|^2$, which is equivalent to $\mathbf{x}^{(0)} \cdot \mathbf{v}_1 \neq 0$. For general matrices, however, the eigenvectors are not orthogonal, so $c_1 \neq 0$ is the precise condition. In practice, choosing a random initial vector virtually guarantees $c_1 \neq 0$.

2.2 Algorithm

Power Method Algorithm

Input: $N \times N$ matrix $A$, initial vector $\mathbf{x}^{(0)}$ ($\|\mathbf{x}^{(0)}\| = 1$), convergence threshold $\varepsilon$, maximum iterations $k_{\max}$

  1. For $k = 1, 2, \ldots, k_{\max}$:
    1. $\mathbf{y}^{(k)} = A \mathbf{x}^{(k-1)}$ (matrix-vector product: $O(N^2)$)
    2. $\mathbf{x}^{(k)} = \mathbf{y}^{(k)} / \|\mathbf{y}^{(k)}\|$ (normalization)
    3. $\mu^{(k)} = (\mathbf{x}^{(k)})^T A \mathbf{x}^{(k)}$ (Rayleigh quotient for eigenvalue approximation)
    4. If $|\mu^{(k)} - \mu^{(k-1)}| < \varepsilon$, terminate

Output: Eigenvalue approximation $\mu^{(k)}$, eigenvector approximation $\mathbf{x}^{(k)}$

Convergence of vector directions under the power method
Figure 1: Convergence of vector directions under the power method. Starting from $\mathbf{x}^{(0)} \approx (-0.30,\, 0.95)^T$ (in the second quadrant, close to $\mathbf{v}_2$), repeated multiplication by $A$ drives the vector toward the dominant eigenvector $\mathbf{v}_1$ in the first quadrant. Iterations are colored from light to dark.

3. Convergence Proof

3.1 Basic Property: $A^k \mathbf{v}_n = \lambda_n^k \mathbf{v}_n$

From the eigenvalue equation $A\mathbf{v}_n = \lambda_n \mathbf{v}_n$, the following holds by induction.

\begin{equation} A^k \mathbf{v}_n = \lambda_n^k \mathbf{v}_n \quad (k = 1, 2, 3, \ldots) \label{eq:Akvn} \end{equation}

Proof: By induction on $k$. The case $k = 1$ holds by definition. Assuming the result holds for $k-1$,

$$A^k \mathbf{v}_n = A(A^{k-1} \mathbf{v}_n) = A(\lambda_n^{k-1} \mathbf{v}_n) = \lambda_n^{k-1} (A\mathbf{v}_n) = \lambda_n^{k-1} (\lambda_n \mathbf{v}_n) = \lambda_n^k \mathbf{v}_n$$

3.2 Convergence via Eigenvector Expansion

Convergence Theorem for the Power Method

Suppose $A$ is diagonalizable with $|\lambda_1| > |\lambda_2| \geq \cdots \geq |\lambda_N|$. If $c_1 \neq 0$ in the eigenvector expansion of the initial vector, then the sequence $\{\mathbf{x}^{(k)}\}$ generated by the normalized power method converges to the direction of the eigenvector corresponding to $\lambda_1$.

Proof: Expand the initial vector in the eigenvector basis.

$$\mathbf{x}^{(0)} = c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_N \mathbf{v}_N, \quad c_1 \neq 0$$

Applying $A^k$ and using equation $\eqref{eq:Akvn}$,

\begin{align} A^k \mathbf{x}^{(0)} &= c_1 \lambda_1^k \mathbf{v}_1 + c_2 \lambda_2^k \mathbf{v}_2 + \cdots + c_N \lambda_N^k \mathbf{v}_N \\ &= c_1 \lambda_1^k \left[ \mathbf{v}_1 + \frac{c_2}{c_1}\left(\frac{\lambda_2}{\lambda_1}\right)^k \mathbf{v}_2 + \cdots + \frac{c_N}{c_1}\left(\frac{\lambda_N}{\lambda_1}\right)^k \mathbf{v}_N \right] \label{eq:expansion} \end{align}

Since $|\lambda_1| > |\lambda_i|$ for $i \geq 2$, we have $|\lambda_i / \lambda_1| < 1$, so

$$\lim_{k \to \infty} \left(\dfrac{\lambda_i}{\lambda_1}\right)^k = 0 \quad (i = 2, 3, \ldots, N)$$

Therefore, as $k \to \infty$ the bracketed expression converges to $\mathbf{v}_1$, and after normalization,

$$\lim_{k \to \infty} \dfrac{A^k \mathbf{x}^{(0)}}{\|A^k \mathbf{x}^{(0)}\|} = \dfrac{\mathbf{v}_1}{\|\mathbf{v}_1\|}$$

$\square$

3.3 Eigenvalue Extraction

With the normalization $\|\mathbf{x}^{(k)}\| = 1$, the eigenvalue can be approximated via the Rayleigh quotient.

$$\mu^{(k)} = (\mathbf{x}^{(k)})^T A \mathbf{x}^{(k)}$$

Since $\mathbf{x}^{(k)} \to \mathbf{v}_1 / \|\mathbf{v}_1\|$, we have $A\mathbf{x}^{(k)} \to \lambda_1 \mathbf{x}^{(k)}$, so

$$\lim_{k \to \infty} \mu^{(k)} = \lambda_1 \|\mathbf{x}^{(k)}\|^2 = \lambda_1$$

3.4 Necessity of Normalization

If $|\lambda_1| \neq 1$, iterating without normalization causes overflow ($|\lambda_1| > 1$) or underflow ($|\lambda_1| < 1$) due to the factor $\lambda_1^k$. Normalization at each step prevents this.

4. Convergence Rate Analysis

Convergence Rate of the Power Method

Eigenvector convergence: The error between $\mathbf{x}^{(k)}$ and the true eigenvector is

$$\|\mathbf{x}^{(k)} - (\pm \mathbf{v}_1 / \|\mathbf{v}_1\|)\| = O\!\left(\left|\dfrac{\lambda_2}{\lambda_1}\right|^k\right)$$

Eigenvalue convergence: The error in the Rayleigh quotient $\mu^{(k)}$ is

$$|\mu^{(k)} - \lambda_1| = O\!\left(\left|\dfrac{\lambda_2}{\lambda_1}\right|^{2k}\right)$$

The Rayleigh quotient has second-order accuracy with respect to the eigenvector error, so the eigenvalue converges twice as fast as the eigenvector.

The convergence ratio $r = |\lambda_2/\lambda_1|$ takes values between 0 and 1; the smaller $r$, the faster the convergence. The number of iterations required to reduce the relative eigenvalue error below $\varepsilon$ is

$$k \geq \dfrac{\log \varepsilon}{2 \log r}$$

Iteration Count Estimate

When $\lambda_1 = 10,\ \lambda_2 = 9$, we have $r = 0.9$. To achieve 6-digit accuracy ($\varepsilon = 10^{-6}$),

$$k \geq \dfrac{-6}{2 \log_{10} 0.9} = \dfrac{6}{2 \times 0.0458} \approx 66 \text{ iterations}$$

When $|\lambda_2 / \lambda_1|$ is close to 1, many iterations are required.

5. Inverse Power Method (Inverse Iteration)

5.1 Principle

If $A$ has eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_N$, then $A^{-1}$ has eigenvalues $1/\lambda_1, 1/\lambda_2, \ldots, 1/\lambda_N$ with the same eigenvectors. Applying the power method to $A^{-1}$ therefore converges to the eigenvector corresponding to the largest $|1/\lambda_i|$, i.e., the smallest $|\lambda_i|$.

Inverse Iteration

Apply the power method to $A^{-1}$, i.e., iterate $\mathbf{y}^{(k)} = A^{-1} \mathbf{x}^{(k-1)}$. In practice, instead of computing the inverse explicitly, solve the linear system $A\mathbf{y}^{(k)} = \mathbf{x}^{(k-1)}$.

5.2 Algorithm

Inverse Iteration Algorithm

Preprocessing: Compute $A = LU$ (LU decomposition, once: $O(N^3)$)

  1. For $k = 1, 2, \ldots$:
    1. Solve $L\mathbf{z} = \mathbf{x}^{(k-1)}$ by forward substitution ($O(N^2)$)
    2. Solve $U\mathbf{y}^{(k)} = \mathbf{z}$ by back substitution ($O(N^2)$)
    3. $\mathbf{x}^{(k)} = \mathbf{y}^{(k)} / \|\mathbf{y}^{(k)}\|$
    4. $\mu^{(k)} = (\mathbf{x}^{(k)})^T A \mathbf{x}^{(k)}$ (Rayleigh quotient)

Output: Eigenvalue of smallest absolute value $\mu^{(k)}$ and corresponding eigenvector $\mathbf{x}^{(k)}$

Computational Cost

The LU decomposition is performed once at $O(N^3)$ cost, and each iteration requires only $O(N^2)$ for forward and back substitution. This achieves the same per-iteration cost as the power method while computing the eigenvalue of smallest absolute value.

Power method versus inverse iteration
Figure 2: Power method versus inverse iteration. Applying the power method to $A$ converges to the largest eigenvalue $\lambda_1$, while applying it to $A^{-1}$ (inverse iteration) targets the largest of the inverted eigenvalues $1/\lambda_2$, which corresponds to the smallest eigenvalue $\lambda_2$ of $A$.

6. Shifted Inverse Power Method

6.1 Introducing the Shift

Introduce a target shift value $\sigma$ and apply the power method to $(A - \sigma I)^{-1}$. Since the eigenvalues of $(A - \sigma I)$ are $\lambda_i - \sigma$, the eigenvalues of $(A - \sigma I)^{-1}$ are

$$\dfrac{1}{\lambda_i - \sigma}$$

For the eigenvalue $\lambda_j$ closest to $\sigma$, $|1/(\lambda_j - \sigma)|$ is the largest, so the power method converges to the eigenvector of $\lambda_j$.

Shifted Inverse Iteration

Finds the eigenvalue closest to the shift $\sigma$ and its eigenvector. At each iteration, solve the linear system

$$(A - \sigma I)\mathbf{y}^{(k)} = \mathbf{x}^{(k-1)}$$

and normalize $\mathbf{x}^{(k)} = \mathbf{y}^{(k)} / \|\mathbf{y}^{(k)}\|$.

6.2 Algorithm

Shifted Inverse Iteration Algorithm

Input: $A$, shift $\sigma$, $\mathbf{x}^{(0)}$ ($\|\mathbf{x}^{(0)}\| = 1$), $\varepsilon$

Preprocessing: Compute $A - \sigma I = LU$ (LU decomposition, once)

  1. For $k = 1, 2, \ldots$:
    1. Solve $(A - \sigma I)\mathbf{y}^{(k)} = \mathbf{x}^{(k-1)}$ by forward/back substitution
    2. $\mathbf{x}^{(k)} = \mathbf{y}^{(k)} / \|\mathbf{y}^{(k)}\|$
    3. $\mu^{(k)} = (\mathbf{x}^{(k)})^T A \mathbf{x}^{(k)}$
    4. If $|\mu^{(k)} - \mu^{(k-1)}| < \varepsilon$, terminate

Output: Eigenvalue closest to $\sigma$: $\mu^{(k)}$, and eigenvector $\mathbf{x}^{(k)}$

Conceptual diagram of shifted inverse iteration
Figure 3: Concept of shifted inverse iteration. With eigenvalues laid out on the real line, the method converges to the eigenvalue nearest to the shift $\sigma$. The smaller $|\lambda_i - \sigma|$, the faster the convergence.

6.3 Convergence Rate

Convergence Rate of Shifted Inverse Iteration

Let $\lambda_j$ be the eigenvalue closest to $\sigma$, and $\lambda_{\text{next}}$ the second closest. The eigenvector convergence ratio is

$$r = \left|\dfrac{\lambda_j - \sigma}{\lambda_{\text{next}} - \sigma}\right|$$

The closer $\sigma$ is to $\lambda_j$, the smaller $r$ becomes, accelerating convergence.

Choosing the Shift

  • Use the Gershgorin circle theorem to estimate the approximate range of eigenvalues
  • Use approximate eigenvalues obtained during the QR algorithm
  • Use diagonal elements after tridiagonalization (Householder transformation)

In LAPACK, the standard strategy is to compute eigenvalues with the QR algorithm and then compute eigenvectors using shifted inverse iteration.

Shift Estimation via the Gershgorin Circle Theorem

For $A = \begin{pmatrix}2&1\\1&3\end{pmatrix}$, the Gershgorin discs are:

  • Row 1: center $2$, radius $|1| = 1$ → disc $[1, 3]$
  • Row 2: center $3$, radius $|1| = 1$ → disc $[2, 4]$

All eigenvalues lie in $[1, 4]$. To target the second eigenvalue, choose $\sigma = 1.3$ (near the left end); to target the first, choose $\sigma = 3.5$ (near the right end).

7. Rayleigh Quotient Iteration

7.1 Properties of the Rayleigh Quotient

Rayleigh Quotient

$$R(\mathbf{x}) = \dfrac{\mathbf{x}^T A \mathbf{x}}{\mathbf{x}^T \mathbf{x}}$$

When $\mathbf{x}$ coincides with an eigenvector $\mathbf{v}_j$, $R(\mathbf{v}_j) = \lambda_j$.

Second-Order Approximation Property of the Rayleigh Quotient

When $\mathbf{x} = \mathbf{v}_j + \boldsymbol{\varepsilon}$,

$$R(\mathbf{x}) = \lambda_j + O(\|\boldsymbol{\varepsilon}\|^2)$$

That is, the Rayleigh quotient has second-order accuracy with respect to the eigenvector error.

7.2 Algorithm

Update the shift $\sigma$ in shifted inverse iteration at every step using the Rayleigh quotient.

Rayleigh Quotient Iteration

  1. For $k = 1, 2, \ldots$:
    1. $\sigma^{(k)} = R(\mathbf{x}^{(k-1)}) = (\mathbf{x}^{(k-1)})^T A \mathbf{x}^{(k-1)}$ (update the shift)
    2. Solve $(A - \sigma^{(k)} I)\mathbf{y}^{(k)} = \mathbf{x}^{(k-1)}$ (since $\sigma^{(k)}$ changes each iteration, the LU factorization must be recomputed)
    3. $\mathbf{x}^{(k)} = \mathbf{y}^{(k)} / \|\mathbf{y}^{(k)}\|$
    4. If $|\sigma^{(k)} - \sigma^{(k-1)}| < \varepsilon$, terminate

Convergence Rate of Rayleigh Quotient Iteration

  • General matrices: quadratic convergence
  • Symmetric matrices: cubic convergence

For symmetric matrices, machine precision is typically reached in 3–4 iterations.

Advantages and Disadvantages

  • Advantage: Extremely fast convergence (cubic for symmetric matrices)
  • Disadvantage: Since the shift changes at each step, the LU decomposition must be recomputed every iteration ($O(N^3)$/iteration)
  • Disadvantage: Which eigenvalue it converges to depends on the initial vector and cannot be predicted in advance

In practice, it is effective to perform a few steps of shifted inverse iteration to obtain an approximate eigenvector, then switch to Rayleigh quotient iteration.

8. Numerical Examples

8.1 Power Method

Example 1: Power Method on a 2×2 Matrix

Consider the matrix

$$A = \begin{pmatrix} 2 & 1 \\ 1 & 3 \end{pmatrix}$$

Its eigenvalues are $\lambda_1 = \frac{5+\sqrt{5}}{2} \approx 3.618$ and $\lambda_2 = \frac{5-\sqrt{5}}{2} \approx 1.382$, giving a convergence ratio of $|\lambda_2/\lambda_1| \approx 0.382$.

Starting with $\mathbf{x}^{(0)} = (1/\sqrt{2},\ 1/\sqrt{2})^T$, the iteration produces:

$k$$\mathbf{x}^{(k)}$$\mu^{(k)}$$|\mu^{(k)} - \lambda_1|$
0$(0.7071,\ 0.7071)$$3.5000$$1.2 \times 10^{-1}$
1$(0.6000,\ 0.8000)$$3.6000$$1.8 \times 10^{-2}$
2$(0.5547,\ 0.8321)$$3.6154$$2.7 \times 10^{-3}$
3$(0.5369,\ 0.8436)$$3.6176$$3.8 \times 10^{-4}$
4$(0.5300,\ 0.8480)$$3.6180$$5.5 \times 10^{-5}$

The error in $\mu^{(k)}$ decreases by a factor of approximately $0.382^2 \approx 0.146$ per iteration, confirming the theoretical convergence rate.

Python implementation

import numpy as np

def power_method(A, max_iter=200, tol=1e-12):
    """Return the dominant eigenvalue and eigenvector of A.

    A: (N, N) ndarray, diagonalizable with |λ_1| > |λ_2|.
    Returns: (eigenvalue, eigenvector, iters)
    """
    n = A.shape[0]
    x = np.random.default_rng(0).standard_normal(n)
    x /= np.linalg.norm(x)
    mu_prev = np.inf
    for k in range(1, max_iter + 1):
        y = A @ x
        x = y / np.linalg.norm(y)
        mu = x @ A @ x                      # Rayleigh quotient
        if abs(mu - mu_prev) < tol * max(1.0, abs(mu)):
            return mu, x, k
        mu_prev = mu
    raise RuntimeError(f"did not converge in {max_iter} iterations")

A = np.array([[2.0, 1.0], [1.0, 3.0]])
mu, v, k = power_method(A)
print(mu, v, k)
# 3.6180339887498945 [0.5257311  0.85065081] ~60 iters

To extend this to shifted inverse iteration, compute lu = scipy.linalg.lu_factor(A - sigma*np.eye(n)) once before the loop and replace the matrix-vector product with y = scipy.linalg.lu_solve(lu, x).

8.2 Shifted Inverse Iteration

Example 2: Shifted Inverse Iteration

For the same matrix $A$, with shift $\sigma = 1.3$, we target the second eigenvalue $\lambda_2 \approx 1.382$.

Since $\sigma = 1.3$ is close to $\lambda_2$, the convergence ratio is

$$r = \left|\dfrac{\lambda_2 - \sigma}{\lambda_1 - \sigma}\right| = \left|\dfrac{1.382 - 1.3}{3.618 - 1.3}\right| = \dfrac{0.082}{2.318} \approx 0.035$$

which is very small, so convergence occurs in 2–3 iterations to high accuracy. Compared to the power method's $r = 0.382$, the effect of shift selection on convergence acceleration is dramatic.

8.3 Rayleigh Quotient Iteration

Example 3: Cubic Convergence of Rayleigh Quotient Iteration

Applying Rayleigh quotient iteration to the same matrix $A$ with initial vector $\mathbf{x}^{(0)} = (1/\sqrt{2},\ 1/\sqrt{2})^T$:

$k$$\sigma^{(k)}$$|\sigma^{(k)} - \lambda_1|$
0$3.5000$$1.2 \times 10^{-1}$
1$3.6176$$3.9 \times 10^{-4}$
2$3.618033988738$$1.2 \times 10^{-11}$
3$3.618033988749895$$\approx 10^{-16}$

With cubic convergence, the number of correct digits roughly triples each iteration. Indeed, the correct digits progress as $1 \to 3 \to 11 \to 16$ (machine precision), reaching full accuracy in just 3 iterations.

9. Convergence Visualization

9.1 Power Method Convergence Graph

Log plot of Rayleigh quotient error and eigenvector angle error versus iteration count for the power method Power Method Convergence (Rayleigh Quotient Error) Iteration k log₁₀|μ⁽ᵏ⁾ − λ₁| 0 −2 −4 −6 −8 −10 0 1 2 3 4 5 6 7 8 9 10 Computed Theoretical |λ₂/λ₁|²ᵏ
Figure 4: Convergence of the power method for $A = \begin{pmatrix}2&1\\1&3\end{pmatrix}$. The Rayleigh quotient error decreases by a factor of $|\lambda_2/\lambda_1|^2 \approx 0.146$ per iteration.

9.2 Comparison of Three Methods

Graph comparing eigenvalue error convergence rates of the power method, shifted inverse iteration, and Rayleigh quotient iteration Convergence Comparison of Three Methods Iteration k log₁₀|error| 0 −2 −4 −6 −8 −10 −12 −14 −16 0 1 2 3 4 5 6 7 Power Shifted Inv. Rayleigh Quotient Iter.
Figure 5: Convergence rate comparison of three methods ($\mathbf{x}_0 = (1,1)^T/\sqrt{2}$, shift $\sigma = 3.5$). The power method (blue) shows linear convergence, shifted inverse iteration (green) shows fast linear convergence, and Rayleigh quotient iteration (red) shows cubic convergence. All three start from the same initial vector; at k=1, shifted inverse iteration and Rayleigh quotient iteration reach the same point (since the initial shift is identical).

10. Implementation Notes

10.1 Choice of Initial Vector

Using a random vector as the initial value virtually guarantees $c_1 \neq 0$. In practice, a random initial vector is sufficient, but if prior information is available, it can be exploited to reduce the number of iterations.

10.2 Convergence Criteria

Either of the following can be used:

  • Eigenvalue change: $|\mu^{(k)} - \mu^{(k-1)}| < \varepsilon$
  • Residual: $\|A\mathbf{x}^{(k)} - \mu^{(k)}\mathbf{x}^{(k)}\| < \varepsilon$ (residual norm)

Residual-based criteria are more reliable. When the eigenvalue is very small, absolute error is insufficient, so implementations often use the relative error

$$\dfrac{|\mu^{(k)} - \mu^{(k-1)}|}{\max(1,\, |\mu^{(k)}|)} < \varepsilon$$

10.3 When the Power Method Fails

Non-convergent Cases

  • $|\lambda_1| = |\lambda_2|$: For example, $\lambda_1 = -\lambda_2$ (opposite signs) or $\lambda_1 = \overline{\lambda_2}$ (complex conjugates) cause the power method to oscillate without converging.
  • $A$ is not diagonalizable: When Jordan blocks are present, convergence is not guaranteed.

In these cases, use the QR algorithm (Chapter 18) or the Lanczos method.

Complex Eigenvalues of Real Matrices

When a real matrix $A$ has complex eigenvalues, they always appear as conjugate pairs $\lambda, \overline{\lambda}$ (a consequence of the polynomial-root property for real-coefficient characteristic polynomials). Since $|\lambda| = |\overline{\lambda}|$, a dominant complex eigenvalue automatically triggers the $|\lambda_1| = |\lambda_2|$ condition above, and the real-arithmetic power method fails to converge.

Hence the power method is unsuitable for directly computing complex eigenvalues of a real matrix. Alternatives include:

  • QR algorithm (Chapter 18): localizes complex conjugate pairs into $2 \times 2$ diagonal blocks and extracts complex eigenvalues using real arithmetic only
  • Power iteration extended to complex initial vectors and iterates (with shifting)
  • Aberth-Ehrlich method (Chapter 55): directly finds complex roots of the characteristic polynomial
  • Müller's method (Chapter 58): single-point iteration that handles complex roots

10.4 Computational Cost Comparison

Method Preprocessing Per Iteration Convergence Rate
Power Method None $O(N^2)$ Linear $|\lambda_2/\lambda_1|^k$
Inverse Iteration LU decomp. $O(N^3)$ $O(N^2)$ Linear $|\lambda_{N-1}/\lambda_N|^k$
Shifted Inverse Iter. LU decomp. $O(N^3)$ $O(N^2)$ Linear ($\sigma$-dependent)
Rayleigh Quotient Iter. None $O(N^3)$ Quadratic / Cubic (symmetric)

10.5 Relationship to Other Methods

  • QR Algorithm: Computes all eigenvalues simultaneously. Can be viewed as an implicit parallel execution of the power method.
  • Lanczos Method: Eigenvalue computation for large sparse matrices. An extension of the power method to Krylov subspaces.
  • LAPACK Strategy: Compute eigenvalues with the QR algorithm, then compute eigenvectors using shifted inverse iteration.

Implementation in LAPACK

LAPACK's symmetric / real-symmetric eigensolvers ?SYEV / ?SYEVR / ?SYEVD follow a three-stage pipeline:

  1. Householder tridiagonalization (?SYTRD, $\tfrac{4}{3}N^3$ flops)
  2. Eigenvalues of the tridiagonal matrix: QR algorithm (?STEQR), divide-and-conquer (?STEDC), or MRRR (?STEMR)
  3. Shifted inverse iteration for each eigenvector (?STEIN, $O(N^2)$ per vector)
! Fortran: all eigenvalues + eigenvectors
CALL DSYEV('V', 'U', N, A, LDA, W, WORK, LWORK, INFO)
! 'V' = also compute eigenvectors, 'U' = reference upper triangle

This yields all $N$ eigenpairs in $O(N^3)$ with provable stability. The shifted inverse iteration of this article is the core of stage 3.

11. Summary

  • The power method is the most fundamental technique for finding the eigenvalue of largest absolute value. Its convergence rate is $O(|\lambda_2/\lambda_1|^k)$, and convergence is slow when $|\lambda_2/\lambda_1|$ is close to 1.
  • Inverse iteration applies the power method to $A^{-1}$ to find the eigenvalue of smallest absolute value.
  • Shifted inverse iteration applies the power method to $(A - \sigma I)^{-1}$ to find the eigenvalue closest to $\sigma$. This is the standard LAPACK strategy for computing eigenvectors.
  • Rayleigh quotient iteration dynamically updates the shift using the Rayleigh quotient, achieving cubic convergence for symmetric matrices.

References

  • G. H. Golub, C. F. Van Loan, Matrix Computations, 4th ed., Johns Hopkins University Press, 2013.
  • L. N. Trefethen, D. Bau III, Numerical Linear Algebra, SIAM, 1997.
  • B. N. Parlett, The Symmetric Eigenvalue Problem, SIAM, 1998.
  • Y. Saad, Numerical Methods for Large Eigenvalue Problems, Revised ed., SIAM, 2011.
  • Wikipedia: Power iteration