When are classical iterations (Jacobi, Gauss-Seidel, SOR) used today?

Rarely as standalone solvers. They live on as smoothers in algebraic multigrid (AMG), as building blocks for preconditioners, or for diagonally dominant matrices where convergence is acceptable. Jacobi is highly parallel. Gauss-Seidel converges faster but is inherently sequential. SOR accelerates Gauss-Seidel via a relaxation parameter ω; for the discrete Laplacian the optimal ω is known analytically and approaches 2 - O(h).

Why is conjugate gradient (CG) restricted to symmetric positive definite matrices?

CG solves Ax = b by minimizing the quadratic form φ(x) = (1/2) x^T A x - b^T x. For φ to be a well-posed convex minimization, A must be symmetric positive definite. Without symmetry the notion of conjugate directions breaks down; without definiteness φ is unbounded below. For symmetric indefinite systems use MINRES (minimizes the residual using only symmetry); for nonsymmetric systems use GMRES, BiCGStab, IDR(s).

Why does GMRES need to be restarted?

After k iterations, GMRES stores the full k-dimensional Krylov basis (memory O(nk)) and performs MGS against all previous vectors (cost O(nk) per iteration). Both grow without bound, so practical use restarts GMRES every m iterations with the current iterate as a new initial guess. Restarting can cause stagnation, but with a good preconditioner m = 20-50 is typically sufficient for convergence.

ILU vs IC — when to use each?

Incomplete Cholesky (IC) is for symmetric positive definite matrices and preserves symmetry — used as a preconditioner for CG/PCG. Incomplete LU (ILU) is the asymmetric counterpart, used to precondition GMRES or BiCGStab. Both control quality versus sparsity through fill level p and drop threshold τ. Use IC when CG is applicable; otherwise ILU. sangi exposes both through its sparse iterative solver preconditioner options.

CSR vs CSC — what's the difference?

CSR (Compressed Sparse Row) lists (column index, value) pairs row-by-row and is optimal for row-major access and SpMV y = Ax. CSC (Compressed Sparse Column) lists them column-by-column and is preferred for transposed SpMV y = A^T x and column-wise operations like LU pivoting. Some solvers store the same matrix in both formats. CSR is sangi's default.

Iterative and Sparse Solvers — Jacobi, CG, GMRES, Preconditioners, Sparse Direct

Overview

Iterative methods construct a sequence of approximations $\mathbf{x}_k$ to the solution of $A \mathbf{x} = \mathbf{b}$ without explicitly factoring $A$. For large sparse $A$ at $n = 10^6$ to $10^9$, direct factorizations exceed available memory and iterative methods become the only practical option.

The major classes:

Classical iterations: Jacobi, Gauss-Seidel, SOR — local row updates; slow alone, still useful as smoothers and preconditioners.
Krylov subspace methods: CG, MINRES, GMRES, BiCGStab, IDR(s), QMR — use only matrix-vector products and find optimal iterates in $\mathcal{K}_k(A, \mathbf{r}_0)$.
Multigrid: hierarchical coarsening and smoothing; asymptotically optimal $O(n)$ for discretized PDEs.

This page covers each iterative class, preconditioners, sparse storage, sparse direct methods, and stopping/restart strategies. See iterative API, sparse direct, and sparse iterative in the API reference.

Classical Iterations: Jacobi, Gauss-Seidel, SOR

Write $A = D + L + U$ (diagonal, strictly lower, strictly upper) and update $\mathbf{x}_{k+1}$ from $\mathbf{x}_k$.

Jacobi

$$\mathbf{x}_{k+1} = D^{-1} (\mathbf{b} - (L + U) \mathbf{x}_k).$$

Each component depends only on others, making the update embarrassingly parallel. Convergence requires $\rho(D^{-1}(L+U)) < 1$, satisfied by diagonally dominant matrices. Rate is set by $\rho$; for the discrete Poisson equation $\rho = 1 - O(h^2)$, painfully slow.

Gauss-Seidel

$$(D + L) \mathbf{x}_{k+1} = \mathbf{b} - U \mathbf{x}_k.$$

Uses updated components within a sweep — faster than Jacobi but inherently sequential. Converges for diagonally dominant and SPD matrices; its spectral radius is roughly the square of Jacobi's.

SOR (Successive Over-Relaxation)

With relaxation parameter $\omega$,

$$(D + \omega L) \mathbf{x}_{k+1} = \omega \mathbf{b} + ((1 - \omega) D - \omega U) \mathbf{x}_k.$$

$\omega = 1$ recovers Gauss-Seidel; $\omega \in (1, 2)$ is over-relaxation. For the discrete Laplacian the optimal $\omega$ is

$$\omega_{\mathrm{opt}} = \frac{2}{1 + \sqrt{1 - \rho_J^2}},$$

improving convergence to the square root of the Jacobi rate. Outside structured PDEs the optimal $\omega$ is hard to estimate; Krylov methods supersede SOR as standalone solvers but it remains relevant as a smoother and preconditioner.

SPD Krylov: CG and PCG

The conjugate gradient method (CG) solves $A\mathbf{x} = \mathbf{b}$ for symmetric positive definite $A$ by minimizing $\varphi(\mathbf{x}) = \tfrac{1}{2} \mathbf{x}^\top A \mathbf{x} - \mathbf{b}^\top \mathbf{x}$.

The Method

At each step, $A$-conjugate search directions $\mathbf{p}_k$ are built from the residual $\mathbf{r}_k = \mathbf{b} - A \mathbf{x}_k$, and $\mathbf{x}_k$ moves by the step size $\alpha_k$ that minimizes $\varphi$. Crucially, $\mathbf{x}_k$ is the $A$-norm optimal approximation in $\mathcal{K}_k(A, \mathbf{r}_0)$, and in exact arithmetic CG terminates in at most $n$ steps. In practice rounding spoils finite termination but the goal is convergence in $k \ll n$ iterations.

Convergence

With condition number $\kappa = \kappa_2(A)$,

$$\frac{\|\mathbf{x}_k - \mathbf{x}^*\|_A}{\|\mathbf{x}_0 - \mathbf{x}^*\|_A} \leq 2 \left(\frac{\sqrt{\kappa} - 1}{\sqrt{\kappa} + 1}\right)^k.$$

Ill-conditioning makes convergence slow, so a preconditioner $M \approx A$ is almost always used. Preconditioned CG (PCG) operates on $\mathbf{z}_k = M^{-1} \mathbf{r}_k$ and reduces the effective condition number to $\kappa(M^{-1} A)$.

Where CG Wins

Elliptic PDEs (potential, heat, structural statics) discretized by FEM/FD, graph Laplacians, normal equations of least squares, and quadratic subproblems in machine learning optimizers. Memory is $O(n)$ (three vectors) and per-iteration cost is one SpMV plus a few inner products — $O(\mathrm{nnz})$.

Symmetric Indefinite: MINRES and SYMMLQ

For symmetric but indefinite $A$, CG's foundation (convex quadratic minimization) fails. Paige and Saunders' MINRES and SYMMLQ build on the symmetric Lanczos recurrence without requiring positive definiteness.

MINRES

Minimizes $\|\mathbf{r}_k\|_2$ in the Krylov subspace. It QR-factorizes the Lanczos tridiagonal as the iteration proceeds, keeping memory at $O(n)$, on par with CG. Standard for symmetric saddle-point KKT systems and the Stokes equations.

SYMMLQ

Minimizes $\|\mathbf{x}_k - \mathbf{x}^*\|_2$ rather than the residual, more stable on singular or nearly-singular $A$ but with a trickier stopping test.

Nonsymmetric: GMRES, BiCGStab, IDR(s), QMR

Nonsymmetric $A$ admits no symmetric three-term recurrence; the methods are more involved.

GMRES (Generalized Minimal Residual)

Arnoldi iteration builds an orthonormal basis $Q_k$ of $\mathcal{K}_k$ and an upper Hessenberg $H_k$; GMRES minimizes $\|\mathbf{b} - A \mathbf{x}_k\|_2$ over $\mathcal{K}_k$. Convergence is monotone, but memory grows linearly with $k$ ($O(nk)$) and so does the per-iteration MGS cost. Practical use is restarted GMRES, GMRES(m), with $m = 20$–$50$. Restarts erase convergence history and can stagnate, but with a decent preconditioner convergence is usually quick.

BiCGStab

Short-recurrence method (CG-like 3-term updates) for nonsymmetric $A$. Combines BiCG steps with a GMRES(1)-style smoothing to dampen Bi-Lanczos instability. Memory $O(n)$, two SpMVs per iteration. Less robust than GMRES; convergence can oscillate or stall.

IDR(s) (Induced Dimension Reduction)

Sonneveld and van Gijzen (2008). Uses $s$ shadow vectors to compress the Krylov subspace, achieving smoother convergence than BiCGStab with fewer SpMVs. Typical $s$ is 4-8. Memory-efficient compared to GMRES and more robust than BiCGStab. (Not yet implemented in sangi; planned for a future release.)

QMR (Quasi-Minimal Residual)

Bi-Lanczos based. Requires multiplications by $A^\top$ and quasi-minimizes the residual. Predominant before BiCGStab; TFQMR variants by Freund avoid the transpose. (Not yet implemented in sangi; planned for a future release.)

Preconditioners

A preconditioner $M$ approximates $A$ in the sense that $M^{-1} A \approx I$, transforming $A\mathbf{x} = \mathbf{b}$ into $M^{-1} A \mathbf{x} = M^{-1} \mathbf{b}$ and reducing the effective condition number. $M^{-1}$ must be cheap to apply each iteration ($O(\mathrm{nnz})$ ideally).

Jacobi (Diagonal)

$M = D = \mathrm{diag}(A)$. Simplest, helpful only for diagonally dominant problems, maximally parallel.

SSOR (Symmetric SOR)

Symmetric variant of SOR, usable for SPD systems: $M^{-1} = (D/\omega + U)^{-1} (D/\omega) (D/\omega + L)^{-1}$. Zero fill-in; very cheap setup.

Incomplete LU / Cholesky (ILU / IC)

Approximate factorizations $\tilde{L} \tilde{U} \approx A$ on the sparsity pattern of $A$. Fill level $p$ and drop threshold $\tau$ control the quality–sparsity trade-off. Use IC for SPD systems, ILU otherwise. Setup is $O(\mathrm{nnz})$, application is forward/backward substitution at $O(\mathrm{nnz})$. A Manteuffel shift $\alpha I$ often stabilizes the factorization. See the LU/Cholesky page for details.

AMG (Algebraic Multigrid)

Builds a hierarchy of coarse grids and smoothers from the algebraic structure of $A$ alone, with no geometric input. Gauss-Seidel or Jacobi smoothing combined with strong-connection coarsening gives asymptotically optimal $O(n)$ preconditioning for elliptic PDEs. Implementation complexity drives most users to specialized libraries (HYPRE, Trilinos). (Not yet implemented in sangi; planned for a future release.)

Sparse Storage Formats

Storing only the nonzeros slashes memory and per-iteration cost when nonzero density $\mathrm{nnz}/n^2$ is small.

CSR (Compressed Sparse Row)

values[k]: nonzero values in row-major order, length $\mathrm{nnz}$
col_idx[k]: column index of each nonzero, length $\mathrm{nnz}$
row_ptr[i]: index into values of the first nonzero of row $i$, length $n+1$

SpMV $\mathbf{y} = A \mathbf{x}$ is an inner product of values[row_ptr[i] : row_ptr[i+1]] with x[col_idx[...]] for each row. Default format in sangi.

CSC (Compressed Sparse Column)

Column version of CSR. Optimal for $A^\top \mathbf{x}$ and column-wise operations such as LU pivoting. Some solvers store both CSR and CSC of the same matrix.

COO (Coordinate)

List of $(i, j, v)$ triples. Convenient for input and matrix assembly, inefficient for SpMV; convert to CSR/CSC after construction.

BSR (Block Sparse Row)

Nonzero $b \times b$ blocks stored CSR-style. PDE discretizations on structured meshes often produce small dense blocks, and BSR lets SpMV use BLAS-3.

SpMV is Memory-Bound

SpMV bandwidth is typically limited by random access to $\mathbf{x}$ via col_idx. Reorderings such as Cuthill-McKee or RCM improve cache locality. Roofline-model optimization is common.

Sparse Direct Methods

When iterative methods converge poorly (strongly indefinite, ill-conditioned, or with many right-hand sides), sparse direct solvers remain competitive.

Symbolic and Numeric Phases

Symbolic factorization: determine the nonzero pattern of $L, U$, choose a row/column ordering to minimize fill-in (AMD, COLAMD, METIS), and build the elimination tree.
Numeric factorization: fill values into the predetermined pattern.

When only the right-hand side changes (e.g., time-stepping ODEs, Newton iterations), the symbolic phase is reused and only numeric work is repeated.

Simplicial vs Supernodal/Multifrontal

Simplicial methods update one element at a time (CHOLMOD's simplicial mode). Supernodal/multifrontal methods aggregate rows of similar pattern into supernodes and update with BLAS-3. The latter is faster but more complex. CHOLMOD, SuperLU, MUMPS, and PARDISO are typical implementations.

Ordering

Fill-in depends strongly on the elimination order. AMD (Approximate Minimum Degree), Nested Dissection (METIS), and COLAMD (for unsymmetric matrices) are the standard heuristics. Optimal ordering is NP-hard.

sangi's Sparse Direct

sangi provides simplicial Cholesky and sparse LU through the sparse direct API. These are well suited to medium-sized sparse matrices ($n \lesssim 10^5$); beyond that, iterative methods with preconditioning are the practical route.

Stopping Criteria, Stagnation, Restart

Stopping Criteria

Absolute residual: $\|\mathbf{r}_k\|_2 < \tau_a$
Relative residual (standard): $\|\mathbf{r}_k\|_2 / \|\mathbf{b}\|_2 < \tau_r$
Backward-error based: $\|\mathbf{r}_k\|_2 / (\|A\|_2 \|\mathbf{x}_k\|_2 + \|\mathbf{b}\|_2) < \tau$
Iteration cap: $k < k_{\max}$

Higham (2002) recommends the backward-error form to avoid over-iteration on ill-conditioned problems. The forward relative error is bounded by $\kappa(A) \cdot \|\mathbf{r}_k\| / \|\mathbf{b}\|$, so the relative residual alone does not bound the forward error.

Stagnation Detection

When $\|\mathbf{r}_k\|_2$ stops decreasing over several iterations, options include:

Strengthen the preconditioner (e.g., IC(0) $\to$ IC(1))
Restart: GMRES restart, or CG warm-restart from current iterate
Iterative refinement: fall back to a direct method, recompute residual, refine
Switch solver: the iterative method may not fit the problem

Restart Strategy

GMRES(m) restart trades memory for convergence quality. Too small $m$ causes stagnation; too large erases the advantage. Start with $m = 20$–$50$ and double when convergence is slow.

Advanced variants — adaptive restart, deflated GMRES that retains spectral information — exist for difficult problems.

sangi's Choices

Problem	Recommended solver	Preconditioner
SPD, dense	Cholesky direct	—
SPD, sparse, $n \lesssim 10^5$	Simplicial Cholesky	—
SPD, sparse, $n \gg 10^5$	CG / PCG	IC(0), Jacobi (AMG planned)
Symmetric indefinite, sparse	MINRES / SYMMLQ	SSOR, block-diagonal
Nonsymmetric, sparse, diagonally dominant	BiCGStab (IDR(s) planned)	ILU(0), Jacobi
Nonsymmetric, sparse, ill-conditioned	restarted GMRES(m)	ILU (AMG planned)
Nonsymmetric, $n \lesssim 10^5$	Sparse LU	—
Classical iteration as a preconditioner	Jacobi / Gauss-Seidel / SOR via iterative API	—

References

Saad, Y. (2003). Iterative Methods for Sparse Linear Systems, 2nd ed., SIAM.
Hestenes, M. R. and Stiefel, E. (1952). "Methods of conjugate gradients for solving linear systems". J. Res. Nat. Bur. Standards, 49, 409–436.
Saad, Y. and Schultz, M. H. (1986). "GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems". SIAM J. Sci. Stat. Comput., 7(3), 856–869.
van der Vorst, H. A. (1992). "BiCGSTAB: A fast and smoothly converging variant of Bi-CG". SIAM J. Sci. Stat. Comput., 13(2), 631–644.
Sonneveld, P. and van Gijzen, M. B. (2008). "IDR(s): A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations". SIAM J. Sci. Comput., 31(2), 1035–1062.
Davis, T. A. (2006). Direct Methods for Sparse Linear Systems, SIAM.
Briggs, W. L., Henson, V. E., and McCormick, S. F. (2000). A Multigrid Tutorial, 2nd ed., SIAM.