LinAlg — Linear Algebra

Overview

Provides matrix decompositions, eigenvalue problems, and linear system solvers.

  • Decompositions — LU, Cholesky, QR, SVD, LDL
  • Eigenvalues — symmetric / general / generalized / Hermitian, power iteration
  • Solvers — direct (LU/Cholesky/QR/SVD/LDL), iterative (CG, Jacobi, Gauss-Seidel, SOR)
  • Least squares — ordinary / weighted / regularized (Ridge)
  • Special matrices — tridiagonal, banded, sparse (CSR)
  • Vector space — Gram-Schmidt, basis completion, linear regression

All functions reside in the sangi::algorithms namespace (vector space utilities are in sangi).

Complex module: The Complex<T> class is documented in detail on the Complex API page. In linear algebra, it can be passed to functions as Matrix<Complex<double>>. std::complex<double> is also supported as Matrix<std::complex<double>>.

Expression templates: Element-wise matrix operations (addition, subtraction, scalar multiplication/division) are lazily evaluated using expression templates. No intermediate buffers are created when chaining multiple operations; evaluation occurs only at the point of assignment. However, matrix multiplication (operator*) is not lazily evaluated (it computes immediately).

Why operator* is eager: Element-wise operations have the form $C(i,j) = f(A(i,j), B(i,j))$, where each output element depends only on the corresponding inputs — a perfect fit for expression templates. Matrix multiplication is fundamentally different: $C(i,j) = \sum_k A(i,k)\,B(k,j)$, so computing one output element requires reading an entire row of $A$ and an entire column of $B$. Lazy element-wise evaluation here breaks down for three reasons. (1) In chained products such as A*B*D, omitting the intermediate A*B turns $O(n^3)$ into $O(n^4)$ because (A*B)(i,k) would be recomputed for every column access of $D$. (2) Cache blocking, packing, and other BLAS Level 3 optimisations cannot be expressed at single-element granularity. (3) High-performance kernels like dgemm in OpenBLAS or MKL require contiguous memory with an explicit leading dimension and cannot operate on lazy expression nodes. sangi therefore evaluates operator* eagerly and dispatches to its SIMD / BLAS backend.

Comparison with Eigen: Eigen behaves the same way. auto C = A * B; returns an Eigen::Product<A, B> expression node, but the actual computation runs in one shot via a GEMM-equivalent routine when the result is materialised (Matrix C = A * B;). No per-element lazy access. Eigen exposes .lazyProduct() for explicit small-matrix lazy products and .noalias() to skip the alias-safe temporary; sangi does not currently expose these fine-grained controls.

Dimension checking: Dimension mismatches in dynamic Matrix<T> are reported at runtime via DimensionError exceptions. In Debug builds (NDEBUG not defined), assert fires in addition to the exception, allowing the debugger to stop immediately at the point of failure.

Compile-time checking: StaticMatrix<T, R, C> / StaticVector<T, N> carry R / C / N as compile-time constants. The base classes BaseMatrix<T> / BaseVector<T> expose std::ptrdiff_t traits static_rows / static_cols / static_size (returning -1 for dynamic types and the matching template argument for static derived types). The supported LinAlg functions consult these traits and use static_assert to catch dimension mismatches at build time whenever both arguments carry static sizes. For example, lu_solve(StaticMatrix<double, 3, 3>{}, StaticVector<double, 5>{}) is rejected by the compiler before the program ever runs. Passing dynamic Matrix / Vector still goes through the runtime DimensionError path (the trait is -1 so static_assert is skipped) — existing code keeps its current behaviour.

Functions with the check applied (50+):

  • Linear solvers: lu_solve, cholesky_solve, qr_solve, qr_col_pivoting_solve, householder_solve, ldl_solve, gaussian_elimination, col_piv_qr_solve, full_piv_lu_solve, cod_solve, svd_solve, band_solve
  • Determinant / rank / predicate: lu_inverse, lu_determinant, bareiss_determinant, is_positive_definite, rank
  • Iterative solvers: conjugate_gradient, jacobi_method, gauss_seidel_method, sor_method
  • Least squares: least_squares, weighted_least_squares, regularized_least_squares
  • Eigenvalue routines: eigen, eigen_symmetric, eigen_generalized, eigen_generalized_symmetric, power_method, inverse_power_method, rayleigh_quotient_iteration, gershgorin_discs, gershgorin_bounds
  • Decompositions: lu_decomposition, cholesky_decomposition, ldl_decomposition, bidiagonalize, tridiagonalize, tridiagonalize_compact, hessenberg_decomposition, schur_decomposition, polar_decomposition, smith_normal_form, jordan_normal_form, qz_decomposition, iwasawa_decomposition, col_piv_qr, full_piv_lu, complete_orthogonal_decomposition, real_qz

Not in scope:

  • BLAS Level 1/2 (dot / nrm2 / asum / iamax / axpy / copy / gemv / ger / syr): the mutable output arguments (Vector<T>& / BaseMatrix<T>&) would force a large-scale signature refactor that ripples through every caller. At the same time these are simple loops / SIMD kernels where the runtime check is sufficient, and StaticVector arguments are rare in this layer — the ROI is too low to justify the work
  • tridiagonal_solve: takes four BaseVector arguments (main diagonal, two off-diagonals, RHS) whose interdependent sizes (a.size = n-1, b.size = n, c.size = n-1, d.size = n) cannot be expressed with the current helper macros. Deferred for a dedicated macro design
  • eigen_hermitian: takes BaseMatrix<Complex<T>>, so element_t<MA> resolves to Complex<T>; extracting the real type T needed for the internal Vector<T> is non-trivial. Deferred

StaticMatrix / StaticVector support: Matrix<T> and StaticMatrix<T, Rows, Cols> share the common base BaseMatrix<T>; Vector<T> and StaticVector<T, N> share the common base BaseVector<T>. All public LinAlg functions take const BaseMatrix<T>& or const BaseVector<T>&, so Matrix / StaticMatrix / Vector / StaticVector can all be passed directly without conversion (implicit upcast, no temporary needed). BaseMatrix<T> and BaseVector<T> have no virtual functions; element access a(i, j) / v[i] is inlinable and does not block SIMD (see BaseMatrix.hpp / BaseVector.hpp). Return values remain Matrix<T> / Vector<T> since their dimensions are determined at runtime.

Copy count at the call site (Eigen-parity minimum): Public functions take const BaseMatrix<T>& / const BaseVector<T>&, so argument binding itself is always 0-copy (passing Matrix or StaticMatrix incurs only an implicit upcast). Only the eigen family makes a single mandatory 1-copy at the body entry (the input is destructively reduced to Hessenberg / Jacobi form) — this is the minimum copy required by the numerical algorithm itself (Eigen's EigenSolver incurs the same 1-copy). All other routines that don't mutate the input are fully 0-copy:

  • Predicates / rank / condition: is_positive_definite, bareiss_determinant, rank, conditionNumber, condition_number_1norm, condition_number_inf, rcond_1norm
  • Decompositions: hessenberg_decomposition, schur_decomposition, polar_decomposition, bidiagonalize, tridiagonalize, tridiagonalize_compact, smith_normal_form, jordan_normal_form, qz_decomposition, iwasawa_decomposition, col_piv_qr, full_piv_lu, complete_orthogonal_decomposition, real_qz
  • Solve family: lu_solve, gaussian_elimination, qr_solve, qr_col_pivoting_solve, ldl_solve, col_piv_qr_solve, full_piv_lu_solve, cod_solve, svd_solve, householder_solve, lu_inverse, lu_determinant (only cholesky_solve keeps a 1-copy on the b side for the SIMD load path)
  • Iterative solvers: conjugate_gradient, jacobi_method, gauss_seidel_method, sor_method, iterativeRefinement, sparse_cg, sparse_bicgstab
  • Eigen routines (non-destructive): power_method, inverse_power_method, rayleigh_quotient_iteration, gershgorin_discs, gershgorin_bounds
  • Least squares / tri- / penta- / banded solvers: least_squares, weighted_least_squares, regularized_least_squares, tridiagonal_solve, pentadiagonal_solve, band_solve
  • Level 1/2 BLAS: dot, nrm2, asum, iamax, axpy, copy, gemv, ger, syr
  • Other: HNF, randomizedSVD

The BLAS SIMD path uses BaseVector::is_contiguous() to invoke pointer-based SIMD only when stride == 1, and falls back to scalar correctly when stride != 1 (e.g. a matrix column view).

Functions that genuinely need matrix arithmetic (A*A etc.) inside the body and therefore keep a body 1-copy identical to Eigen's: expm (needs for Padé approximation), solve_sylvester (only the C argument, for U^T C V), and the internal helper mat_power (result * M). These copies cannot be removed at the algorithm level. sangi's LinAlg API copy overhead is now at the Eigen-parity minimum.

Pivot Strategies

PivotStrategy is an enumeration used by both LU decomposition and Gaussian elimination. Since it appears as an argument to many decomposition APIs below, we document it once here, up front.

enum class PivotStrategy {
    None,       // No pivoting (use diagonal element as-is)
    Partial,    // Partial pivoting (max absolute value in column; min height for Rational)
    Full        // Full pivoting (search entire remaining submatrix)
};
StrategySearch RangeFactorizationStabilityCostNotes
NoneNone (uses diagonal element)$A = LU$LowMinimalOnly when the matrix structure is known and diagonal elements are sufficiently large
PartialColumn-wise (rows $k \ldots n{-}1$)$PA = LU$Good$O(n)$/stepSufficient for most practical cases. Default
FullEntire submatrix$PAQ = LU$Best$O(n^2)$/stepFor ill-conditioned matrices. Adds column permutation inversion

For Rational type: The selection criterion for Partial and Full differs from floating-point types. Instead of maximum absolute value, the pivot with minimum height is selected. The height is defined as $\max(|\text{numerator}|, |\text{denominator}|)$; a smaller height suppresses numerator/denominator growth in subsequent operations.

How Pivot Strategy Affects Results

For ill-conditioned matrices, omitting pivoting can severely degrade numerical accuracy. The following example shows a case where the (1,1) entry is extremely small.

using namespace sangi::algorithms;

// Ill-conditioned: (1,1) entry is extremely small
Matrix<float> A({{1e-8f, 1.0f}, {1.0f, 1.0f}});
Vector<float> b({1.0f, 2.0f});
// True solution: x = {1.00000001..., 0.99999999...} ≈ {1.0, 1.0}

// No pivoting → divides by 1e-8, rounding error explodes
auto x_none = gaussian_elimination(A, b, PivotStrategy::None);
// x_none = {0.000000, 1.000000}  ← x_none[0] is completely wrong

// Partial pivoting → swaps rows to process the larger entry first
auto x_partial = gaussian_elimination(A, b, PivotStrategy::Partial);
// x_partial = {1.000000, 1.000000}  ← accurate

Without pivoting, dividing row 1 by $10^{-8}$ introduces a scaling factor of $10^8$. Since float has only 23 mantissa bits (~7 digits), $1.0$ is rounded away when subtracted from $10^8$ during elimination of row 2. Partial pivoting swaps the rows so that $|a_{11}| = 1$, avoiding this catastrophic cancellation.

Further Reading

LU Decomposition

LU decomposition. The PivotStrategy argument selects from None / Partial (default) / Full (see the Pivot Strategies section for details).

  • Complexity: $O(n^3)$
  • Use cases: General-purpose computation of linear systems, determinants, and inverses
  • Constraint: Square matrix

lu_decomposition

template<typename T>
std::pair<Matrix<T>, std::vector<size_type>>
    lu_decomposition(const Matrix<T>& a,
                     PivotStrategy pivot = PivotStrategy::Partial)
ParameterTypeDescription
aconst Matrix<T>&Square matrix to decompose
pivotPivotStrategyPivot strategy (default: Partial)

Return value: std::pair — the first element is the combined LU matrix (below the diagonal is L, the diagonal and above is U; L's diagonal is implicitly 1), the second element is the pivot array perm. perm follows the LAPACK ipiv convention as a swap-trace: perm[k] records that row k and row perm[k] were swapped at step k (it is not a direct permutation of the original row order). Under full pivoting, the first n entries record row swaps and the next n record column swaps (2n entries in total).

Exception: DimensionError — if the matrix is not square.

using namespace sangi::algorithms;

Matrix<double> A = {{2, 1, -1}, {-3, -1, 2}, {-2, 1, 2}};
auto [LU, perm] = lu_decomposition(A);
// LU = {{-3,                  -1,                  2                 },
//       { 0.6666666666666667,  1.6666666666666667,  0.6666666666666667},
//       {-0.6666666666666667,  0.2,                 0.2               }}
//   diagonal + upper = U; below diagonal = L (L's diagonal is implicit 1)
// perm = {1, 2, 2}
//   records the swap at step k: row k ↔ row perm[k] (LAPACK ipiv convention)
//   for this A: step 0: row 0 ↔ row 1, step 1: row 1 ↔ row 2, step 2: self

lu_solve

template<typename T>
Vector<T> lu_solve(const Matrix<T>& a, const Vector<T>& b,
                   PivotStrategy pivot = PivotStrategy::Partial)
ParameterTypeDescription
aconst Matrix<T>&Coefficient matrix (square)
bconst Vector<T>&Right-hand side vector
pivotPivotStrategyPivot strategy (default: Partial)

Return value: Solution vector $\mathbf{x}$ satisfying $A\mathbf{x} = \mathbf{b}$.

Exception: DimensionError — dimension mismatch.

Matrix<double> A = {{2, 1, -1}, {-3, -1, 2}, {-2, 1, 2}};
Vector<double> b = {8, -11, -3};
auto x = lu_solve(A, b);
// x = {2, 3, -1}

lu_inverse

template<typename T>
Matrix<T> lu_inverse(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Square matrix to invert

Return value: Inverse matrix $A^{-1}$.

Matrix<double> A = {{2, 1}, {5, 3}};
auto Ainv = lu_inverse(A);
// Ainv = {{3, -1}, {-5, 2}}
// A * Ainv = I

lu_determinant

template<typename T>
T lu_determinant(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Square matrix

Return value: Determinant $\det(A)$. Computed as the product of diagonal elements of the LU decomposition, multiplied by the sign of the pivot permutation.

Matrix<double> A = {{2, 1}, {5, 3}};
auto det = lu_determinant(A);
// det = 1.0  (2*3 - 1*5)

bareiss_determinant — Bareiss Algorithm

Computes the determinant of an integer matrix without division. By exploiting Sylvester's identity, the divisions arising during elimination are always exact, so all intermediate results remain integers while computing the determinant in $O(n^3)$.

  • Complexity: $O(n^3)$
  • Algorithm: At each step, computes $M_{ij}^{(k)} = \frac{M_{ij}^{(k-1)} \cdot M_{kk}^{(k-1)} - M_{ik}^{(k-1)} \cdot M_{kj}^{(k-1)}}{M_{k-1,k-1}^{(k-2)}}$. This division is always exact by Sylvester's identity
  • Use cases: Integer types such as Matrix<Int> or Matrix<int> where division is undefined or should be avoided
  • Auto-selection: The Matrix::determinant() member function automatically uses Bareiss for integer types (numeric_traits<T>::is_integer)
template<typename T>
T bareiss_determinant(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Square matrix (integer type recommended)

Return value: Determinant $\det(A)$. All intermediate results are kept as integers.

// Determinant of a multi-precision integer matrix
Matrix<Int> A({{Int(100), Int(200), Int(300)},
               {Int(400), Int(500), Int(600)},
               {Int(700), Int(800), Int(1000)}});
auto det = bareiss_determinant(A);  // -3000000 (exact)

// Also works with int
Matrix<int> B({{0, 1, 2}, {1, 0, 3}, {4, 5, 6}});
auto det2 = bareiss_determinant(B);  // 16

// The determinant() member function automatically selects Bareiss for integer types
auto det3 = A.determinant();  // same result as bareiss_determinant

Cholesky Decomposition

Symmetric positive definite matrix $A = LL^\top$. About twice as fast as LU decomposition.

  • Complexity: $O(n^3/3)$ (half of LU)
  • Use cases: Least squares, Kalman filters, Gaussian processes, normal equations
  • Constraint: Symmetric positive definite matrices only

cholesky_decomposition

template<typename T>
Matrix<T> cholesky_decomposition(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Symmetric positive definite matrix

Return value: Lower triangular matrix $L$ satisfying $A = LL^\top$.

Exception: Decomposition fails if the matrix is not positive definite.

Matrix<double> A = {{4, 2}, {2, 3}};
auto L = cholesky_decomposition(A);
// L = {{2, 0}, {1, sqrt(2)}}
// L * L^T = A

cholesky_solve

template<typename T>
Vector<T> cholesky_solve(const Matrix<T>& a, const Vector<T>& b)
ParameterTypeDescription
aconst Matrix<T>&Symmetric positive definite coefficient matrix
bconst Vector<T>&Right-hand side vector

Return value: Solution vector $\mathbf{x}$. Internally performs Cholesky decomposition followed by forward and back substitution.

Matrix<double> A = {{4, 2}, {2, 3}};
Vector<double> b = {1, 2};
auto x = cholesky_solve(A, b);
// x ≈ {-0.125, 0.75}

QR Decomposition

QR decomposition using Householder reflections. $A = QR$.

  • Complexity: $O(mn^2)$ (for an $m \times n$ matrix)
  • Algorithm: Householder reflections (more efficient than Givens rotations)
  • Use cases: Least squares problems, preprocessing for eigenvalue computation, numerically stable linear systems
  • Properties: More numerically stable than LU. Also handles rectangular matrices

qr_decomposition

template<typename T>
std::pair<Matrix<T>, Matrix<T>> qr_decomposition(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&$m \times n$ matrix

Return value: std::pair — the first element is the orthogonal matrix $Q$ ($m \times m$), the second element is the upper triangular matrix $R$ ($m \times n$).

Matrix<double> A = {{1, 1}, {1, 2}, {1, 3}};
auto [Q, R] = qr_decomposition(A);
// Q: 3x3 orthogonal matrix (Q^T Q = I)
// R: 3x2 upper triangular matrix
// Q * R = A

qr_solve

template<typename T>
Vector<T> qr_solve(const Matrix<T>& a, const Vector<T>& b)
ParameterTypeDescription
aconst Matrix<T>&Coefficient matrix (square or overdetermined)
bconst Vector<T>&Right-hand side vector

Return value: Solution vector $\mathbf{x}$. For overdetermined systems ($m > n$), returns the least squares solution $\min\|A\mathbf{x} - \mathbf{b}\|_2$.

// Overdetermined system (3x2): least squares solution
Matrix<double> A = {{1, 1}, {1, 2}, {1, 3}};
Vector<double> b = {1, 2, 2};
auto x = qr_solve(A, b);
// x ≈ {0.667, 0.5}

SVD (Singular Value Decomposition)

Singular value decomposition using the Golub-Kahan algorithm. $A = U\Sigma V^\top$.

  • Complexity: $O(mn \cdot \min(m,n))$
  • Algorithm: Householder bidiagonalization followed by implicit-shift QR iteration (ref: Golub & Van Loan "Matrix Computations" §8.6)
  • Use cases: Rank determination, condition number estimation, least squares, low-rank approximation, principal component analysis
  • Properties: The most stable decomposition. Handles ill-conditioned matrices

svd_decomposition

template<typename T>
std::tuple<Matrix<T>, Vector<T>, Matrix<T>>
    svd_decomposition(const Matrix<T>& A, bool transV = true)
ParameterTypeDefaultDescription
Aconst Matrix<T>&$m \times n$ matrix
transVbooltruetrue: returns $V^\top$, false: returns $V$

Return value: std::tuple<U, σ, V> — $U$ is the left singular vector matrix ($m \times k$), $\sigma$ is the singular value vector ($k$ elements, in descending order), $V^\top$ is the right singular vector matrix ($k \times n$). $k = \min(m, n)$.

Matrix<double> A = {{1, 2}, {3, 4}, {5, 6}};
auto [U, sigma, Vt] = svd_decomposition(A);

// Reconstruction: A = U * diag(sigma) * Vt
auto A_rec = reconstruct_matrix_from_svd(U, sigma, Vt);

svd_solve

template<typename T>
Vector<T> svd_solve(const Matrix<T>& a, const Vector<T>& b,
    T rcond = std::numeric_limits<T>::epsilon() * T(100))
ParameterTypeDefaultDescription
aconst Matrix<T>&Coefficient matrix
bconst Vector<T>&Right-hand side vector
rcondT$\varepsilon \times 100$Reciprocal condition number threshold. Singular values smaller than this are damped

Return value: Solution vector $\mathbf{x}$. Returns a numerically stable solution via damping, even for ill-conditioned matrices.

When to use: If the condition number is good, lu_solve is the fastest. For ill-conditioned or near-singular matrices, use svd_solve. Increasing rcond strengthens the damping and improves stability, at the cost of accuracy.

Matrix<double> A = {{1, 2}, {3, 4}, {5, 6}};
Vector<double> b = {1, 2, 3};
auto x = svd_solve(A, b);
// x ≈ {0, 0.5}

reconstruct_matrix_from_svd

template<typename T>
Matrix<T> reconstruct_matrix_from_svd(
    const Matrix<T>& U, const Vector<T>& s, const Matrix<T>& Vt)
ParameterTypeDescription
Uconst Matrix<T>&Left singular vector matrix ($m \times k$)
sconst Vector<T>&Singular value vector ($k$ elements)
Vtconst Matrix<T>&Right singular vector matrix ($k \times n$)

Return value: Reconstructed matrix $A = U \cdot \mathrm{diag}(s) \cdot V^\top$ ($m \times n$).

Matrix<double> A = {{3, 0}, {0, 5}};
auto [U, s, Vt] = svd_decomposition(A);
auto A2 = reconstruct_matrix_from_svd(U, s, Vt);
// A2 ≈ A (reconstruction error ≈ 0)

svd_determinant

template<typename T>
T svd_determinant(const Matrix<T>& a)

Computes the determinant using SVD. Returns the product of singular values $\prod \sigma_i$ multiplied by the sign. More numerically stable than lu_determinant for ill-conditioned matrices.

Matrix<double> A = {{1, 2}, {3, 4}};
auto det = svd_determinant(A);
// det ≈ -2.0

LDL Decomposition

Symmetric matrix $A = LDL^\top$. Also handles indefinite matrices.

  • Complexity: $O(n^3/3)$
  • Use cases: Symmetric but not necessarily positive definite matrices, such as Hessians in optimization problems
  • Properties: Generalization of Cholesky (which requires positive definiteness)

ldl_decomposition

template<typename T>
std::pair<Matrix<T>, Vector<T>> ldl_decomposition(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Symmetric matrix

Return value: std::pair — the first element is the lower triangular matrix $L$ (with unit diagonal), the second element is the diagonal element vector $D$.

Matrix<double> A = {{4, 2, -1}, {2, 5, 3}, {-1, 3, 6}};
auto [L, D] = ldl_decomposition(A);
// A = L * diag(D) * L^T

ldl_solve

template<typename T>
Vector<T> ldl_solve(const Matrix<T>& a, const Vector<T>& b)
ParameterTypeDescription
aconst Matrix<T>&Symmetric coefficient matrix
bconst Vector<T>&Right-hand side vector

Return value: Solution vector $\mathbf{x}$.

Matrix<double> A = {{4, 2}, {2, 3}};
Vector<double> b = {1, 2};
auto x = ldl_solve(A, b);
// x ≈ {-0.125, 0.75}

Choosing a Decomposition

DecompositionMatrix RequirementsComplexityStabilityRecommended Use
LUSquare$O(n^3)$GoodGeneral-purpose solver, determinant, inverse
CholeskySymmetric positive definite$O(n^3/3)$GoodFast solver for positive definite systems
QRAny ($m \ge n$)$O(mn^2)$Very goodLeast squares, when numerical stability matters
SVDAny$O(mn \cdot \min)$BestIll-conditioned matrices, rank determination, low-rank approximation
LDLSymmetric (indefinite OK)$O(n^3/3)$GoodSymmetric indefinite systems

Gaussian Elimination

Performs forward elimination and back substitution on the augmented matrix $[A|\mathbf{b}]$ to directly solve the linear system $A\mathbf{x} = \mathbf{b}$. Mathematically equivalent to LU decomposition, but operates directly on the augmented matrix, making it more concise when the decomposition will not be reused.

gaussian_elimination

template<typename T>
Vector<T> gaussian_elimination(const Matrix<T>& a, const Vector<T>& b,
                               PivotStrategy pivot = PivotStrategy::Partial)
ParameterTypeDefaultDescription
aconst Matrix<T>&Coefficient matrix (square)
bconst Vector<T>&Right-hand side vector
pivotPivotStrategyPartialPivot selection strategy

Return value: Solution vector $\mathbf{x}$.

Exceptions:

  • DimensionError — matrix is not square, or dimension mismatch
  • MathError — singular matrix detected

Special behavior for Rational type: When T = Rational, pivot selection chooses the element with the minimum height (the one whose numerator and denominator are smallest in absolute value) rather than the element with maximum absolute value. This suppresses numerator/denominator growth in Rational arithmetic and maintains computational efficiency.

using namespace sangi::algorithms;

// double with partial pivoting (default)
Matrix<double> A = {{2, 1, -1}, {-3, -1, 2}, {-2, 1, 2}};
Vector<double> b = {8, -11, -3};
auto x = gaussian_elimination(A, b);
// x = {2, 3, -1}
using namespace sangi::algorithms;

// Rational for exact solution
Matrix<Rational> A = {
    {Rational(2), Rational(1), Rational(-1)},
    {Rational(-3), Rational(-1), Rational(2)},
    {Rational(-2), Rational(1), Rational(2)}
};
Vector<Rational> b = {Rational(8), Rational(-11), Rational(-3)};
auto x = gaussian_elimination(A, b);
// x = {2, 3, -1} (exact rational solution)

Eigenvalues

eigen — General Eigenvalue Decomposition

Computes all eigenvalues and eigenvectors of a general (including non-symmetric) matrix.

  • Algorithm: Automatically detects symmetry; if symmetric, delegates to eigen_symmetric. For non-symmetric matrices, uses Hessenberg reduction + Givens rotation-based QR iteration. Uses LAPACK routines when MKL is available
  • Complexity: $O(n^3)$
template<typename T>
std::pair<Vector<std::complex<T>>, Matrix<std::complex<T>>>
    eigen(const Matrix<T>& a)
ParameterTypeDescription
aconst Matrix<T>&Square matrix

Return value: std::pair — the first element is the eigenvalue vector (Vector<complex<T>>), the second element is the eigenvector matrix (Matrix<complex<T>>, where each column is an eigenvector).

Matrix<double> A = {{0, -1}, {1, 0}};  // rotation matrix
auto [eigenvalues, eigenvectors] = eigen(A);
// eigenvalues = {i, -i} (purely imaginary)

eigen_symmetric — Symmetric Matrix Eigenvalues

Computes all eigenvalues and eigenvectors of a real symmetric matrix. Eigenvalues are real.

  • Algorithm: Jacobi method (diagonalization by rotations). Uses LAPACK when MKL is available
  • Complexity: $O(n^3)$
  • Properties: All eigenvalues are guaranteed real. Eigenvectors are orthogonal
template<typename T>
std::pair<Vector<T>, Matrix<T>>
    eigen_symmetric(const Matrix<T>& a)
Matrix<double> A = {{2, 1}, {1, 2}};
auto [vals, vecs] = eigen_symmetric(A);
// vals = {1, 3}
// each column of vecs is the corresponding eigenvector

eigen_hermitian — Hermitian Matrix Eigenvalues

Computes all eigenvalues and eigenvectors of a Hermitian matrix ($A^H = A$). Eigenvalues are real.

  • Algorithm: Uses LAPACKE_zheev / LAPACKE_cheev when MKL is available. Without MKL, the $n \times n$ Hermitian matrix is converted to a $2n \times 2n$ real symmetric matrix and solved via eigen_symmetric
  • Works without MKL: A fallback implementation is available in all environments
template<typename T>
std::pair<Vector<T>, Matrix<std::complex<T>>>
    eigen_hermitian(const Matrix<std::complex<T>>& a)
// Eigenvalues of a Hermitian matrix are real
Matrix<std::complex<double>> H = {{{2,0}, {1,-1}}, {{1,1}, {3,0}}};
auto [vals, vecs] = eigen_hermitian(H);
// vals ≈ {1.268, 3.732} (real)

eigen_generalized — Generalized Eigenvalue Problem

Solves $A\mathbf{x} = \lambda B\mathbf{x}$. $A$ and $B$ are general square matrices.

  • Use cases: Vibration analysis, buckling problems, generalized PCA
template<typename T>
std::pair<Vector<std::complex<T>>, Matrix<std::complex<T>>>
    eigen_generalized(const Matrix<T>& a, const Matrix<T>& b)
Matrix<double> A = {{1, 2}, {3, 4}};
Matrix<double> B = {{5, 6}, {7, 8}};
auto [vals, vecs] = eigen_generalized(A, B);
// A * vecs = B * vecs * diag(vals)

eigen_generalized_symmetric — Symmetric Generalized Eigenvalue Problem

Solves $A\mathbf{x} = \lambda B\mathbf{x}$. $A$ is symmetric and $B$ is symmetric positive definite.

  • Algorithm: Decomposes $B = LL^\top$ via Cholesky, then solves the symmetric eigenvalue problem for $C = L^{-1}AL^{-\top}$
template<typename T>
std::pair<Vector<T>, Matrix<T>>
    eigen_generalized_symmetric(const Matrix<T>& a, const Matrix<T>& b)
Matrix<double> A = {{4, 1}, {1, 3}};
Matrix<double> B = {{2, 0}, {0, 1}};
auto [vals, vecs] = eigen_generalized_symmetric(A, B);
// A * v = λ * B * v

eigen_complex — Complex Matrix Eigenvalues

template<typename T>
std::pair<Vector<Complex<T>>, Matrix<Complex<T>>>
    eigen_complex(const Matrix<T>& a)

Returns the eigenvalues and eigenvectors of a general real matrix as complex numbers. Equivalent to eigen, but returns eigenvectors as Complex<T> type.

Matrix<double> A = {{0, -1}, {1, 0}};  // 90° rotation
auto [vals, vecs] = eigen_complex(A);
// vals = {i, -i}

Choosing an Eigenvalue Solver

FunctionMatrix RequirementsEigenvalue TypeEigenvector Type
eigenGeneral squarecomplex<T>complex<T>
eigen_symmetricReal symmetricT (real)T (real, orthogonal)
eigen_hermitianHermitianT (real)complex<T>
eigen_generalizedGeneral paircomplex<T>complex<T>
eigen_generalized_symmetricSymmetric + SPD pairT (real)T (real)

Power Iteration

Iterative methods for computing specific eigenvalues. Useful when not all eigenvalues are needed for large matrices.

power_method

Computes the dominant eigenvalue (largest in absolute value) and its eigenvector.

  • Convergence rate: the error after $k$ iterations is $O\!\left(|\lambda_2/\lambda_1|^k\right)$ — depends on the ratio of the second-largest to the largest eigenvalue (smaller ratio = faster)
  • Complexity: $O(n^2)$ per iteration (matrix-vector product)

Convergence requirements and caveats: A strictly unique dominant eigenvalue $\lambda_1$ (i.e. $|\lambda_1| > |\lambda_2|$ holds in the strict sense) is required. In the following situations the method either fails to converge or converges arbitrarily slowly:

  • Complex conjugate pair is dominant — for a real matrix, if $\lambda = a \pm bi$ is the largest pair in absolute value, the iteration oscillates within the real vector space and does not converge (there is no single real eigenvector to converge to).
  • Multiple eigenvalues share the largest absolute value — e.g. $\lambda_1 = 5, \lambda_2 = -5$, or any spectrum where the outermost eigenvalues are symmetrically placed on a circle. Orthogonal matrices (all $|\lambda_i| = 1$) fall into this category too.
  • $|\lambda_2| \approx |\lambda_1|$ — the convergence ratio approaches 1 and the iteration count explodes. In this case, switch to shifted inverse_power_method, rayleigh_quotient_iteration, or the full QR-based eigen.

In practice power_method is most reliable on symmetric positive-definite matrices and on systems where a dominant real eigenvalue is guaranteed (e.g. stochastic matrices for Markov chains and PageRank). For general non-symmetric real matrices, prefer eigen as the default and use power_method only when the uniqueness of the dominant eigenvalue is known in advance.

template<typename T>
std::pair<T, Vector<T>> power_method(const Matrix<T>& a,
    T tolerance = T(1e-10),
    std::size_t max_iterations = 1000)
Matrix<double> A = {{2, 1}, {1, 3}};
auto [lambda, v] = power_method(A);
// lambda ~ 3.618

inverse_power_method

Computes the eigenvalue closest to a given shift $\sigma$.

template<typename T>
std::pair<T, Vector<T>> inverse_power_method(const Matrix<T>& a,
    T sigma = T(0),
    T tolerance = T(1e-10),
    std::size_t max_iterations = 1000)
Matrix<double> A = {{2, 1}, {1, 3}};
auto [lambda, v] = inverse_power_method(A);
// lambda ≈ 1.382 (smallest eigenvalue)

rayleigh_quotient_iteration

Rayleigh quotient iteration. Updates the shift at each iteration, achieving cubic convergence.

template<typename T>
std::pair<T, Vector<T>> rayleigh_quotient_iteration(const Matrix<T>& a,
    const Vector<T>& initial_guess,
    T tolerance = T(1e-10),
    std::size_t max_iterations = 100)
Matrix<double> A = {{2, 1}, {1, 3}};
Vector<double> v0 = {1, 1};
auto [lambda, v] = rayleigh_quotient_iteration(A, v0);
// lambda ≈ 3.618 (converges rapidly to the eigenvalue nearest to v0)

schur_decomposition — Schur Decomposition

template<typename T>
std::pair<Matrix<T>, Matrix<T>> schur_decomposition(const Matrix<T>& a)

Computes the real Schur decomposition $A = QTQ^\top$. $T$ is a quasi-upper-triangular matrix (real eigenvalues on the diagonal, complex eigenvalues as $2 \times 2$ blocks), and $Q$ is an orthogonal matrix. Used internally by matrix functions (expm, logm, sqrtm).

Matrix<double> A = {{1, 2}, {0, 3}};
auto [T, Q] = schur_decomposition(A);
// T ≈ {{1, *}, {0, 3}}  (upper triangular)
// Q * T * Q^T ≈ A

Solvers

SolverType

enum class SolverType {
    Auto,       // Automatically selects the best solver based on matrix properties
    LU,         // LU decomposition (general-purpose)
    Cholesky,   // Cholesky decomposition (symmetric positive definite)
    QR,         // QR decomposition (numerical stability, overdetermined systems)
    SVD,        // SVD (ill-conditioned, rank-deficient)
    LDL,        // LDL decomposition (symmetric indefinite)
    Iterative   // Iterative method (CG)
};

Auto selection logic:

  1. Symmetric and positive definite → Cholesky
  2. Symmetric → LDL
  3. Square → LU
  4. Overdetermined → QR

solve

template<typename T>
Vector<T> solve(const Matrix<T>& a, const Vector<T>& b,
    SolverType solver_type = SolverType::Auto)
ParameterTypeDefaultDescription
aconst Matrix<T>&Coefficient matrix
bconst Vector<T>&Right-hand side vector
solver_typeSolverTypeAutoSolver selection

Return value: Solution vector $\mathbf{x}$.

using namespace sangi::algorithms;

Matrix<double> A = {{2, 1, -1}, {-3, -1, 2}, {-2, 1, 2}};
Vector<double> b = {8, -11, -3};

auto x1 = solve(A, b);                      // auto-selection
auto x2 = solve(A, b, SolverType::LU);      // explicit LU
auto x3 = solve(A, b, "cholesky");           // string specification also available

A string overload is also provided. Available strings: "auto", "lu", "cholesky", "qr", "svd", "ldl", "iterative".

solve_multi — Multiple Right-Hand Sides

template<typename T>
Matrix<T> solve_multi(const Matrix<T>& a, const Matrix<T>& b,
    SolverType solver_type = SolverType::Auto)
ParameterTypeDefaultDescription
aconst Matrix<T>&Coefficient matrix ($n \times n$)
bconst Matrix<T>&Right-hand side matrix ($n \times m$; each column is one right-hand side)
solver_typeSolverTypeAutoSolver selection

Return value: Solution matrix $X$ ($n \times m$). Solves each column of $AX = B$ individually via solve.

Matrix<double> A = {{2, 1}, {5, 3}};
Matrix<double> B = {{1, 0}, {0, 1}};  // identity → computes the inverse
auto X = solve_multi(A, B);
// X = {{3, -1}, {-5, 2}} = A^{-1}

Iterative Solvers

Iterative methods that are effective when direct solvers are too costly for large matrices, or when the matrix is sparse.

conjugate_gradient — Conjugate Gradient (CG)

  • Convergence rate: $O(\sqrt{\kappa})$ iterations ($\kappa$: condition number)
  • Constraint: Symmetric positive definite matrices only
  • Complexity: $O(n^2)$ per iteration (dense matrix) or $O(\mathrm{nnz})$ (sparse matrix)
template<typename T>
Vector<T> conjugate_gradient(const Matrix<T>& a, const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 1000)
ParameterTypeDefaultDescription
aconst Matrix<T>&Symmetric positive definite matrix
bconst Vector<T>&Right-hand side vector
toleranceT$\varepsilon \times 10$Convergence criterion: $\|\mathbf{r}\| \le \text{tolerance} \cdot \|\mathbf{b}\|$
max_iterationssize_type1000Maximum number of iterations
Matrix<double> A = {{4, 1}, {1, 3}};
Vector<double> b = {1, 2};
auto x = conjugate_gradient(A, b, 1e-10, 1000);

jacobi_method — Jacobi Iteration

  • Convergence condition: Guaranteed for diagonally dominant matrices ($|a_{ii}| > \sum_{j \ne i} |a_{ij}|$)
  • Convergence rate: Slower than CG but simple to implement. Well-suited for parallelization
template<typename T>
Vector<T> jacobi_method(const Matrix<T>& a, const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 1000)
Matrix<double> A = {{10, -1}, {-1, 10}};
Vector<double> b = {9, 9};
auto x = jacobi_method(A, b);
// x ≈ {1, 1}

gauss_seidel_method — Gauss-Seidel Iteration

  • Convergence rate: About twice as fast as Jacobi (under the same conditions)
  • Convergence condition: Guaranteed for diagonally dominant or symmetric positive definite matrices
template<typename T>
Vector<T> gauss_seidel_method(const Matrix<T>& a, const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 1000)
Matrix<double> A = {{10, -1}, {-1, 10}};
Vector<double> b = {9, 9};
auto x = gauss_seidel_method(A, b);
// x ≈ {1, 1} (converges faster than Jacobi)

sor_method — SOR (Successive Over-Relaxation)

  • Convergence condition: Guaranteed for diagonally dominant matrices when $0 < \omega < 2$. Reduces to Gauss-Seidel when $\omega = 1$
template<typename T>
Vector<T> sor_method(const Matrix<T>& a, const Vector<T>& b,
    T omega = static_cast<T>(1.5),
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 1000)
Matrix<double> A = {{10, -1}, {-1, 10}};
Vector<double> b = {9, 9};
auto x = sor_method(A, b, 1.2);  // omega = 1.2
// x ≈ {1, 1}
ParameterTypeDefaultDescription
omegaT1.5Relaxation parameter ($0 < \omega < 2$)

Choosing an Iterative Method

MethodMatrix RequirementsRelative SpeedNotes
CGSymmetric positive definiteFastFirst choice for large-scale problems
Gauss-SeidelDiagonally dominantMediumSimple and easy to implement
SORDiagonally dominantMedium–fastFaster than GS with tuned $\omega$
JacobiDiagonally dominantSlowWell-suited for parallelization

Least Squares

least_squares

template<typename T>
Vector<T> least_squares(const Matrix<T>& a, const Vector<T>& b,
    SolverType solver_type = SolverType::Auto)
ParameterTypeDefaultDescription
aconst Matrix<T>&Coefficient matrix ($m \times n$, typically $m \ge n$)
bconst Vector<T>&Right-hand side vector ($m$ elements)
solver_typeSolverTypeAutoSolver selection

Return value: Least squares solution $\mathbf{x} = \arg\min \|A\mathbf{x} - \mathbf{b}\|_2$ ($n$ elements).

Auto selection logic: Uses QR decomposition for overdetermined systems ($m \ge n$), and SVD for underdetermined systems ($m < n$). Solving the normal equations $A^\top A \mathbf{x} = A^\top \mathbf{b}$ directly is not used due to numerical instability.

Matrix<double> A = {{1, 1}, {1, 2}, {1, 3}};
Vector<double> b = {1, 2, 2};
auto x = least_squares(A, b);
// x ≈ {0.667, 0.5}  (least squares fit y = 0.667 + 0.5*x)

weighted_least_squares

template<typename T>
Vector<T> weighted_least_squares(const Matrix<T>& a, const Vector<T>& b,
    const Vector<T>& weights)
ParameterTypeDescription
aconst Matrix<T>&Coefficient matrix ($m \times n$)
bconst Vector<T>&Right-hand side vector ($m$ elements)
weightsconst Vector<T>&Weight vector ($m$ elements, confidence of each observation)

Return value: Weighted least squares solution $\mathbf{x} = \arg\min \sum w_i (A_i\mathbf{x} - b_i)^2$. Internally multiplies each row of $A$ and each element of $b$ by $\sqrt{w_i}$ and reduces to an ordinary least squares problem.

Matrix<double> A = {{1, 1}, {1, 2}, {1, 3}};
Vector<double> b = {1, 2, 2};
Vector<double> w = {1, 1, 2};  // emphasize the 3rd observation
auto x = weighted_least_squares(A, b, w);

regularized_least_squares — Ridge Regression

template<typename T>
Vector<T> regularized_least_squares(const Matrix<T>& a, const Vector<T>& b,
    T lambda = static_cast<T>(0.1))
ParameterTypeDefaultDescription
lambdaT0.1Regularization parameter ($\lambda > 0$)

Return value: Regularized least squares solution $\mathbf{x} = (A^\top A + \lambda I)^{-1} A^\top \mathbf{b}$.

Larger $\lambda$ reduces the solution norm and improves stability, but increases bias. It is common practice to select the optimal $\lambda$ via cross-validation.

Matrix<double> A = {{1, 1}, {1, 2}, {1, 3}};
Vector<double> b = {1, 2, 2};
auto x = regularized_least_squares(A, b, 0.1);
// lambda = 0.1: the norm of solution x is suppressed

Special Matrix Solvers

tridiagonal_solve — Thomas Algorithm

Fast solver for tridiagonal matrices.

  • Complexity: $O(n)$ (dramatically faster than LU's $O(n^3)$)
  • Use cases: Spline interpolation, finite difference methods (FDM), implicit time integration
template<typename T>
Vector<T> tridiagonal_solve(const Vector<T>& a, const Vector<T>& b,
    const Vector<T>& c, const Vector<T>& d)
ParameterTypeDescription
aconst Vector<T>&Sub-diagonal ($n{-}1$ elements)
bconst Vector<T>&Main diagonal ($n$ elements)
cconst Vector<T>&Super-diagonal ($n{-}1$ elements)
dconst Vector<T>&Right-hand side vector ($n$ elements)
//  [2 1 0] [x0]   [1]
//  [1 3 1] [x1] = [2]
//  [0 1 2] [x2]   [3]
Vector<double> a = {1, 1};      // sub-diagonal
Vector<double> b = {2, 3, 2};   // main diagonal
Vector<double> c = {1, 1};      // super-diagonal
Vector<double> d = {1, 2, 3};   // right-hand side
auto x = tridiagonal_solve(a, b, c, d);

band_solve — Banded Matrix Solver

Banded matrix LU solver with partial pivoting.

  • Complexity: $O(n \cdot (k_l + k_u)^2)$
  • Optimization: Automatically delegates to the Thomas algorithm when $k_l = k_u = 1$ (tridiagonal)
template<typename T>
Vector<T> band_solve(const Matrix<T>& a, const Vector<T>& b,
    typename Matrix<T>::size_type kl,
    typename Matrix<T>::size_type ku)
ParameterTypeDescription
aconst Matrix<T>&Banded matrix ($n \times n$, stored in dense format)
bconst Vector<T>&Right-hand side vector
klsize_typeLower bandwidth
kusize_typeUpper bandwidth
// Tridiagonal matrix (bandwidth 1)
Matrix<double> A = {{2,-1, 0}, {-1, 2,-1}, {0,-1, 2}};
Vector<double> b = {1, 0, 1};
auto x = band_solve(A, b, 1, 1);  // lower=1, upper=1
// x = {1, 1, 1}

Sparse Matrix Solvers (CSR Format)

Solvers for sparse matrices stored in CSR (Compressed Sparse Row) format.

sparse_solve

template<typename T>
Vector<T> sparse_solve(
    const std::vector<T>& values,
    const std::vector<size_type>& row_ptr,
    const std::vector<size_type>& col_idx,
    const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * T(100),
    typename Matrix<T>::size_type max_iterations = 0)

Internally delegates to sparse_bicgstab (supports non-symmetric matrices).

SparseMatrix<double> A(3, 3);
A.insert(0,0, 4); A.insert(0,1,-1);
A.insert(1,0,-1); A.insert(1,1, 4); A.insert(1,2,-1);
A.insert(2,1,-1); A.insert(2,2, 4);
Vector<double> b = {1, 2, 1};
auto x = sparse_solve(A, b);

sparse_cg — Sparse CG

template<typename T>
Vector<T> sparse_cg(
    const std::vector<T>& values,
    const std::vector<size_type>& row_ptr,
    const std::vector<size_type>& col_idx,
    const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 0)

Constraint: Symmetric positive definite matrices only. Use sparse_bicgstab for non-symmetric matrices.

// SPD sparse matrix
SparseMatrix<double> A(2, 2);
A.insert(0,0, 4); A.insert(0,1, 1);
A.insert(1,0, 1); A.insert(1,1, 3);
Vector<double> b = {1, 2};
auto x = sparse_cg(A, b);
// x ≈ {0.091, 0.636}

sparse_bicgstab — Sparse BiCGSTAB

template<typename T>
Vector<T> sparse_bicgstab(
    const std::vector<T>& values,
    const std::vector<size_type>& row_ptr,
    const std::vector<size_type>& col_idx,
    const Vector<T>& b,
    T tolerance = std::numeric_limits<T>::epsilon() * 10,
    typename Matrix<T>::size_type max_iterations = 0)

Unlike CG, supports non-symmetric matrices. More memory-efficient than GMRES.

SparseMatrix<double> A(2, 2);
A.insert(0,0, 3); A.insert(0,1, 1);
A.insert(1,0, 2); A.insert(1,1, 4);
Vector<double> b = {1, 2};
auto x = sparse_bicgstab(A, b);
// x ≈ {0.2, 0.4}

Matrix Functions

Analytic functions applied to matrices. Internally uses Schur decomposition and Parlett recurrence.

expm — Matrix Exponential

template<typename T>
Matrix<T> expm(const Matrix<T>& A)

Computes $e^A$ using the scaling-and-squaring method with Padé approximation (Higham 2005). The Padé order is automatically selected from [3/3] to [13/13] based on the 1-norm of the matrix. For matrices with a small norm, a low-order approximation is sufficient and fast; for large norms, scaling (halving) combined with the maximum order [13/13] ensures high accuracy. There is currently no parameter to manually specify the order.

Matrix<double> A = {{0, -1}, {1, 0}};
auto E = expm(A);
// E ≈ {{cos(1), -sin(1)}, {sin(1), cos(1)}}

sqrtm — Matrix Square Root

template<typename T>
Matrix<T> sqrtm(const Matrix<T>& A)

Computes the principal square root $A^{1/2}$. $\text{sqrtm}(A) \cdot \text{sqrtm}(A) = A$.

Matrix<double> A = {{4, 1}, {1, 3}};
auto S = sqrtm(A);
// S * S ≈ A

logm — Matrix Logarithm

template<typename T>
Matrix<T> logm(const Matrix<T>& A)

Computes the principal logarithm $\log(A)$. $e^{\log(A)} = A$.

Matrix<double> A = {{2, 1}, {0, 2}};
auto L = logm(A);
auto check = expm(L);
// check ≈ A

Gershgorin Discs

Estimates the region where eigenvalues lie. For each row, returns a disc with center $a_{ii}$ and radius $\sum_{j \ne i}|a_{ij}|$.

gershgorin_discs

template<typename T>
std::vector<GershgorinDisc<T>> gershgorin_discs(const Matrix<T>& a)

// GershgorinDisc<T> { T center; T radius; }
Matrix<double> A = {{5, 0.5, 0.2}, {0.5, 3, 0.3}, {0.2, 0.3, 4}};
auto discs = gershgorin_discs(A);
// discs[0]: center=5.0, radius=0.7
// discs[1]: center=3.0, radius=0.8
// discs[2]: center=4.0, radius=0.5

gershgorin_bounds

template<typename T>
std::pair<T, T> gershgorin_bounds(const Matrix<T>& a)

Returns the interval $[\lambda_{\min}, \lambda_{\max}]$ that contains all eigenvalues.

auto [lo, hi] = gershgorin_bounds(A);
// all eigenvalues lie within [lo, hi]

Matrices with Complex Eigenvalues

gershgorin_bounds uses std::abs() for radius computation, so it works correctly with matrices that have complex-valued elements. However, the returned bounds represent an interval on the real axis and do not reflect the imaginary parts of eigenvalues.

To locate complex eigenvalues precisely, inspect the individual Gershgorin discs via gershgorin_discs (center: $a_{ii}$, radius: $R_i = \sum_{j \ne i}|a_{ij}|$). Each eigenvalue lies within at least one disc.

Vector Space Utilities

Provides fundamental vector space operations: Gram-Schmidt orthonormalization to convert a set of vectors into an orthogonal basis, projection of one vector onto another, linear independence testing, and linear regression via least squares.

Typical use cases: constructing and orthogonalizing bases (post-processing eigenvectors, subspace projection), linear fitting of data, and computing angles between vectors.

gram_schmidt

template<typename T, typename V>
std::vector<V> gram_schmidt(std::vector<V> vectors)
std::vector<Vector<double>> basis = {{1,0,1}, {0,1,1}, {1,1,0}};
auto ortho = gram_schmidt<double, Vector<double>>(basis);
// ortho is the orthogonalized set of vectors

projection

template<typename T, typename V>
V projection(const V& v, const V& onto)
Vector<double> v = {3, 4};
Vector<double> u = {1, 0};
auto proj = projection<double>(v, u);
// proj = {3, 0}

normalize

template<typename T, typename V>
V normalize(const V& v)

Divides $\mathbf{v}$ by its Euclidean norm to return the unit vector $\mathbf{v}/\|\mathbf{v}\|$. Throws std::invalid_argument when the norm is below std::numeric_limits<T>::epsilon() (i.e. effectively zero).

Vector<double> v = {3.0, 4.0};
auto u = normalize<double>(v);
// u = {0.6, 0.8}, ||u|| = 1

linear_combination

template<typename T, typename V>
V linear_combination(const std::vector<V>& vectors, const std::vector<T>& coefficients)

Computes the linear combination $\sum_{i=1}^{n} c_i \mathbf{v}_i$ of vectors $\mathbf{v}_1, \ldots, \mathbf{v}_n$ with coefficients $c_1, \ldots, c_n$. Throws std::invalid_argument if the number of vectors and coefficients mismatch, the vector list is empty, or the vectors have inconsistent sizes.

std::vector<Vector<double>> vs = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}};
std::vector<double> cs = {2.0, 3.0, -1.0};
auto w = linear_combination<double>(vs, cs);
// w = {2, 3, -1}

is_linearly_independent

template<typename T, typename V, typename M>
bool is_linearly_independent(const std::vector<V>& vectors)

Tests whether the given vector list is linearly independent. The implementation forms the Gram matrix $G_{ij} = \langle \mathbf{v}_i, \mathbf{v}_j \rangle$ and checks whether its determinant exceeds std::numeric_limits<T>::epsilon() (independence). An empty list is treated as independent and returns true; if the number of vectors exceeds the vector dimension, false is returned immediately.

std::vector<Vector<double>> indep = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}};
bool b1 = is_linearly_independent<double, Vector<double>, Matrix<double>>(indep);
// b1 = true

std::vector<Vector<double>> dep = {{1, 2, 3}, {2, 4, 6}, {0, 1, 1}};
bool b2 = is_linearly_independent<double, Vector<double>, Matrix<double>>(dep);
// b2 = false  (the second vector is 2 * the first)

Note: for large vector sets (n >= a few hundred) the Gram determinant easily overflows or underflows. For better numerical reliability, count the non-zero vectors left by gram_schmidt, or check the singular values of the matrix formed by stacking the vectors.

linear_regression

template<typename T, typename V, typename M>
V linear_regression(const M& X, const V& y)

Linear regression via least squares: $\beta = (X^\top X)^{-1} X^\top y$.

// Fit y = a + b*x
Matrix<double> X = {{1,1}, {1,2}, {1,3}, {1,4}};
Vector<double> y = {2.1, 3.9, 6.2, 7.8};
auto beta = linear_regression<double, Vector<double>, Matrix<double>>(X, y);
// beta ≈ {0.05, 1.97} → y ≈ 0.05 + 1.97*x

angle

template<typename T, typename V>
T angle(const V& u, const V& v)

Returns the angle (in radians) between two vectors.

BLAS (Basic Linear Algebra Subprograms)

BLAS (Basic Linear Algebra Subprograms) is a standard interface defining fundamental linear algebra operations. sangi implements the key Level 1 through Level 3 BLAS functions in the sangi::blas namespace.

These functions are used internally by decomposition algorithms and solvers, and can also be called directly when you need explicit control over matrix-vector operations (e.g., performing update operations like $y \leftarrow \alpha A x + \beta y$ without buffer allocation).

Level 1: Vector-Vector Operations

dot — Inner Product

template<typename T> T dot(const Vector<T>& x, const Vector<T>& y)
Vector<double> x = {1, 2, 3}, y = {4, 5, 6};
double d = blas::dot(x, y);  // 32.0

nrm2 — 2-Norm

Computes the Euclidean norm (2-norm) of a vector: $\|x\|_2 = \sqrt{\sum x_i^2}$.

template<typename T> T nrm2(const Vector<T>& x)

asum — 1-Norm

Computes the sum of absolute values (1-norm) of a vector: $\|x\|_1 = \sum |x_i|$.

template<typename T> T asum(const Vector<T>& x)

scal — Scaling

Scales a vector by a scalar: $x \leftarrow \alpha x$. In-place operation.

template<typename T> void scal(T alpha, Vector<T>& x)  // x ← α·x

axpy — AXPY

$y \leftarrow \alpha x + y$. Performs vector scaling and addition in a single pass. One of the most frequently used BLAS functions.

template<typename T> void axpy(T alpha, const Vector<T>& x, Vector<T>& y)  // y ← α·x + y
Vector<double> x = {1, 2, 3}, y = {4, 5, 6};
blas::axpy(2.5, x, y);  // y = {6.5, 10.0, 13.5}

Level 2: Matrix-Vector Operations

gemv — General Matrix-Vector Product

General matrix-vector product: $y \leftarrow \alpha \cdot \mathrm{op}(A) \cdot x + \beta \cdot y$. Set trans=true for transpose. Buffer reuse eliminates temporary objects.

template<typename T>
void gemv(bool trans, T alpha, const Matrix<T>& A,
          const Vector<T>& x, T beta, Vector<T>& y)  // y ← α·op(A)·x + β·y

trsv — Triangular Solve

Solves the triangular system $\mathrm{op}(A) x = b$ via forward/back substitution. The upper/trans/unit_diag flags specify the triangular matrix type.

template<typename T>
void trsv(bool upper, bool trans, bool unit_diag,
          const Matrix<T>& A, Vector<T>& x)  // x ← op(A)⁻¹·x

Level 3: Matrix-Matrix Operations

gemm — General Matrix Multiply

General matrix multiplication: $C \leftarrow \alpha \cdot \mathrm{op}(A) \cdot \mathrm{op}(B) + \beta \cdot C$. The core Level 3 BLAS routine. For large matrices, memory access patterns dominate performance.

template<typename T>
void gemm(bool transA, bool transB, T alpha,
          const Matrix<T>& A, const Matrix<T>& B,
          T beta, Matrix<T>& C)  // C ← α·op(A)·op(B) + β·C

trsm — Triangular Solve (Matrix Version)

Solves a triangular matrix equation: $B \leftarrow \alpha \cdot \mathrm{op}(A)^{-1} \cdot B$ or $B \cdot \mathrm{op}(A)^{-1}$.

template<typename T>
void trsm(bool side_left, bool upper, bool trans, bool unit_diag,
          T alpha, const Matrix<T>& A, Matrix<T>& B)  // B ← α·op(A)⁻¹·B or B·op(A)⁻¹

Randomized SVD

Halko-Martinsson-Tropp (2011) algorithm. Rapidly computes the top $k$ singular values and vectors of a large matrix.

randomizedSVD

Parameters: k = number of singular values to compute, oversampling = oversampling count (improves accuracy), powerIter = number of power iterations (effective when singular value decay is slow), seed = random seed.

template<typename T>
RandomizedSVDResult<T> randomizedSVD(const Matrix<T>& A, size_t k,
    size_t oversampling = 5, size_t powerIter = 1, unsigned seed = 42)

// RandomizedSVDResult<T> { Matrix<T> U; std::vector<T> S; Matrix<T> Vt; }
Matrix<double> A(1000, 500);  // large matrix
// ... fill A ...
auto result = randomizedSVD(A, 10);  // top 10 singular values
// result.U: (1000×10), result.S: 10 values, result.Vt: (10×500)

Sparse Direct Solvers

Direct decomposition solvers for sparse matrices. #include <math/linalg/simplicial_cholesky.hpp> etc.

SimplicialLLT — Sparse Cholesky

template<typename T, typename IndexType = std::size_t>
class SimplicialLLT {
public:
    explicit SimplicialLLT(const SparseMatrix<T, IndexType>& A);
    void compute(const SparseMatrix<T, IndexType>& A);
    Vector<T> solve(const Vector<T>& b) const;
    bool success() const;
};

Cholesky decomposition $A = LL^\top$ for symmetric positive definite sparse matrices.

SparseMatrix<double> A(100, 100);
// ... build symmetric positive definite matrix ...
SimplicialLLT<double> chol(A);
auto x = chol.solve(b);

SimplicialLDLT — Sparse LDL

template<typename T, typename IndexType = std::size_t>
class SimplicialLDLT {
public:
    explicit SimplicialLDLT(const SparseMatrix<T, IndexType>& A);
    void compute(const SparseMatrix<T, IndexType>& A);
    Vector<T> solve(const Vector<T>& b) const;
    bool success() const;
};

$A = LDL^\top$ decomposition for symmetric sparse matrices. No sqrt needed; numerically stable.

SparseLU — Sparse LU

template<typename T, typename IndexType = std::size_t>
class SparseLU {
public:
    void compute(const SparseMatrix<T, IndexType>& A);
    Vector<T> solve(const Vector<T>& b) const;
    bool success() const;
};

LU decomposition $PA = LU$ for non-symmetric sparse matrices.

SparseQR — Sparse QR

template<typename T, typename IndexType = std::size_t>
class SparseQR {
public:
    void compute(const SparseMatrix<T, IndexType>& A);
    Vector<T> solve(const Vector<T>& b) const;
    bool success() const;
};

QR decomposition for sparse matrices. Also usable for least squares problems.

Sparse Iterative Solvers

Namespace sangi::sparse_algorithms. Iterative solvers for large-scale sparse matrices.

IterativeSolver — Builder Pattern

template<typename T, typename IndexType = std::size_t>
class IterativeSolver {
public:
    IterativeSolver(const SparseMatrix<T, IndexType>& A, const Vector<T>& b);
    IterativeSolver& solver(SolverType type);          // CG, BiCGSTAB, GMRES, or MINRES
    IterativeSolver& preconditioner(PreconditionerType type); // None, Jacobi, ILU0, or IC0
    IterativeSolver& tolerance(T abs_tol, T rel_tol);
    IterativeSolver& maxIterations(std::size_t n);
    IterativeSolver& restart(std::size_t m);            // GMRES restart interval
    IterativeSolver& initialGuess(Vector<T> x0);
    IterativeSolver& enableLog();                       // Record residual at each iteration
    SolverResult<T> solve();
};
auto result = IterativeSolver<double>(A, b)
    .solver(SolverType::BiCGSTAB)
    .preconditioner(PreconditionerType::ILU0)
    .tolerance(1e-10, 1e-8)
    .maxIterations(500)
    .solve();
// result.success, result.x, result.final_residual

minres — MINRES

For symmetric indefinite matrices. Uses Lanczos tridiagonalization + Givens QR.

template<typename T, typename IndexType>
SolverResult<T> minres(const SparseMatrix<T, IndexType>& A, const Vector<T>& b,
    const ConvergenceCriteria<T>& criteria, ...)

fgmres — Flexible GMRES

Allows different preconditioners at each iteration. For non-symmetric matrices.

template<typename T, typename IndexType>
SolverResult<T> fgmres(const SparseMatrix<T, IndexType>& A, const Vector<T>& b,
    const ConvergenceCriteria<T>& criteria, std::size_t restart = 30, ...)

bicgstab — BiCGSTAB (Sparse Version)

For non-symmetric matrices. More memory-efficient than GMRES.

template<typename T, typename IndexType>
SolverResult<T> bicgstab(const SparseMatrix<T, IndexType>& A, const Vector<T>& b,
    const ConvergenceCriteria<T>& criteria, ...)

conjugate_gradient — CG (Sparse Version)

For symmetric positive definite matrices.

template<typename T, typename IndexType>
SolverResult<T> conjugate_gradient(const SparseMatrix<T, IndexType>& A, const Vector<T>& b,
    const ConvergenceCriteria<T>& criteria, ...)

Examples

1. Solving a Linear System with LU Decomposition

#include <math/linalg/decomposition.hpp>
#include <math/linalg/solvers.hpp>
#include <iostream>
using namespace sangi;
using namespace sangi::algorithms;

int main() {
    // 3x3 linear system Ax = b
    Matrix<double> A = {{2, 1, -1}, {-3, -1, 2}, {-2, 1, 2}};
    Vector<double> b = {8, -11, -3};

    // Method 1: auto-selection
    auto x = solve(A, b);
    std::cout << "x = [" << x[0] << ", " << x[1] << ", " << x[2] << "]" << std::endl;
    // x = [2, 3, -1]

    // Method 2: direct LU decomposition (when reusing the decomposition)
    auto [LU, perm] = lu_decomposition(A);
    auto det = lu_determinant(A);
    auto inv = lu_inverse(A);
    std::cout << "det(A) = " << det << std::endl;
}
// Run output (build & run verified):
//   x = [2, 3, -1]                     (matches the exact solution)
//   det(A) = -0.999999999999999         (machine-precision error around -1)

2. Low-Rank Approximation with SVD

#include <math/linalg/decomposition.hpp>
#include <iostream>
using namespace sangi;
using namespace sangi::algorithms;

int main() {
    // 3x3 matrix
    Matrix<double> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
    auto [U, sigma, Vt] = svd_decomposition(A);

    std::cout << "Singular values: ";
    for (std::size_t i = 0; i < sigma.size(); ++i)
        std::cout << sigma[i] << " ";
    std::cout << std::endl;

    // Rank-1 approximation: use only the largest singular value
    Matrix<double> U1(U.rows(), 1), Vt1(1, Vt.cols());
    for (std::size_t i = 0; i < U.rows(); ++i) U1(i, 0) = U(i, 0);
    for (std::size_t j = 0; j < Vt.cols(); ++j) Vt1(0, j) = Vt(0, j);
    Vector<double> s1 = {sigma[0]};
    auto A1 = reconstruct_matrix_from_svd(U1, s1, Vt1);
    std::cout << "Rank-1 approximation:" << std::endl;
    for (std::size_t i = 0; i < A1.rows(); ++i) {
        for (std::size_t j = 0; j < A1.cols(); ++j)
            std::cout << A1(i, j) << " ";
        std::cout << std::endl;
    }
}
// Run output (build & run verified):
//   Singular values = 16.8481, 1.0684, 8.3e-16
//   (reflects mathematical rank 2; σ_3 is the floating-point representation of zero)
//   rank-1 approx (0,0) ≈ 1.7362; deviation from A(0,0)=1 is the σ_2 contribution

3. gaussian_elimination + PivotStrategy::Full

#include <math/linalg/decomposition.hpp>
#include <iostream>
using namespace sangi;
using namespace sangi::algorithms;

int main() {
    // Full pivoting for maximum numerical stability
    Matrix<double> A = {
        {1e-18, 1.0,  1.0},
        {1.0,   1.0,  2.0},
        {1.0,   2.0,  1.0}
    };
    Vector<double> b = {2.0, 4.0, 4.0};

    // Partial pivoting (default)
    auto x_partial = gaussian_elimination(A, b, PivotStrategy::Partial);

    // Full pivoting (maximum stability)
    auto x_full = gaussian_elimination(A, b, PivotStrategy::Full);

    std::cout << "Partial pivot: [" << x_partial[0] << ", "
              << x_partial[1] << ", " << x_partial[2] << "]" << std::endl;
    std::cout << "Full pivot:    [" << x_full[0] << ", "
              << x_full[1] << ", " << x_full[2] << "]" << std::endl;
}
// Run output (build & run verified):
//   Partial: [1, 1, 1]
//   Full:    [1, 1, 1]
// (sangi's partial pivot already stabilizes the 1e-18 pivot internally,
//  so both strategies agree on this system.)

4. Exact Solution of a Hilbert Matrix with Matrix<Rational>

#include <math/linalg/decomposition.hpp>
#include <math/core/mp/Rational.hpp>
#include <iostream>
using namespace sangi;
using namespace sangi::algorithms;

int main() {
    // 4x4 Hilbert matrix: H(i,j) = 1/(i+j+1)
    // A classic example of an ill-conditioned matrix (cond ~ 15514)
    const std::size_t n = 4;
    Matrix<Rational> H(n, n);
    for (std::size_t i = 0; i < n; ++i)
        for (std::size_t j = 0; j < n; ++j)
            H(i, j) = Rational(1, static_cast<int64_t>(i + j + 1));

    // Right-hand side: b = H * [1, 1, 1, 1]^T (so the solution is all ones)
    Vector<Rational> b(n);
    for (std::size_t i = 0; i < n; ++i) {
        Rational sum(0);
        for (std::size_t j = 0; j < n; ++j)
            sum = sum + H(i, j);
        b[i] = sum;
    }

    // Exact solution using Rational (minimum-height pivot selection)
    auto x = gaussian_elimination(H, b);

    std::cout << "Exact solution of Hilbert system:" << std::endl;
    for (std::size_t i = 0; i < n; ++i)
        std::cout << "  x[" << i << "] = " << x[i] << std::endl;
    // x = {1, 1, 1, 1} (exact)
}
// Run output (build & run verified): x = [1, 1, 1, 1]  (Rational gives the exact solution; floating-point would lose digits at cond ~ 15514)

Related Mathematical Background

The following articles explain the mathematical concepts underlying the linear algebra module.