Matrix Rank
Goals of this chapter
Understand the rank of a matrix as "the maximum number of linearly independent rows/columns = the essential dimension of the matrix." Connect this with row reduction, the rank-nullity theorem, and the structure of solutions to linear systems.
Prerequisites
- Chapter 1: Vector Space Basics — subspaces, bases, dimension ($\dim$), and linear maps.
- Chapter 2: Linear Independence — rank is defined as the maximum number of linearly independent rows/columns.
- Chapter 3: Determinant — determinants ($\det$), minors, and invertibility ($A^{-1}$).
- Elementary row operations (three kinds) — same as Gaussian elimination for linear systems.
Contents
§1. Overview
The rank of a matrix is the integer expressing the maximum number of linearly independent rows (or columns); it measures the "essential dimension" of the matrix. For instance, a matrix with three rows may have rank 1 if the second row is twice the first and the third row is the sum of the first two: there is effectively only one row's worth of independent information.
Rank plays a central role throughout linear algebra — in the solvability of linear systems, the dimensions of the image and kernel of a linear map, the effective dimension of data, and so on. In particular, the following are equivalent for an $n \times n$ square matrix $A$, a key fact that threads through the basic material:
- $A$ is full rank ($\text{rank}(A) = n$)
- $A$ is invertible (i.e., $A^{-1}$ exists)
- $\det(A) \neq 0$
- $A\mathbf{x} = \mathbf{0}$ has only $\mathbf{x} = \mathbf{0}$ as a solution
§2. Definition of Rank
Definition: Rank of a matrix
The rank $\text{rank}(A)$ of an $m \times n$ matrix $A$ is defined by any of the following (all equivalent):
- Row rank: the maximum number of linearly independent row vectors of $A$
- Column rank: the maximum number of linearly independent column vectors of $A$
- Minor: the size of the largest square submatrix with nonzero determinant
Upper bound on rank and full rank
For any $m \times n$ matrix $A$, $$\text{rank}(A) \leq \min(m, n)$$ since there are only $m$ rows and $n$ columns. When equality $\text{rank}(A) = \min(m, n)$ holds, $A$ is said to be full rank. For a square matrix, full rank is equivalent to invertibility.
§3. Theorem: Row Rank = Column Rank
Theorem: Row rank = column rank
For any matrix, the row rank and the column rank are equal. We therefore speak simply of "the rank" without distinguishing between rows and columns.
This theorem ensures that the three definitions in §2 — row rank, column rank, and minor — all yield the same value. Geometrically, it says that the dimension of the subspace spanned by the rows always equals the dimension of the subspace spanned by the columns. This is not obvious, but the proof is short.
The proof uses the rank factorization $A = CB$ (where $C$ is $m \times r$ and $B$ is $r \times n$), combined with a symmetric argument applied to $A^T$ to extract inequalities in both directions and merge them. The factorization is also the stepping stone to SVD and low-rank approximation.
The full proof is on an appendix page
The full Strang-style proof (with diagrams and steps 1-6) is separated into an appendix page for readability:
Appendix: Proof of row rank = column rank (Strang style)
Readers who only want the main line (computing rank, applications, and correspondence with linear systems) may skip the appendix and continue with §4 below.
§4. Computing Rank
4.1 Row reduction (the standard method)
Transform the matrix into row echelon form using elementary row operations (swap rows, multiply a row by a nonzero scalar, add a multiple of one row to another). The leading nonzero entry of each row is called a pivot, and the number of pivots = number of nonzero rows = rank.
Elementary row operations preserve the linear independence of rows, so the rank does not change during reduction. Because independence is immediate to read off in echelon form, this procedure is the default way to compute rank.
Example: $$A = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix}$$ Subtract 4 × (row 1) from row 2, and 7 × (row 1) from row 3: $$\begin{pmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -12 \end{pmatrix}$$ Then subtract 2 × (row 2) from row 3: $$\begin{pmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & 0 & 0 \end{pmatrix}$$ There are 2 nonzero rows, so $\text{rank}(A) = 2$.
4.2 Minors
If every $k \times k$ minor vanishes but some $(k{-}1) \times (k{-}1)$ minor is nonzero, the rank is $k - 1$. Faithful to the theoretical definition, but combinatorially heavy for hand calculation.
4.3 Singular value decomposition (SVD) ★advanced
Factor $A = U\Sigma V^T$; the number of nonzero singular values on the diagonal of $\Sigma$ equals the rank. Because floating-point arithmetic forces a choice of threshold for "effectively zero," this approach is the most numerically stable (see §11).
§5. Basic Properties
Terminology: image ($\text{Im}$) and kernel ($\ker$)
We define the image and kernel of $A$ here since they are used repeatedly below. For an $m \times n$ matrix $A$:
- Image: $\text{Im}(A) = \{A\mathbf{x} : \mathbf{x} \in \mathbb{R}^n\} \subset \mathbb{R}^m$ — the set of all images under $A$. Coincides with the subspace spanned by the columns of $A$ (the column space).
- Kernel (null space): $\ker A = \{\mathbf{x} \in \mathbb{R}^n : A\mathbf{x} = \mathbf{0}\} \subset \mathbb{R}^n$ — the set of vectors mapped to $\mathbf{0}$ by $A$. The solution space of the homogeneous equation $A\mathbf{x} = \mathbf{0}$.
Viewing $A$ as a linear map $\mathbb{R}^n \to \mathbb{R}^m$, the image and (preimage of $\mathbf{0}$) correspond to these two objects; their dimensions are the rank and nullity respectively (see §7.1, rank-nullity theorem).
5.1 Transpose and rank
$$\text{rank}(A) = \text{rank}(A^T)$$ A restatement of "row rank = column rank." Transposition does not change the number of independent rows or columns.
5.2 Product and rank
$$\text{rank}(AB) \leq \min(\text{rank}(A), \text{rank}(B))$$ $\text{Im}(AB) \subseteq \text{Im}(A)$, so $\text{rank}(AB) \leq \text{rank}(A)$. Also, the input of $AB$ passes through the image of $B$, so $\text{rank}(AB) \leq \text{rank}(B)$.
5.3 Sum and rank
$$\text{rank}(A + B) \leq \text{rank}(A) + \text{rank}(B)$$ An inequality obtained from the observation that $\text{Im}(A+B) \subset \text{Im}(A) + \text{Im}(B)$.
5.4 Sylvester's inequality ★advanced
If $A$ is $m \times n$ and $B$ is $n \times p$, then $$\text{rank}(A) + \text{rank}(B) - n \leq \text{rank}(AB).$$ A quantitative bound: the rank loss in the product is at most $n - \text{rank}(B)$ (equivalently $n - \text{rank}(A)$).
5.5 Equivalent conditions for full rank
For an $n \times n$ square matrix $A$, the following are all equivalent:
- $\text{rank}(A) = n$ (full rank)
- $A$ is invertible (the inverse $A^{-1}$ exists)
- $\det(A) \neq 0$
- The columns (rows) of $A$ are linearly independent
- $A\mathbf{x} = \mathbf{0}$ has only $\mathbf{x} = \mathbf{0}$ as a solution
- For every $\mathbf{b} \in \mathbb{R}^n$, the equation $A\mathbf{x} = \mathbf{b}$ has a unique solution
§6. Examples
Example 1: Full-rank matrix
$$A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}$$ The column vectors $(1, 3)^T$ and $(2, 4)^T$ are linearly independent (neither is a scalar multiple of the other). Hence $\text{rank}(A) = 2$ (full rank).
Example 2: Rank-deficient matrix
$$B = \begin{pmatrix} 1 & 2 & 3 \\ 2 & 4 & 6 \end{pmatrix}$$ The second row is twice the first, so only one independent row. $\text{rank}(B) = 1$.
Example 3: Zero matrix
$$O = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}$$ All rows and columns are the zero vector, so $\text{rank}(O) = 0$.
Example 4: Identity matrix
$$I_n = \begin{pmatrix} 1 & & \\ & \ddots & \\ & & 1 \end{pmatrix}$$ All rows and columns are linearly independent, so $\text{rank}(I_n) = n$ (full rank).
§7. Rank and Linear Systems
7.1 Homogeneous equation $A\mathbf{x} = \mathbf{0}$
The dimension of the solution space (kernel, null space) is called the nullity of $A$, written $\dim(\ker A)$. For an $m \times n$ matrix $A$, $$\dim(\ker A) = n - \text{rank}(A).$$ This identity is the rank-nullity theorem.
Intuition: in echelon form, the pivot columns ($\text{rank}(A)$ of them) correspond to variables whose values are determined, while the non-pivot columns ($n - \text{rank}(A)$ of them) become free parameters. The count of free parameters equals the dimension of the solution space — which is the nullity — and that is what this theorem encodes.
Special case — square invertible $A$: if $A$ is an $n \times n$ square matrix and invertible (equivalently $\text{rank}(A) = n$), then $$\dim(\ker A) = n - n = 0.$$ A subspace of dimension $0$ is the zero-vector subspace $\{\mathbf{0}\}$, so $A\mathbf{x} = \mathbf{0}$ has only $\mathbf{x} = \mathbf{0}$ as a solution. This recovers the equivalence "$A$ invertible ⇔ $A\mathbf{x} = \mathbf{0}$ has only $\mathbf{x} = \mathbf{0}$" listed in §5.5 Equivalent conditions for full rank, now obtained by a dimension-arithmetic argument from the rank-nullity theorem.
7.2 Inhomogeneous equation $A\mathbf{x} = \mathbf{b}$
Existence of solution: $A\mathbf{x} = \mathbf{b}$ has a solution $\Leftrightarrow$ $\text{rank}(A) = \text{rank}(A \mid \mathbf{b})$ (the augmented matrix has the same rank). Equivalently, $\mathbf{b}$ lies in the column space of $A$.
Uniqueness of solution:
- $\text{rank}(A) = n$ (equal to the number of columns) $\Rightarrow$ the solution is unique (if one exists)
- $\text{rank}(A) < n$ $\Rightarrow$ infinitely many solutions (with $n - \text{rank}(A)$ degrees of freedom)
§8. Rank Normal Form
Any $m \times n$ matrix $A$ can be transformed, by a finite sequence of elementary row and column operations, into the form: $$A \sim \begin{pmatrix} I_r & O \\ O & O \end{pmatrix}$$ where $r = \text{rank}(A)$, $I_r$ is the $r \times r$ identity, and $O$ is the zero block. The symbol $A \sim B$ means "$A$ can be transformed to $B$ by a finite sequence of elementary row/column operations (i.e., $A$ and $B$ are equivalent)." This special form is called the rank normal form.
The rank normal form makes it transparent that the rank is the "essential dimension" of the matrix.
§9. Applications
9.1 Image and kernel of a linear map
For a linear map $T: \mathbb{R}^n \to \mathbb{R}^m$ with matrix representation $A$:
- $\dim(\text{Im } T) = \text{rank}(A)$ (dimension of the image)
- $\dim(\ker T) = n - \text{rank}(A)$ (dimension of the kernel)
9.2 Data analysis and image processing ★advanced
The rank of a data matrix represents its effective dimension (number of independent features). In principal component analysis (PCA) and image compression, SVD is used to form a low-rank approximation that keeps the essential components and discards noise and redundancy.
9.3 Control theory ★advanced
The ranks of the controllability and observability matrices determine the controllability and observability of a dynamical system (Kalman rank condition).
9.4 Machine learning ★advanced
The matrix completion problem estimates missing entries under a low-rank assumption (e.g., recommender systems).
§10. Implementation Examples
10.1 Python (NumPy)
import numpy as np
# Define a matrix
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# Compute the rank
rank_A = np.linalg.matrix_rank(A)
print(f"rank(A) = {rank_A}") # 2
# Full-rank matrix
B = np.array([[1, 2],
[3, 4]])
rank_B = np.linalg.matrix_rank(B)
print(f"rank(B) = {rank_B}") # 2
# Rank-deficient matrix
C = np.array([[1, 2, 3],
[2, 4, 6]])
rank_C = np.linalg.matrix_rank(C)
print(f"rank(C) = {rank_C}") # 1
# Rank via SVD
U, s, Vt = np.linalg.svd(A)
print(f"singular values: {s}")
# Count of nonzero singular values = rank
rank_svd = np.sum(s > 1e-10)
print(f"rank via SVD: {rank_svd}")
# Verify the rank-nullity theorem
n = A.shape[1] # number of columns
nullity = n - rank_A
print(f"n - rank(A) = {nullity}") # dim of kernel
10.2 MATLAB
% Define a matrix
A = [1 2 3; 4 5 6; 7 8 9];
% Compute the rank
rank_A = rank(A);
fprintf('rank(A) = %d\n', rank_A); % 2
% Cross-check via SVD
[U, S, V] = svd(A);
singular_values = diag(S);
disp('singular values:');
disp(singular_values');
% Rank-nullity theorem
n = size(A, 2);
nullity = n - rank_A;
fprintf('n - rank(A) = %d\n', nullity);
§11. Numerical Considerations
11.1 Floating-point error
Under floating-point arithmetic, rounding can leave theoretically zero rows or singular values as "very small" values, forcing a judgment about what counts as zero. Because of this, SVD-based rank determination is more stable than pure row reduction.
11.2 Tolerance
NumPy's matrix_rank uses a default tolerance of $\sigma_{\max} \cdot \max(m, n) \cdot \varepsilon$ ($\sigma_{\max}$: largest singular value, $\varepsilon \approx 10^{-16}$: machine epsilon) and treats singular values below this threshold as zero. The tol parameter allows tuning for specific applications.
11.3 Condition number and rank
For ill-conditioned matrices (large condition number), small input errors can flip the ordering of singular values and make rank determination unstable. In such cases, the notion of "numerical rank" (effective rank cut at a threshold) is used alongside the theoretical rank.
§12. References
- Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge Press.
- Axler, S. (2015). Linear Algebra Done Right (3rd ed.). Springer.
- Hoffman, K., & Kunze, R. (1971). Linear Algebra (2nd ed.). Prentice-Hall.
- Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Johns Hopkins University Press.
- Horn, R. A., & Johnson, C. R. (2012). Matrix Analysis (2nd ed.). Cambridge University Press.
- Weisstein, E. W. "Matrix Rank." From MathWorld--A Wolfram Web Resource.