Multivariate Root Finding
Overview
Solving the nonlinear system $\mathbf{F}(\mathbf{x}) = \mathbf{0}$ with $\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n$. Methods split into direct generalizations of 1D Newton/secant and globally convergent homotopy continuation.
Related API: newton_raphson_system, broyden_method.
Multivariate Newton
Linearize $\mathbf{F}$ around $\mathbf{x}_k$:
$$\mathbf{F}(\mathbf{x}_k + \Delta) \approx \mathbf{F}(\mathbf{x}_k) + J(\mathbf{x}_k) \Delta.$$
Setting the linearization to zero gives the update
$$\mathbf{x}_{k+1} = \mathbf{x}_k - J(\mathbf{x}_k)^{-1} \mathbf{F}(\mathbf{x}_k),$$
where $J(\mathbf{x}) = \partial \mathbf{F}/\partial \mathbf{x}$ is the Jacobian matrix ($n \times n$):
$$J_{ij}(\mathbf{x}) = \frac{\partial F_i}{\partial x_j}(\mathbf{x}).$$
Convergence
At a simple zero (non-singular Jacobian) with smooth $\mathbf{F}$, the iteration converges quadratically:
$$\|\mathbf{x}_{k+1} - \boldsymbol{\alpha}\| \leq C \|\mathbf{x}_k - \boldsymbol{\alpha}\|^2.$$
Starting outside the basin of attraction may cause divergence; line search or homotopy continuation provides safety nets.
sangi's newton_raphson_system(F, J, x0) takes $\mathbf{F}$ and $J$ as callables. When $J$ is not available in closed form, a finite-difference Jacobian (central differences) is the standard substitute.
Related article: Multivariate Newton method
Linear-System Solver (LU vs QR)
Forming $J^{-1}$ explicitly is numerically unstable. Instead, solve the linear system $J \Delta = -\mathbf{F}$.
LU factorization
Factor $J = L U$ with partial pivoting, then perform forward and back substitution. $O(n^3)$ for the factorization plus $O(n^2)$ for the substitutions.
Fastest when the Jacobian is well-conditioned. sangi's newton_raphson_system uses partial-pivot LU by default.
QR factorization
Factor $J = QR$ with orthogonal $Q$ and upper-triangular $R$. About twice the cost of LU but more stable when the Jacobian is near-singular. Switching to QR is the safer choice if Newton iterations approach degeneracy late in the run.
Iterative solvers
When $n$ is large and $J$ is sparse, Krylov-subspace methods (GMRES, BiCGSTAB) solve $J \Delta = -\mathbf{F}$ matrix-free (Newton-Krylov). They only need the action $J \mathbf{v}$, saving both memory and computation.
Broyden's Quasi-Newton Method
If evaluating the Jacobian dominates cost, Newton becomes inefficient. Broyden (1965) replaces $J$ with an approximation $B_k$ and updates it by a rank-1 modification each iteration:
$$B_{k+1} = B_k + \frac{(\mathbf{y}_k - B_k \mathbf{s}_k) \mathbf{s}_k^\top}{\mathbf{s}_k^\top \mathbf{s}_k},$$
where $\mathbf{s}_k = \mathbf{x}_{k+1} - \mathbf{x}_k$ and $\mathbf{y}_k = \mathbf{F}(\mathbf{x}_{k+1}) - \mathbf{F}(\mathbf{x}_k)$. This is the minimum-norm modification satisfying the secant condition $B_{k+1} \mathbf{s}_k = \mathbf{y}_k$.
Convergence
Convergence order: superlinear ($\approx 1.62$).
Slower than Newton's quadratic but with no Jacobian evaluation. Broyden wins overall when computing $J$ costs several times $\mathbf{F}$.
Standard initial $B_0$ is the identity matrix or a one-time finite-difference Jacobian estimate.
Related article: Broyden's method
Sherman-Morrison Inverse Update
The rank-1 nature of Broyden's update allows the inverse to be updated in $O(n^2)$. Sherman-Morrison:
$$(A + \mathbf{u} \mathbf{v}^\top)^{-1} = A^{-1} - \frac{A^{-1} \mathbf{u} \mathbf{v}^\top A^{-1}}{1 + \mathbf{v}^\top A^{-1} \mathbf{u}}.$$
So $B_{k+1}^{-1}$ derives from $B_k^{-1}$ in $O(n^2)$, replacing the $O(n^3)$ LU factorization Newton would need each step. The gap grows quickly with $n$.
Caveat: Sherman-Morrison is numerically less stable than direct LU/QR. Repeated rank-1 updates degrade conditioning, so periodic Jacobian restarts may be needed; sangi's broyden_method supports this strategy.
Combination with Line Search
Taking the full Newton step $\Delta = -J^{-1} \mathbf{F}$ can diverge outside the basin of attraction. A line search sets $\mathbf{x}_{k+1} = \mathbf{x}_k + \lambda \Delta$ ($\lambda \in (0, 1]$) and chooses $\lambda$ to ensure the merit function $\|\mathbf{F}(\mathbf{x}_{k+1})\|^2 / 2$ decreases sufficiently.
Armijo condition
$$\|\mathbf{F}(\mathbf{x}_k + \lambda \Delta)\|^2 \leq (1 - 2 c \lambda) \|\mathbf{F}(\mathbf{x}_k)\|^2.$$
$c \in (0, 1/2)$ is the sufficient-decrease parameter (typically $10^{-4}$). Backtracking tries $\lambda = 1, 1/2, 1/4, \ldots$ and keeps the largest $\lambda$ satisfying the condition.
With line search, multivariate Newton gains global convergence for smooth $\mathbf{F}$: it never diverges and gradually enters the local convergence region.
Line search support is on sangi's roadmap as an option for newton_raphson_system.
Homotopy Continuation (Global Convergence)
For problems where line search alone is insufficient (multiple deep local minima of $\|\mathbf{F}\|$), use homotopy continuation.
Idea
Connect an easy system to the target via a parameter $t \in [0, 1]$:
$$H(\mathbf{x}, t) = (1 - t) G(\mathbf{x}) + t \mathbf{F}(\mathbf{x}).$$
$G$ is a simple system whose solution is known (e.g. $G(\mathbf{x}) = \mathbf{x} - \mathbf{x}_0$ has solution $\mathbf{x}_0$). Move $t$ from $0$ to $1$ in small steps, tracking solutions with a predictor-corrector loop:
- Predictor: take an Euler step along the tangent direction from $\partial H/\partial t$ to predict the next solution.
- Corrector: from that prediction, perform Newton iterations at fixed $t$ to solve $H(\mathbf{x}, t) = \mathbf{0}$.
Reaching $t = 1$ yields a solution of the target system. For polynomial systems and enumeration of all real solutions, this is the standard approach (Bertini, HomotopyContinuation.jl).
sangi does not currently expose homotopy as a top-level API, but a wrapper around newton_raphson_system can implement it.
Basins of Attraction and Initial Values
Multivariate Newton and Broyden guarantee only local convergence; starting outside the basin of attraction can diverge. Kantorovich's theorem gives a sufficient condition: if at $\mathbf{x}_0$
$$\|J(\mathbf{x}_0)^{-1}\| \cdot \|\mathbf{F}(\mathbf{x}_0)\| \cdot L \leq \frac{1}{2},$$
(where $L$ is the Lipschitz constant of the Jacobian) Newton converges quadratically to a root near $\mathbf{x}_0$. In practice, combine the following strategies:
- Domain knowledge: if the solution range is known a priori, center $\mathbf{x}_0$ on it.
- Continuation: gradually deform an easy problem into the target (a simple form of homotopy).
- Line search / trust region: damp the iteration to avoid divergence and slowly enter the basin.
- Multistart: run from several initial values in parallel and keep the converged result.
With a good initial value, Newton reaches machine precision in 5–10 iterations. Exceeding 50 iterations without convergence usually signals a bad initial value or a wrong algorithm choice.
Design Decisions
Two separate entry points: Newton and Broyden
Choice between methods depends strongly on the Jacobian's evaluation cost:
- Analytic Jacobian, comparable cost to $\mathbf{F}$ → Newton (quadratic convergence wins).
- Complex or unknown Jacobian → Broyden (no Jacobian needed).
sangi exposes newton_raphson_system and broyden_method separately so users can match the algorithm to the problem.
Future additions such as Newton-Krylov or trust region methods will follow the same naming pattern.
Consistency with 1D Newton
1D newton_raphson and multivariate newton_raphson_system are the same idea at different dimensions.
sangi keeps the API shape uniform (function + derivative/Jacobian + initial value), minimizing the effort to move between scalar and vector problems.
References
- Ortega, J. M. and Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. Academic Press.
- Broyden, C. G. (1965). "A class of methods for solving nonlinear simultaneous equations". Mathematics of Computation, 19(92), 577–593.
- Allgower, E. L. and Georg, K. (2003). Introduction to Numerical Continuation Methods. SIAM.
- Dennis, J. E. and Schnabel, R. B. (1996). Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM.