How is the multivariate Newton iteration written?

For F: R^n → R^n, the Jacobian J(x) = ∂F/∂x gives the update x_{k+1} = x_k - J(x_k)^{-1} F(x_k). Implementations never form J^{-1} explicitly; instead they solve the linear system J(x_k) Δ = -F(x_k) using LU (or QR for poorly conditioned Jacobians) and update x_{k+1} = x_k + Δ.

What makes Broyden's method useful?

Broyden maintains an approximate Jacobian B_k and updates it by a rank-1 modification each step. It works without an analytic Jacobian, and both the matrix and its inverse can be updated in O(n^2) via Sherman-Morrison rather than the O(n^3) factorization required by Newton. Convergence is superlinear (~1.62) — slower than Newton's quadratic — but if computing the Jacobian is expensive, Broyden wins overall.

What does the Sherman-Morrison formula update?

(A + uv^T)^{-1} = A^{-1} - (A^{-1} u v^T A^{-1}) / (1 + v^T A^{-1} u). It expresses the inverse of a rank-1 modification of A in terms of A^{-1} and the rank-1 factors. Because Broyden's update is rank-1, we can update B_k^{-1} directly in O(n^2), avoiding repeated factorizations.

What is homotopy continuation for?

When x_0 is far from a root α, Newton may diverge. Homotopy continuation builds H(x, t) = (1-t) G(x) + t F(x) from a simple system G(x) = 0 to the target F(x) = 0 with parameter t ∈ [0, 1], and tracks the solution along t with predictor-corrector steps. It removes initial-value sensitivity and supports enumerating all real solutions; for polynomial systems it is the standard approach (Bertini, HomotopyContinuation.jl, etc.).

Multivariate Root Finding

Overview

Solving the nonlinear system $\mathbf{F}(\mathbf{x}) = \mathbf{0}$ with $\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n$. Methods split into direct generalizations of 1D Newton/secant and globally convergent homotopy continuation.

Multivariate Newton

Linearize $\mathbf{F}$ around $\mathbf{x}_k$:

$$\mathbf{F}(\mathbf{x}_k + \Delta) \approx \mathbf{F}(\mathbf{x}_k) + J(\mathbf{x}_k) \Delta.$$

Setting the linearization to zero gives the update

$$\mathbf{x}_{k+1} = \mathbf{x}_k - J(\mathbf{x}_k)^{-1} \mathbf{F}(\mathbf{x}_k),$$

where $J(\mathbf{x}) = \partial \mathbf{F}/\partial \mathbf{x}$ is the Jacobian matrix ($n \times n$):

$$J_{ij}(\mathbf{x}) = \frac{\partial F_i}{\partial x_j}(\mathbf{x}).$$

Convergence

At a simple zero (non-singular Jacobian) with smooth $\mathbf{F}$, the iteration converges quadratically:

$$\|\mathbf{x}_{k+1} - \boldsymbol{\alpha}\| \leq C \|\mathbf{x}_k - \boldsymbol{\alpha}\|^2.$$

Starting outside the basin of attraction may cause divergence; line search or homotopy continuation provides safety nets.

sangi's newton_raphson_system(F, J, x0) takes $\mathbf{F}$ and $J$ as callables. When $J$ is not available in closed form, a finite-difference Jacobian (central differences) is the standard substitute.

Related article: Multivariate Newton method

Linear-System Solver (LU vs QR)

Forming $J^{-1}$ explicitly is numerically unstable. Instead, solve the linear system $J \Delta = -\mathbf{F}$.

LU factorization

Factor $J = L U$ with partial pivoting, then perform forward and back substitution. $O(n^3)$ for the factorization plus $O(n^2)$ for the substitutions. Fastest when the Jacobian is well-conditioned. sangi's newton_raphson_system uses partial-pivot LU by default.

QR factorization

Factor $J = QR$ with orthogonal $Q$ and upper-triangular $R$. About twice the cost of LU but more stable when the Jacobian is near-singular. Switching to QR is the safer choice if Newton iterations approach degeneracy late in the run.

Iterative solvers

When $n$ is large and $J$ is sparse, Krylov-subspace methods (GMRES, BiCGSTAB) solve $J \Delta = -\mathbf{F}$ matrix-free (Newton-Krylov). They only need the action $J \mathbf{v}$, saving both memory and computation.

Broyden's Quasi-Newton Method

If evaluating the Jacobian dominates cost, Newton becomes inefficient. Broyden (1965) replaces $J$ with an approximation $B_k$ and updates it by a rank-1 modification each iteration:

$$B_{k+1} = B_k + \frac{(\mathbf{y}_k - B_k \mathbf{s}_k) \mathbf{s}_k^\top}{\mathbf{s}_k^\top \mathbf{s}_k},$$

where $\mathbf{s}_k = \mathbf{x}_{k+1} - \mathbf{x}_k$ and $\mathbf{y}_k = \mathbf{F}(\mathbf{x}_{k+1}) - \mathbf{F}(\mathbf{x}_k)$. This is the minimum-norm modification satisfying the secant condition $B_{k+1} \mathbf{s}_k = \mathbf{y}_k$.

Convergence

Convergence order: superlinear ($\approx 1.62$).

Slower than Newton's quadratic but with no Jacobian evaluation. Broyden wins overall when computing $J$ costs several times $\mathbf{F}$.

Standard initial $B_0$ is the identity matrix or a one-time finite-difference Jacobian estimate.

Sherman-Morrison Inverse Update

The rank-1 nature of Broyden's update allows the inverse to be updated in $O(n^2)$. Sherman-Morrison:

$$(A + \mathbf{u} \mathbf{v}^\top)^{-1} = A^{-1} - \frac{A^{-1} \mathbf{u} \mathbf{v}^\top A^{-1}}{1 + \mathbf{v}^\top A^{-1} \mathbf{u}}.$$

So $B_{k+1}^{-1}$ derives from $B_k^{-1}$ in $O(n^2)$, replacing the $O(n^3)$ LU factorization Newton would need each step. The gap grows quickly with $n$.

Caveat: Sherman-Morrison is numerically less stable than direct LU/QR. Repeated rank-1 updates degrade conditioning, so periodic Jacobian restarts may be needed; sangi's broyden_method supports this strategy.

Combination with Line Search

Taking the full Newton step $\Delta = -J^{-1} \mathbf{F}$ can diverge outside the basin of attraction. A line search sets $\mathbf{x}_{k+1} = \mathbf{x}_k + \lambda \Delta$ ($\lambda \in (0, 1]$) and chooses $\lambda$ to ensure the merit function $\|\mathbf{F}(\mathbf{x}_{k+1})\|^2 / 2$ decreases sufficiently.

Armijo condition

$$\|\mathbf{F}(\mathbf{x}_k + \lambda \Delta)\|^2 \leq (1 - 2 c \lambda) \|\mathbf{F}(\mathbf{x}_k)\|^2.$$

$c \in (0, 1/2)$ is the sufficient-decrease parameter (typically $10^{-4}$). Backtracking tries $\lambda = 1, 1/2, 1/4, \ldots$ and keeps the largest $\lambda$ satisfying the condition.

With line search, multivariate Newton gains global convergence for smooth $\mathbf{F}$: it never diverges and gradually enters the local convergence region. Line search support is on sangi's roadmap as an option for newton_raphson_system.

Homotopy Continuation (Global Convergence)

For problems where line search alone is insufficient (multiple deep local minima of $\|\mathbf{F}\|$), use homotopy continuation.

Idea

Connect an easy system to the target via a parameter $t \in [0, 1]$:

$$H(\mathbf{x}, t) = (1 - t) G(\mathbf{x}) + t \mathbf{F}(\mathbf{x}).$$

$G$ is a simple system whose solution is known (e.g. $G(\mathbf{x}) = \mathbf{x} - \mathbf{x}_0$ has solution $\mathbf{x}_0$). Move $t$ from $0$ to $1$ in small steps, tracking solutions with a predictor-corrector loop:

Predictor: take an Euler step along the tangent direction from $\partial H/\partial t$ to predict the next solution.
Corrector: from that prediction, perform Newton iterations at fixed $t$ to solve $H(\mathbf{x}, t) = \mathbf{0}$.

Reaching $t = 1$ yields a solution of the target system. For polynomial systems and enumeration of all real solutions, this is the standard approach (Bertini, HomotopyContinuation.jl).

sangi does not currently expose homotopy as a top-level API, but a wrapper around newton_raphson_system can implement it.

Basins of Attraction and Initial Values

Multivariate Newton and Broyden guarantee only local convergence; starting outside the basin of attraction can diverge. Kantorovich's theorem gives a sufficient condition: if at $\mathbf{x}_0$

$$\|J(\mathbf{x}_0)^{-1}\| \cdot \|\mathbf{F}(\mathbf{x}_0)\| \cdot L \leq \frac{1}{2},$$

(where $L$ is the Lipschitz constant of the Jacobian) Newton converges quadratically to a root near $\mathbf{x}_0$. In practice, combine the following strategies:

Domain knowledge: if the solution range is known a priori, center $\mathbf{x}_0$ on it.
Continuation: gradually deform an easy problem into the target (a simple form of homotopy).
Line search / trust region: damp the iteration to avoid divergence and slowly enter the basin.
Multistart: run from several initial values in parallel and keep the converged result.

With a good initial value, Newton reaches machine precision in 5–10 iterations. Exceeding 50 iterations without convergence usually signals a bad initial value or a wrong algorithm choice.

Design Decisions

Two separate entry points: Newton and Broyden

Choice between methods depends strongly on the Jacobian's evaluation cost:

Analytic Jacobian, comparable cost to $\mathbf{F}$ → Newton (quadratic convergence wins).
Complex or unknown Jacobian → Broyden (no Jacobian needed).

sangi exposes newton_raphson_system and broyden_method separately so users can match the algorithm to the problem. Future additions such as Newton-Krylov or trust region methods will follow the same naming pattern.

Consistency with 1D Newton

1D newton_raphson and multivariate newton_raphson_system are the same idea at different dimensions. sangi keeps the API shape uniform (function + derivative/Jacobian + initial value), minimizing the effort to move between scalar and vector problems.

References

Ortega, J. M. and Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. Academic Press.
Broyden, C. G. (1965). "A class of methods for solving nonlinear simultaneous equations". Mathematics of Computation, 19(92), 577–593.
Allgower, E. L. and Georg, K. (2003). Introduction to Numerical Continuation Methods. SIAM.
Dennis, J. E. and Schnabel, R. B. (1996). Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM.