Chapter 2: Newton's Method (Newton-Raphson Method)

Q: How fast does Newton's method converge?

Near a simple root, Newton's method exhibits quadratic convergence. This means the number of correct digits approximately doubles with each iteration. However, for multiple roots, convergence degrades to linear.

Q: Can Newton's method fail?

Yes. Newton's method can fail when the derivative approaches zero, when the initial value is too far from the root, or when the iterates oscillate periodically. Remedies include damped Newton's method and combining with the bisection method.

basic

Published: 2026-03-27

1. Overview

Newton's method is an iterative technique for numerically finding roots of nonlinear equations $f(x) = 0$, and is one of the most fundamental and powerful methods in numerical analysis. It was independently discovered by Isaac Newton and Joseph Raphson, and is also known as the Newton-Raphson method.

The method approximates $f$ at each point by its tangent line (first-order Taylor expansion) and takes the intersection of that tangent line with the $x$-axis as the next approximation. With an appropriate initial value, it exhibits quadratic convergence, meaning the number of correct digits approximately doubles with each iteration.

2. Algorithm

Newton's Method Iteration Formula

When $f$ is differentiable and $f'(x_n) \neq 0$,

$$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \qquad n = 0, 1, 2, \ldots$$

The procedure is as follows.

Choose an initial value $x_0$.
Compute $x_1, x_2, \ldots$ successively using the iteration formula.
Stop when $|x_{n+1} - x_n| < \varepsilon$ or $|f(x_n)| < \delta$ ($\varepsilon$: tolerance for position, $\delta$: tolerance for function value).

3. Graphical Interpretation

The equation of the tangent line at the point $(x_n,\, f(x_n))$ is

$$y = f(x_n) + f'(x_n)(x - x_n)$$

and the point where this tangent line intersects the $x$-axis ($y=0$) is $x_{n+1}$.

Figure 1. Graphical interpretation of Newton's method ($f(x) = x^3 - 2x - 5$). The intersection of the tangent line at the point $(x_n,\, f(x_n))$ on the curve with the $x$-axis gives the next approximation $x_{n+1}$. Starting from $x_0 = 3.5$, the iterates $x_1 \approx 2.61 \to x_2 \approx 2.20 \to x_3 \approx 2.10$ converge rapidly to the root $x^* \approx 2.095$.

Derivation of the Formula

The equation of the tangent line at $(x_n, f(x_n))$ can be written as:

$$y - f(x_n) = f'(x_n)(x - x_n)$$

Setting $y = 0$ to find where this tangent line intersects the $x$-axis:

$$0 - f(x_n) = f'(x_n)(x - x_n)$$ $$x = x_n - \frac{f(x_n)}{f'(x_n)}$$

This is the next approximation $x_{n+1}$. That is, the iteration formula of Newton's method arises naturally as the $x$-intercept of the tangent line.

Derivation of Newton's method formula. The tangent line at (xn, f(xn)) intersects the x-axis at xn+1. — Figure 2. Derivation of the formula. The tangent line (slope $f'(x_n)$) at the point $(x_n, f(x_n))$ on the curve intersects the $x$-axis at the next approximation $x_{n+1}$.

4. Quadratic Convergence

Theorem (Quadratic Convergence)

If $f$ is twice continuously differentiable and $x_0$ is taken sufficiently close to a simple root $x^*$ ($f(x^*)=0$, $f'(x^*)\neq 0$), then Newton's method converges quadratically. That is, for the constant $C = \dfrac{|f''(x^*)|}{2|f'(x^*)|}$,

$$|x_{n+1} - x^*| \approx C \cdot |x_n - x^*|^2$$

holds.

This means that the next error is proportional to the square of the current error, multiplied by the constant $C$. For example, if the current error is $10^{-3}$, the next error is approximately $C \times (10^{-3})^2 = C \times 10^{-6}$. The number of correct digits roughly doubles with each iteration, which is why this is called "quadratic" convergence.

Proof

Step 1 (Taylor expansion): Expand $f(x^*)$ in a Taylor series about $x = x_n$ up to first order. Since $f$ is twice continuously differentiable, with the Lagrange form of the remainder as the last term,

$$f(x^*) = f(x_n) + f'(x_n)(x^* - x_n) + \frac{f''(\xi_n)}{2}(x^* - x_n)^2$$

where $\xi_n$ is some point between $x_n$ and $x^*$. Since $x^*$ is a root ($f(x^*) = 0$),

$$f(x_n) + f'(x_n)(x^* - x_n) + \frac{f''(\xi_n)}{2}(x^* - x_n)^2 = 0 \tag{*}$$

Step 2 (eliminate $f(x_n)$): Rearranging Newton's iteration formula $x_{n+1} = x_n - \dfrac{f(x_n)}{f'(x_n)}$ gives

$$f(x_n) = f'(x_n)(x_n - x_{n+1})$$

Substituting into $(*)$:

$$f'(x_n)(x_n - x_{n+1}) + f'(x_n)(x^* - x_n) + \frac{f''(\xi_n)}{2}(x^* - x_n)^2 = 0$$

Step 3 (simplify): Factoring $f'(x_n)$ from the first two terms,

$$f'(x_n)\bigl[(x_n - x_{n+1}) + (x^* - x_n)\bigr] + \frac{f''(\xi_n)}{2}(x^* - x_n)^2 = 0$$

The bracketed expression simplifies to $(x_n - x_{n+1}) + (x^* - x_n) = x^* - x_{n+1}$, so

$$f'(x_n)(x^* - x_{n+1}) + \frac{f''(\xi_n)}{2}(x^* - x_n)^2 = 0$$

Step 4 (error recurrence): Since $f'(x_n) \neq 0$ (guaranteed when $x^*$ is a simple root and $x_n$ is sufficiently close), dividing both sides by $f'(x_n)$ gives

$$x^* - x_{n+1} = -\frac{f''(\xi_n)}{2f'(x_n)}(x^* - x_n)^2$$

Reversing signs,

$$x_{n+1} - x^* = \frac{f''(\xi_n)}{2f'(x_n)}(x_n - x^*)^2$$

As $n \to \infty$, $x_n \to x^*$ and $\xi_n \to x^*$, so with $C = \dfrac{|f''(x^*)|}{2|f'(x^*)|}$,

$$|x_{n+1} - x^*| \approx C \cdot |x_n - x^*|^2 \quad \blacksquare$$

Meaning of Quadratic Convergence

Since the error decreases quadratically, the number of correct digits approximately doubles with each iteration.

Example: $10^{-2} \to 10^{-4} \to 10^{-8} \to 10^{-16}$

Convergence rate comparison of Newton's method and the bisection method — Figure 3. Convergence rate comparison ($f(x) = x^3 - 2x - 5$, logarithmic scale). Red: Newton's method error $|x_k - x^*|$ ($x_0 = 3.5$). Blue dashed: bisection method interval width $|b_k - a_k|$ (initial interval $[1, 3.5]$). Newton's method reaches machine precision $10^{-15}$ in 6 iterations. The bisection method's interval halves each time, forming a straight line on the log scale, reaching only about $10^{-7}$ after 25 iterations.

5. Example: Computing $\sqrt{2}$

Applying Newton's Method to $f(x) = x^2 - 2$

Since $f'(x) = 2x$, the iteration formula becomes:

$$x_{n+1} = x_n - \frac{x_n^2 - 2}{2x_n} = \frac{1}{2}\!\left(x_n + \frac{2}{x_n}\right)$$

Starting from the initial value $x_0 = 1$:

$n$	$x_n$	Error $\|x_n - \sqrt{2}\|$	Correct digits
0	1.000000000	4.14 × 10⁻¹	0
1	1.500000000	8.58 × 10⁻²	1
2	1.416666667	2.45 × 10⁻³	2
3	1.414215686	2.12 × 10⁻⁶	5
4	1.414213562	1.59 × 10⁻¹²	11

In just 4 iterations, 11 digits of accuracy are obtained. The progression of correct digits 0 → 1 → 2 → 5 → 11, approximately doubling each time, demonstrates the power of quadratic convergence.

6. Multivariable Newton's Method

For a system of nonlinear equations $\mathbf{f}(\mathbf{x}) = \mathbf{0}$ ($\mathbf{x} \in \mathbb{R}^n$), the iteration formula becomes

$$\mathbf{x}_{n+1} = \mathbf{x}_n - J(\mathbf{x}_n)^{-1}\,\mathbf{f}(\mathbf{x}_n)$$

where $J(\mathbf{x})$ is the Jacobian matrix

$$J(\mathbf{x}) = \begin{pmatrix} \dfrac{\partial f_1}{\partial x_1} & \cdots & \dfrac{\partial f_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \dfrac{\partial f_n}{\partial x_1} & \cdots & \dfrac{\partial f_n}{\partial x_n} \end{pmatrix}$$

Although the iteration formula involves $J^{-1}$, in practice one does not compute the inverse matrix explicitly. Instead, one solves the linear system

$$J(\mathbf{x}_n)\,\Delta\mathbf{x} = -\mathbf{f}(\mathbf{x}_n)$$

via LU decomposition or similar methods, and updates $\mathbf{x}_{n+1} = \mathbf{x}_n + \Delta\mathbf{x}$. There are two main reasons for avoiding the explicit inverse:

Numerical stability: Computing the inverse is susceptible to catastrophic cancellation, whereas solving the linear system directly via LU decomposition yields higher accuracy.
Memory efficiency: The inverse is a full $n \times n$ dense matrix that must be stored, whereas LU decomposition works in-place and produces $\Delta\mathbf{x}$ directly.

Since $J(\mathbf{x}_n)$ changes at every iteration, the LU factorization must be recomputed each time.

7. Convergence Conditions and Failure Cases

Sufficient Conditions for Convergence

$f(x^*) = 0$ and $f'(x^*) \neq 0$ (simple root).
$f$ is twice continuously differentiable in a neighborhood of the root.
The initial value $x_0$ is sufficiently close to the root.

Failure Cases

Multiple roots: When $f'(x^*)=0$, convergence degrades to linear. If the multiplicity $m$ is known, the modified Newton's method $x_{n+1} = x_n - m\,f(x_n)/f'(x_n)$ restores quadratic convergence. When $m$ is unknown, one can apply Newton's method to $u(x) = f(x)/f'(x)$ instead, whose roots are all simple.
Zero derivative: If $f'(x_n) \approx 0$ during iteration, the method diverges.
Periodic oscillation: For example, with $f(x) = x^3 - 2x + 2$ and $x_0 = 0$, the iterates oscillate as $x_0 \to 1 \to 0 \to 1 \to \cdots$.

Three failure patterns of Newton's method: linear convergence at multiple roots, divergence near zero derivative, periodic oscillation — Figure 4. Three failure patterns of Newton's method. (a) Multiple root $f(x)=x^2$: convergence degrades to linear (error only halves each step). (b) Divergence $f(x)=\arctan x$: the small $f'(x_n)$ causes the tangent line to be nearly flat, flinging the next iterate far away. (c) Oscillation $f(x)=x^3-2x+2$, $x_0=0$: the iterates cycle $0 \to 1 \to 0 \to 1 \to \cdots$ forever.

Improved Methods and Practical Remedies

Global convergence can be ensured by damped Newton's method ($x_{n+1} = x_n - \alpha_n\, f(x_n)/f'(x_n)$, where $\alpha_n$ is determined by line search) or by combining with the bisection method.

Practical Remedies

Use the bisection method to narrow down an approximate range, then apply Newton's method.
Limit the step size when $|f'(x_n)|$ is too small.
Set an upper bound on the number of iterations.

8. Comparison with the Bisection Method

	Bisection Method	Newton's Method
Convergence rate	First-order (linear)	Second-order (quadratic)
Required information	$f(x)$ only	$f(x)$ and $f'(x)$
Convergence guarantee	Always converges	Depends on initial value
Typical iterations	20–50	5–10

9. Summary

Newton's method iteration formula: $x_{n+1} = x_n - \dfrac{f(x_n)}{f'(x_n)}$
It is a graphical method that uses the intersection of the tangent line with the $x$-axis as the next approximation.
Quadratic convergence: the number of correct digits approximately doubles with each iteration.
Computation of the derivative $f'(x)$ is required.
The choice of initial value is critical, and convergence is not guaranteed in all cases.
Extension to multiple variables requires the Jacobian matrix and solving a linear system.

10. Frequently Asked Questions

Q1. What is Newton's method?

It is a technique for iteratively finding roots of a nonlinear equation $f(x)=0$ using tangent-line approximation. The iteration formula is $x_{n+1} = x_n - f(x_n)/f'(x_n)$.

Q2. How fast does Newton's method converge?

Near a simple root, it converges quadratically. That is, the number of correct digits approximately doubles with each iteration. For multiple roots, convergence degrades to linear.

Q3. Can Newton's method fail?

It can fail when the derivative is close to zero, when the initial value is too far from the root, or when periodic oscillation occurs. Damped Newton's method or combining with the bisection method can address these issues.

11. References

Wikipedia, "Newton's method" (English)
Wikipedia, "Newton's method" (Japanese)
R. L. Burden & J. D. Faires, Numerical Analysis, 10th ed., Cengage, 2016.
J. Nocedal & S. J. Wright, Numerical Optimization, 2nd ed., Springer, 2006.