Float Square Root Algorithm
Overview
Float::sqrt computes arbitrary-precision floating-point square roots by
reducing to integer square root (sqrtRem).
A sticky bit is constructed from the remainder,
achieving correct rounding compliant with IEEE 754 for all rounding modes.
$$\sqrt{m \cdot 2^e} = \sqrt{m'} \cdot 2^{e'/2}$$
Here $m'$ is the mantissa scaled to $2(p+1)$ bits, and $e'$ is the exponent adjusted to be even.
From the root and remainder returned by integer sqrtRem,
a correctly rounded floating-point square root is constructed.
Dispatch is performed in three stages depending on input precision:
| Precision | Mantissa | Function | Method |
|---|---|---|---|
| $\leq 64$ bit | 1 word | sqrt_newton_1 | std::sqrt + 1 Newton iteration + intrinsics |
| $\leq 128$ bit | $\leq 2$ words | sqrt_small_2 | Direct dc_sqrtrem (n=2) call |
| $> 128$ bit | Arbitrary | sqrt_core | Reduction to Int::sqrtRem |
For usage, see the Float API Reference. For details on integer square root (sqrtrem1, sqrtrem2, Zimmermann's dc_sqrtrem), see Int Square Root Algorithm.
Reduction to Integer Square Root
The following describes the procedure for reducing a Float square root to integer operations. This principle is common to all dispatch paths.
Scaling
Given a floating-point number $x = m \cdot 2^e$ (mantissa $m$, exponent $e$), we compute the square root to a target precision of $p$ bits.
- Determining root bit-length: $\text{target\_root\_bits} = p + 1$ (one extra bit for rounding)
- Mantissa scaling: Scale $m$ to a $2(p+1)$-bit integer $I$. Left-shift $m$ by $\text{base\_shift} = 2(p+1) - \text{bitLength}(m)$ bits.
- Making the exponent even: If the adjusted exponent $e' = e - \text{base\_shift}$ is odd, apply one additional bit of shift to make it even: $$e' \bmod 2 = 0$$
- Integer square root: Compute $s = \lfloor \sqrt{I} \rfloor$ and $r = I - s^2$
- Result exponent: $e_{\text{result}} = e' / 2$
$$\sqrt{m \cdot 2^{e'}} = \sqrt{I} \cdot 2^{e'/2} \approx s \cdot 2^{e'/2}$$
Why the Exponent Must Be Even
$\sqrt{2^{2k}} = 2^k$ is an integer, but $\sqrt{2^{2k+1}} = 2^k \sqrt{2}$ is irrational. By making the exponent even, the square root can be completed entirely in the integer domain. For an odd exponent, the mantissa is doubled ($+1$ bit shift) and the exponent is decremented by 1.
Correct Rounding
To perform exact rounding of a floating-point square root, one must know whether the true square root lies strictly on one side of $s$ or $s+1$. calx achieves this using the sticky bit method.
How the Sticky Bit Works
The remainder $r = I - s^2$ returned by integer sqrtRem is used:
- $r = 0$: $\sqrt{I} = s$ (exact). The LSB of the root is left unchanged
- $r \neq 0$: $s < \sqrt{I} < s + 1$. The LSB of the root is set to 1 (sticky bit)
$$s_{\text{sticky}} = \begin{cases} s & \text{if } r = 0 \\ s \mathbin{|} 1 & \text{if } r \neq 0 \end{cases}$$
This information propagates to the subsequent setResultPrecision() (rounding routine),
enabling correct identification of midpoints (values exactly on the rounding boundary).
When rounding the $(p+1)$-bit root to $p$ bits,
a sticky bit of 1 indicates that the true value is above the midpoint,
allowing Round-To-Nearest-Even to round in the correct direction.
Relationship with Rounding Modes
The sticky bit guarantees correct rounding for all of the following rounding modes:
- Round-To-Nearest-Even (default): rounds toward the even value at midpoint. The sticky bit is needed to identify midpoints
- Truncation: always rounds toward zero. The sticky bit has no effect
- Round-Up / Round-Down: the rounding direction is determined by whether the remainder is nonzero
This method is the same approach adopted by MPFR and satisfies the requirements of IEEE 754.
sqrt_newton_1 ($\leq 64$ bit Precision)
Computes the square root of a 1-word mantissa, completely bypassing the Float abstraction. Avoids Int object construction and destruction, performing all computation using CPU intrinsics only.
- Scaling to 128 bits:
From the mantissa $m$ and exponent $e$, construct a 128-bit integer $M$ (
M_hi:M_lo). Setdesired_bits = 126so that even after parity adjustment of $+1$, it fits within 128 bits. - Initial approximation:
Obtain a $\sim$53-bit initial approximation $s_0$ via
std::sqrt(double(M)). Since $M$ is 128 bits but the double mantissa is 53 bits, $s_0$ approximates only the leading bits. - One Newton iteration:
$$s_1 = \frac{s_0 + \lfloor M / s_0 \rfloor}{2}$$
Execute a 128 ÷ 64 bit division using
_udiv128(M_hi, M_lo, s0, &rem). Overflow-safe average:(s0 >> 1) + (q >> 1) + ((s0 & q) & 1). This single iteration doubles the precision from $\sim$53 to $\sim$64 bits. - ±1 correction:
Compute $s_1^2$ using
_umul128(s1, s1, &sq_hi)and verify that $s_1^2 \leq M < (s_1 + 1)^2$. If the condition is not met, adjust $s_1$ by $\pm 1$. - Sticky bit: If $s_1^2 \neq M$, set the LSB of $s_1$ to 1.
The result is constructed as a Float directly from Int(s1),
then rounded by setResultPrecision().
Thanks to SBO (Small Buffer Optimization), a 1-word Int is constructed without heap allocation.
sqrt_small_2 ($\leq 128$ bit Precision)
Computes the square root of a 1–2 word mantissa directly using mpn-level dc_sqrtrem.
Avoids the overhead of Int-level division and multiplication, completing entirely in fixed-size stack buffers.
- Scaling to a 4-word buffer:
Place the mantissa into a 256-bit (4-word) stack buffer
scaled[6]. Two extra words are allocated becausedc_sqrtremuses the regionnp[n..n+2l-1]. - Normalization:
dc_sqrtremrequiresscaled[3] >= 2^{62}. Measure leading zeros withstd::countl_zero()and left-shift by an even number of bits only (an odd shift would produce a half-word offset in the square root result). - dc_sqrtrem call:
Call
dc_sqrtrem(sp, scaled, n=2, scratch)directly. The resulting root is stored insp[0..1](2 words), and the remainder inscaled[0..1]. The carry flag is also returned as part of the remainder. - Denormalization: If normalization shifted left by $t$ bits, right-shift the root by $t/2$ bits to restore it ($\sqrt{x \cdot 2^t} = \sqrt{x} \cdot 2^{t/2}$).
- Sticky bit:
If the carry or remainder (
scaled[0], scaled[1]) is nonzero, setsp[0] |= 1.
An Int is constructed from the 2 words using Int::fromRawWordsPreNormalized (SBO: no heap allocation),
a Float is built from it, and setResultPrecision() performs the rounding.
sqrt_core (General Purpose)
The general routine for arbitrary precision beyond 128 bits.
It faithfully follows the integer reduction procedure
and computes the integer square root using IntSqrt::sqrtRem().
precision_bits = precisionToBits(precision),target_root_bits = precision_bits + 1- Left-shift the mantissa $m$ to
desired_bits = 2 * target_root_bitsbits - Adjust the exponent to be even
- Compute $s = \lfloor\sqrt{I}\rfloor$ and the remainder using
IntSqrt::sqrtRem(scaled, remainder) - If the remainder is nonzero, set
s |= Int(1)(sticky bit) - Construct
Float(s, result_exponent, false)and round withsetResultPrecision
Internally, Zimmermann's dc_sqrtrem is called recursively depending on input size.
The integer square root of $n$ words is computed in $O(M(n))$ ($M(n)$ is the cost of $n$-word multiplication),
the same asymptotic complexity as division via Newton iteration.
Related article: Int Square Root Algorithm — Details of sqrtrem1, sqrtrem2, Zimmermann's dc_sqrtrem, and isSquare
References
- Zimmermann, P. (1999). "Karatsuba Square Root". INRIA Research Report RR-3805.
- Brent, R. P. and Zimmermann, P. (2010). Modern Computer Arithmetic. Cambridge University Press. Chapter 3.5: Square Root.
- Muller, J.-M. et al. (2018). Handbook of Floating-Point Arithmetic. 2nd ed. Birkhäuser. Chapter 8: Square Root.
- MPFR Documentation. https://www.mpfr.org/mpfr-current/mpfr.html