Float Square Root Algorithm

Overview

Float::sqrt computes arbitrary-precision floating-point square roots by reducing to integer square root (sqrtRem). A sticky bit is constructed from the remainder, achieving correct rounding compliant with IEEE 754 for all rounding modes.

$$\sqrt{m \cdot 2^e} = \sqrt{m'} \cdot 2^{e'/2}$$

Here $m'$ is the mantissa scaled to $2(p+1)$ bits, and $e'$ is the exponent adjusted to be even. From the root and remainder returned by integer sqrtRem, a correctly rounded floating-point square root is constructed.

Dispatch is performed in three stages depending on input precision:

PrecisionMantissaFunctionMethod
$\leq 64$ bit1 wordsqrt_newton_1std::sqrt + 1 Newton iteration + intrinsics
$\leq 128$ bit$\leq 2$ wordssqrt_small_2Direct dc_sqrtrem (n=2) call
$> 128$ bitArbitrarysqrt_coreReduction to Int::sqrtRem

For usage, see the Float API Reference. For details on integer square root (sqrtrem1, sqrtrem2, Zimmermann's dc_sqrtrem), see Int Square Root Algorithm.

Reduction to Integer Square Root

The following describes the procedure for reducing a Float square root to integer operations. This principle is common to all dispatch paths.

Scaling

Given a floating-point number $x = m \cdot 2^e$ (mantissa $m$, exponent $e$), we compute the square root to a target precision of $p$ bits.

  1. Determining root bit-length: $\text{target\_root\_bits} = p + 1$ (one extra bit for rounding)
  2. Mantissa scaling: Scale $m$ to a $2(p+1)$-bit integer $I$. Left-shift $m$ by $\text{base\_shift} = 2(p+1) - \text{bitLength}(m)$ bits.
  3. Making the exponent even: If the adjusted exponent $e' = e - \text{base\_shift}$ is odd, apply one additional bit of shift to make it even: $$e' \bmod 2 = 0$$
  4. Integer square root: Compute $s = \lfloor \sqrt{I} \rfloor$ and $r = I - s^2$
  5. Result exponent: $e_{\text{result}} = e' / 2$

$$\sqrt{m \cdot 2^{e'}} = \sqrt{I} \cdot 2^{e'/2} \approx s \cdot 2^{e'/2}$$

Why the Exponent Must Be Even

$\sqrt{2^{2k}} = 2^k$ is an integer, but $\sqrt{2^{2k+1}} = 2^k \sqrt{2}$ is irrational. By making the exponent even, the square root can be completed entirely in the integer domain. For an odd exponent, the mantissa is doubled ($+1$ bit shift) and the exponent is decremented by 1.

Correct Rounding

To perform exact rounding of a floating-point square root, one must know whether the true square root lies strictly on one side of $s$ or $s+1$. calx achieves this using the sticky bit method.

How the Sticky Bit Works

The remainder $r = I - s^2$ returned by integer sqrtRem is used:

  • $r = 0$: $\sqrt{I} = s$ (exact). The LSB of the root is left unchanged
  • $r \neq 0$: $s < \sqrt{I} < s + 1$. The LSB of the root is set to 1 (sticky bit)

$$s_{\text{sticky}} = \begin{cases} s & \text{if } r = 0 \\ s \mathbin{|} 1 & \text{if } r \neq 0 \end{cases}$$

This information propagates to the subsequent setResultPrecision() (rounding routine), enabling correct identification of midpoints (values exactly on the rounding boundary). When rounding the $(p+1)$-bit root to $p$ bits, a sticky bit of 1 indicates that the true value is above the midpoint, allowing Round-To-Nearest-Even to round in the correct direction.

Relationship with Rounding Modes

The sticky bit guarantees correct rounding for all of the following rounding modes:

  • Round-To-Nearest-Even (default): rounds toward the even value at midpoint. The sticky bit is needed to identify midpoints
  • Truncation: always rounds toward zero. The sticky bit has no effect
  • Round-Up / Round-Down: the rounding direction is determined by whether the remainder is nonzero

This method is the same approach adopted by MPFR and satisfies the requirements of IEEE 754.

sqrt_newton_1 ($\leq 64$ bit Precision)

Computes the square root of a 1-word mantissa, completely bypassing the Float abstraction. Avoids Int object construction and destruction, performing all computation using CPU intrinsics only.

  1. Scaling to 128 bits: From the mantissa $m$ and exponent $e$, construct a 128-bit integer $M$ (M_hi:M_lo). Set desired_bits = 126 so that even after parity adjustment of $+1$, it fits within 128 bits.
  2. Initial approximation: Obtain a $\sim$53-bit initial approximation $s_0$ via std::sqrt(double(M)). Since $M$ is 128 bits but the double mantissa is 53 bits, $s_0$ approximates only the leading bits.
  3. One Newton iteration: $$s_1 = \frac{s_0 + \lfloor M / s_0 \rfloor}{2}$$ Execute a 128 ÷ 64 bit division using _udiv128(M_hi, M_lo, s0, &rem). Overflow-safe average: (s0 >> 1) + (q >> 1) + ((s0 & q) & 1). This single iteration doubles the precision from $\sim$53 to $\sim$64 bits.
  4. ±1 correction: Compute $s_1^2$ using _umul128(s1, s1, &sq_hi) and verify that $s_1^2 \leq M < (s_1 + 1)^2$. If the condition is not met, adjust $s_1$ by $\pm 1$.
  5. Sticky bit: If $s_1^2 \neq M$, set the LSB of $s_1$ to 1.

The result is constructed as a Float directly from Int(s1), then rounded by setResultPrecision(). Thanks to SBO (Small Buffer Optimization), a 1-word Int is constructed without heap allocation.

sqrt_small_2 ($\leq 128$ bit Precision)

Computes the square root of a 1–2 word mantissa directly using mpn-level dc_sqrtrem. Avoids the overhead of Int-level division and multiplication, completing entirely in fixed-size stack buffers.

  1. Scaling to a 4-word buffer: Place the mantissa into a 256-bit (4-word) stack buffer scaled[6]. Two extra words are allocated because dc_sqrtrem uses the region np[n..n+2l-1].
  2. Normalization: dc_sqrtrem requires scaled[3] >= 2^{62}. Measure leading zeros with std::countl_zero() and left-shift by an even number of bits only (an odd shift would produce a half-word offset in the square root result).
  3. dc_sqrtrem call: Call dc_sqrtrem(sp, scaled, n=2, scratch) directly. The resulting root is stored in sp[0..1] (2 words), and the remainder in scaled[0..1]. The carry flag is also returned as part of the remainder.
  4. Denormalization: If normalization shifted left by $t$ bits, right-shift the root by $t/2$ bits to restore it ($\sqrt{x \cdot 2^t} = \sqrt{x} \cdot 2^{t/2}$).
  5. Sticky bit: If the carry or remainder (scaled[0], scaled[1]) is nonzero, set sp[0] |= 1.

An Int is constructed from the 2 words using Int::fromRawWordsPreNormalized (SBO: no heap allocation), a Float is built from it, and setResultPrecision() performs the rounding.

sqrt_core (General Purpose)

The general routine for arbitrary precision beyond 128 bits. It faithfully follows the integer reduction procedure and computes the integer square root using IntSqrt::sqrtRem().

  1. precision_bits = precisionToBits(precision), target_root_bits = precision_bits + 1
  2. Left-shift the mantissa $m$ to desired_bits = 2 * target_root_bits bits
  3. Adjust the exponent to be even
  4. Compute $s = \lfloor\sqrt{I}\rfloor$ and the remainder using IntSqrt::sqrtRem(scaled, remainder)
  5. If the remainder is nonzero, set s |= Int(1) (sticky bit)
  6. Construct Float(s, result_exponent, false) and round with setResultPrecision

Internally, Zimmermann's dc_sqrtrem is called recursively depending on input size. The integer square root of $n$ words is computed in $O(M(n))$ ($M(n)$ is the cost of $n$-word multiplication), the same asymptotic complexity as division via Newton iteration.

Related article: Int Square Root Algorithm — Details of sqrtrem1, sqrtrem2, Zimmermann's dc_sqrtrem, and isSquare

References

  • Zimmermann, P. (1999). "Karatsuba Square Root". INRIA Research Report RR-3805.
  • Brent, R. P. and Zimmermann, P. (2010). Modern Computer Arithmetic. Cambridge University Press. Chapter 3.5: Square Root.
  • Muller, J.-M. et al. (2018). Handbook of Floating-Point Arithmetic. 2nd ed. Birkhäuser. Chapter 8: Square Root.
  • MPFR Documentation. https://www.mpfr.org/mpfr-current/mpfr.html