Float — Arbitrary-Precision Floating Point

Overview

Float is the arbitrary-precision floating-point type in calx. Internally, a value is represented as a triple of (sign, mantissa Int, exponent int64_t), encoding a value $x$ as:

$$x = (-1)^{s} \times m \times 2^{e}$$

where $s$ is the sign bit, $m$ is the multiple-precision integer (Int) mantissa, and $e$ is a 64-bit integer exponent.

Precision is specified in decimal digits and internally converted to bit precision. IEEE 754-compatible rounding modes (RoundingMode) are supported, and NaN and Infinity propagate safely through all operations.

  • Arbitrary precision — No limit on the number of digits. Calculations with hundreds of thousands of digits are possible
  • NaN/Infinity safe — Special values propagate with IEEE 754-like semantics
  • thread_local constant cache — Mathematical constants such as $\pi, e, \log 2$ are cached per thread, avoiding recomputation at the same precision
  • Precision tracking — Effective bits (effectiveBits) and requested bits (requestedBits) are automatically managed

Constructors

SignatureDescription
Float()Default construction. Value is 0
Float(int value)Construct from int
Float(int64_t value)Construct from 64-bit integer
Float(double value)Construct from double (53-bit precision)
Float(std::string_view str)Construct from string (decimal)
Float(const Int& value)Construct from arbitrary-precision integer (exact)
Float(const Int& mantissa, int64_t exponent, bool is_negative = false)Construct from mantissa + exponent
Float(Int&& mantissa, int64_t exponent, bool is_negative = false)Construct from mantissa (move) + exponent
Float(int64_t mantissa, int64_t exponent, bool is_negative = false)Construct from integer mantissa + exponent

Copy/move constructors and assignment operators are all provided as default. Assignment operators from int64_t and double are also available.

Special Value Factory

All are static member functions.

SignatureDescription
Float positiveInfinity()Returns $+\infty$
Float negativeInfinity()Returns $-\infty$
Float nan()Returns NaN
Float epsilon(int precision)Machine epsilon for the given precision
Float zero(int precision)Zero with the given precision
Float one(int precision)One with the given precision

If precision is omitted, defaultPrecision() (thread-local) is used.

Mathematical Constants

All are static member functions. A thread_local cache avoids recomputation at the same precision.

SignatureDescriptionAlgorithm
Float pi(int precision)Pi $\pi \approx 3.1416$Chudnovsky (binary splitting)
Float e(int precision)Euler's number $e \approx 2.7183$$\sum 1/n!$ (binary splitting)
Float log2(int precision)$\ln 2 \approx 0.6931$atanh series
Float log10(int precision)$\ln 10 \approx 2.3026$atanh series
Float euler(int precision)Euler-Mascheroni constant $\gamma \approx 0.5772$Brent-McMillan
Float catalan(int precision)Catalan's constant $G \approx 0.9160$Euler series

Trigonometric / Circle Constants

SignatureDescription
Float::half_pi(int precision)$\pi/2 \approx 1.5708$
Float::quarter_pi(int precision)$\pi/4 \approx 0.7854$
Float::two_pi(int precision)$2\pi \approx 6.2832$
Float::inv_pi(int precision)$1/\pi \approx 0.3183$
Float::two_inv_pi(int precision)$2/\pi \approx 0.6366$
Float::pi_squared_over_6(int precision)$\pi^2/6 = \zeta(2) \approx 1.6449$
Float::pi_squared_over_12(int precision)$\pi^2/12 \approx 0.8225$
Float::inv_sqrt_pi(int precision)$1/\sqrt{\pi} \approx 0.5642$
Float::two_inv_sqrt_pi(int precision)$2/\sqrt{\pi} \approx 1.1284$
Float::degree(int precision)$\pi/180 \approx 0.01745$

Square Root Constants

SignatureDescription
Float::sqrt2(int precision)$\sqrt{2} \approx 1.4142$
Float::sqrt3(int precision)$\sqrt{3} \approx 1.7321$
Float::sqrt5(int precision)$\sqrt{5} \approx 2.2361$
Float::inv_sqrt2(int precision)$1/\sqrt{2} \approx 0.7071$
Float::cbrt2(int precision)$\sqrt[3]{2} \approx 1.2599$

Logarithmic Constants

SignatureDescription
Float::ln3(int precision)$\ln 3 \approx 1.0986$
Float::ln5(int precision)$\ln 5 \approx 1.6094$
Float::log2e(int precision)$\log_2 e \approx 1.4427$
Float::log10e(int precision)$\log_{10} e \approx 0.4343$

Notable Constants

SignatureDescription
Float::phi(int precision)Golden ratio $(1+\sqrt{5})/2 \approx 1.6180$
Float::lemniscate(int precision)Lemniscate constant $\varpi \approx 2.6221$
Float::gamma14(int precision)$\Gamma(1/4) \approx 3.6256$
Float::zeta3(int precision)Apéry's constant $\zeta(3) \approx 1.2021$
Float::zeta5(int precision)$\zeta(5) \approx 1.0369$
Float::zeta7(int precision)$\zeta(7) \approx 1.0083$
Float::glaisher(int precision)Glaisher-Kinkelin constant $A \approx 1.2824$
Float::khinchin(int precision)Khinchin's constant $K \approx 2.6854$
Float::omega(int precision)$\Omega$ (Lambert $W(1)$) $\approx 0.5671$
Float::sin1(int precision)$\sin 1 \approx 0.8415$
Float::cos1(int precision)$\cos 1 \approx 0.5403$

Rare Mathematical Constants

SignatureDescription
Float::plastic(int precision)Plastic number $\approx 1.3247$
Float::twin_prime(int precision)Twin prime constant $C_2 \approx 0.6602$
Float::landau_ramanujan(int precision)Landau-Ramanujan constant $\approx 0.7642$
Float::meissel_mertens(int precision)Meissel-Mertens constant $\approx 0.2615$
Float::bernstein(int precision)Bernstein's constant $\approx 0.2801$
Float::gauss_kuzmin(int precision)Gauss-Kuzmin-Wirsing constant $\approx 0.3037$
Float::feigenbaum_delta(int precision)Feigenbaum $\delta \approx 4.6692$
Float::feigenbaum_alpha(int precision)Feigenbaum $\alpha \approx 2.5029$
Float::erdos_borwein(int precision)Erdős-Borwein constant $\approx 1.6066$
Float::laplace_limit(int precision)Laplace limit constant $\approx 0.6627$
Float::soldner(int precision)Ramanujan-Soldner constant $\mu \approx 1.4513$
Float::backhouse(int precision)Backhouse's constant $\approx 1.4560$
Float::porter(int precision)Porter's constant $\approx 1.4670$
Float::lieb_square_ice(int precision)Lieb's square ice constant $\approx 1.5396$
Float::niven(int precision)Niven's constant $\approx 1.7052$
Float::reciprocal_fibonacci(int precision)Reciprocal Fibonacci constant $\approx 3.3599$
Float::sierpinski(int precision)Sierpiński's constant $\approx 2.5849$
Float::mills(int precision)Mills' constant $\approx 1.3064$
Float::dottie(int precision)Dottie number $\approx 0.7391$
Float::golomb_dickman(int precision)Golomb-Dickman constant $\approx 0.6243$
Float::salem(int precision)Salem constant $\approx 1.1762$
Float::cahen(int precision)Cahen's constant $\approx 0.6434$
Float::levy(int precision)Lévy's constant $\approx 3.2758$
Float::copeland_erdos(int precision)Copeland-Erdős constant $\approx 0.2357$
Float::egamma_exp(int precision)$e^{\gamma} \approx 1.7811$

Code Example

Float::setDefaultPrecision(100);
Float pi = Float::pi();       // pi = 3.14159265...
Float e  = Float::e();        // e = 2.71828182...
Float g  = Float::euler();    // gamma = 0.57721566... (Euler-Mascheroni)
Float s2 = Float::sqrt2();    // sqrt(2) = 1.41421356...

Precision Control

SignatureDescription
Float& setPrecision(int precision)Set the number of significant digits (with rounding). Returns a reference to self
int precision() constGet the current number of significant digits
int effectiveBits() constGet the number of reliable bits
int requestedBits() constGet the target number of bits
void truncateToApprox(int precision)Fast word-level approximate truncation (for intermediate computations)
static int precisionToBits(int precision)Decimal digits → bits
static int bitsToPrecision(int bits)Bits → decimal digits
static int defaultPrecision()Get the thread-local default precision
static void setDefaultPrecision(int precision)Set the thread-local default precision

Rounding Modes

enum class RoundingMode {
    ToNearest,      // Round to nearest (default)
    TowardZero,     // Round toward zero (truncation)
    TowardPositive, // Round toward positive infinity (ceiling)
    TowardNegative, // Round toward negative infinity (floor)
    AwayFromZero    // Round away from zero
};
SignatureDescription
static RoundingMode roundingMode()Get the current rounding mode
static void setRoundingMode(RoundingMode mode)Set the rounding mode

State Inspection

SignatureDescription
bool isNaN() consttrue if NaN
bool isInfinity() consttrue if $\pm\infty$
bool isZero() consttrue if zero
bool isNegative() consttrue if negative
bool isPositive() consttrue if non-negative (!isNegative())
bool isExact() consttrue if the value is exact (derived from an integer)
bool isInteger() consttrue if the value is an integer (NaN/Infinity returns false)
bool fitsInt() constWhether the value fits in int range
bool fitsInt64() constWhether the value fits in int64_t range
bool fitsDouble() constWhether the value fits in double range

Arithmetic Operators

SignatureDescription
Float operator+(const Float&, const Float&)Addition
Float operator-(const Float&, const Float&)Subtraction
Float operator*(const Float&, const Float&)Multiplication
Float operator/(const Float&, const Float&)Division
Float operator/(const Float&, int64_t)Integer division (fast path)
Float& operator+=(const Float&)Addition assignment
Float& operator-=(const Float&)Subtraction assignment
Float& operator*=(const Float&)Multiplication assignment
Float& operator/=(const Float&)Division assignment
Float operator-(const Float&)Unary minus (sign flip)
Float operator<<(const Float&, int)Left shift ($\times 2^n$)
Float operator>>(const Float&, int)Right shift ($\div 2^n$)

Based on the precision propagation policy (default: MAX_PROPAGATION), the result's requestedBits is max(lhs, rhs) and effectiveBits is min(lhs, rhs).

Comparison Operators

SignatureDescription
bool operator==(const Float&, const Float&)Equality comparison
std::partial_ordering operator<=>(const Float&, const Float&)Three-way comparison (C++20)

The three-way comparison operator automatically generates !=, <, >, <=, >=. Any comparison involving NaN returns std::partial_ordering::unordered (IEEE 754 compliant).

Basic Mathematical Functions

All are free functions in the calx namespace. The precision argument specifies the working precision in decimal digits. Move overloads are also provided but omitted from the tables.

Exponential / Logarithmic

SignatureDescription
Float exp(const Float& x, int precision)$e^x$
Float exp2(const Float& x, int precision)$2^x$
Float exp10(const Float& x, int precision)$10^x$
Float expm1(const Float& x, int precision)$e^x - 1$ (high accuracy near $x \approx 0$)
Float log(const Float& x, int precision)$\ln x$ (AGM method)
Float log2(const Float& x, int precision)$\log_2 x$
Float log10(const Float& x, int precision)$\log_{10} x$
Float log1p(const Float& x, int precision)$\ln(1+x)$ (high accuracy near $x \approx 0$)
Float logUi(unsigned long long n, int precision)Logarithm of an integer (high accuracy via prime factorization)

Trigonometric Functions

SignatureDescription
Float sin(const Float& x, int precision)$\sin x$
Float cos(const Float& x, int precision)$\cos x$
Float tan(const Float& x, int precision)$\tan x$
Float sinPi(const Float& x, int precision)$\sin(\pi x)$ (exactly 0 at integer points)
Float cosPi(const Float& x, int precision)$\cos(\pi x)$
Float tanPi(const Float& x, int precision)$\tan(\pi x)$
Float sec(const Float& x, int precision)$\sec x = 1/\cos x$
Float csc(const Float& x, int precision)$\csc x = 1/\sin x$
Float cot(const Float& x, int precision)$\cot x = \cos x/\sin x$

Code Example

int prec = 50;
Float x("0.5", prec);
Float s = sin(x, prec);   // sin(0.5)
Float c = cos(x, prec);   // cos(0.5)
Float t = tan(x, prec);   // tan(0.5)

Inverse Trigonometric Functions

SignatureDescription
Float asin(const Float& x, int precision)$\arcsin x$
Float acos(const Float& x, int precision)$\arccos x$
Float atan(const Float& x, int precision)$\arctan x$ (Taylor + argument halving)
Float atan2(const Float& y, const Float& x, int precision)$\mathrm{atan2}(y, x)$
Float asinPi(const Float& x, int precision)$\arcsin(x) / \pi$
Float acosPi(const Float& x, int precision)$\arccos(x) / \pi$
Float atanPi(const Float& x, int precision)$\arctan(x) / \pi$

Code Example

int prec = 50;
Float x("0.5", prec);
Float a = asin(x, prec);  // arcsin(0.5) = pi/6
Float b = acos(x, prec);  // arccos(0.5) = pi/3
Float c = atan(x, prec);  // arctan(0.5) = 0.4636...

Hyperbolic Functions

SignatureDescription
Float sinh(const Float& x, int precision)$\sinh x$
Float cosh(const Float& x, int precision)$\cosh x$
Float tanh(const Float& x, int precision)$\tanh x$
Float sech(const Float& x, int precision)$\mathrm{sech}\,x = 1/\cosh x$
Float csch(const Float& x, int precision)$\mathrm{csch}\,x = 1/\sinh x$
Float coth(const Float& x, int precision)$\coth x = \cosh x/\sinh x$

Code Example

int prec = 50;
Float x("1.0", prec);
Float s = sinh(x, prec);  // sinh(1) = 1.1752...
Float c = cosh(x, prec);  // cosh(1) = 1.5430...
Float t = tanh(x, prec);  // tanh(1) = 0.7615...

Inverse Hyperbolic Functions

SignatureDescription
Float asinh(const Float& x, int precision)$\mathrm{arcsinh}\,x$
Float acosh(const Float& x, int precision)$\mathrm{arccosh}\,x$
Float atanh(const Float& x, int precision)$\mathrm{arctanh}\,x$

Powers / Roots

SignatureDescription
Float sqr(const Float& x, int precision)$x^2$ (squaring with optimized algorithm)
Float pow(const Float& x, const Float& y, int precision)$x^y$
Float pow(const Float& x, int n, int precision)$x^n$ (integer power via binary exponentiation)
Float sqrt(const Float& x, int precision)$\sqrt{x}$
Float cbrt(const Float& x, int precision)$\sqrt[3]{x}$
Float nthRoot(const Float& x, int n, int precision)$\sqrt[n]{x}$
Float recSqrt(const Float& x, int precision)$1/\sqrt{x}$ (reciprocal square root)
Float hypot(const Float& x, const Float& y, int precision)$\sqrt{x^2 + y^2}$ (overflow-safe)

Code Example

int prec = 50;
Float x("2.0", prec);
Float s = sqrt(x, prec);                  // sqrt(2) = 1.41421356...
Float c = cbrt(x, prec);                  // cbrt(2) = 1.25992104...
Float p = pow(x, Float("0.5", prec), prec); // 2^0.5 = sqrt(2)

Miscellaneous

SignatureDescription
Float abs(const Float& x)$|x|$
Float fma(const Float& a, const Float& b, const Float& c, int precision)$ab + c$ (fused multiply-add)
Float fms(const Float& a, const Float& b, const Float& c, int precision)$ab - c$ (fused multiply-subtract)
Float fmma(const Float& a, const Float& b, const Float& c, const Float& d, int precision)$ab + cd$ (double fused multiply-add)
Float fmms(const Float& a, const Float& b, const Float& c, const Float& d, int precision)$ab - cd$ (double fused multiply-subtract)
Float factorial(int n, int precision)$n!$
void sinCos(const Float& x, Float& s, Float& c, int precision)Compute $\sin x$ and $\cos x$ simultaneously
void sinhCosh(const Float& x, Float& s, Float& c, int precision)Compute $\sinh x$ and $\cosh x$ simultaneously
Float agm(const Float& a, const Float& b, int precision)Arithmetic-geometric mean $\mathrm{AGM}(a, b)$
Float sum(std::span<const Float> values, int precision)High-precision summation
Float dot(std::span<const Float> a, std::span<const Float> b, int precision)High-precision dot product

Error Functions / Gamma Functions

SignatureDescription
Float erf(const Float& x, int precision)Error function $\mathrm{erf}(x)$
Float erfc(const Float& x, int precision)Complementary error function $\mathrm{erfc}(x) = 1 - \mathrm{erf}(x)$
Float erfcx(const Float& x, int precision)Scaled complementary error function $e^{x^2}\,\mathrm{erfc}(x)$
Float gamma(const Float& x, int precision)$\Gamma(x)$
Float lnGamma(const Float& x, int precision)$\ln\Gamma(x)$
Float beta(const Float& a, const Float& b, int precision)$B(a, b) = \Gamma(a)\Gamma(b)/\Gamma(a+b)$
Float digamma(const Float& x, int precision)$\psi(x) = \Gamma'(x)/\Gamma(x)$
Float trigamma(const Float& x, int precision)$\psi'(x)$
Float polygamma(int n, const Float& x, int precision)$\psi^{(n)}(x)$
Float gammaP(const Float& a, const Float& x, int precision)Regularized lower incomplete gamma $P(a,x)$
Float gammaQ(const Float& a, const Float& x, int precision)Regularized upper incomplete gamma $Q(a,x)$
Float gammaLower(const Float& a, const Float& x, int precision)Lower incomplete gamma $\gamma(a,x)$
Float gammaUpper(const Float& a, const Float& x, int precision)Upper incomplete gamma $\Gamma(a,x)$

Code Example

int prec = 50;
Float x("1.0", prec);
Float e = erf(x, prec);                     // erf(1) = 0.84270079...
Float g = gamma(x, prec);                   // Gamma(1) = 1
Float lg = lnGamma(Float("10", prec), prec); // ln(9!) = 12.8018...

Rounding / Integer Part

All are free functions in the calx namespace. Move overloads are also provided.

SignatureDescription
Float floor(const Float& x)$\lfloor x \rfloor$ (floor function)
Float ceil(const Float& x)$\lceil x \rceil$ (ceiling function)
Float round(const Float& x)Round to nearest (half-away-from-zero: ties round away from zero)
Float roundEven(const Float& x)Round to nearest even (banker's rounding: ties round to even)
Float trunc(const Float& x)Truncation toward zero
Float frac(const Float& x)Fractional part $x - \lfloor x \rfloor$
Float nearbyint(const Float& x)Equivalent to round
Float rint(const Float& x)Equivalent to round
Float modf(const Float& x, Float& iptr)Stores integer part in iptr, returns fractional part
Float fmod(const Float& x, const Float& y)Floating-point remainder $x - \mathrm{trunc}(x/y) \cdot y$
Float remainder(const Float& x, const Float& y)IEEE 754 remainder $x - \mathrm{roundEven}(x/y) \cdot y$
std::pair<Float, int> remquo(const Float& x, const Float& y)remainder + signed low 3 bits of the quotient
Float ldexp(const Float& x, int exp)$x \times 2^{\mathrm{exp}}$
Float frexp(const Float& x, int* exp)Decompose into mantissa $[0.5, 1)$ and exponent
Float scalbn(const Float& x, int n)Equivalent to ldexp
int64_t ilogb(const Float& x)$\lfloor \log_2 |x| \rfloor$
Float logb(const Float& x)$\lfloor \log_2 |x| \rfloor$ (returns as Float)

Utilities

SignatureDescription
Float fmin(const Float& a, const Float& b)Minimum ignoring NaN
Float fmax(const Float& a, const Float& b)Maximum ignoring NaN
Float fdim(const Float& a, const Float& b)$\max(a - b, 0)$
Float copySign(const Float& x, const Float& y)Copy the sign of $y$ to $x$
bool signBit(const Float& x)true if negative
Float nextAbove(const Float& x)Smallest representable value greater than $x$
Float nextBelow(const Float& x)Largest representable value less than $x$
Float lerp(const Float& a, const Float& b, const Float& t, int precision)Linear interpolation $a + t(b - a)$
Float midpoint(const Float& a, const Float& b)Midpoint $(a + b) / 2$
void swap(Float& a, Float& b) noexceptSwap

String Conversion

SignatureDescription
std::string toString(int precision = -1) constDecimal scientific notation (3.14e+0)
std::string toString(int base, int fracDigits) constBase-N representation (base 2-36). toString(2, 8)"11.01010101", toString(16, 4)"3.243f"
std::string toDecimalString(int precision = -1) constDecimal fixed-point notation
std::string toScientificString(int precision = -1) constScientific notation ($1.23 \times 10^4$ form)
double toDouble() constConvert to double (possible precision loss)
Int toInt() constConvert to Int (fractional part truncated)
ostream& operator<<(ostream&, const Float&)Stream output
istream& operator>>(istream&, Float&)Stream input

Internal Access

SignatureDescription
const Int& mantissa() constReference to the mantissa (arbitrary-precision integer)
int64_t exponent() constExponent (binary exponent)
bool isNegative() constSign flag
int bitLength() constBit length of the mantissa

Usage Examples

Computing Pi

#include <math/core/mp/Float.hpp>
using namespace calx;

// Compute pi to 1000 digits (Chudnovsky algorithm)
Float::setDefaultPrecision(1000);
Float pi = Float::pi(1000);
std::cout << pi.toDecimalString(1000) << std::endl;
// 3.14159265358979323846264338327950288419716939937510...

Multi-Precision exp / log

#include <math/core/mp/Float.hpp>
using namespace calx;

int prec = 30;
Float::setDefaultPrecision(prec);
Float x("1.5");

Float ex = exp(x, prec);       // e^1.5
Float lx = log(ex, prec);      // ln(e^1.5) = 1.5
std::cout << "exp(1.5) = " << ex.toDecimalString(prec) << std::endl;
std::cout << "log(exp(1.5)) = " << lx.toDecimalString(prec) << std::endl;
// Matches 1.5 exactly at 30-digit precision

Safe NaN / Infinity Propagation

Float inf = Float::positiveInfinity();
Float nan = Float::nan();
Float z   = Float::zero();

Float r1 = inf + Float(1);     // +Inf
Float r2 = inf - inf;          // NaN
Float r3 = nan + Float(42);   // NaN
Float r4 = Float(1) / z;      // +Inf

std::cout << r1.isInfinity()  // true
          << r2.isNaN()       // true
          << r3.isNaN()       // true
          << r4.isInfinity(); // true

Precision Control and Constant Cache

// thread_local cache makes repeated calls at the same precision fast
Float pi1 = Float::pi(500);   // First call: computed
Float pi2 = Float::pi(500);   // Second call: returned from cache instantly
Float pi3 = Float::pi(1000);  // Different precision: recomputed

// Changing the rounding mode
Float::setRoundingMode(RoundingMode::TowardZero);
Float x("1.999");
x.setPrecision(3);  // Rounded with TowardZero

Related Mathematical Background

The following articles explain the mathematical concepts underlying the Float class.

Differences from IEEE 754

calx Float is an arbitrary-precision floating-point type and differs in design goals from IEEE 754 fixed-width formats (float/double). The following table summarizes which IEEE 754 features Float supports and which are intentionally omitted.

IEEE 754 Featurecalx FloatNotes
5 rounding modes Implemented Selectable per thread via setRoundingMode(). Uses guard bit + sticky bit for correct rounding decisions
NaN / ±Infinity Fully supported Generation, propagation, and comparison semantics implemented
Signed zero (±0) Representable However, x - x always returns +0. IEEE 754 mandates -0 under roundTowardNegative, but Float does not make this distinction
Subnormals (denormals) Not applicable IEEE 754 subnormals arise from the implicit leading 1 (hidden bit). Float stores the full significand explicitly as an Int, so there is no hidden bit and the concept of subnormals does not apply. The exponent range is ±260, so underflow does not occur in normal use. subnormalize() is provided as an MPFR-compatible emulation feature
Exception flags Infrastructure only Flags such as FE_INEXACT are defined but not raised by arithmetic operators. Only subnormalize() raises them

Rounding Strategy: Per-Operation Design

IEEE 754 (and MPFR) round every operation to the target precision. calx Float uses a different rounding strategy depending on the type of operation.

OperationRoundingDesign Rationale
Multiplication / Division Rounded to requested_bits_ Multiplying two significands of the same bit length nearly doubles the result length, so without rounding, costs grow exponentially
Addition / Subtraction No rounding Addition only extends the significand by the exponent difference, and the extra low-order bits serve as guard bits for subsequent operations. Rounding after every addition would discard these guard bits, causing rounding errors to accumulate in long summation chains. When the exponent difference exceeds max(requested_bits, 1000) + 64, the smaller operand is discarded, preventing unbounded significand growth
setPrecision() Rounded to specified precision Explicit user control. Use after additions when a specific precision is needed

With this design, in an expression like (a + b) * c, the extra bits from addition act as guard bits for the multiplication, yielding higher accuracy at the same requested_bits_ than the MPFR approach (round every operation). To achieve equivalent accuracy with MPFR, one must allocate additional working precision.

Precision Management in Other Libraries

LibraryPrecision Management
MPFR Fixed precision per result variable; every operation performs correct rounding to that precision
GMP (mpf) Precision set per result variable; rounding mode guarantees are not provided
Arb (FLINT) Ball arithmetic (midpoint ± error radius); every operation rigorously tracks the error radius
mpmath (Python) Global precision (mp.prec); every operation rounds to that precision
calx Float Two-value tracking: requested precision (requested_bits_) and effective precision (effective_bits_). Multiplication and division round to the requested precision; addition and subtraction preserve guard bits to improve accuracy in subsequent operations