What sizes does the calx FFT support?

Any positive integer size is supported. Sizes of the form 2^a * 3^b * 5^c use an efficient mixed-radix algorithm. Power-of-two sizes additionally benefit from SIMD (AVX2/SSE) optimization. All other sizes fall back to the Bluestein (Chirp-Z) algorithm, which internally uses a power-of-two FFT.

How do I choose between FFT, RealFFT, and FFTEngine?

Use FFT > for complex-to-complex transforms. Use RealFFT when your input is purely real — it returns only N/2+1 bins, saving memory. Use FFTEngine when you need configurable normalization modes (None, Forward, Inverse, Both) via the FFTDivMode enum.

What transforms are available besides FFT?

calx provides DCT-II/III (with FFT-based O(n log n) fast path), DST-II/III, Hilbert transform (analytic signal, instantaneous amplitude/frequency), Walsh-Hadamard transform, Hartley transform, discrete Hankel transform (arbitrary order), and Number Theoretic Transform (NTT) for ModularInt types.

How does the batch FFT work?

batchFFT processes 4 complex signals simultaneously by interleaving them into a single AVX2 register layout. This shares twiddle factor computation and improves cache utilization. stereoFFT uses a single complex FFT to compute both L and R channel spectra from interleaved stereo audio data.

FFT — Fast Fourier Transform

Overview

Computes the Discrete Fourier Transform (DFT) and its inverse efficiently for arbitrary-length input.

$$X[k] = \sum_{n=0}^{N-1} x[n] \, e^{-2\pi i \, kn/N}, \qquad k = 0, \dots, N{-}1$$

Mixed-radix — radix-2, 3, 5 butterfly stages handle $N = 2^a \cdot 3^b \cdot 5^c$ in $O(N \log N)$
Bluestein — arbitrary lengths via the Chirp-Z transform
SIMD optimized — AVX2/SSE auto-dispatch for power-of-two sizes
Real FFT — exploits conjugate symmetry, returning only $N/2+1$ bins
2D / 3D FFT — sequential 1D FFT along each axis
NTT — Number Theoretic Transform for ModularInt
Utilities — DCT, DST, Hilbert transform, Walsh-Hadamard, Hartley, Hankel
Batch FFT — multi-signal simultaneous processing (AVX2 interleaved), stereo FFT

All APIs reside in the calx namespace (batch functions in calx::simd_fft).

Header

#include <math/fft/fft.hpp>         // FFT, RealFFT, FFTEngine, FFTDivMode
#include <math/fft/ntt.hpp>         // Ntt (Number Theoretic Transform)
#include <math/fft/fft2d.hpp>       // fft2d, ifft2d, fft2d_real
#include <math/fft/fft3d.hpp>       // Tensor3D, fft3d, ifft3d, fft3d_real, fftshift3d
#include <math/fft/fft_utils.hpp>   // DCT, DST, Hilbert, Walsh-Hadamard, Hartley, Hankel
#include <math/fft/fft_batch.hpp>   // batchFFT, batchIFFT, stereoFFT

Linking

Library	Required for
`calx_fft`	Using FFT / RealFFT / FFTEngine with `float` / `double`
`calx_ntt`	Using the `Ntt` class

// CMake
target_link_libraries(your_target PRIVATE calx_fft calx_ntt)

FFTDivMode

Enumeration specifying the normalization mode for FFTEngine.

enum class FFTDivMode {
    None,       // No normalization
    Forward,    // Normalize forward transform by 1/N
    Inverse,    // Normalize inverse transform by 1/N (default)
    Both        // Normalize both by 1/sqrt(N)
};

Value	Forward	Inverse	Use case
`None`	$1$	$1$	Performance-critical, manual normalization
`Forward`	$1/N$	$1$	Spectrum analysis (amplitudes are physical quantities)
`Inverse`	$1$	$1/N$	Mathematical convention (default)
`Both`	$1/\sqrt{N}$	$1/\sqrt{N}$	Unitary (Parseval's theorem)

Note: FFT<T> and RealFFT<Real> always use Inverse convention (1/N on inverse). Only FFTEngine supports arbitrary normalization modes.

FFT<T, Real> Class

Static FFT API for complex and ModularInt types.

template<typename T, typename Real = double>
class FFT;

T may be std::complex<float>, std::complex<double>, or ModularInt<P>. Real specifies the floating-point type for twiddle factors.

is_valid_size / next_valid_size

static bool is_valid_size(int n);
static int  next_valid_size(int n);

Function	Description
`is_valid_size`	Returns true if $n = 2^a \cdot 3^b \cdot 5^c$ (eligible for the mixed-radix path).
`next_valid_size`	Returns the smallest valid size $\geq n$.

Note: Non-valid sizes can still be passed to fft() (falls back to Bluestein).

fft

static void fft(std::span<T> data);

Parameter	Type	Description
`data`	`std::span<T>`	Input/output array (in-place transform)

Forward DFT. No normalization. Automatically dispatches based on size:

$2^k$ — SIMD-optimized radix-2 (float/double), scalar radix-2 (ModularInt)
$2^a \cdot 3^b \cdot 5^c$ — mixed-radix
Other — Bluestein (Chirp-Z)

ifft

static void ifft(std::span<T> data);

Inverse DFT. Normalized by $1/N$. For complex types, implemented as conjugate + FFT + conjugate + $1/N$.

convolve

static std::vector<T> convolve(std::span<const T> a, std::span<const T> b);

Parameter	Type	Description
`a`	`std::span<const T>`	First polynomial
`b`	`std::span<const T>`	Second polynomial

Returns: Convolution result of size a.size() + b.size() - 1. Internally pads to the next power of two.

Usage: Type T must be a complex type (e.g. std::complex<double>); intended for polynomial multiplication or complex-coefficient convolution. For convolution of real waveforms (float / double), use RealFFT::convolve instead (FIR filtering, impulse-response application, etc.).

ntt / intt / convolve_ntt (deprecated)

Deprecated: These methods are deprecated. Use the Ntt class instead. Ntt provides 5-smooth sizes, AVX2 Montgomery optimization, and Bluestein arbitrary-length support.

template<int P>
[[deprecated]] static void ntt(std::span<ModularInt<P>> data, ModularInt<P> primitive_root);

template<int P>
[[deprecated]] static void intt(std::span<ModularInt<P>> data, ModularInt<P> primitive_root);

template<int P>
[[deprecated]] static std::vector<ModularInt<P>> convolve_ntt(
    std::span<const ModularInt<P>> a, std::span<const ModularInt<P>> b,
    ModularInt<P> primitive_root);

Ntt Class — Number Theoretic Transform

Number Theoretic Transform (NTT) over 64-bit integers. Internally uses AVX2 Montgomery NTT pipeline. Supports 5-smooth sizes ($N = 2^a \cdot 3^b \cdot 5^c$, $b,c \in \{0,1\}$) and arbitrary sizes via Bluestein algorithm.

#include <math/fft/ntt.hpp>
// Link: calx_ntt

class Ntt;

Prime Struct

struct Prime {
    uint64_t p;          // NTT-friendly prime
    uint64_t g;          // Primitive root
    size_t maxNttSize;   // Maximum NTT length = 2^47
};

Recommended Primes

Function	Prime $p$	Root	Notes
`prime1()`	$30 \times 2^{47} + 1$	19	$30 = 2 \times 3 \times 5$
`prime2()`	$345 \times 2^{47} + 1$	13	$345 = 3 \times 5 \times 23$
`prime3()`	$420 \times 2^{47} + 1$	17	$420 = 4 \times 3 \times 5 \times 7$

All three primes satisfy $2^{47} \mid (p - 1)$, supporting NTT of up to $2^{47}$ points. Coefficient range is $[0, p)$ where $p \approx 2^{62}$.

forward / inverse

static void forward(uint64_t* data, size_t N, const Prime& prime);
static void inverse(uint64_t* data, size_t N, const Prime& prime);

Parameter	Type	Description
`data`	`uint64_t*`	In-place input/output array
`N`	`size_t`	Array size (requires $N \mid (p - 1)$)
`prime`	`const Prime&`	NTT prime (e.g. `prime1()`)

5-smooth sizes are handled via mixed-radix NTT. Non-smooth sizes fall back to Bluestein (Chirp-Z) algorithm. inverse includes $1/N$ scaling.

Throws: std::invalid_argument if $N \nmid (p - 1)$.

pointwiseMul

static void pointwiseMul(uint64_t* r, const uint64_t* a, const uint64_t* b,
                          size_t N, const Prime& prime);

Element-wise modular multiplication: $r[i] = (a[i] \times b[i]) \bmod p$. Used for convolution in NTT domain.

polyMul

static void polyMul(uint64_t* r, const uint64_t* a, size_t an,
                     const uint64_t* b, size_t bn, const Prime& prime);

Parameter	Type	Description
`r`	`uint64_t*`	Output array (size $\geq an + bn$)
`a, b`	`const uint64_t*`	Input polynomial coefficients
`an, bn`	`size_t`	Number of terms in each polynomial

Polynomial multiplication $\bmod p$. Uses schoolbook for small sizes ($an + bn \leq 16$), NTT otherwise. Montgomery transform is applied automatically.

Size Utilities

Function	Description
`nextSmoothSize(n)`	Returns the smallest 5-smooth number $\geq n$
`isSmooth(n)`	Tests whether $n$ is 5-smooth
`isValidSize(N, prime)`	Tests whether $N \mid (p - 1)$

Example

#include <math/fft/ntt.hpp>
#include <vector>
#include <iostream>

int main() {
    using calx::Ntt;
    auto prime = Ntt::prime1();

    // Polynomial multiplication: (1 + 2x) × (3 + 4x) = 3 + 10x + 8x²
    uint64_t a[] = {1, 2};
    uint64_t b[] = {3, 4};
    uint64_t r[4] = {};
    Ntt::polyMul(r, a, 2, b, 2, prime);

    for (int i = 0; i < 3; ++i)
        std::cout << r[i] << " ";   // 3 10 8
    std::cout << std::endl;

    // Manual NTT: forward → pointwise multiply → inverse
    size_t N = 4;  // zero-padded power of 2
    std::vector<uint64_t> fa(N, 0), fb(N, 0);
    fa[0] = 1; fa[1] = 2;
    fb[0] = 3; fb[1] = 4;

    Ntt::forward(fa.data(), N, prime);
    Ntt::forward(fb.data(), N, prime);
    Ntt::pointwiseMul(fa.data(), fa.data(), fb.data(), N, prime);
    Ntt::inverse(fa.data(), N, prime);

    for (size_t i = 0; i < 3; ++i)
        std::cout << fa[i] << " ";   // 3 10 8
    std::cout << std::endl;
}

RealFFT<Real> Class

Static FFT API specialized for real-valued data. Exploits conjugate symmetry to return only $N/2+1$ bins.

template<typename Real>
class RealFFT;

fft

static std::vector<std::complex<Real>> fft(std::span<const Real> data);

Parameter	Type	Description
`data`	`std::span<const Real>`	Real input of size $N$

Returns: Complex vector of size $N/2+1$.

ifft

static std::vector<Real> ifft(std::span<const std::complex<Real>> spectrum,
                               int output_size);

Parameter	Type	Description
`spectrum`	`std::span<const std::complex<Real>>`	Frequency domain ($N/2+1$ bins)
`output_size`	`int`	Output size $N$

Returns: Real vector of size $N$. Normalized by $1/N$.

Throws: MathError if spectrum.size() != output_size / 2 + 1.

convolve

static std::vector<Real> convolve(std::span<const Real> a, std::span<const Real> b);

Real-valued convolution using RealFFT::fft internally.

FFTEngine<Real> Class

Instance-based FFT wrapper that holds a normalization mode (FFTDivMode).

template<typename Real = double>
class FFTEngine;

Constructor

explicit FFTEngine(int nfft, FFTDivMode mode = FFTDivMode::Forward);

Parameter	Type	Description
`nfft`	`int`	FFT size (must be a power of two)
`mode`	`FFTDivMode`	Normalization mode (default: Forward)

Throws: std::invalid_argument if nfft is not a power of two.

Member Functions

Function	Signature	Description
`size`	`int size() const`	Returns the FFT size
`transform`	`vector<complex> transform(const vector<complex>&)`	Complex forward FFT
`inverse`	`vector<complex> inverse(const vector<complex>&)`	Complex inverse FFT
`real_transform`	`vector<complex> real_transform(const vector<Real>&)`	Real forward FFT ($N \to N/2+1$)
`real_inverse`	`vector<Real> real_inverse(const vector<complex>&)`	Real inverse FFT ($N/2+1 \to N$)

Throws std::invalid_argument if the input size does not match nfft.

2D FFT

Two-dimensional Fourier transform via sequential 1D FFTs along rows then columns. Matrix<complex>-based.

#include <math/fft/fft2d.hpp>

fft2d

template<typename Real = double>
Matrix<std::complex<Real>> fft2d(const Matrix<std::complex<Real>>& input);

2D FFT (row-wise then column-wise). Output has the same dimensions as input.

ifft2d

template<typename Real = double>
Matrix<std::complex<Real>> ifft2d(const Matrix<std::complex<Real>>& input);

2D inverse FFT. The $1/N$ normalization is applied automatically by the underlying FFT::ifft on each axis.

fft2d_real

template<typename Real = double>
Matrix<std::complex<Real>> fft2d_real(const Matrix<Real>& input);

Converts a real matrix to complex, then applies fft2d.

3D FFT

Three-dimensional Fourier transform via sequential 1D FFTs along $x \to y \to z$ axes.

#include <math/fft/fft3d.hpp>

Tensor3D<T>

template<typename T>
class Tensor3D;

3D tensor container. Row-major layout: data[z * ny * nx + y * nx + x].

Member	Description
`Tensor3D(nx, ny, nz)`	Construct with given dimensions
`Tensor3D(nx, ny, nz, val)`	Construct with initial value
`nx(), ny(), nz()`	Dimension along each axis
`size()`	Total element count ($nx \times ny \times nz$)
`operator()(x, y, z)`	Element access
`data()`	Pointer to the internal buffer

fft3d / ifft3d

template<typename Real = double>
Tensor3D<std::complex<Real>> fft3d(const Tensor3D<std::complex<Real>>& input);

template<typename Real = double>
Tensor3D<std::complex<Real>> ifft3d(const Tensor3D<std::complex<Real>>& input);

fft3d_real

template<typename Real = double>
Tensor3D<std::complex<Real>> fft3d_real(const Tensor3D<Real>& input);

fftshift3d

template<typename T>
Tensor3D<T> fftshift3d(const Tensor3D<T>& input);

Shifts the DC component to the center of the tensor by half-shifting along each axis.

Utilities

#include <math/fft/fft_utils.hpp>

Spectrum Analysis

Function	Description
`amplitude_spectrum(spectrum)`	Computes the amplitude spectrum $\|X[k]\|$
`phase_spectrum(spectrum)`	Computes the phase spectrum $\arg(X[k])$ as the principal value $(-\pi, \pi]$ (wrapped)
`unwrap_phase(phase, tol=π)`	1D phase unwrapping (equivalent to NumPy `np.unwrap`)
`reconstruct_spectrum(amp, phase)`	Reconstructs a complex spectrum from amplitude and phase

template<typename Real>
std::vector<Real> amplitude_spectrum(std::span<const std::complex<Real>> spectrum);

template<typename Real>
std::vector<Real> phase_spectrum(std::span<const std::complex<Real>> spectrum);

template<typename Real>
std::vector<Real> unwrap_phase(std::span<const Real> phase,
                                Real tol = std::numbers::pi_v<Real>);

template<typename Real>
std::vector<std::complex<Real>> reconstruct_spectrum(
    std::span<const Real> amplitude, std::span<const Real> phase);

Phase unwrapping: phase_spectrum returns the principal value of std::arg(), so the phase response wraps at $\pm\pi$. Apply unwrap_phase when a continuous phase is required (group delay $-d\phi/d\omega$, minimum-phase estimation, complex cepstrum, etc.). tol is the threshold on adjacent-sample phase differences beyond which a $2\pi$ offset is applied (default: $\pi$).

auto spec  = RealFFT<double>::fft(signal);
auto phase = phase_spectrum<double>(spec);       // wrapped principal values
auto cont  = unwrap_phase<double>(phase);        // continuous phase
// group delay: approximate by differencing -dφ/dω

Polynomials

template<typename T>
std::vector<T> polynomial_multiply(std::span<const T> a, std::span<const T> b);

template<typename T, typename U>
auto evaluate_polynomial(std::span<const T> poly, const U& x);

polynomial_multiply delegates to FFT::convolve. evaluate_polynomial uses Horner's method.

DCT / IDCT

template<typename Real>
std::vector<Real> dct(std::span<const Real> data);    // DCT-II

template<typename Real>
std::vector<Real> idct(std::span<const Real> coeffs);  // DCT-III (IDCT)

template<typename Real>
std::vector<Real> fast_dct(std::span<const Real> data); // Alias for dct

Normalized DCT-II: $X[k] = c(k) \sum_{i=0}^{N-1} x[i] \cos\bigl(\pi(i+0.5)k/N\bigr)$, where $c(0) = \sqrt{1/N}$, $c(k) = \sqrt{2/N}$ for $k > 0$.

When $N$ is an FFT-compatible size ($2^a \cdot 3^b \cdot 5^c$), uses the Makhoul method at $O(N \log N)$. Otherwise falls back to direct $O(N^2)$ computation.

DST / IDST

template<typename Real>
std::vector<Real> dst(std::span<const Real> data);     // DST-II

template<typename Real>
std::vector<Real> idst(std::span<const Real> coeffs);   // DST-III (IDST)

Normalized DST-II. Currently uses direct $O(N^2)$ computation.

Hilbert Transform

template<typename Real>
std::vector<std::complex<Real>> analyticSignal(std::span<const Real> data);

template<typename Real>
std::vector<Real> hilbertTransform(std::span<const Real> data);

template<typename Real>
std::vector<Real> instantaneousAmplitude(std::span<const Real> data);

template<typename Real>
std::vector<Real> instantaneousFrequency(std::span<const Real> data,
                                          Real sampleRate = Real(1));

Function	Description
`analyticSignal`	Constructs the analytic signal $z[n] = x[n] + j \cdot \mathcal{H}\{x\}[n]$
`hilbertTransform`	Hilbert transform (imaginary part of the analytic signal)
`instantaneousAmplitude`	Instantaneous amplitude $\|z[n]\|$ (envelope)
`instantaneousFrequency`	Instantaneous frequency $\frac{d}{dt}\arg(z[n]) \cdot f_s / 2\pi$

Walsh-Hadamard Transform

template<typename Real>
std::vector<Real> walshHadamard(std::span<const Real> data);

template<typename Real>
std::vector<Real> iwalshHadamard(std::span<const Real> coeffs);

template<typename Real>
void walshHadamardInplace(std::vector<Real>& data);

Fast Walsh-Hadamard Transform. Size must be a power of two. $O(N \log N)$ butterfly algorithm. Self-inverse with $1/N$ normalization.

Hartley Transform

template<typename Real>
std::vector<Real> hartley(std::span<const Real> data);

template<typename Real>
std::vector<Real> ihartley(std::span<const Real> coeffs);

$H[k] = \sum_{n=0}^{N-1} x[n] \cdot \mathrm{cas}(2\pi nk/N)$, where $\mathrm{cas}(\theta) = \cos\theta + \sin\theta$. Real input produces real output. Computed via FFT. Self-inverse with $1/N$ normalization.

Hankel Transform

template<typename Real>
struct HankelTransform {
    HankelTransform(int n, Real radius, Real order = Real(0));

    std::vector<Real> r;    // Sample points
    std::vector<Real> k;    // Frequency points

    std::vector<Real> transform(std::span<const Real> f) const;
    std::vector<Real> inverse(std::span<const Real> F) const;
};

Discrete Hankel Transform (GSL gsl_dht equivalent). Supports arbitrary order $\nu$. For $\nu = 0$, a fast path using McMahon asymptotics and $J_0$/$J_1$ approximations is used.

NTT Utilities

template<int P>
std::optional<ModularInt<P>> find_ntt_primitive_root(int n);

Searches for a primitive root suitable for NTT of size $n$ (power of two). Returns std::nullopt if none found.

Batch FFT

#include <math/fft/fft_batch.hpp>
namespace calx::simd_fft { ... }

Processes multiple same-sized signals simultaneously. Interleaves 4 signals for AVX2 butterfly operations.

batchFFT / batchIFFT

void batchFFT(std::complex<float>** signals, int count, int fftSize);
void batchIFFT(std::complex<float>** signals, int count, int fftSize);

Parameter	Type	Description
`signals`	`complex<float>**`	Array of signal pointers (each signal is updated in-place)
`count`	`int`	Number of signals
`fftSize`	`int`	FFT size (power of two)

Processes 4 signals at a time using AVX2 batch butterflies. Remaining signals are processed with standard simd_fft::fft(). Twiddle tables are cached in thread_local storage.

stereoFFT

void stereoFFT(
    const float* interleaved, int fftSize,
    const float* window,
    std::complex<float>* outL,
    std::complex<float>* outR);

Parameter	Type	Description
`interleaved`	`const float*`	Stereo interleaved data {L0,R0,L1,R1,...} (fftSize*2 floats)
`fftSize`	`int`	FFT size
`window`	`const float*`	Window function (fftSize floats), or nullptr for no window
`outL`	`complex<float>*`	L channel spectrum (fftSize/2+1 bins)
`outR`	`complex<float>*`	R channel spectrum (fftSize/2+1 bins)

Splits interleaved stereo audio into L and R channel spectra using a single complex FFT. Exploits Hermitian symmetry for separation.

Examples

Basic FFT

#include <math/fft/fft.hpp>
#include <iostream>
#include <complex>
#include <vector>

int main() {
    using C = std::complex<double>;

    // 8-point FFT
    std::vector<C> signal = {1, 2, 3, 4, 4, 3, 2, 1};
    calx::FFT<C>::fft(signal);

    std::cout << "FFT result:" << std::endl;
    for (auto& x : signal)
        std::cout << "  " << x << std::endl;

    // Inverse FFT to recover original
    calx::FFT<C>::ifft(signal);

    std::cout << "IFFT result:" << std::endl;
    for (auto& x : signal)
        std::cout << "  " << x.real() << std::endl;
}

Spectrum Analysis

#include <math/fft/fft.hpp>
#include <math/fft/fft_utils.hpp>
#include <vector>
#include <cmath>
#include <iostream>

int main() {
    constexpr int N = 1024;
    constexpr double fs = 44100.0;
    constexpr double f0 = 440.0;

    // Generate 440 Hz sine wave
    std::vector<double> signal(N);
    for (int i = 0; i < N; ++i)
        signal[i] = std::sin(2.0 * M_PI * f0 * i / fs);

    // Real FFT
    auto spectrum = calx::RealFFT<double>::fft(signal);

    // Amplitude spectrum
    auto amp = calx::amplitude_spectrum<double>(spectrum);

    // Find peak frequency
    int peak = 0;
    for (int k = 1; k < (int)amp.size(); ++k)
        if (amp[k] > amp[peak]) peak = k;

    double peakFreq = peak * fs / N;
    std::cout << "Peak frequency: " << peakFreq << " Hz" << std::endl;
}

Convolution (Polynomial Multiplication)

#include <math/fft/fft.hpp>
#include <iostream>

int main() {
    using C = std::complex<double>;

    // p(x) = 1 + 2x + 3x^2
    // q(x) = 4 + 5x
    std::vector<C> p = {1, 2, 3};
    std::vector<C> q = {4, 5};

    auto result = calx::FFT<C>::convolve(p, q);
    // result = {4, 13, 22, 15}  (4 + 13x + 22x^2 + 15x^3)

    for (auto& c : result)
        std::cout << c.real() << " ";
    std::cout << std::endl;
}

2D FFT

#include <math/fft/fft2d.hpp>
#include <math/core/matrix.hpp>
#include <iostream>

int main() {
    using C = std::complex<double>;

    // 4x4 matrix
    calx::Matrix<C> img(4, 4);
    for (int r = 0; r < 4; ++r)
        for (int c = 0; c < 4; ++c)
            img(r, c) = C(r + c, 0);

    // Forward 2D FFT
    auto freq = calx::fft2d(img);

    // Inverse 2D FFT
    auto recovered = calx::ifft2d(freq);

    std::cout << "Recovered (0,0): " << recovered(0, 0).real() << std::endl;
}

DCT (Discrete Cosine Transform)

#include <math/fft/fft_utils.hpp>
#include <vector>
#include <iostream>

int main() {
    std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0};

    auto coeffs = calx::dct<double>(data);
    auto recovered = calx::idct<double>(coeffs);

    std::cout << "DCT coefficients:";
    for (auto v : coeffs) std::cout << " " << v;
    std::cout << std::endl;

    std::cout << "Recovered:";
    for (auto v : recovered) std::cout << " " << v;
    std::cout << std::endl;
}

Hilbert Transform (Envelope Detection)

#include <math/fft/fft_utils.hpp>
#include <vector>
#include <cmath>
#include <iostream>

int main() {
    constexpr int N = 256;

    // AM signal: (1 + 0.5*cos(2pi*5*t)) * cos(2pi*50*t)
    std::vector<double> signal(N);
    for (int i = 0; i < N; ++i) {
        double t = double(i) / N;
        signal[i] = (1.0 + 0.5 * std::cos(2 * M_PI * 5 * t))
                   * std::cos(2 * M_PI * 50 * t);
    }

    // Instantaneous amplitude (envelope)
    auto envelope = calx::instantaneousAmplitude<double>(signal);

    std::cout << "Envelope[0]: " << envelope[0] << std::endl;
}

Related Mathematical Background

The following articles explain the mathematical concepts underlying the FFT module.

Fast Fourier Transform — Cooley-Tukey algorithm, complexity analysis
Fast Multiplication Algorithms — NTT, Schönhage-Strassen