Chapter 1: Frequency-Domain Derivation
1. Problem Setup
An unknown original signal $X(\omega)$ passes through a filter $H(\omega)$ and is further corrupted by additive noise $N(\omega)$, yielding the observed degraded signal $Y(\omega)$:
\begin{equation} Y(\omega) = H(\omega)X(\omega) + N(\omega) \label{eq:Y} \end{equation}We wish to apply a filter $G(\omega)$ to the degraded signal $Y(\omega)$ in order to recover the original signal $X(\omega)$. The goal is to find the filter $G(\omega)$ that minimizes the expected restoration error power spectrum:
\begin{equation} \varepsilon = E\left[|X(\omega) - G(\omega)Y(\omega)|^2\right] \to \min \label{eq:objective_freq} \end{equation}2. Assumptions
Assumptions required for the derivation
- The signal $X$ and noise $N$ are uncorrelated: $E[NX^*] = E[N^*X] = 0$
- The power spectra $P_S(\omega)$ and $P_N(\omega)$ are known
Stationarity is commonly assumed because it guarantees that the power spectra are functions of frequency alone, enabling a fixed (time-invariant) filter to be used.
3. Notation
- Signal power spectrum: $P_S = E[|X|^2] = E[XX^*]$
- Noise power spectrum: $P_N = E[|N|^2] = E[NN^*]$
4. Expanding the Objective Function
For brevity we drop the $(\omega)$ argument. Substituting \eqref{eq:Y} into \eqref{eq:objective_freq} and expanding:
\begin{align} \varepsilon &= E[|X - GY|^2] \nonumber\\ &= E[|X - G(HX + N)|^2] \nonumber\\ &= E[|(1 - GH)X - GN|^2] \end{align}Since $|z|^2 = zz^*$ in general, and using the assumption that the signal and noise are uncorrelated ($E[NX^*] = E[N^*X] = 0$):
\begin{align} \varepsilon &= |1-GH|^2 P_S + |G|^2 P_N \label{eq:eps_expanded} \end{align}5. Derivation via the Orthogonality Principle
Let the estimated signal be $\hat{X} = GY$ and the estimation error be $E = X - \hat{X} = X - GY$. The orthogonality principle states:
Orthogonality Principle (frequency domain)
For the optimal filter $G$, the estimation error $E$ is orthogonal to (uncorrelated with) the observed signal $Y$:
\begin{equation} E[E \cdot Y^*] = 0 \label{eq:orthogonality_freq} \end{equation}Expanding \eqref{eq:orthogonality_freq}:
\begin{equation} E[(X - GY) \cdot Y^*] = 0 \end{equation}Since $G$ is not a random variable:
\begin{equation} E[XY^*] - G \cdot E[YY^*] = 0 \end{equation}Using $Y = HX + N$:
\begin{align} E[XY^*] &= E[X(HX + N)^*] = H^*E[XX^*] + E[XN^*] = H^* P_S \\ E[YY^*] &= E[(HX+N)(HX+N)^*] = |H|^2 P_S + P_N \end{align}where we used $E[XN^*] = 0$ and $E[NX^*] = 0$ (signal-noise uncorrelatedness).
From the orthogonality condition $E[XY^*] = G \cdot E[YY^*]$:
Wiener Filter (derived via the orthogonality principle)
\begin{equation} G = \frac{H^* P_S}{|H|^2 P_S + P_N} \end{equation}6. Derivation via Completing the Square
Expanding \eqref{eq:eps_expanded} further:
\begin{align} \varepsilon &= (|H|^2P_S + P_N)|G|^2 - P_S(GH + G^*H^*) + P_S \label{eq:eps_quadratic} \end{align}Applying the complex completing-the-square identity:
\begin{equation} a|G|^2 + bG + b^*G^* + c = a\left|G + \frac{b^*}{a}\right|^2 + c - \frac{|b|^2}{a} \label{eq:identity} \end{equation}we obtain:
\begin{equation} \varepsilon = (|H|^2P_S + P_N)\left|G - \frac{H^*P_S}{|H|^2P_S + P_N}\right|^2 + \frac{P_S P_N}{|H|^2P_S + P_N} \label{eq:eps_completed} \end{equation}Optimal Solution
In \eqref{eq:eps_completed}, the error $\varepsilon$ is minimized when the argument of $|\cdot|^2$ vanishes:
Wiener Filter (frequency domain)
\begin{equation} G(\omega) = \frac{H^*(\omega) P_S(\omega)}{|H(\omega)|^2 P_S(\omega) + P_N(\omega)} \label{eq:wiener_freq} \end{equation}The corresponding minimum residual error power is:
\begin{equation} \varepsilon_{\min} = \frac{P_S P_N}{|H|^2 P_S + P_N} \end{equation}7. Derivation via Wirtinger Derivative
The Wirtinger derivative provides a mechanical route to the optimal solution. Since $\varepsilon$ is a real-valued function that depends on both $G$ and $G^*$, setting $\dfrac{\partial \varepsilon}{\partial G^*} = 0$ yields:
\begin{equation} -H^*P_S + G|H|^2P_S + GP_N = 0 \end{equation}Solving for $G$:
\begin{equation} G = \frac{H^*P_S}{|H|^2P_S + P_N} \end{equation}This is identical to \eqref{eq:wiener_freq}.
Advantages of the Wirtinger derivative
- The computation is mechanical and less error-prone
- It exploits the structure of complex variables for a concise derivation
- It is the standard tool in machine learning (e.g., complex-valued neural networks)
For details on the Wirtinger derivative, see Wirtinger Derivatives (Complex Analysis, Advanced).