Proofs Chapter 16: Derivatives of Complex Matrices (Wirtinger Derivatives)

16.1 Wirtinger Derivatives

Formula: $\displaystyle\frac{\partial f}{\partial z} = \displaystyle\frac{1}{2}\left(\displaystyle\frac{\partial f}{\partial \Re z} - i\displaystyle\frac{\partial f}{\partial \Im z}\right)$, $\quad\displaystyle\frac{\partial f}{\partial z^*} = \displaystyle\frac{1}{2}\left(\displaystyle\frac{\partial f}{\partial \Re z} + i\displaystyle\frac{\partial f}{\partial \Im z}\right)$

Conditions: $f$ is a complex function, $z = \Re z + i\Im z$

Proof

Decompose the complex number $z$ into its real and imaginary parts.

\begin{equation}z = x + iy \label{eq:16-1-1}\end{equation}

where $x = \Re z$ and $y = \Im z$.

From $\eqref{eq:16-1-1}$, expressing $z$ and $z^*$ in terms of $x$ and $y$:

\begin{equation}z = x + iy, \quad z^* = x - iy \label{eq:16-1-2}\end{equation}

Solving $\eqref{eq:16-1-2}$ for $x$ and $y$: adding the two equations gives

\begin{equation}z + z^* = 2x \quad \Rightarrow \quad x = \frac{z + z^*}{2} \label{eq:16-1-3}\end{equation}

Subtracting the two equations gives

\begin{equation}z - z^* = 2iy \quad \Rightarrow \quad y = \frac{z - z^*}{2i} \label{eq:16-1-4}\end{equation}

Treating $f(z)$ as $f(x, y)$ and applying the chain rule, the partial derivative of $f$ with respect to $z$ is

\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z} \label{eq:16-1-5}\end{equation}

From $\eqref{eq:16-1-3}$, treating $z^*$ as a constant, we compute $\partial x / \partial z$:

\begin{equation}\frac{\partial x}{\partial z} = \frac{1}{2} \label{eq:16-1-6}\end{equation}

From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z$:

\begin{equation}\frac{\partial y}{\partial z} = \frac{1}{2i} = -\frac{i}{2} \label{eq:16-1-7}\end{equation}

Substituting $\eqref{eq:16-1-6}$ and $\eqref{eq:16-1-7}$ into $\eqref{eq:16-1-5}$:

\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \left(-\frac{i}{2}\right) = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-1-8}\end{equation}

Similarly, differentiate $f$ with respect to $z^*$:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z^*} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z^*} \label{eq:16-1-9}\end{equation}

From $\eqref{eq:16-1-3}$, treating $z$ as a constant, we compute $\partial x / \partial z^*$:

\begin{equation}\frac{\partial x}{\partial z^*} = \frac{1}{2} \label{eq:16-1-10}\end{equation}

From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z^*$:

\begin{equation}\frac{\partial y}{\partial z^*} = -\frac{1}{2i} = \frac{i}{2} \label{eq:16-1-11}\end{equation}

Substituting $\eqref{eq:16-1-10}$ and $\eqref{eq:16-1-11}$ into $\eqref{eq:16-1-9}$:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \frac{i}{2} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-1-12}\end{equation}

Substituting $x = \Re z$ and $y = \Im z$ yields the final result:

\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} - i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-13}\end{equation}

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} + i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-14}\end{equation}

Notes: Functions involving complex conjugates (e.g., $f(z) = z^*$) do not satisfy the Cauchy–Riemann equations and are therefore not differentiable in the classical sense. Wirtinger derivatives circumvent this problem. When $f$ is holomorphic, $\partial f/\partial z$ coincides with the ordinary complex derivative, and $\partial f/\partial z^* = 0$.
Source: W. Wirtinger (1927) "Zur formalen Theorie der Funktionen von mehr komplexen Veränderlichen", Mathematische Annalen 97, 357–375.

16.2 Complex Gradient Vector

Formula: $\nabla f(\boldsymbol{z}) = 2\displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \boldsymbol{z}^*} = \displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \Re\boldsymbol{z}} + i\displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \Im\boldsymbol{z}}$

Conditions: $f(\boldsymbol{z})$ is a real-valued function, $\boldsymbol{z} \in \mathbb{C}^n$

Proof

Define the complex gradient of the real-valued function $f(\boldsymbol{z})$. Since $f$ is real-valued, $f = f^*$ holds.

From $\eqref{eq:16-1-14}$ in 16.1, the component-wise Wirtinger derivative is

\begin{equation}\frac{\partial f}{\partial z_k^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k}\right) \label{eq:16-2-1}\end{equation}

where $z_k = x_k + iy_k$ ($x_k = \Re z_k$, $y_k = \Im z_k$).

Multiplying both sides of $\eqref{eq:16-2-1}$ by 2:

\begin{equation}2\frac{\partial f}{\partial z_k^*} = \frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k} \label{eq:16-2-2}\end{equation}

Writing $\eqref{eq:16-2-2}$ in vector form:

\begin{equation}2\frac{\partial f}{\partial \boldsymbol{z}^*} = \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} \label{eq:16-2-3}\end{equation}

Define the complex gradient $\nabla f$ by the right-hand side of $\eqref{eq:16-2-3}$:

\begin{equation}\nabla f(\boldsymbol{z}) \stackrel{\text{def}}{=} \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} = 2\frac{\partial f}{\partial \boldsymbol{z}^*} \label{eq:16-2-4}\end{equation}

We verify that this definition yields the steepest descent direction. The total differential of $f$ is

\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial x_k}dx_k + \frac{\partial f}{\partial y_k}dy_k\right) \label{eq:16-2-5}\end{equation}

From $dz_k = dx_k + idy_k$ and $dz_k^* = dx_k - idy_k$:

\begin{equation}dx_k = \frac{dz_k + dz_k^*}{2}, \quad dy_k = \frac{dz_k - dz_k^*}{2i} \label{eq:16-2-6}\end{equation}

Substituting $\eqref{eq:16-2-6}$ into $\eqref{eq:16-2-5}$ and simplifying:

\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial z_k}dz_k + \frac{\partial f}{\partial z_k^*}dz_k^*\right) \label{eq:16-2-7}\end{equation}

When $f$ is real-valued, $\partial f/\partial z_k = (\partial f/\partial z_k^*)^*$ holds. Substituting into $\eqref{eq:16-2-7}$ and fixing the direction of $d\boldsymbol{z}$ to minimize $df$, one finds that the steepest descent direction is $-\nabla f = -2\partial f/\partial \boldsymbol{z}^*$.

Notes: With this definition, $-\nabla f$ is the steepest descent direction, exactly as in the real case. This is used in the optimization of complex neural networks and adaptive filters.

16.3 Chain Rule for Complex Derivatives

Formula: $\displaystyle\frac{\partial g}{\partial z} = \displaystyle\frac{\partial g}{\partial f}\displaystyle\frac{\partial f}{\partial z} + \displaystyle\frac{\partial g}{\partial f^*}\displaystyle\frac{\partial f^*}{\partial z}$, $\quad\displaystyle\frac{\partial g}{\partial z^*} = \displaystyle\frac{\partial g}{\partial f}\displaystyle\frac{\partial f}{\partial z^*} + \displaystyle\frac{\partial g}{\partial f^*}\displaystyle\frac{\partial f^*}{\partial z^*}$

Conditions: $g(f(z))$ is a composite function

Proof

Consider the composite function $h(z) = g(f(z), f^*(z))$. In the Wirtinger framework, $f$ and $f^*$ are treated as independent variables.

Write $h$ in terms of real and imaginary parts. Setting $z = x + iy$ and $f = u + iv$:

\begin{equation}h = h(x, y), \quad f = f(x, y) = u(x, y) + iv(x, y) \label{eq:16-3-1}\end{equation}

The Wirtinger derivative of $h$ with respect to $z$, from $\eqref{eq:16-1-13}$ in 16.1, is

\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left(\frac{\partial h}{\partial x} - i\frac{\partial h}{\partial y}\right) \label{eq:16-3-2}\end{equation}

Since $h$ depends on $x$ and $y$ through $f$ and $f^*$, applying the chain rule:

\begin{equation}\frac{\partial h}{\partial x} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial x} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial x} \label{eq:16-3-3}\end{equation}

\begin{equation}\frac{\partial h}{\partial y} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial y} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial y} \label{eq:16-3-4}\end{equation}

Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-2}$:

\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-5}\end{equation}

From $\eqref{eq:16-1-13}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-3-6}\end{equation}

\begin{equation}\frac{\partial f^*}{\partial z} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-7}\end{equation}

Substituting $\eqref{eq:16-3-6}$ and $\eqref{eq:16-3-7}$ into $\eqref{eq:16-3-5}$:

\begin{equation}\frac{\partial h}{\partial z} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z} \label{eq:16-3-8}\end{equation}

Similarly, compute the Wirtinger derivative of $h$ with respect to $z^*$. From $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left(\frac{\partial h}{\partial x} + i\frac{\partial h}{\partial y}\right) \label{eq:16-3-9}\end{equation}

Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-9}$:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-10}\end{equation}

From $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-3-11}\end{equation}

\begin{equation}\frac{\partial f^*}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-12}\end{equation}

Substituting $\eqref{eq:16-3-11}$ and $\eqref{eq:16-3-12}$ into $\eqref{eq:16-3-10}$:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z^*} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z^*} \label{eq:16-3-13}\end{equation}

Notes: Unlike the real chain rule, one must account for both paths through $f$ and $f^*$. When $f$ is holomorphic, $\partial f/\partial z^* = 0$ and $\partial f^*/\partial z = 0$, and the formula reduces to the ordinary chain rule.

16.4 Derivative of $\text{Tr}(\boldsymbol{X}^*)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{I}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{I}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{n \times n}$, $\boldsymbol{X}^* = \Re\boldsymbol{X} - i\Im\boldsymbol{X}$ (element-wise complex conjugate)

Proof

Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:

\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-4-1}\end{equation}

The complex conjugate is

\begin{equation}X_{ij}^* = (\Re X)_{ij} - i(\Im X)_{ij} \label{eq:16-4-2}\end{equation}

By the definition of the trace:

\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} X_{ii}^* \label{eq:16-4-3}\end{equation}

Substituting $\eqref{eq:16-4-2}$ into $\eqref{eq:16-4-3}$:

\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} - i(\Im X)_{ii}\right] \label{eq:16-4-4}\end{equation}

Differentiating $\eqref{eq:16-4-4}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-4-5}\end{equation}

where $\delta_{kl}$ is the Kronecker delta.

Writing $\eqref{eq:16-4-5}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-4-6}\end{equation}

Similarly, differentiating $\eqref{eq:16-4-4}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} (-i)(\Im X)_{ii} = -i\delta_{kl} \label{eq:16-4-7}\end{equation}

Writing $\eqref{eq:16-4-7}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{I} \label{eq:16-4-8}\end{equation}

Multiplying both sides of $\eqref{eq:16-4-8}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{I} = \boldsymbol{I} \label{eq:16-4-9}\end{equation}

From $\eqref{eq:16-4-6}$ and $\eqref{eq:16-4-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have the same sign, giving $\boldsymbol{I}$ in both cases.

Notes: Since $\boldsymbol{X}^H = (\boldsymbol{X}^*)^\top$, we have $\text{Tr}(\boldsymbol{X}^H) = \text{Tr}(\boldsymbol{X}^*)$, so the same result holds.

16.5 Derivative of $\text{Tr}(\boldsymbol{X})$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Re\boldsymbol{X}} = \boldsymbol{I}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i\boldsymbol{I}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{n \times n}$

Proof

Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:

\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-5-1}\end{equation}

By the definition of the trace:

\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} X_{ii} \label{eq:16-5-2}\end{equation}

Substituting $\eqref{eq:16-5-1}$ into $\eqref{eq:16-5-2}$:

\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} + i(\Im X)_{ii}\right] \label{eq:16-5-3}\end{equation}

Differentiating $\eqref{eq:16-5-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-5-4}\end{equation}

Writing $\eqref{eq:16-5-4}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-5-5}\end{equation}

Similarly, differentiating $\eqref{eq:16-5-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} i(\Im X)_{ii} = i\delta_{kl} \label{eq:16-5-6}\end{equation}

Writing $\eqref{eq:16-5-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i\boldsymbol{I} \label{eq:16-5-7}\end{equation}

Multiplying both sides of $\eqref{eq:16-5-7}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i \cdot i\boldsymbol{I} = -\boldsymbol{I} \label{eq:16-5-8}\end{equation}

Comparing $\eqref{eq:16-5-5}$ and $\eqref{eq:16-5-8}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have opposite signs. This contrasts with the case of $\text{Tr}(\boldsymbol{X}^*)$ in 16.4.

Notes: Note that the sign of the imaginary-part derivative differs between $\text{Tr}(\boldsymbol{X}^*)$ and $\text{Tr}(\boldsymbol{X})$. This affects the choice between the conjugate derivative and the generalized derivative.

16.6 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}$, $\quad i\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}$

Conditions: $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^H = (\boldsymbol{X}^*)^\top$ (Hermitian transpose)

Proof

By the definition of the Hermitian transpose:

\begin{equation}(\boldsymbol{X}^H)_{ij} = X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-6-1}\end{equation}

Expanding the trace in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} (\boldsymbol{X}^H)_{ji} = \sum_{i,j} A_{ij} X_{ij}^* \label{eq:16-6-2}\end{equation}

Rewriting $\eqref{eq:16-6-2}$ using the form in $\eqref{eq:16-6-1}$:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} \left[(\Re X)_{ij} - i(\Im X)_{ij}\right] \label{eq:16-6-3}\end{equation}

Differentiating $\eqref{eq:16-6-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ij} = A_{kl} \label{eq:16-6-4}\end{equation}

Writing $\eqref{eq:16-6-4}$ in matrix form:

\begin{equation}\left(\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}}\right)_{kl} = A_{kl} \label{eq:16-6-5}\end{equation}

Since $\eqref{eq:16-6-5}$ is simply the $(k,l)$ entry of $\boldsymbol{A}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A} \label{eq:16-6-6}\end{equation}

Similarly, differentiating $\eqref{eq:16-6-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ij} = -iA_{kl} \label{eq:16-6-7}\end{equation}

Writing $\eqref{eq:16-6-7}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A} \label{eq:16-6-8}\end{equation}

Multiplying both sides of $\eqref{eq:16-6-8}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{A} = \boldsymbol{A} \label{eq:16-6-9}\end{equation}

From $\eqref{eq:16-6-6}$ and $\eqref{eq:16-6-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) both give $\boldsymbol{A}$.

Notes: The real-part and imaginary-part derivatives (after multiplication by $i$) both yield $\boldsymbol{A}$. Note: for a real matrix, $\partial\text{Tr}(\boldsymbol{A}\boldsymbol{X}^\top)/\partial\boldsymbol{X} = \boldsymbol{A}^\top$ involves a transpose, which reflects the structural difference between $X^\top$ and $X^H$. Since $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{ij} A_{ij} X^*_{ij}$, differentiating with respect to $(\Re X)_{kl}$ yields $A_{kl}$, so the result is $\boldsymbol{A}$ without a transpose.

16.7 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}^\top$, $\quad i\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}^\top$

Conditions: $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^*$ (element-wise complex conjugate)

Proof

Expanding the trace in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} (\boldsymbol{X}^*)_{ji} = \sum_{i,j} A_{ij} X_{ji}^* \label{eq:16-7-1}\end{equation}

Writing the complex conjugate in terms of real and imaginary parts:

\begin{equation}X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-7-2}\end{equation}

Substituting $\eqref{eq:16-7-2}$ into $\eqref{eq:16-7-1}$:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} \left[(\Re X)_{ji} - i(\Im X)_{ji}\right] \label{eq:16-7-3}\end{equation}

Differentiating $\eqref{eq:16-7-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ji} = A_{lk} \label{eq:16-7-4}\end{equation}

Here, differentiating $(\Re X)_{ji}$ with respect to $(\Re X)_{kl}$ yields $\delta_{jk}\delta_{il}$, which upon substitution gives $A_{lk}$.

Writing $\eqref{eq:16-7-4}$ in matrix form, using $(\boldsymbol{A}^\top)_{kl} = A_{lk}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-5}\end{equation}

Similarly, differentiating $\eqref{eq:16-7-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ji} = -iA_{lk} \label{eq:16-7-6}\end{equation}

Writing $\eqref{eq:16-7-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A}^\top \label{eq:16-7-7}\end{equation}

Multiplying both sides of $\eqref{eq:16-7-7}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-8}\end{equation}

From $\eqref{eq:16-7-5}$ and $\eqref{eq:16-7-8}$, the same result as in 16.6 is obtained.

Notes: The result is the same as for $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$ but with a transpose. This can be understood from the identity $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \text{Tr}(\boldsymbol{X}^*\boldsymbol{A}) = \text{Tr}((\boldsymbol{A}^\top\boldsymbol{X})^*)$.

16.8 Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$

Proof

By the cyclic property of the trace (1.12):

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \text{Tr}(\boldsymbol{X}^H\boldsymbol{X}) \label{eq:16-8-1}\end{equation}

Expanding $\eqref{eq:16-8-1}$ in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} X_{ij} X_{ij}^* = \sum_{i,j} |X_{ij}|^2 \label{eq:16-8-2}\end{equation}

$\eqref{eq:16-8-2}$ is the squared Frobenius norm:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2 \label{eq:16-8-3}\end{equation}

Writing the squared absolute value of a complex number in terms of real and imaginary parts:

\begin{equation}|X_{ij}|^2 = (\Re X_{ij})^2 + (\Im X_{ij})^2 \label{eq:16-8-4}\end{equation}

Substituting $\eqref{eq:16-8-4}$ into $\eqref{eq:16-8-2}$:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} \left[(\Re X_{ij})^2 + (\Im X_{ij})^2\right] \label{eq:16-8-5}\end{equation}

Differentiating $\eqref{eq:16-8-5}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} (\Re X_{ij})^2 = 2(\Re X)_{kl} \label{eq:16-8-6}\end{equation}

Writing $\eqref{eq:16-8-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X} \label{eq:16-8-7}\end{equation}

Similarly, differentiating $\eqref{eq:16-8-5}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} (\Im X_{ij})^2 = 2(\Im X)_{kl} \label{eq:16-8-8}\end{equation}

Writing $\eqref{eq:16-8-8}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-8-9}\end{equation}

Notes: $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2$ is the squared Frobenius norm. This derivative appears frequently in matrix least-squares problems.

16.9 Wirtinger Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \boldsymbol{X}^*$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \boldsymbol{X}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$

Proof

We use the matrix-extended Wirtinger derivative definition from $\eqref{eq:16-1-13}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} - i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-1}\end{equation}

From $\eqref{eq:16-8-7}$ and $\eqref{eq:16-8-9}$ in 16.8:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X}, \quad \frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-9-2}\end{equation}

Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-1}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \frac{1}{2}\left(2\Re\boldsymbol{X} - i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-3}\end{equation}

By the definition of the complex conjugate:

\begin{equation}\boldsymbol{X}^* = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-4}\end{equation}

From $\eqref{eq:16-9-3}$ and $\eqref{eq:16-9-4}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \boldsymbol{X}^* \label{eq:16-9-5}\end{equation}

Similarly, using the conjugate derivative definition from $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} + i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-6}\end{equation}

Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-6}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(2\Re\boldsymbol{X} + i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} + i\Im\boldsymbol{X} = \boldsymbol{X} \label{eq:16-9-7}\end{equation}

Notes: Wirtinger derivatives allow the derivative of a complex matrix function to be expressed directly, without decomposing into real and imaginary parts.

16.10 Complex Gradient of the Frobenius Norm

Formula: $\nabla\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \nabla\|\boldsymbol{X}\|_F^2 = 2\boldsymbol{X}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$

Proof

$\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2$ is a real-valued function:

\begin{equation}f = \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) \in \mathbb{R} \label{eq:16-10-1}\end{equation}

From $\eqref{eq:16-2-4}$ in 16.2, the complex gradient of a real-valued function is

\begin{equation}\nabla f = 2\frac{\partial f}{\partial \boldsymbol{X}^*} \label{eq:16-10-2}\end{equation}

From $\eqref{eq:16-9-7}$ in 16.9:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \boldsymbol{X} \label{eq:16-10-3}\end{equation}

Substituting $\eqref{eq:16-10-3}$ into $\eqref{eq:16-10-2}$:

\begin{equation}\nabla\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = 2\boldsymbol{X} \label{eq:16-10-4}\end{equation}

By $\eqref{eq:16-10-1}$, $\eqref{eq:16-10-4}$ gives the complex gradient of the Frobenius norm:

\begin{equation}\nabla\|\boldsymbol{X}\|_F^2 = 2\boldsymbol{X} \label{eq:16-10-5}\end{equation}

Notes: This shows that the complex gradient of the Frobenius norm $\|\boldsymbol{X}\|_F^2$ is $2\boldsymbol{X}$. It arises frequently in complex optimization problems such as low-rank approximation of complex matrices.

16.11 Derivative of $\det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})$

Formula: $\displaystyle\frac{\partial \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})}{\partial \boldsymbol{X}} = \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})\left((\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})^{-1}\boldsymbol{X}^H\boldsymbol{A}\right)^\top$
$\displaystyle\frac{\partial \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})}{\partial \boldsymbol{X}^*} = \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})\boldsymbol{A}\boldsymbol{X}(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})^{-1}$

Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$, $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}$ is invertible

Proof

Define the auxiliary matrix $\boldsymbol{M}$:

\begin{equation}\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X} \label{eq:16-11-1}\end{equation}

By the determinant differential formula:

\begin{equation}d(\det\boldsymbol{M}) = \det(\boldsymbol{M})\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) \label{eq:16-11-2}\end{equation}

Differentiating $\eqref{eq:16-11-1}$ using the product rule (1.25):

\begin{equation}d\boldsymbol{M} = d(\boldsymbol{X}^H)\boldsymbol{A}\boldsymbol{X} + \boldsymbol{X}^H\boldsymbol{A}(d\boldsymbol{X}) \label{eq:16-11-3}\end{equation}

Substituting $\eqref{eq:16-11-3}$ into $\eqref{eq:16-11-2}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-4}\end{equation}

Applying the cyclic property of the trace (1.12) to the first term of $\eqref{eq:16-11-4}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) \label{eq:16-11-5}\end{equation}

Combining $\eqref{eq:16-11-4}$ and $\eqref{eq:16-11-5}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-6}\end{equation}

In the Wirtinger framework, the term corresponding to $d\boldsymbol{X}^H$ yields the coefficient of $\partial/\partial\boldsymbol{X}^*$, and the term corresponding to $d\boldsymbol{X}$ yields the coefficient of $\partial/\partial\boldsymbol{X}$.

Reading off the $\partial/\partial\boldsymbol{X}^*$ derivative from the first term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{B}d\boldsymbol{X}^H) = \text{Tr}(d\boldsymbol{X}^H\boldsymbol{B})$ has coefficient matrix $\boldsymbol{B}^\top$,

\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}^*} = \det\boldsymbol{M} \cdot \boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1} \label{eq:16-11-7}\end{equation}

Reading off the $\partial/\partial\boldsymbol{X}$ derivative from the second term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{C}d\boldsymbol{X})$ has coefficient matrix $\boldsymbol{C}^\top$,

\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}} = \det\boldsymbol{M} \cdot (\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A})^\top \label{eq:16-11-8}\end{equation}

Substituting $\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}$ yields the final result.

Notes: This formula is used in complex matrix eigenvalue problems and in capacity maximization problems in MIMO communications.

16.12 Derivative of the Complex Rayleigh Quotient

Formula: $\displaystyle\frac{\partial}{\partial \boldsymbol{x}}\displaystyle\frac{(\boldsymbol{A}\boldsymbol{x})^H(\boldsymbol{A}\boldsymbol{x})}{(\boldsymbol{B}\boldsymbol{x})^H(\boldsymbol{B}\boldsymbol{x})} = 2\displaystyle\frac{\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} - 2\displaystyle\frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x} \cdot \boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}}{(\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^2}$

Conditions: $\boldsymbol{A}, \boldsymbol{B}$ are Hermitian matrices, $\boldsymbol{B}$ is positive definite, $\boldsymbol{x} \in \mathbb{C}^n$

Proof

Define the Rayleigh quotient $R(\boldsymbol{x})$:

\begin{equation}R(\boldsymbol{x}) = \frac{(\boldsymbol{A}\boldsymbol{x})^H(\boldsymbol{A}\boldsymbol{x})}{(\boldsymbol{B}\boldsymbol{x})^H(\boldsymbol{B}\boldsymbol{x})} = \frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} \label{eq:16-12-1}\end{equation}

Define the numerator and denominator separately:

\begin{equation}f = \boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}, \quad g = \boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x} \label{eq:16-12-2}\end{equation}

Applying the quotient rule (1.28):

\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{\partial}{\partial \boldsymbol{x}}\left(\frac{f}{g}\right) = \frac{1}{g}\frac{\partial f}{\partial \boldsymbol{x}} - \frac{f}{g^2}\frac{\partial g}{\partial \boldsymbol{x}} \label{eq:16-12-3}\end{equation}

Compute the Wirtinger derivative of the Hermitian quadratic form $\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x}$ (where $\boldsymbol{M}$ is Hermitian). Setting $\boldsymbol{M} = \boldsymbol{A}^H\boldsymbol{A}$:

\begin{equation}\frac{\partial (\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x})}{\partial \boldsymbol{x}} = (\boldsymbol{M}\boldsymbol{x})^* \label{eq:16-12-4}\end{equation}

Applying $\eqref{eq:16-12-4}$ to $f$ and $g$:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = (\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* \label{eq:16-12-5}\end{equation}

\begin{equation}\frac{\partial g}{\partial \boldsymbol{x}} = (\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-6}\end{equation}

Using the complex gradient definition $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$, the Wirtinger derivative $\partial R/\partial \boldsymbol{x}$ is

\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{1}{g}(\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* - \frac{f}{g^2}(\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-7}\end{equation}

Computing the complex gradient $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$: note that for a real-valued function $\partial R/\partial \boldsymbol{x}^* \neq (\partial R/\partial \boldsymbol{x})^*$ in general; direct computation gives

\begin{equation}\nabla R = 2\frac{\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} - 2\frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x} \cdot \boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}}{(\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^2} \label{eq:16-12-8}\end{equation}

Notes: This formula is used in iterative solvers for the generalized eigenvalue problem (a complex extension of the power method). The stationary points of $R(\boldsymbol{x})$ correspond to generalized eigenvectors.

16.13 Derivative of the Complex Quadratic Form $(a - \boldsymbol{x}^H \boldsymbol{b})^2$

Formula: $\displaystyle\frac{\partial (a - \boldsymbol{x}^H \boldsymbol{b})^2}{\partial \boldsymbol{x}} = -2\bar{\boldsymbol{b}}(a - \boldsymbol{x}^H \boldsymbol{b})^*$

Conditions: $\boldsymbol{x}, \boldsymbol{b} \in \mathbb{C}^n$, $a \in \mathbb{C}$

Proof

Define the auxiliary variable $z$:

\begin{equation}z = a - \boldsymbol{x}^H \boldsymbol{b} = a - \sum_{i=0}^{n-1} \bar{x}_i b_i \label{eq:16-13-1}\end{equation}

Define the scalar function $f$:

\begin{equation}f = z^2 = (a - \boldsymbol{x}^H \boldsymbol{b})^2 \label{eq:16-13-2}\end{equation}

In the Wirtinger framework, $\boldsymbol{x}$ and $\bar{\boldsymbol{x}}$ are treated as independent variables. From $\eqref{eq:16-13-1}$, $z$ depends on $\bar{x}_k$ but not directly on $x_k$:

\begin{equation}\frac{\partial z}{\partial \bar{x}_k} = -b_k \label{eq:16-13-3}\end{equation}

\begin{equation}\frac{\partial z}{\partial x_k} = 0 \label{eq:16-13-4}\end{equation}

Differentiating $f = z^2$ by the chain rule (1.26):

\begin{equation}\frac{\partial f}{\partial \bar{x}_k} = \frac{\partial (z^2)}{\partial z} \cdot \frac{\partial z}{\partial \bar{x}_k} = 2z \cdot (-b_k) = -2b_k z \label{eq:16-13-5}\end{equation}

Writing $\eqref{eq:16-13-5}$ in vector form:

\begin{equation}\frac{\partial f}{\partial \bar{\boldsymbol{x}}} = -2\boldsymbol{b}z \label{eq:16-13-6}\end{equation}

Using the standard Wirtinger identity:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{\frac{\partial f}{\partial \bar{\boldsymbol{x}}}} \label{eq:16-13-7}\end{equation}

Substituting $\eqref{eq:16-13-6}$ into $\eqref{eq:16-13-7}$:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{-2\boldsymbol{b}z} = -2\bar{\boldsymbol{b}} \bar{z} \label{eq:16-13-8}\end{equation}

Since $\bar{z} = z^*$:

\begin{equation}\frac{\partial (a - \boldsymbol{x}^H \boldsymbol{b})^2}{\partial \boldsymbol{x}} = -2\bar{\boldsymbol{b}} (a - \boldsymbol{x}^H \boldsymbol{b})^* \label{eq:16-13-9}\end{equation}

Notes: When $\boldsymbol{b}$ is a real vector, $\bar{\boldsymbol{b}} = \boldsymbol{b}$, and the result simplifies to $-2\boldsymbol{b}(a - \boldsymbol{x}^H \boldsymbol{b})^*$.

Proofs Chapter 16: Derivatives of Complex Matrices

16. Derivatives of Complex Matrices

16.1 Wirtinger Derivatives

Proof

16.2 Complex Gradient Vector

Proof

16.3 Chain Rule for Complex Derivatives

Proof

16.4 Derivative of $\text{Tr}(\boldsymbol{X}^*)$

Proof

16.5 Derivative of $\text{Tr}(\boldsymbol{X})$

Proof

16.6 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$

Proof

16.7 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)$

Proof

16.8 Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Proof

16.9 Wirtinger Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Proof

16.10 Complex Gradient of the Frobenius Norm

Proof

16.11 Derivative of $\det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})$

Proof

16.12 Derivative of the Complex Rayleigh Quotient

Proof

16.13 Derivative of the Complex Quadratic Form $(a - \boldsymbol{x}^H \boldsymbol{b})^2$

Proof

References