Proofs Chapter 16: Derivatives of Complex Matrices
Wirtinger Derivatives, Complex Gradients, and Derivatives of Complex Trace / Determinant
This chapter proves the derivatives of complex matrices within the framework of Wirtinger derivatives. Complex differentiation is an indispensable mathematical tool in signal processing and communications engineering, appearing in independent component analysis (ICA), adaptive filtering, signal processing for MIMO antenna systems, and optimization of density matrices in quantum information theory. Wirtinger derivatives allow the gradient of non-holomorphic functions (such as the squared absolute value) to be computed directly, without decomposing into real and imaginary parts.
Prerequisites: Chapter 5 (Derivatives of Trace), Chapter 7 (Derivatives of Determinant). Related chapter: Chapter 15 (Derivatives of Special Matrices).
16. Derivatives of Complex Matrices
Unless stated otherwise, all formulas in this chapter hold under the following conditions:
- All formulas follow the denominator layout convention
- Complex differentiation uses Wirtinger derivatives ($\frac{\partial}{\partial z}$ and $\frac{\partial}{\partial z^*}$)
- The gradient of a real-valued function is given by $\frac{\partial f}{\partial z^*}$
We derive Wirtinger derivatives for functions involving complex conjugates, and differentiation formulas for the complex trace.
16.1 Wirtinger Derivatives
Proof
Decompose the complex number $z$ into its real and imaginary parts.
\begin{equation}z = x + iy \label{eq:16-1-1}\end{equation}
where $x = \Re z$ and $y = \Im z$.
From $\eqref{eq:16-1-1}$, expressing $z$ and $z^*$ in terms of $x$ and $y$:
\begin{equation}z = x + iy, \quad z^* = x - iy \label{eq:16-1-2}\end{equation}
Solving $\eqref{eq:16-1-2}$ for $x$ and $y$: adding the two equations gives
\begin{equation}z + z^* = 2x \quad \Rightarrow \quad x = \frac{z + z^*}{2} \label{eq:16-1-3}\end{equation}
Subtracting the two equations gives
\begin{equation}z - z^* = 2iy \quad \Rightarrow \quad y = \frac{z - z^*}{2i} \label{eq:16-1-4}\end{equation}
Treating $f(z)$ as $f(x, y)$ and applying the chain rule, the partial derivative of $f$ with respect to $z$ is
\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z} \label{eq:16-1-5}\end{equation}
From $\eqref{eq:16-1-3}$, treating $z^*$ as a constant, we compute $\partial x / \partial z$:
\begin{equation}\frac{\partial x}{\partial z} = \frac{1}{2} \label{eq:16-1-6}\end{equation}
From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z$:
\begin{equation}\frac{\partial y}{\partial z} = \frac{1}{2i} = -\frac{i}{2} \label{eq:16-1-7}\end{equation}
Substituting $\eqref{eq:16-1-6}$ and $\eqref{eq:16-1-7}$ into $\eqref{eq:16-1-5}$:
\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \left(-\frac{i}{2}\right) = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-1-8}\end{equation}
Similarly, differentiate $f$ with respect to $z^*$:
\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z^*} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z^*} \label{eq:16-1-9}\end{equation}
From $\eqref{eq:16-1-3}$, treating $z$ as a constant, we compute $\partial x / \partial z^*$:
\begin{equation}\frac{\partial x}{\partial z^*} = \frac{1}{2} \label{eq:16-1-10}\end{equation}
From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z^*$:
\begin{equation}\frac{\partial y}{\partial z^*} = -\frac{1}{2i} = \frac{i}{2} \label{eq:16-1-11}\end{equation}
Substituting $\eqref{eq:16-1-10}$ and $\eqref{eq:16-1-11}$ into $\eqref{eq:16-1-9}$:
\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \frac{i}{2} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-1-12}\end{equation}
Substituting $x = \Re z$ and $y = \Im z$ yields the final result:
\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} - i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-13}\end{equation}
\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} + i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-14}\end{equation}
Source: W. Wirtinger (1927) "Zur formalen Theorie der Funktionen von mehr komplexen Veränderlichen", Mathematische Annalen 97, 357–375.
16.2 Complex Gradient Vector
Proof
Define the complex gradient of the real-valued function $f(\boldsymbol{z})$. Since $f$ is real-valued, $f = f^*$ holds.
From $\eqref{eq:16-1-14}$ in 16.1, the component-wise Wirtinger derivative is
\begin{equation}\frac{\partial f}{\partial z_k^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k}\right) \label{eq:16-2-1}\end{equation}
where $z_k = x_k + iy_k$ ($x_k = \Re z_k$, $y_k = \Im z_k$).
Multiplying both sides of $\eqref{eq:16-2-1}$ by 2:
\begin{equation}2\frac{\partial f}{\partial z_k^*} = \frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k} \label{eq:16-2-2}\end{equation}
Writing $\eqref{eq:16-2-2}$ in vector form:
\begin{equation}2\frac{\partial f}{\partial \boldsymbol{z}^*} = \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} \label{eq:16-2-3}\end{equation}
Define the complex gradient $\nabla f$ by the right-hand side of $\eqref{eq:16-2-3}$:
\begin{equation}\nabla f(\boldsymbol{z}) \stackrel{\text{def}}{=} \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} = 2\frac{\partial f}{\partial \boldsymbol{z}^*} \label{eq:16-2-4}\end{equation}
We verify that this definition yields the steepest descent direction. The total differential of $f$ is
\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial x_k}dx_k + \frac{\partial f}{\partial y_k}dy_k\right) \label{eq:16-2-5}\end{equation}
From $dz_k = dx_k + idy_k$ and $dz_k^* = dx_k - idy_k$:
\begin{equation}dx_k = \frac{dz_k + dz_k^*}{2}, \quad dy_k = \frac{dz_k - dz_k^*}{2i} \label{eq:16-2-6}\end{equation}
Substituting $\eqref{eq:16-2-6}$ into $\eqref{eq:16-2-5}$ and simplifying:
\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial z_k}dz_k + \frac{\partial f}{\partial z_k^*}dz_k^*\right) \label{eq:16-2-7}\end{equation}
When $f$ is real-valued, $\partial f/\partial z_k = (\partial f/\partial z_k^*)^*$ holds. Substituting into $\eqref{eq:16-2-7}$ and fixing the direction of $d\boldsymbol{z}$ to minimize $df$, one finds that the steepest descent direction is $-\nabla f = -2\partial f/\partial \boldsymbol{z}^*$.
16.3 Chain Rule for Complex Derivatives
Proof
Consider the composite function $h(z) = g(f(z), f^*(z))$. In the Wirtinger framework, $f$ and $f^*$ are treated as independent variables.
Write $h$ in terms of real and imaginary parts. Setting $z = x + iy$ and $f = u + iv$:
\begin{equation}h = h(x, y), \quad f = f(x, y) = u(x, y) + iv(x, y) \label{eq:16-3-1}\end{equation}
The Wirtinger derivative of $h$ with respect to $z$, from $\eqref{eq:16-1-13}$ in 16.1, is
\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left(\frac{\partial h}{\partial x} - i\frac{\partial h}{\partial y}\right) \label{eq:16-3-2}\end{equation}
Since $h$ depends on $x$ and $y$ through $f$ and $f^*$, applying the chain rule:
\begin{equation}\frac{\partial h}{\partial x} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial x} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial x} \label{eq:16-3-3}\end{equation}
\begin{equation}\frac{\partial h}{\partial y} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial y} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial y} \label{eq:16-3-4}\end{equation}
Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-2}$:
\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-5}\end{equation}
From $\eqref{eq:16-1-13}$ in 16.1:
\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-3-6}\end{equation}
\begin{equation}\frac{\partial f^*}{\partial z} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-7}\end{equation}
Substituting $\eqref{eq:16-3-6}$ and $\eqref{eq:16-3-7}$ into $\eqref{eq:16-3-5}$:
\begin{equation}\frac{\partial h}{\partial z} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z} \label{eq:16-3-8}\end{equation}
Similarly, compute the Wirtinger derivative of $h$ with respect to $z^*$. From $\eqref{eq:16-1-14}$ in 16.1:
\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left(\frac{\partial h}{\partial x} + i\frac{\partial h}{\partial y}\right) \label{eq:16-3-9}\end{equation}
Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-9}$:
\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-10}\end{equation}
From $\eqref{eq:16-1-14}$ in 16.1:
\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-3-11}\end{equation}
\begin{equation}\frac{\partial f^*}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-12}\end{equation}
Substituting $\eqref{eq:16-3-11}$ and $\eqref{eq:16-3-12}$ into $\eqref{eq:16-3-10}$:
\begin{equation}\frac{\partial h}{\partial z^*} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z^*} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z^*} \label{eq:16-3-13}\end{equation}
16.4 Derivative of $\text{Tr}(\boldsymbol{X}^*)$
Proof
Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:
\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-4-1}\end{equation}
The complex conjugate is
\begin{equation}X_{ij}^* = (\Re X)_{ij} - i(\Im X)_{ij} \label{eq:16-4-2}\end{equation}
By the definition of the trace:
\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} X_{ii}^* \label{eq:16-4-3}\end{equation}
Substituting $\eqref{eq:16-4-2}$ into $\eqref{eq:16-4-3}$:
\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} - i(\Im X)_{ii}\right] \label{eq:16-4-4}\end{equation}
Differentiating $\eqref{eq:16-4-4}$ with respect to the real part: for the $(k, l)$ entry,
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-4-5}\end{equation}
where $\delta_{kl}$ is the Kronecker delta.
Writing $\eqref{eq:16-4-5}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-4-6}\end{equation}
Similarly, differentiating $\eqref{eq:16-4-4}$ with respect to the imaginary part:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} (-i)(\Im X)_{ii} = -i\delta_{kl} \label{eq:16-4-7}\end{equation}
Writing $\eqref{eq:16-4-7}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{I} \label{eq:16-4-8}\end{equation}
Multiplying both sides of $\eqref{eq:16-4-8}$ by $i$:
\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{I} = \boldsymbol{I} \label{eq:16-4-9}\end{equation}
From $\eqref{eq:16-4-6}$ and $\eqref{eq:16-4-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have the same sign, giving $\boldsymbol{I}$ in both cases.
16.5 Derivative of $\text{Tr}(\boldsymbol{X})$
Proof
Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:
\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-5-1}\end{equation}
By the definition of the trace:
\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} X_{ii} \label{eq:16-5-2}\end{equation}
Substituting $\eqref{eq:16-5-1}$ into $\eqref{eq:16-5-2}$:
\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} + i(\Im X)_{ii}\right] \label{eq:16-5-3}\end{equation}
Differentiating $\eqref{eq:16-5-3}$ with respect to the real part: for the $(k, l)$ entry,
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-5-4}\end{equation}
Writing $\eqref{eq:16-5-4}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-5-5}\end{equation}
Similarly, differentiating $\eqref{eq:16-5-3}$ with respect to the imaginary part:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} i(\Im X)_{ii} = i\delta_{kl} \label{eq:16-5-6}\end{equation}
Writing $\eqref{eq:16-5-6}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i\boldsymbol{I} \label{eq:16-5-7}\end{equation}
Multiplying both sides of $\eqref{eq:16-5-7}$ by $i$:
\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i \cdot i\boldsymbol{I} = -\boldsymbol{I} \label{eq:16-5-8}\end{equation}
Comparing $\eqref{eq:16-5-5}$ and $\eqref{eq:16-5-8}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have opposite signs. This contrasts with the case of $\text{Tr}(\boldsymbol{X}^*)$ in 16.4.
16.6 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$
Proof
By the definition of the Hermitian transpose:
\begin{equation}(\boldsymbol{X}^H)_{ij} = X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-6-1}\end{equation}
Expanding the trace in terms of entries:
\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} (\boldsymbol{X}^H)_{ji} = \sum_{i,j} A_{ij} X_{ij}^* \label{eq:16-6-2}\end{equation}
Rewriting $\eqref{eq:16-6-2}$ using the form in $\eqref{eq:16-6-1}$:
\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} \left[(\Re X)_{ij} - i(\Im X)_{ij}\right] \label{eq:16-6-3}\end{equation}
Differentiating $\eqref{eq:16-6-3}$ with respect to the real part: for the $(k, l)$ entry,
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ij} = A_{kl} \label{eq:16-6-4}\end{equation}
Writing $\eqref{eq:16-6-4}$ in matrix form:
\begin{equation}\left(\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}}\right)_{kl} = A_{kl} \label{eq:16-6-5}\end{equation}
Since $\eqref{eq:16-6-5}$ is simply the $(k,l)$ entry of $\boldsymbol{A}$:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A} \label{eq:16-6-6}\end{equation}
Similarly, differentiating $\eqref{eq:16-6-3}$ with respect to the imaginary part:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ij} = -iA_{kl} \label{eq:16-6-7}\end{equation}
Writing $\eqref{eq:16-6-7}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A} \label{eq:16-6-8}\end{equation}
Multiplying both sides of $\eqref{eq:16-6-8}$ by $i$:
\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{A} = \boldsymbol{A} \label{eq:16-6-9}\end{equation}
From $\eqref{eq:16-6-6}$ and $\eqref{eq:16-6-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) both give $\boldsymbol{A}$.
16.7 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)$
Proof
Expanding the trace in terms of entries:
\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} (\boldsymbol{X}^*)_{ji} = \sum_{i,j} A_{ij} X_{ji}^* \label{eq:16-7-1}\end{equation}
Writing the complex conjugate in terms of real and imaginary parts:
\begin{equation}X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-7-2}\end{equation}
Substituting $\eqref{eq:16-7-2}$ into $\eqref{eq:16-7-1}$:
\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} \left[(\Re X)_{ji} - i(\Im X)_{ji}\right] \label{eq:16-7-3}\end{equation}
Differentiating $\eqref{eq:16-7-3}$ with respect to the real part: for the $(k, l)$ entry,
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ji} = A_{lk} \label{eq:16-7-4}\end{equation}
Here, differentiating $(\Re X)_{ji}$ with respect to $(\Re X)_{kl}$ yields $\delta_{jk}\delta_{il}$, which upon substitution gives $A_{lk}$.
Writing $\eqref{eq:16-7-4}$ in matrix form, using $(\boldsymbol{A}^\top)_{kl} = A_{lk}$:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-5}\end{equation}
Similarly, differentiating $\eqref{eq:16-7-3}$ with respect to the imaginary part:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ji} = -iA_{lk} \label{eq:16-7-6}\end{equation}
Writing $\eqref{eq:16-7-6}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A}^\top \label{eq:16-7-7}\end{equation}
Multiplying both sides of $\eqref{eq:16-7-7}$ by $i$:
\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-8}\end{equation}
From $\eqref{eq:16-7-5}$ and $\eqref{eq:16-7-8}$, the same result as in 16.6 is obtained.
16.8 Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$
Proof
By the cyclic property of the trace (1.12):
\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \text{Tr}(\boldsymbol{X}^H\boldsymbol{X}) \label{eq:16-8-1}\end{equation}
Expanding $\eqref{eq:16-8-1}$ in terms of entries:
\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} X_{ij} X_{ij}^* = \sum_{i,j} |X_{ij}|^2 \label{eq:16-8-2}\end{equation}
$\eqref{eq:16-8-2}$ is the squared Frobenius norm:
\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2 \label{eq:16-8-3}\end{equation}
Writing the squared absolute value of a complex number in terms of real and imaginary parts:
\begin{equation}|X_{ij}|^2 = (\Re X_{ij})^2 + (\Im X_{ij})^2 \label{eq:16-8-4}\end{equation}
Substituting $\eqref{eq:16-8-4}$ into $\eqref{eq:16-8-2}$:
\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} \left[(\Re X_{ij})^2 + (\Im X_{ij})^2\right] \label{eq:16-8-5}\end{equation}
Differentiating $\eqref{eq:16-8-5}$ with respect to the real part: for the $(k, l)$ entry,
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} (\Re X_{ij})^2 = 2(\Re X)_{kl} \label{eq:16-8-6}\end{equation}
Writing $\eqref{eq:16-8-6}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X} \label{eq:16-8-7}\end{equation}
Similarly, differentiating $\eqref{eq:16-8-5}$ with respect to the imaginary part:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} (\Im X_{ij})^2 = 2(\Im X)_{kl} \label{eq:16-8-8}\end{equation}
Writing $\eqref{eq:16-8-8}$ in matrix form:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-8-9}\end{equation}
16.9 Wirtinger Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$
Proof
We use the matrix-extended Wirtinger derivative definition from $\eqref{eq:16-1-13}$ in 16.1:
\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} - i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-1}\end{equation}
From $\eqref{eq:16-8-7}$ and $\eqref{eq:16-8-9}$ in 16.8:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X}, \quad \frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-9-2}\end{equation}
Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-1}$:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \frac{1}{2}\left(2\Re\boldsymbol{X} - i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-3}\end{equation}
By the definition of the complex conjugate:
\begin{equation}\boldsymbol{X}^* = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-4}\end{equation}
From $\eqref{eq:16-9-3}$ and $\eqref{eq:16-9-4}$:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \boldsymbol{X}^* \label{eq:16-9-5}\end{equation}
Similarly, using the conjugate derivative definition from $\eqref{eq:16-1-14}$ in 16.1:
\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} + i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-6}\end{equation}
Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-6}$:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(2\Re\boldsymbol{X} + i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} + i\Im\boldsymbol{X} = \boldsymbol{X} \label{eq:16-9-7}\end{equation}
16.10 Complex Gradient of the Frobenius Norm
Proof
$\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2$ is a real-valued function:
\begin{equation}f = \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) \in \mathbb{R} \label{eq:16-10-1}\end{equation}
From $\eqref{eq:16-2-4}$ in 16.2, the complex gradient of a real-valued function is
\begin{equation}\nabla f = 2\frac{\partial f}{\partial \boldsymbol{X}^*} \label{eq:16-10-2}\end{equation}
From $\eqref{eq:16-9-7}$ in 16.9:
\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \boldsymbol{X} \label{eq:16-10-3}\end{equation}
Substituting $\eqref{eq:16-10-3}$ into $\eqref{eq:16-10-2}$:
\begin{equation}\nabla\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = 2\boldsymbol{X} \label{eq:16-10-4}\end{equation}
By $\eqref{eq:16-10-1}$, $\eqref{eq:16-10-4}$ gives the complex gradient of the Frobenius norm:
\begin{equation}\nabla\|\boldsymbol{X}\|_F^2 = 2\boldsymbol{X} \label{eq:16-10-5}\end{equation}
16.11 Derivative of $\det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})$
$\displaystyle\frac{\partial \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})}{\partial \boldsymbol{X}^*} = \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})\boldsymbol{A}\boldsymbol{X}(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})^{-1}$
Proof
Define the auxiliary matrix $\boldsymbol{M}$:
\begin{equation}\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X} \label{eq:16-11-1}\end{equation}
By the determinant differential formula:
\begin{equation}d(\det\boldsymbol{M}) = \det(\boldsymbol{M})\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) \label{eq:16-11-2}\end{equation}
Differentiating $\eqref{eq:16-11-1}$ using the product rule (1.25):
\begin{equation}d\boldsymbol{M} = d(\boldsymbol{X}^H)\boldsymbol{A}\boldsymbol{X} + \boldsymbol{X}^H\boldsymbol{A}(d\boldsymbol{X}) \label{eq:16-11-3}\end{equation}
Substituting $\eqref{eq:16-11-3}$ into $\eqref{eq:16-11-2}$:
\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-4}\end{equation}
Applying the cyclic property of the trace (1.12) to the first term of $\eqref{eq:16-11-4}$:
\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) \label{eq:16-11-5}\end{equation}
Combining $\eqref{eq:16-11-4}$ and $\eqref{eq:16-11-5}$:
\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-6}\end{equation}
In the Wirtinger framework, the term corresponding to $d\boldsymbol{X}^H$ yields the coefficient of $\partial/\partial\boldsymbol{X}^*$, and the term corresponding to $d\boldsymbol{X}$ yields the coefficient of $\partial/\partial\boldsymbol{X}$.
Reading off the $\partial/\partial\boldsymbol{X}^*$ derivative from the first term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{B}d\boldsymbol{X}^H) = \text{Tr}(d\boldsymbol{X}^H\boldsymbol{B})$ has coefficient matrix $\boldsymbol{B}^\top$,
\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}^*} = \det\boldsymbol{M} \cdot \boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1} \label{eq:16-11-7}\end{equation}
Reading off the $\partial/\partial\boldsymbol{X}$ derivative from the second term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{C}d\boldsymbol{X})$ has coefficient matrix $\boldsymbol{C}^\top$,
\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}} = \det\boldsymbol{M} \cdot (\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A})^\top \label{eq:16-11-8}\end{equation}
Substituting $\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}$ yields the final result.
16.12 Derivative of the Complex Rayleigh Quotient
Proof
Define the Rayleigh quotient $R(\boldsymbol{x})$:
\begin{equation}R(\boldsymbol{x}) = \frac{(\boldsymbol{A}\boldsymbol{x})^H(\boldsymbol{A}\boldsymbol{x})}{(\boldsymbol{B}\boldsymbol{x})^H(\boldsymbol{B}\boldsymbol{x})} = \frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} \label{eq:16-12-1}\end{equation}
Define the numerator and denominator separately:
\begin{equation}f = \boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}, \quad g = \boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x} \label{eq:16-12-2}\end{equation}
Applying the quotient rule (1.28):
\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{\partial}{\partial \boldsymbol{x}}\left(\frac{f}{g}\right) = \frac{1}{g}\frac{\partial f}{\partial \boldsymbol{x}} - \frac{f}{g^2}\frac{\partial g}{\partial \boldsymbol{x}} \label{eq:16-12-3}\end{equation}
Compute the Wirtinger derivative of the Hermitian quadratic form $\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x}$ (where $\boldsymbol{M}$ is Hermitian). Setting $\boldsymbol{M} = \boldsymbol{A}^H\boldsymbol{A}$:
\begin{equation}\frac{\partial (\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x})}{\partial \boldsymbol{x}} = (\boldsymbol{M}\boldsymbol{x})^* \label{eq:16-12-4}\end{equation}
Applying $\eqref{eq:16-12-4}$ to $f$ and $g$:
\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = (\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* \label{eq:16-12-5}\end{equation}
\begin{equation}\frac{\partial g}{\partial \boldsymbol{x}} = (\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-6}\end{equation}
Using the complex gradient definition $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$, the Wirtinger derivative $\partial R/\partial \boldsymbol{x}$ is
\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{1}{g}(\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* - \frac{f}{g^2}(\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-7}\end{equation}
Computing the complex gradient $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$: note that for a real-valued function $\partial R/\partial \boldsymbol{x}^* \neq (\partial R/\partial \boldsymbol{x})^*$ in general; direct computation gives
\begin{equation}\nabla R = 2\frac{\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} - 2\frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x} \cdot \boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}}{(\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^2} \label{eq:16-12-8}\end{equation}
16.13 Derivative of the Complex Quadratic Form $(a - \boldsymbol{x}^H \boldsymbol{b})^2$
Proof
Define the auxiliary variable $z$:
\begin{equation}z = a - \boldsymbol{x}^H \boldsymbol{b} = a - \sum_{i=0}^{n-1} \bar{x}_i b_i \label{eq:16-13-1}\end{equation}
Define the scalar function $f$:
\begin{equation}f = z^2 = (a - \boldsymbol{x}^H \boldsymbol{b})^2 \label{eq:16-13-2}\end{equation}
In the Wirtinger framework, $\boldsymbol{x}$ and $\bar{\boldsymbol{x}}$ are treated as independent variables. From $\eqref{eq:16-13-1}$, $z$ depends on $\bar{x}_k$ but not directly on $x_k$:
\begin{equation}\frac{\partial z}{\partial \bar{x}_k} = -b_k \label{eq:16-13-3}\end{equation}
\begin{equation}\frac{\partial z}{\partial x_k} = 0 \label{eq:16-13-4}\end{equation}
Differentiating $f = z^2$ by the chain rule (1.26):
\begin{equation}\frac{\partial f}{\partial \bar{x}_k} = \frac{\partial (z^2)}{\partial z} \cdot \frac{\partial z}{\partial \bar{x}_k} = 2z \cdot (-b_k) = -2b_k z \label{eq:16-13-5}\end{equation}
Writing $\eqref{eq:16-13-5}$ in vector form:
\begin{equation}\frac{\partial f}{\partial \bar{\boldsymbol{x}}} = -2\boldsymbol{b}z \label{eq:16-13-6}\end{equation}
Using the standard Wirtinger identity:
\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{\frac{\partial f}{\partial \bar{\boldsymbol{x}}}} \label{eq:16-13-7}\end{equation}
Substituting $\eqref{eq:16-13-6}$ into $\eqref{eq:16-13-7}$:
\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{-2\boldsymbol{b}z} = -2\bar{\boldsymbol{b}} \bar{z} \label{eq:16-13-8}\end{equation}
Since $\bar{z} = z^*$:
\begin{equation}\frac{\partial (a - \boldsymbol{x}^H \boldsymbol{b})^2}{\partial \boldsymbol{x}} = -2\bar{\boldsymbol{b}} (a - \boldsymbol{x}^H \boldsymbol{b})^* \label{eq:16-13-9}\end{equation}
References
- Petersen, K. B., & Pedersen, M. S. (2012). The Matrix Cookbook. Technical University of Denmark.
- Magnus, J. R., & Neudecker, H. (1999). Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised ed.). Wiley.
- Matrix calculus - Wikipedia