Proofs Chapter 16: Derivatives of Complex Matrices

Wirtinger Derivatives, Complex Gradients, and Derivatives of Complex Trace / Determinant

This chapter proves the derivatives of complex matrices within the framework of Wirtinger derivatives. Complex differentiation is an indispensable mathematical tool in signal processing and communications engineering, appearing in independent component analysis (ICA), adaptive filtering, signal processing for MIMO antenna systems, and optimization of density matrices in quantum information theory. Wirtinger derivatives allow the gradient of non-holomorphic functions (such as the squared absolute value) to be computed directly, without decomposing into real and imaginary parts.

Prerequisites: Chapter 5 (Derivatives of Trace), Chapter 7 (Derivatives of Determinant). Related chapter: Chapter 15 (Derivatives of Special Matrices).

16. Derivatives of Complex Matrices

Prerequisites of this chapter
Unless stated otherwise, all formulas in this chapter hold under the following conditions:
  • All formulas follow the denominator layout convention
  • Complex differentiation uses Wirtinger derivatives ($\frac{\partial}{\partial z}$ and $\frac{\partial}{\partial z^*}$)
  • The gradient of a real-valued function is given by $\frac{\partial f}{\partial z^*}$

We derive Wirtinger derivatives for functions involving complex conjugates, and differentiation formulas for the complex trace.

16.1 Wirtinger Derivatives

Formula: $\displaystyle\frac{\partial f}{\partial z} = \displaystyle\frac{1}{2}\left(\displaystyle\frac{\partial f}{\partial \Re z} - i\displaystyle\frac{\partial f}{\partial \Im z}\right)$, $\quad\displaystyle\frac{\partial f}{\partial z^*} = \displaystyle\frac{1}{2}\left(\displaystyle\frac{\partial f}{\partial \Re z} + i\displaystyle\frac{\partial f}{\partial \Im z}\right)$
Conditions: $f$ is a complex function, $z = \Re z + i\Im z$
Proof

Decompose the complex number $z$ into its real and imaginary parts.

\begin{equation}z = x + iy \label{eq:16-1-1}\end{equation}

where $x = \Re z$ and $y = \Im z$.

From $\eqref{eq:16-1-1}$, expressing $z$ and $z^*$ in terms of $x$ and $y$:

\begin{equation}z = x + iy, \quad z^* = x - iy \label{eq:16-1-2}\end{equation}

Solving $\eqref{eq:16-1-2}$ for $x$ and $y$: adding the two equations gives

\begin{equation}z + z^* = 2x \quad \Rightarrow \quad x = \frac{z + z^*}{2} \label{eq:16-1-3}\end{equation}

Subtracting the two equations gives

\begin{equation}z - z^* = 2iy \quad \Rightarrow \quad y = \frac{z - z^*}{2i} \label{eq:16-1-4}\end{equation}

Treating $f(z)$ as $f(x, y)$ and applying the chain rule, the partial derivative of $f$ with respect to $z$ is

\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z} \label{eq:16-1-5}\end{equation}

From $\eqref{eq:16-1-3}$, treating $z^*$ as a constant, we compute $\partial x / \partial z$:

\begin{equation}\frac{\partial x}{\partial z} = \frac{1}{2} \label{eq:16-1-6}\end{equation}

From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z$:

\begin{equation}\frac{\partial y}{\partial z} = \frac{1}{2i} = -\frac{i}{2} \label{eq:16-1-7}\end{equation}

Substituting $\eqref{eq:16-1-6}$ and $\eqref{eq:16-1-7}$ into $\eqref{eq:16-1-5}$:

\begin{equation}\frac{\partial f}{\partial z} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \left(-\frac{i}{2}\right) = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-1-8}\end{equation}

Similarly, differentiate $f$ with respect to $z^*$:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial z^*} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial z^*} \label{eq:16-1-9}\end{equation}

From $\eqref{eq:16-1-3}$, treating $z$ as a constant, we compute $\partial x / \partial z^*$:

\begin{equation}\frac{\partial x}{\partial z^*} = \frac{1}{2} \label{eq:16-1-10}\end{equation}

From $\eqref{eq:16-1-4}$, we compute $\partial y / \partial z^*$:

\begin{equation}\frac{\partial y}{\partial z^*} = -\frac{1}{2i} = \frac{i}{2} \label{eq:16-1-11}\end{equation}

Substituting $\eqref{eq:16-1-10}$ and $\eqref{eq:16-1-11}$ into $\eqref{eq:16-1-9}$:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{\partial f}{\partial x} \cdot \frac{1}{2} + \frac{\partial f}{\partial y} \cdot \frac{i}{2} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-1-12}\end{equation}

Substituting $x = \Re z$ and $y = \Im z$ yields the final result:

\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} - i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-13}\end{equation}

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re z} + i\frac{\partial f}{\partial \Im z}\right) \label{eq:16-1-14}\end{equation}

Notes: Functions involving complex conjugates (e.g., $f(z) = z^*$) do not satisfy the Cauchy–Riemann equations and are therefore not differentiable in the classical sense. Wirtinger derivatives circumvent this problem. When $f$ is holomorphic, $\partial f/\partial z$ coincides with the ordinary complex derivative, and $\partial f/\partial z^* = 0$.
Source: W. Wirtinger (1927) "Zur formalen Theorie der Funktionen von mehr komplexen Veränderlichen", Mathematische Annalen 97, 357–375.

16.2 Complex Gradient Vector

Formula: $\nabla f(\boldsymbol{z}) = 2\displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \boldsymbol{z}^*} = \displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \Re\boldsymbol{z}} + i\displaystyle\frac{\partial f(\boldsymbol{z})}{\partial \Im\boldsymbol{z}}$
Conditions: $f(\boldsymbol{z})$ is a real-valued function, $\boldsymbol{z} \in \mathbb{C}^n$
Proof

Define the complex gradient of the real-valued function $f(\boldsymbol{z})$. Since $f$ is real-valued, $f = f^*$ holds.

From $\eqref{eq:16-1-14}$ in 16.1, the component-wise Wirtinger derivative is

\begin{equation}\frac{\partial f}{\partial z_k^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k}\right) \label{eq:16-2-1}\end{equation}

where $z_k = x_k + iy_k$ ($x_k = \Re z_k$, $y_k = \Im z_k$).

Multiplying both sides of $\eqref{eq:16-2-1}$ by 2:

\begin{equation}2\frac{\partial f}{\partial z_k^*} = \frac{\partial f}{\partial x_k} + i\frac{\partial f}{\partial y_k} \label{eq:16-2-2}\end{equation}

Writing $\eqref{eq:16-2-2}$ in vector form:

\begin{equation}2\frac{\partial f}{\partial \boldsymbol{z}^*} = \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} \label{eq:16-2-3}\end{equation}

Define the complex gradient $\nabla f$ by the right-hand side of $\eqref{eq:16-2-3}$:

\begin{equation}\nabla f(\boldsymbol{z}) \stackrel{\text{def}}{=} \frac{\partial f}{\partial \Re\boldsymbol{z}} + i\frac{\partial f}{\partial \Im\boldsymbol{z}} = 2\frac{\partial f}{\partial \boldsymbol{z}^*} \label{eq:16-2-4}\end{equation}

We verify that this definition yields the steepest descent direction. The total differential of $f$ is

\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial x_k}dx_k + \frac{\partial f}{\partial y_k}dy_k\right) \label{eq:16-2-5}\end{equation}

From $dz_k = dx_k + idy_k$ and $dz_k^* = dx_k - idy_k$:

\begin{equation}dx_k = \frac{dz_k + dz_k^*}{2}, \quad dy_k = \frac{dz_k - dz_k^*}{2i} \label{eq:16-2-6}\end{equation}

Substituting $\eqref{eq:16-2-6}$ into $\eqref{eq:16-2-5}$ and simplifying:

\begin{equation}df = \sum_k \left(\frac{\partial f}{\partial z_k}dz_k + \frac{\partial f}{\partial z_k^*}dz_k^*\right) \label{eq:16-2-7}\end{equation}

When $f$ is real-valued, $\partial f/\partial z_k = (\partial f/\partial z_k^*)^*$ holds. Substituting into $\eqref{eq:16-2-7}$ and fixing the direction of $d\boldsymbol{z}$ to minimize $df$, one finds that the steepest descent direction is $-\nabla f = -2\partial f/\partial \boldsymbol{z}^*$.

Notes: With this definition, $-\nabla f$ is the steepest descent direction, exactly as in the real case. This is used in the optimization of complex neural networks and adaptive filters.

16.3 Chain Rule for Complex Derivatives

Formula: $\displaystyle\frac{\partial g}{\partial z} = \displaystyle\frac{\partial g}{\partial f}\displaystyle\frac{\partial f}{\partial z} + \displaystyle\frac{\partial g}{\partial f^*}\displaystyle\frac{\partial f^*}{\partial z}$, $\quad\displaystyle\frac{\partial g}{\partial z^*} = \displaystyle\frac{\partial g}{\partial f}\displaystyle\frac{\partial f}{\partial z^*} + \displaystyle\frac{\partial g}{\partial f^*}\displaystyle\frac{\partial f^*}{\partial z^*}$
Conditions: $g(f(z))$ is a composite function
Proof

Consider the composite function $h(z) = g(f(z), f^*(z))$. In the Wirtinger framework, $f$ and $f^*$ are treated as independent variables.

Write $h$ in terms of real and imaginary parts. Setting $z = x + iy$ and $f = u + iv$:

\begin{equation}h = h(x, y), \quad f = f(x, y) = u(x, y) + iv(x, y) \label{eq:16-3-1}\end{equation}

The Wirtinger derivative of $h$ with respect to $z$, from $\eqref{eq:16-1-13}$ in 16.1, is

\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left(\frac{\partial h}{\partial x} - i\frac{\partial h}{\partial y}\right) \label{eq:16-3-2}\end{equation}

Since $h$ depends on $x$ and $y$ through $f$ and $f^*$, applying the chain rule:

\begin{equation}\frac{\partial h}{\partial x} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial x} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial x} \label{eq:16-3-3}\end{equation}

\begin{equation}\frac{\partial h}{\partial y} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial y} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial y} \label{eq:16-3-4}\end{equation}

Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-2}$:

\begin{equation}\frac{\partial h}{\partial z} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-5}\end{equation}

From $\eqref{eq:16-1-13}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial z} = \frac{1}{2}\left(\frac{\partial f}{\partial x} - i\frac{\partial f}{\partial y}\right) \label{eq:16-3-6}\end{equation}

\begin{equation}\frac{\partial f^*}{\partial z} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} - i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-7}\end{equation}

Substituting $\eqref{eq:16-3-6}$ and $\eqref{eq:16-3-7}$ into $\eqref{eq:16-3-5}$:

\begin{equation}\frac{\partial h}{\partial z} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z} \label{eq:16-3-8}\end{equation}

Similarly, compute the Wirtinger derivative of $h$ with respect to $z^*$. From $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left(\frac{\partial h}{\partial x} + i\frac{\partial h}{\partial y}\right) \label{eq:16-3-9}\end{equation}

Substituting $\eqref{eq:16-3-3}$ and $\eqref{eq:16-3-4}$ into $\eqref{eq:16-3-9}$:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{1}{2}\left[\frac{\partial g}{\partial f}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) + \frac{\partial g}{\partial f^*}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right)\right] \label{eq:16-3-10}\end{equation}

From $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f}{\partial x} + i\frac{\partial f}{\partial y}\right) \label{eq:16-3-11}\end{equation}

\begin{equation}\frac{\partial f^*}{\partial z^*} = \frac{1}{2}\left(\frac{\partial f^*}{\partial x} + i\frac{\partial f^*}{\partial y}\right) \label{eq:16-3-12}\end{equation}

Substituting $\eqref{eq:16-3-11}$ and $\eqref{eq:16-3-12}$ into $\eqref{eq:16-3-10}$:

\begin{equation}\frac{\partial h}{\partial z^*} = \frac{\partial g}{\partial f}\frac{\partial f}{\partial z^*} + \frac{\partial g}{\partial f^*}\frac{\partial f^*}{\partial z^*} \label{eq:16-3-13}\end{equation}

Notes: Unlike the real chain rule, one must account for both paths through $f$ and $f^*$. When $f$ is holomorphic, $\partial f/\partial z^* = 0$ and $\partial f^*/\partial z = 0$, and the formula reduces to the ordinary chain rule.

16.4 Derivative of $\text{Tr}(\boldsymbol{X}^*)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{I}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{I}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{n \times n}$, $\boldsymbol{X}^* = \Re\boldsymbol{X} - i\Im\boldsymbol{X}$ (element-wise complex conjugate)
Proof

Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:

\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-4-1}\end{equation}

The complex conjugate is

\begin{equation}X_{ij}^* = (\Re X)_{ij} - i(\Im X)_{ij} \label{eq:16-4-2}\end{equation}

By the definition of the trace:

\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} X_{ii}^* \label{eq:16-4-3}\end{equation}

Substituting $\eqref{eq:16-4-2}$ into $\eqref{eq:16-4-3}$:

\begin{equation}\text{Tr}(\boldsymbol{X}^*) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} - i(\Im X)_{ii}\right] \label{eq:16-4-4}\end{equation}

Differentiating $\eqref{eq:16-4-4}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-4-5}\end{equation}

where $\delta_{kl}$ is the Kronecker delta.

Writing $\eqref{eq:16-4-5}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-4-6}\end{equation}

Similarly, differentiating $\eqref{eq:16-4-4}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} (-i)(\Im X)_{ii} = -i\delta_{kl} \label{eq:16-4-7}\end{equation}

Writing $\eqref{eq:16-4-7}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{I} \label{eq:16-4-8}\end{equation}

Multiplying both sides of $\eqref{eq:16-4-8}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{I} = \boldsymbol{I} \label{eq:16-4-9}\end{equation}

From $\eqref{eq:16-4-6}$ and $\eqref{eq:16-4-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have the same sign, giving $\boldsymbol{I}$ in both cases.

Notes: Since $\boldsymbol{X}^H = (\boldsymbol{X}^*)^\top$, we have $\text{Tr}(\boldsymbol{X}^H) = \text{Tr}(\boldsymbol{X}^*)$, so the same result holds.

16.5 Derivative of $\text{Tr}(\boldsymbol{X})$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Re\boldsymbol{X}} = \boldsymbol{I}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i\boldsymbol{I}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{n \times n}$
Proof

Decompose the entries of $\boldsymbol{X}$ into real and imaginary parts:

\begin{equation}X_{ij} = (\Re X)_{ij} + i(\Im X)_{ij} \label{eq:16-5-1}\end{equation}

By the definition of the trace:

\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} X_{ii} \label{eq:16-5-2}\end{equation}

Substituting $\eqref{eq:16-5-1}$ into $\eqref{eq:16-5-2}$:

\begin{equation}\text{Tr}(\boldsymbol{X}) = \sum_{i=0}^{n-1} \left[(\Re X)_{ii} + i(\Im X)_{ii}\right] \label{eq:16-5-3}\end{equation}

Differentiating $\eqref{eq:16-5-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i=0}^{n-1} (\Re X)_{ii} = \delta_{kl} \label{eq:16-5-4}\end{equation}

Writing $\eqref{eq:16-5-4}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Re\boldsymbol{X}} = \boldsymbol{I} \label{eq:16-5-5}\end{equation}

Similarly, differentiating $\eqref{eq:16-5-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i=0}^{n-1} i(\Im X)_{ii} = i\delta_{kl} \label{eq:16-5-6}\end{equation}

Writing $\eqref{eq:16-5-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i\boldsymbol{I} \label{eq:16-5-7}\end{equation}

Multiplying both sides of $\eqref{eq:16-5-7}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{X})}{\partial \Im\boldsymbol{X}} = i \cdot i\boldsymbol{I} = -\boldsymbol{I} \label{eq:16-5-8}\end{equation}

Comparing $\eqref{eq:16-5-5}$ and $\eqref{eq:16-5-8}$, the real-part and imaginary-part derivatives (after multiplication by $i$) have opposite signs. This contrasts with the case of $\text{Tr}(\boldsymbol{X}^*)$ in 16.4.

Notes: Note that the sign of the imaginary-part derivative differs between $\text{Tr}(\boldsymbol{X}^*)$ and $\text{Tr}(\boldsymbol{X})$. This affects the choice between the conjugate derivative and the generalized derivative.

16.6 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}$, $\quad i\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}$
Conditions: $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^H = (\boldsymbol{X}^*)^\top$ (Hermitian transpose)
Proof

By the definition of the Hermitian transpose:

\begin{equation}(\boldsymbol{X}^H)_{ij} = X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-6-1}\end{equation}

Expanding the trace in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} (\boldsymbol{X}^H)_{ji} = \sum_{i,j} A_{ij} X_{ij}^* \label{eq:16-6-2}\end{equation}

Rewriting $\eqref{eq:16-6-2}$ using the form in $\eqref{eq:16-6-1}$:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{i,j} A_{ij} \left[(\Re X)_{ij} - i(\Im X)_{ij}\right] \label{eq:16-6-3}\end{equation}

Differentiating $\eqref{eq:16-6-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ij} = A_{kl} \label{eq:16-6-4}\end{equation}

Writing $\eqref{eq:16-6-4}$ in matrix form:

\begin{equation}\left(\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}}\right)_{kl} = A_{kl} \label{eq:16-6-5}\end{equation}

Since $\eqref{eq:16-6-5}$ is simply the $(k,l)$ entry of $\boldsymbol{A}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A} \label{eq:16-6-6}\end{equation}

Similarly, differentiating $\eqref{eq:16-6-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ij} = -iA_{kl} \label{eq:16-6-7}\end{equation}

Writing $\eqref{eq:16-6-7}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A} \label{eq:16-6-8}\end{equation}

Multiplying both sides of $\eqref{eq:16-6-8}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = i \cdot (-i)\boldsymbol{A} = \boldsymbol{A} \label{eq:16-6-9}\end{equation}

From $\eqref{eq:16-6-6}$ and $\eqref{eq:16-6-9}$, the real-part and imaginary-part derivatives (after multiplication by $i$) both give $\boldsymbol{A}$.

Notes: The real-part and imaginary-part derivatives (after multiplication by $i$) both yield $\boldsymbol{A}$. Note: for a real matrix, $\partial\text{Tr}(\boldsymbol{A}\boldsymbol{X}^\top)/\partial\boldsymbol{X} = \boldsymbol{A}^\top$ involves a transpose, which reflects the structural difference between $X^\top$ and $X^H$. Since $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H) = \sum_{ij} A_{ij} X^*_{ij}$, differentiating with respect to $(\Re X)_{kl}$ yields $A_{kl}$, so the result is $\boldsymbol{A}$ without a transpose.

16.7 Derivative of $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}^\top$, $\quad i\displaystyle\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}^\top$
Conditions: $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^*$ (element-wise complex conjugate)
Proof

Expanding the trace in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} (\boldsymbol{X}^*)_{ji} = \sum_{i,j} A_{ij} X_{ji}^* \label{eq:16-7-1}\end{equation}

Writing the complex conjugate in terms of real and imaginary parts:

\begin{equation}X_{ji}^* = (\Re X)_{ji} - i(\Im X)_{ji} \label{eq:16-7-2}\end{equation}

Substituting $\eqref{eq:16-7-2}$ into $\eqref{eq:16-7-1}$:

\begin{equation}\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \sum_{i,j} A_{ij} \left[(\Re X)_{ji} - i(\Im X)_{ji}\right] \label{eq:16-7-3}\end{equation}

Differentiating $\eqref{eq:16-7-3}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} A_{ij} (\Re X)_{ji} = A_{lk} \label{eq:16-7-4}\end{equation}

Here, differentiating $(\Re X)_{ji}$ with respect to $(\Re X)_{kl}$ yields $\delta_{jk}\delta_{il}$, which upon substitution gives $A_{lk}$.

Writing $\eqref{eq:16-7-4}$ in matrix form, using $(\boldsymbol{A}^\top)_{kl} = A_{lk}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Re\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-5}\end{equation}

Similarly, differentiating $\eqref{eq:16-7-3}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} A_{ij} (-i)(\Im X)_{ji} = -iA_{lk} \label{eq:16-7-6}\end{equation}

Writing $\eqref{eq:16-7-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = -i\boldsymbol{A}^\top \label{eq:16-7-7}\end{equation}

Multiplying both sides of $\eqref{eq:16-7-7}$ by $i$:

\begin{equation}i \cdot \frac{\partial \text{Tr}(\boldsymbol{A}\boldsymbol{X}^*)}{\partial \Im\boldsymbol{X}} = \boldsymbol{A}^\top \label{eq:16-7-8}\end{equation}

From $\eqref{eq:16-7-5}$ and $\eqref{eq:16-7-8}$, the same result as in 16.6 is obtained.

Notes: The result is the same as for $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^H)$ but with a transpose. This can be understood from the identity $\text{Tr}(\boldsymbol{A}\boldsymbol{X}^*) = \text{Tr}(\boldsymbol{X}^*\boldsymbol{A}) = \text{Tr}((\boldsymbol{A}^\top\boldsymbol{X})^*)$.

16.8 Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X}$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$
Proof

By the cyclic property of the trace (1.12):

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \text{Tr}(\boldsymbol{X}^H\boldsymbol{X}) \label{eq:16-8-1}\end{equation}

Expanding $\eqref{eq:16-8-1}$ in terms of entries:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} X_{ij} X_{ij}^* = \sum_{i,j} |X_{ij}|^2 \label{eq:16-8-2}\end{equation}

$\eqref{eq:16-8-2}$ is the squared Frobenius norm:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2 \label{eq:16-8-3}\end{equation}

Writing the squared absolute value of a complex number in terms of real and imaginary parts:

\begin{equation}|X_{ij}|^2 = (\Re X_{ij})^2 + (\Im X_{ij})^2 \label{eq:16-8-4}\end{equation}

Substituting $\eqref{eq:16-8-4}$ into $\eqref{eq:16-8-2}$:

\begin{equation}\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \sum_{i,j} \left[(\Re X_{ij})^2 + (\Im X_{ij})^2\right] \label{eq:16-8-5}\end{equation}

Differentiating $\eqref{eq:16-8-5}$ with respect to the real part: for the $(k, l)$ entry,

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Re X)_{kl}} = \frac{\partial}{\partial (\Re X)_{kl}} \sum_{i,j} (\Re X_{ij})^2 = 2(\Re X)_{kl} \label{eq:16-8-6}\end{equation}

Writing $\eqref{eq:16-8-6}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X} \label{eq:16-8-7}\end{equation}

Similarly, differentiating $\eqref{eq:16-8-5}$ with respect to the imaginary part:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial (\Im X)_{kl}} = \frac{\partial}{\partial (\Im X)_{kl}} \sum_{i,j} (\Im X_{ij})^2 = 2(\Im X)_{kl} \label{eq:16-8-8}\end{equation}

Writing $\eqref{eq:16-8-8}$ in matrix form:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-8-9}\end{equation}

Notes: $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2$ is the squared Frobenius norm. This derivative appears frequently in matrix least-squares problems.

16.9 Wirtinger Derivative of $\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)$

Formula: $\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \boldsymbol{X}^*$, $\quad\displaystyle\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \boldsymbol{X}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$
Proof

We use the matrix-extended Wirtinger derivative definition from $\eqref{eq:16-1-13}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} - i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-1}\end{equation}

From $\eqref{eq:16-8-7}$ and $\eqref{eq:16-8-9}$ in 16.8:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Re\boldsymbol{X}} = 2\Re\boldsymbol{X}, \quad \frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \Im\boldsymbol{X}} = 2\Im\boldsymbol{X} \label{eq:16-9-2}\end{equation}

Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-1}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \frac{1}{2}\left(2\Re\boldsymbol{X} - i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-3}\end{equation}

By the definition of the complex conjugate:

\begin{equation}\boldsymbol{X}^* = \Re\boldsymbol{X} - i\Im\boldsymbol{X} \label{eq:16-9-4}\end{equation}

From $\eqref{eq:16-9-3}$ and $\eqref{eq:16-9-4}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}} = \boldsymbol{X}^* \label{eq:16-9-5}\end{equation}

Similarly, using the conjugate derivative definition from $\eqref{eq:16-1-14}$ in 16.1:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(\frac{\partial f}{\partial \Re\boldsymbol{X}} + i\frac{\partial f}{\partial \Im\boldsymbol{X}}\right) \label{eq:16-9-6}\end{equation}

Substituting $\eqref{eq:16-9-2}$ into $\eqref{eq:16-9-6}$:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \frac{1}{2}\left(2\Re\boldsymbol{X} + i \cdot 2\Im\boldsymbol{X}\right) = \Re\boldsymbol{X} + i\Im\boldsymbol{X} = \boldsymbol{X} \label{eq:16-9-7}\end{equation}

Notes: Wirtinger derivatives allow the derivative of a complex matrix function to be expressed directly, without decomposing into real and imaginary parts.

16.10 Complex Gradient of the Frobenius Norm

Formula: $\nabla\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \nabla\|\boldsymbol{X}\|_F^2 = 2\boldsymbol{X}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$
Proof

$\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = \|\boldsymbol{X}\|_F^2$ is a real-valued function:

\begin{equation}f = \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) \in \mathbb{R} \label{eq:16-10-1}\end{equation}

From $\eqref{eq:16-2-4}$ in 16.2, the complex gradient of a real-valued function is

\begin{equation}\nabla f = 2\frac{\partial f}{\partial \boldsymbol{X}^*} \label{eq:16-10-2}\end{equation}

From $\eqref{eq:16-9-7}$ in 16.9:

\begin{equation}\frac{\partial \text{Tr}(\boldsymbol{X}\boldsymbol{X}^H)}{\partial \boldsymbol{X}^*} = \boldsymbol{X} \label{eq:16-10-3}\end{equation}

Substituting $\eqref{eq:16-10-3}$ into $\eqref{eq:16-10-2}$:

\begin{equation}\nabla\text{Tr}(\boldsymbol{X}\boldsymbol{X}^H) = 2\boldsymbol{X} \label{eq:16-10-4}\end{equation}

By $\eqref{eq:16-10-1}$, $\eqref{eq:16-10-4}$ gives the complex gradient of the Frobenius norm:

\begin{equation}\nabla\|\boldsymbol{X}\|_F^2 = 2\boldsymbol{X} \label{eq:16-10-5}\end{equation}

Notes: This shows that the complex gradient of the Frobenius norm $\|\boldsymbol{X}\|_F^2$ is $2\boldsymbol{X}$. It arises frequently in complex optimization problems such as low-rank approximation of complex matrices.

16.11 Derivative of $\det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})$

Formula: $\displaystyle\frac{\partial \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})}{\partial \boldsymbol{X}} = \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})\left((\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})^{-1}\boldsymbol{X}^H\boldsymbol{A}\right)^\top$
$\displaystyle\frac{\partial \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})}{\partial \boldsymbol{X}^*} = \det(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})\boldsymbol{A}\boldsymbol{X}(\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X})^{-1}$
Conditions: $\boldsymbol{X} \in \mathbb{C}^{m \times n}$, $\boldsymbol{A}$ is a constant matrix, $\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}$ is invertible
Proof

Define the auxiliary matrix $\boldsymbol{M}$:

\begin{equation}\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X} \label{eq:16-11-1}\end{equation}

By the determinant differential formula:

\begin{equation}d(\det\boldsymbol{M}) = \det(\boldsymbol{M})\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) \label{eq:16-11-2}\end{equation}

Differentiating $\eqref{eq:16-11-1}$ using the product rule (1.25):

\begin{equation}d\boldsymbol{M} = d(\boldsymbol{X}^H)\boldsymbol{A}\boldsymbol{X} + \boldsymbol{X}^H\boldsymbol{A}(d\boldsymbol{X}) \label{eq:16-11-3}\end{equation}

Substituting $\eqref{eq:16-11-3}$ into $\eqref{eq:16-11-2}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-4}\end{equation}

Applying the cyclic property of the trace (1.12) to the first term of $\eqref{eq:16-11-4}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) \label{eq:16-11-5}\end{equation}

Combining $\eqref{eq:16-11-4}$ and $\eqref{eq:16-11-5}$:

\begin{equation}\text{Tr}(\boldsymbol{M}^{-1}d\boldsymbol{M}) = \text{Tr}(\boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1}d\boldsymbol{X}^H) + \text{Tr}(\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A}\,d\boldsymbol{X}) \label{eq:16-11-6}\end{equation}

In the Wirtinger framework, the term corresponding to $d\boldsymbol{X}^H$ yields the coefficient of $\partial/\partial\boldsymbol{X}^*$, and the term corresponding to $d\boldsymbol{X}$ yields the coefficient of $\partial/\partial\boldsymbol{X}$.

Reading off the $\partial/\partial\boldsymbol{X}^*$ derivative from the first term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{B}d\boldsymbol{X}^H) = \text{Tr}(d\boldsymbol{X}^H\boldsymbol{B})$ has coefficient matrix $\boldsymbol{B}^\top$,

\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}^*} = \det\boldsymbol{M} \cdot \boldsymbol{A}\boldsymbol{X}\boldsymbol{M}^{-1} \label{eq:16-11-7}\end{equation}

Reading off the $\partial/\partial\boldsymbol{X}$ derivative from the second term of $\eqref{eq:16-11-6}$: since $\text{Tr}(\boldsymbol{C}d\boldsymbol{X})$ has coefficient matrix $\boldsymbol{C}^\top$,

\begin{equation}\frac{\partial \det\boldsymbol{M}}{\partial \boldsymbol{X}} = \det\boldsymbol{M} \cdot (\boldsymbol{M}^{-1}\boldsymbol{X}^H\boldsymbol{A})^\top \label{eq:16-11-8}\end{equation}

Substituting $\boldsymbol{M} = \boldsymbol{X}^H\boldsymbol{A}\boldsymbol{X}$ yields the final result.

Notes: This formula is used in complex matrix eigenvalue problems and in capacity maximization problems in MIMO communications.

16.12 Derivative of the Complex Rayleigh Quotient

Formula: $\displaystyle\frac{\partial}{\partial \boldsymbol{x}}\displaystyle\frac{(\boldsymbol{A}\boldsymbol{x})^H(\boldsymbol{A}\boldsymbol{x})}{(\boldsymbol{B}\boldsymbol{x})^H(\boldsymbol{B}\boldsymbol{x})} = 2\displaystyle\frac{\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} - 2\displaystyle\frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x} \cdot \boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}}{(\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^2}$
Conditions: $\boldsymbol{A}, \boldsymbol{B}$ are Hermitian matrices, $\boldsymbol{B}$ is positive definite, $\boldsymbol{x} \in \mathbb{C}^n$
Proof

Define the Rayleigh quotient $R(\boldsymbol{x})$:

\begin{equation}R(\boldsymbol{x}) = \frac{(\boldsymbol{A}\boldsymbol{x})^H(\boldsymbol{A}\boldsymbol{x})}{(\boldsymbol{B}\boldsymbol{x})^H(\boldsymbol{B}\boldsymbol{x})} = \frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} \label{eq:16-12-1}\end{equation}

Define the numerator and denominator separately:

\begin{equation}f = \boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}, \quad g = \boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x} \label{eq:16-12-2}\end{equation}

Applying the quotient rule (1.28):

\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{\partial}{\partial \boldsymbol{x}}\left(\frac{f}{g}\right) = \frac{1}{g}\frac{\partial f}{\partial \boldsymbol{x}} - \frac{f}{g^2}\frac{\partial g}{\partial \boldsymbol{x}} \label{eq:16-12-3}\end{equation}

Compute the Wirtinger derivative of the Hermitian quadratic form $\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x}$ (where $\boldsymbol{M}$ is Hermitian). Setting $\boldsymbol{M} = \boldsymbol{A}^H\boldsymbol{A}$:

\begin{equation}\frac{\partial (\boldsymbol{x}^H\boldsymbol{M}\boldsymbol{x})}{\partial \boldsymbol{x}} = (\boldsymbol{M}\boldsymbol{x})^* \label{eq:16-12-4}\end{equation}

Applying $\eqref{eq:16-12-4}$ to $f$ and $g$:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = (\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* \label{eq:16-12-5}\end{equation}

\begin{equation}\frac{\partial g}{\partial \boldsymbol{x}} = (\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-6}\end{equation}

Using the complex gradient definition $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$, the Wirtinger derivative $\partial R/\partial \boldsymbol{x}$ is

\begin{equation}\frac{\partial R}{\partial \boldsymbol{x}} = \frac{1}{g}(\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x})^* - \frac{f}{g^2}(\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^* \label{eq:16-12-7}\end{equation}

Computing the complex gradient $\nabla R = 2\partial R/\partial \boldsymbol{x}^*$: note that for a real-valued function $\partial R/\partial \boldsymbol{x}^* \neq (\partial R/\partial \boldsymbol{x})^*$ in general; direct computation gives

\begin{equation}\nabla R = 2\frac{\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x}}{\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}} - 2\frac{\boldsymbol{x}^H\boldsymbol{A}^H\boldsymbol{A}\boldsymbol{x} \cdot \boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x}}{(\boldsymbol{x}^H\boldsymbol{B}^H\boldsymbol{B}\boldsymbol{x})^2} \label{eq:16-12-8}\end{equation}

Notes: This formula is used in iterative solvers for the generalized eigenvalue problem (a complex extension of the power method). The stationary points of $R(\boldsymbol{x})$ correspond to generalized eigenvectors.

16.13 Derivative of the Complex Quadratic Form $(a - \boldsymbol{x}^H \boldsymbol{b})^2$

Formula: $\displaystyle\frac{\partial (a - \boldsymbol{x}^H \boldsymbol{b})^2}{\partial \boldsymbol{x}} = -2\bar{\boldsymbol{b}}(a - \boldsymbol{x}^H \boldsymbol{b})^*$
Conditions: $\boldsymbol{x}, \boldsymbol{b} \in \mathbb{C}^n$, $a \in \mathbb{C}$
Proof

Define the auxiliary variable $z$:

\begin{equation}z = a - \boldsymbol{x}^H \boldsymbol{b} = a - \sum_{i=0}^{n-1} \bar{x}_i b_i \label{eq:16-13-1}\end{equation}

Define the scalar function $f$:

\begin{equation}f = z^2 = (a - \boldsymbol{x}^H \boldsymbol{b})^2 \label{eq:16-13-2}\end{equation}

In the Wirtinger framework, $\boldsymbol{x}$ and $\bar{\boldsymbol{x}}$ are treated as independent variables. From $\eqref{eq:16-13-1}$, $z$ depends on $\bar{x}_k$ but not directly on $x_k$:

\begin{equation}\frac{\partial z}{\partial \bar{x}_k} = -b_k \label{eq:16-13-3}\end{equation}

\begin{equation}\frac{\partial z}{\partial x_k} = 0 \label{eq:16-13-4}\end{equation}

Differentiating $f = z^2$ by the chain rule (1.26):

\begin{equation}\frac{\partial f}{\partial \bar{x}_k} = \frac{\partial (z^2)}{\partial z} \cdot \frac{\partial z}{\partial \bar{x}_k} = 2z \cdot (-b_k) = -2b_k z \label{eq:16-13-5}\end{equation}

Writing $\eqref{eq:16-13-5}$ in vector form:

\begin{equation}\frac{\partial f}{\partial \bar{\boldsymbol{x}}} = -2\boldsymbol{b}z \label{eq:16-13-6}\end{equation}

Using the standard Wirtinger identity:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{\frac{\partial f}{\partial \bar{\boldsymbol{x}}}} \label{eq:16-13-7}\end{equation}

Substituting $\eqref{eq:16-13-6}$ into $\eqref{eq:16-13-7}$:

\begin{equation}\frac{\partial f}{\partial \boldsymbol{x}} = \overline{-2\boldsymbol{b}z} = -2\bar{\boldsymbol{b}} \bar{z} \label{eq:16-13-8}\end{equation}

Since $\bar{z} = z^*$:

\begin{equation}\frac{\partial (a - \boldsymbol{x}^H \boldsymbol{b})^2}{\partial \boldsymbol{x}} = -2\bar{\boldsymbol{b}} (a - \boldsymbol{x}^H \boldsymbol{b})^* \label{eq:16-13-9}\end{equation}

Notes: When $\boldsymbol{b}$ is a real vector, $\bar{\boldsymbol{b}} = \boldsymbol{b}$, and the result simplifies to $-2\boldsymbol{b}(a - \boldsymbol{x}^H \boldsymbol{b})^*$.

References

  • Petersen, K. B., & Pedersen, M. S. (2012). The Matrix Cookbook. Technical University of Denmark.
  • Magnus, J. R., & Neudecker, H. (1999). Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised ed.). Wiley.
  • Matrix calculus - Wikipedia