BCH Formula and Perturbation

Lie-2: BCH Formula and Perturbation (Differentiation on Lie Groups)

Notation
Formulas on this page follow the denominator layout. For the basics of Lie groups and Lie algebras, see Lie Algebra Intro, Chapter 6. The differential of the exponential map itself (dexp/dlog, SO(3)/SE(3)) is separated into Lie-1.

BCH Formula and Perturbation

Starting from the differentials of the Lie bracket and the adjoint representation, we derive the Baker-Campbell-Hausdorff (BCH) formula and its differential, as well as the Jacobi identity, the Killing form, and the Maurer-Cartan form. The proof numbers are kept from the original Chapter 11.

11.12 Differential of the Lie Bracket

Formula: $\displaystyle\dfrac{\partial}{\partial X_{ij}} [X, Y] = [J^{ij}, Y]$, $\displaystyle\dfrac{\partial}{\partial Y_{ij}} [X, Y] = [X, J^{ij}]$
Conditions: $X, Y \in \mathbb{R}^{n \times n}$, $[X, Y] = XY - YX$ (Lie bracket / commutator), $J^{ij}$ is the single-entry matrix with a $1$ only in position $(i,j)$.
Proof

By the definition of the Lie bracket (commutator),

\begin{equation}[X, Y] = XY - YX \label{eq:11-12-1}\end{equation}

Differentiate with respect to $X_{ij}$, assuming $Y$ does not depend on $X$.

\begin{equation}\dfrac{\partial [X, Y]}{\partial X_{ij}} = \dfrac{\partial (XY)}{\partial X_{ij}} - \dfrac{\partial (YX)}{\partial X_{ij}} \label{eq:11-12-2}\end{equation}

From 4.3, $\displaystyle\dfrac{\partial X}{\partial X_{ij}} = J^{ij}$.

\begin{equation}\dfrac{\partial (XY)}{\partial X_{ij}} = J^{ij} Y \label{eq:11-12-3}\end{equation}

\begin{equation}\dfrac{\partial (YX)}{\partial X_{ij}} = Y J^{ij} \label{eq:11-12-4}\end{equation}

Substituting $\eqref{eq:11-12-3}$ and $\eqref{eq:11-12-4}$ into $\eqref{eq:11-12-2}$,

\begin{equation}\dfrac{\partial [X, Y]}{\partial X_{ij}} = J^{ij} Y - Y J^{ij} = [J^{ij}, Y] \label{eq:11-12-5}\end{equation}

Similarly, differentiating with respect to $Y_{ij}$,

\begin{equation}\dfrac{\partial [X, Y]}{\partial Y_{ij}} = X J^{ij} - J^{ij} X = [X, J^{ij}] \label{eq:11-12-6}\end{equation}

Note: The Lie bracket is bilinear and antisymmetric, $[X, Y] = -[Y, X]$, and satisfies the Jacobi identity $[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0$. This differential formula is important in optimization problems on a Lie algebra $\mathfrak{g}$ (e.g., geodesic computation).

11.13 Differential of the Adjoint Representation

Formula: $\displaystyle\dfrac{d}{dt} \text{Ad}_{e^{tX}}(Y) \Big|_{t=0} = \text{ad}_X(Y) = [X, Y]$
Conditions: $X, Y \in \mathfrak{g}$ (Lie algebra), $\text{Ad}_g(Y) = gYg^{-1}$ (adjoint representation), $\text{ad}_X(Y) = [X, Y]$ (adjoint action).
Proof

The adjoint representation of a Lie group $G$ is, for $g \in G$ and $Y \in \mathfrak{g}$,

\begin{equation}\text{Ad}_g(Y) = gYg^{-1} \label{eq:11-13-1}\end{equation}

Set $g = e^{tX}$ and compute the derivative at $t = 0$.

\begin{equation}\dfrac{d}{dt} \text{Ad}_{e^{tX}}(Y) = \dfrac{d}{dt} \left( e^{tX} Y e^{-tX} \right) \label{eq:11-13-2}\end{equation}

Applying the product rule, with $\displaystyle\dfrac{d}{dt} e^{tX} = X e^{tX}$ and $\displaystyle\dfrac{d}{dt} e^{-tX} = -X e^{-tX}$,

\begin{equation}\dfrac{d}{dt} \left( e^{tX} Y e^{-tX} \right) = X e^{tX} Y e^{-tX} + e^{tX} Y (-X) e^{-tX} \label{eq:11-13-3}\end{equation}

\begin{equation}= X e^{tX} Y e^{-tX} - e^{tX} Y X e^{-tX} \label{eq:11-13-4}\end{equation}

Evaluate at $t = 0$. Since $e^{0 \cdot X} = I$,

\begin{equation}\dfrac{d}{dt} \text{Ad}_{e^{tX}}(Y) \Big|_{t=0} = XY - YX = [X, Y] \label{eq:11-13-5}\end{equation}

This is the definition of the adjoint action $\text{ad}_X$ of the Lie algebra.

\begin{equation}\text{ad}_X(Y) = [X, Y] \label{eq:11-13-6}\end{equation}

Note: The adjoint representation $\text{Ad}: G \to \text{GL}(\mathfrak{g})$ is a homomorphism from the Lie group to the automorphism group of the Lie algebra. Its differential is the adjoint action $\text{ad}: \mathfrak{g} \to \text{End}(\mathfrak{g})$. This means $\text{ad}_X$ is a linear transformation on $\mathfrak{g}$ and admits a matrix representation.

11.14 Exponential of the Adjoint Action

Formula: $\displaystyle e^{\text{ad}_X}(Y) = \text{Ad}_{e^X}(Y) = e^X Y e^{-X}$
Conditions: $X, Y \in \mathfrak{g}$ (Lie algebra), $e^{\text{ad}_X} = \displaystyle\sum_{k=0}^{\infty} \displaystyle\dfrac{(\text{ad}_X)^k}{k!}$
Proof

The $k$-th power of $\text{ad}_X$ is expressed as a nested Lie bracket.

\begin{equation}(\text{ad}_X)^k(Y) = \underbrace{[X, [X, \cdots [X}_{k \text{ times}}, Y] \cdots ]] \label{eq:11-14-1}\end{equation}

Let $f(t) = e^{tX} Y e^{-tX}$ and expand it in a Taylor series.

\begin{equation}f(t) = \displaystyle\sum_{k=0}^{\infty} \dfrac{t^k}{k!} f^{(k)}(0) \label{eq:11-14-2}\end{equation}

We have $f(0) = Y$. Since $f'(t) = X f(t) - f(t) X = [X, f(t)]$,

\begin{equation}f'(0) = [X, Y] = \text{ad}_X(Y) \label{eq:11-14-3}\end{equation}

By induction, $f^{(k)}(0) = (\text{ad}_X)^k(Y)$.

\begin{equation}f^{(k)}(t) = [X, f^{(k-1)}(t)] \label{eq:11-14-4}\end{equation}

\begin{equation}f^{(k)}(0) = [X, f^{(k-1)}(0)] = \text{ad}_X(f^{(k-1)}(0)) = (\text{ad}_X)^k(Y) \label{eq:11-14-5}\end{equation}

Evaluating at $t = 1$,

\begin{equation}e^X Y e^{-X} = f(1) = \displaystyle\sum_{k=0}^{\infty} \dfrac{(\text{ad}_X)^k(Y)}{k!} = e^{\text{ad}_X}(Y) \label{eq:11-14-6}\end{equation}

Note: This formula can be written as $\text{Ad}_{e^X} = e^{\text{ad}_X}$, a fundamental relation connecting the exponential map of the Lie group with the adjoint action of the Lie algebra. Expanded, $e^X Y e^{-X} = Y + [X, Y] + \displaystyle\dfrac{1}{2!}[X, [X, Y]] + \displaystyle\dfrac{1}{3!}[X, [X, [X, Y]]] + \cdots$.

11.15 Baker-Campbell-Hausdorff Formula

Formula: $\displaystyle e^X e^Y = e^{Z}$, where $Z = X + Y + \displaystyle\dfrac{1}{2}[X, Y] + \displaystyle\dfrac{1}{12}[X, [X, Y]] - \displaystyle\dfrac{1}{12}[Y, [X, Y]] + \cdots$
Conditions: $X, Y \in \mathfrak{g}$ (Lie algebra), with $\|X\|, \|Y\|$ sufficiently small.
Proof

[Strategy] We derive the result two ways. First, (A) expand the series directly to watch the low-order terms reorganize from plain matrix products into commutators (most concrete). Then, (B) use a differential equation to systematize this to all orders.

[(A) Low-order terms by direct expansion]

Expand $e^X$ and $e^Y$ as power series and multiply.

\begin{equation}e^X e^Y = \Bigl(I + X + \tfrac{1}{2}X^2 + \cdots\Bigr)\Bigl(I + Y + \tfrac{1}{2}Y^2 + \cdots\Bigr) \label{eq:11-15-a1}\end{equation}

Collecting terms up to second order,

\begin{equation}e^X e^Y = I + (X + Y) + \Bigl(\tfrac{1}{2}X^2 + XY + \tfrac{1}{2}Y^2\Bigr) + O(3) \label{eq:11-15-a2}\end{equation}

To obtain $Z = \log(e^X e^Y)$, set $W = e^X e^Y - I$ and use $\log(I + W) = W - \tfrac{1}{2}W^2 + \tfrac{1}{3}W^3 - \cdots$. The lowest-order part of $W$ is $X + Y$, so the contribution of $W^2$ up to second order is

\begin{equation}W^2 = (X + Y)^2 + O(3) = X^2 + XY + YX + Y^2 + O(3) \label{eq:11-15-a3}\end{equation}

Substituting $\eqref{eq:11-15-a2}$ and $\eqref{eq:11-15-a3}$ into $\log(I+W) = W - \tfrac{1}{2}W^2 + \cdots$,

\begin{equation}Z = (X + Y) + \Bigl(\tfrac{1}{2}X^2 + XY + \tfrac{1}{2}Y^2\Bigr) - \tfrac{1}{2}\bigl(X^2 + XY + YX + Y^2\bigr) + O(3) \label{eq:11-15-a4}\end{equation}

The $X^2$ and $Y^2$ terms cancel exactly, leaving only $XY$ and $YX$.

\begin{equation}Z = X + Y + \Bigl(XY - \tfrac{1}{2}XY - \tfrac{1}{2}YX\Bigr) + O(3) = X + Y + \tfrac{1}{2}[X, Y] + O(3) \label{eq:11-15-a5}\end{equation}

The second-order term collapses into the commutator $[X, Y]$—this is the heart of the BCH formula. Continuing the same computation to third order gives $Z_3 = \tfrac{1}{12}\bigl(X^2Y + XY^2 - 2XYX + Y^2X + YX^2 - 2YXY\bigr)$ (as listed e.g. on Wikipedia), which likewise reorganizes into nested commutators.

\begin{equation}Z_3 = \tfrac{1}{12}\bigl([X, [X, Y]] + [Y, [Y, X]]\bigr) \label{eq:11-15-a6}\end{equation}

That every order is expressible using commutators alone holds in general (existence theorem). The reason the coefficient $\tfrac{1}{12}$ appears is explained naturally in (B) below, via the Bernoulli number $B_2 = \tfrac{1}{6}$.

[(B) Differential equation giving all orders]

Tracking low-order terms by hand as in (A) becomes rapidly tedious from fourth order on. We therefore introduce $t$, derive the differential equation satisfied by $Z(t) = \log(e^{tX} e^Y)$, and handle all orders at once. Differentiate both sides of $e^{Z(t)} = e^{tX} e^Y$ with respect to $t$.

\begin{equation}\dfrac{d}{dt} e^{Z(t)} = X e^{tX} e^Y = X e^{Z(t)} \label{eq:11-15-1}\end{equation}

By the differential formula for the matrix exponential (11.7, right trivialization $d(e^{Z})\,e^{-Z} = \dfrac{e^{\text{ad}_Z}-1}{\text{ad}_Z}(dZ)$),

\begin{equation}\dfrac{d}{dt} e^{Z(t)} = \dfrac{e^{\text{ad}_{Z(t)}} - 1}{\text{ad}_{Z(t)}} \left( \dfrac{dZ}{dt} \right) e^{Z(t)} \label{eq:11-15-2}\end{equation}

Comparing $\eqref{eq:11-15-1}$ and $\eqref{eq:11-15-2}$,

\begin{equation}\dfrac{e^{\text{ad}_{Z(t)}} - 1}{\text{ad}_{Z(t)}} \left( \dfrac{dZ}{dt} \right) = X \label{eq:11-15-3}\end{equation}

Apply the inverse operator $\displaystyle\dfrac{\text{ad}_{Z(t)}}{e^{\text{ad}_{Z(t)}} - 1}$ to both sides. This function expands as $\displaystyle\dfrac{z}{e^{z} - 1} = \displaystyle\sum_{k=0}^{\infty} \displaystyle\dfrac{B_k}{k!} z^k$ ($B_k$ the Bernoulli numbers, $B_0=1,\ B_1=-\tfrac{1}{2},\ B_2=\tfrac{1}{6},\ B_4=-\tfrac{1}{30},\dots$).

\begin{equation}\dfrac{dZ}{dt} = \dfrac{\text{ad}_{Z(t)}}{e^{\text{ad}_{Z(t)}} - 1}(X) \label{eq:11-15-4}\end{equation}

The initial condition is $Z(0) = Y$. Substituting the expansion $\displaystyle\dfrac{z}{e^{z}-1} = 1 - \dfrac{z}{2} + \dfrac{z^2}{12} + O(z^4)$ into the right-hand side of $\eqref{eq:11-15-4}$ and evaluating at $t = 0$ (where $Z = Y$), the initial rate of change is

\begin{equation}\left.\dfrac{dZ}{dt}\right|_{t=0} = \dfrac{\text{ad}_Y}{e^{\text{ad}_Y} - 1}(X) = X - \dfrac{1}{2}[Y, X] + \dfrac{1}{12}[Y, [Y, X]] + O(4) \label{eq:11-15-5}\end{equation}

\begin{equation}= X + \dfrac{1}{2}[X, Y] + \dfrac{1}{12}[Y, [Y, X]] + O(4) \label{eq:11-15-5b}\end{equation}

Here one sees the coefficient $\tfrac{1}{2}[X,Y]$ agreeing with $\eqref{eq:11-15-a5}$ of (A), and the $\tfrac{1}{12}$ arising from the Bernoulli number $B_2 = \tfrac{1}{6}$.

[Result at $t=1$ (up to fifth order)]

Integrating from $t = 0$ to $t = 1$ determines $Z = Z(1)$. Carrying out the order-by-order expansion and integration (mechanical, but with many terms), the complete expansion up to fifth order is the following known result (Dynkin series). Orders 1–4 are $\eqref{eq:11-15-7}$–$\eqref{eq:11-15-8}$ and the fifth order is $\eqref{eq:11-15-9}$–$\eqref{eq:11-15-11}$; the latter is the asymmetric nested-bracket form of $C_5$ below.

\begin{equation}Z = X + Y + \dfrac{1}{2}[X, Y] + \dfrac{1}{12}\bigl([X, [X, Y]] + [Y, [Y, X]]\bigr) \label{eq:11-15-7}\end{equation}

\begin{equation}- \dfrac{1}{24}[Y, [X, [X, Y]]] \label{eq:11-15-8}\end{equation}

\begin{equation}- \dfrac{1}{720}\bigl([Y, [Y, [Y, [Y, X]]]] + [X, [X, [X, [X, Y]]]]\bigr) \label{eq:11-15-9}\end{equation}

\begin{equation}+ \dfrac{1}{360}\bigl([X, [Y, [Y, [Y, X]]]] + [Y, [X, [X, [X, Y]]]]\bigr) \label{eq:11-15-10}\end{equation}

\begin{equation}+ \dfrac{1}{120}\bigl([Y, [X, [Y, [X, Y]]]] + [X, [Y, [X, [Y, X]]]]\bigr) + O(6) \label{eq:11-15-11}\end{equation}

[Dynkin form (symmetric form)]

The BCH formula has a closed expression due to Dynkin. Let $C_n$ be the term consisting of $n$-th order Lie brackets; then

\begin{equation}Z = \displaystyle\sum_{n=1}^{\infty} C_n \label{eq:11-15-12}\end{equation}

where

\begin{equation}C_1 = X + Y \label{eq:11-15-13}\end{equation}

\begin{equation}C_2 = \dfrac{1}{2}[X, Y] \label{eq:11-15-14}\end{equation}

\begin{equation}C_3 = \dfrac{1}{12}\bigl([X, [X, Y]] - [Y, [X, Y]]\bigr) \label{eq:11-15-15}\end{equation}

\begin{equation}C_4 = -\dfrac{1}{24}[X, [Y, [X, Y]]] \label{eq:11-15-16}\end{equation}

(Note: we used $[Y, [X, [X, Y]]] = -[X, [Y, [X, Y]]]$ via the Jacobi identity.)

[Fifth- and sixth-order terms]

\begin{equation}C_5 = -\dfrac{1}{720}\bigl([X,[X,[X,[X,Y]]]] + [Y,[Y,[Y,[Y,X]]]]\bigr) \label{eq:11-15-17}\end{equation}

\begin{equation}+ \dfrac{1}{360}\bigl([X,[Y,[Y,[Y,X]]]] + [Y,[X,[X,[X,Y]]]]\bigr) \label{eq:11-15-18}\end{equation}

\begin{equation}+ \dfrac{1}{120}\bigl([X,[Y,[X,[Y,X]]]] + [Y,[X,[Y,[X,Y]]]]\bigr) \label{eq:11-15-19}\end{equation}

\begin{equation}C_6 = \dfrac{1}{720}\bigl([X,[Y,[X,[X,[X,Y]]]]] - [Y,[X,[Y,[Y,[Y,X]]]]]\bigr) \label{eq:11-15-20}\end{equation}

\begin{equation}\quad + \dfrac{1}{240}[X,[Y,[Y,[X,[X,Y]]]]] \label{eq:11-15-20b}\end{equation}

\begin{equation}\quad + \dfrac{1}{1440}\bigl([X,[X,[Y,[X,[Y,X]]]]] - [Y,[Y,[X,[Y,[X,Y]]]]]\bigr) \label{eq:11-15-21}\end{equation}

\begin{equation}\quad - \dfrac{1}{720}[X,[X,[Y,[Y,[X,Y]]]]] \label{eq:11-15-21b}\end{equation}

Note: The BCH formula is an infinite series, and every term is expressed as nested Lie brackets. In the commutative case ($[X, Y] = 0$) it reduces to $e^X e^Y = e^{X+Y}$. On nilpotent Lie algebras (e.g., upper-triangular matrices) it truncates to finitely many terms. The coefficients are computed from Bernoulli numbers and combinatorial factors. Goldberg (1956) showed that the number of $n$-th order terms is bounded above by $\displaystyle\dfrac{2^n - 2}{n}$.
On convergence: $Z$ is always defined as a formal power series (its existence in the free Lie algebra is guaranteed independently of the coefficient computation). When summed analytically for matrices, it converges absolutely under a condition such as $\|X\| + \|Y\| < \ln 2$ in operator norm.
Non-uniqueness of higher-order terms: The free Lie algebra carries relations coming from the Jacobi identity, so the same element can be written with different nested commutators. The $C_n$ on this page (especially fifth and sixth order) are one standard choice; a different basis (e.g., a Hall basis) yields a different-looking expression.
References: H.F. Baker (1905) "Alternants and continuous groups"; J.E. Campbell (1897) "On a law of combination of operators"; F. Hausdorff (1906) "Die symbolische Exponentialformel in der Gruppentheorie". The Dynkin form is due to E.B. Dynkin (1947). For explicit higher-order terms see M.W. Reinsch (2000), "A simple expression for the terms in the Baker-Campbell-Hausdorff series", J. Math. Phys. 41, 2434. For a modern treatment see B.C. Hall, "Lie Groups, Lie Algebras, and Representations", Ch.5. The fifth- and sixth-order terms on this page have been verified by computer algebra.

11.16 Differential of the BCH Formula

Formula: $\displaystyle\dfrac{\partial Z}{\partial X} = \dfrac{\text{ad}_Z}{e^{\text{ad}_Z} - 1} \cdot \dfrac{e^{\text{ad}_X} - 1}{\text{ad}_X}$, $\displaystyle\dfrac{\partial Z}{\partial Y} = \dfrac{\text{ad}_Z}{e^{\text{ad}_Z} - 1}\, e^{\text{ad}_X}\, \dfrac{e^{\text{ad}_Y} - 1}{\text{ad}_Y}$
Conditions: $Z = \log(e^X e^Y)$ (BCH formula), where $\text{ad}_Z$ is the adjoint action by $Z$.
Proof

Differentiate $e^Z = e^X e^Y$ with respect to $X$ (using the right trivialization $d(e^A)=\dfrac{e^{\text{ad}_A}-1}{\text{ad}_A}(dA)\,e^A$). The right-hand side is

\begin{equation}\dfrac{\partial}{\partial X}(e^X e^Y) = \dfrac{e^{\text{ad}_X} - 1}{\text{ad}_X}(dX) \cdot e^X e^Y \label{eq:11-16-1}\end{equation}

The left-hand side is

\begin{equation}\dfrac{\partial}{\partial X} e^Z = \dfrac{e^{\text{ad}_Z} - 1}{\text{ad}_Z}(dZ) \cdot e^Z \label{eq:11-16-2}\end{equation}

Since $e^X e^Y = e^Z$, equate $\eqref{eq:11-16-1}$ and $\eqref{eq:11-16-2}$.

\begin{equation}\dfrac{e^{\text{ad}_Z} - 1}{\text{ad}_Z}(dZ) = \dfrac{e^{\text{ad}_X} - 1}{\text{ad}_X}(dX) \label{eq:11-16-3}\end{equation}

Apply $\displaystyle\dfrac{\text{ad}_Z}{e^{\text{ad}_Z} - 1}$ to both sides. Writing $\dfrac{e^{\text{ad}_A}-1}{\text{ad}_A}=\text{dexp}_A$ and $\dfrac{\text{ad}_A}{e^{\text{ad}_A}-1}=\text{dexp}_A^{-1}$,

\begin{equation}\dfrac{\partial Z}{\partial X} = \dfrac{\text{ad}_Z}{e^{\text{ad}_Z} - 1} \cdot \dfrac{e^{\text{ad}_X} - 1}{\text{ad}_X} = \text{dexp}_Z^{-1} \circ \text{dexp}_X \label{eq:11-16-5}\end{equation}

Next, differentiate with respect to $Y$. Since $\displaystyle\dfrac{\partial}{\partial Y}(e^X e^Y) = e^X \dfrac{e^{\text{ad}_Y} - 1}{\text{ad}_Y}(dY)\, e^Y$, equate with $\eqref{eq:11-16-2}$ and cancel $e^Y$ on the right.

\begin{equation}\dfrac{e^{\text{ad}_Z} - 1}{\text{ad}_Z}(dZ)\, e^X = e^X\, \dfrac{e^{\text{ad}_Y} - 1}{\text{ad}_Y}(dY) \label{eq:11-16-6}\end{equation}

Using $e^X (\,\cdot\,) e^{-X} = \text{Ad}_{e^X} = e^{\text{ad}_X}$ and applying $\dfrac{\text{ad}_Z}{e^{\text{ad}_Z}-1}$ from the left,

\begin{equation}\dfrac{\partial Z}{\partial Y} = \dfrac{\text{ad}_Z}{e^{\text{ad}_Z} - 1}\, e^{\text{ad}_X}\, \dfrac{e^{\text{ad}_Y} - 1}{\text{ad}_Y} = \text{dexp}_Z^{-1} \circ \text{Ad}_{e^X} \circ \text{dexp}_Y \label{eq:11-16-7}\end{equation}

Note: The right-trivialization convention ($d(e^A)e^{-A}$) is used throughout. Here $\dfrac{\partial Z}{\partial X}$ is not a giant Jacobian matrix but a linear operator (Fréchet derivative) acting on an increment $\Delta X \in \mathfrak{g}$; it is read precisely as $D_X Z[\Delta X] = \text{dexp}_Z^{-1}\!\bigl(\text{dexp}_X(\Delta X)\bigr)$ ($\circ$ denotes operator composition). This differential formula is important in optimization on Lie groups (geodesic gradient methods) and in robotics (attitude control). Using the expansion $\displaystyle\dfrac{z}{e^{z} - 1} = 1 - \displaystyle\dfrac{z}{2} + \displaystyle\dfrac{z^2}{12} - \displaystyle\dfrac{z^4}{720} + \cdots$ (Bernoulli numbers) makes numerical computation feasible.

11.20 Jacobi Identity and Differentiation

Formula: $\text{ad}_{[X,Y]} = [\text{ad}_X, \text{ad}_Y]$, i.e., $\text{ad}_{[X,Y]}(Z) = \text{ad}_X(\text{ad}_Y(Z)) - \text{ad}_Y(\text{ad}_X(Z))$
Conditions: $X, Y, Z \in \mathfrak{g}$ (Lie algebra).
Proof

We prove the Jacobi identity. By the definition of the Lie bracket,

\begin{equation}[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0 \label{eq:11-20-1}\end{equation}

Expand $\eqref{eq:11-20-1}$. Since $[A, B] = AB - BA$,

\begin{equation}[X, [Y, Z]] = X(YZ - ZY) - (YZ - ZY)X = XYZ - XZY - YZX + ZYX \label{eq:11-20-2}\end{equation}

Similarly,

\begin{equation}[Y, [Z, X]] = YZX - YXZ - ZXY + XZY \label{eq:11-20-3}\end{equation}

\begin{equation}[Z, [X, Y]] = ZXY - ZYX - XYZ + YXZ \label{eq:11-20-4}\end{equation}

Adding $\eqref{eq:11-20-2}$, $\eqref{eq:11-20-3}$, and $\eqref{eq:11-20-4}$, all terms cancel and the sum is $0$.

[Relation to the adjoint representation]

Rewrite the Jacobi identity using $\text{ad}$. Since $\text{ad}_X(Y) = [X, Y]$,

\begin{equation}\text{ad}_X(\text{ad}_Y(Z)) - \text{ad}_Y(\text{ad}_X(Z)) = [[X, Y], Z] = \text{ad}_{[X,Y]}(Z) \label{eq:11-20-5}\end{equation}

Thus $\eqref{eq:11-20-5}$ can be written as $\text{ad}_{[X,Y]} = [\text{ad}_X, \text{ad}_Y]$. This shows that $\text{ad}: \mathfrak{g} \to \text{End}(\mathfrak{g})$ is a Lie algebra homomorphism.

Note: The Jacobi identity is the axiom that characterizes a Lie algebra in place of associativity. That $\text{ad}$ is a Lie algebra homomorphism means that, using the structure constants $c_{ij}^k$ ($[e_i, e_j] = \displaystyle\sum_k c_{ij}^k e_k$), it admits the matrix representation $(\text{ad}_{e_i})_j^k = c_{ij}^k$.
References: Due to C.G.J. Jacobi (posthumous manuscripts, ca. 1862). The modern formulation is due to S. Lie (1888-1893), "Theorie der Transformationsgruppen".

11.21 Differential of the Killing Form

Formula: $\displaystyle\dfrac{\partial \kappa(X, Y)}{\partial X_{ij}} = \text{tr}(\text{ad}_{J^{ij}} \text{ad}_Y)$, $\displaystyle\dfrac{\partial \kappa(X, Y)}{\partial Y_{ij}} = \text{tr}(\text{ad}_X \text{ad}_{J^{ij}})$
Conditions: $\kappa(X, Y) = \text{tr}(\text{ad}_X \text{ad}_Y)$ (Killing form), $X, Y \in \mathfrak{g}$.
Proof

The Killing form is defined by

\begin{equation}\kappa(X, Y) = \text{tr}(\text{ad}_X \text{ad}_Y) \label{eq:11-21-1}\end{equation}

Since $\text{ad}_X$ is a linear function of $X$, we have $\displaystyle\dfrac{\partial}{\partial X_{ij}} \text{ad}_X = \text{ad}_{J^{ij}}$.

Differentiate $\eqref{eq:11-21-1}$ with respect to $X_{ij}$. By the product rule and the linearity of the trace,

\begin{equation}\dfrac{\partial \kappa(X, Y)}{\partial X_{ij}} = \text{tr}\left( \dfrac{\partial \text{ad}_X}{\partial X_{ij}} \text{ad}_Y \right) = \text{tr}(\text{ad}_{J^{ij}} \text{ad}_Y) \label{eq:11-21-2}\end{equation}

Using the symmetry $\kappa(X, Y) = \kappa(Y, X)$, the derivative with respect to $Y$ follows similarly.

\begin{equation}\dfrac{\partial \kappa(X, Y)}{\partial Y_{ij}} = \text{tr}(\text{ad}_X \text{ad}_{J^{ij}}) \label{eq:11-21-3}\end{equation}

When it depends on both variables,

\begin{equation}d\kappa(X, Y) = \text{tr}(\text{ad}_{dX} \text{ad}_Y) + \text{tr}(\text{ad}_X \text{ad}_{dY}) \label{eq:11-21-4}\end{equation}

Note: The Killing form is $\text{Ad}$-invariant ($\kappa(\text{Ad}_g X, \text{Ad}_g Y) = \kappa(X, Y)$) and is non-degenerate on semisimple Lie algebras. For $\mathfrak{su}(n)$, $\kappa(X, Y) = 2n \cdot \text{tr}(XY)$, and for $\mathfrak{so}(n)$, $\kappa(X, Y) = (n-2) \cdot \text{tr}(XY)$. This invariant metric defines Riemannian geometry on the Lie group.
References: W. Killing (1888), "Die Zusammensetzung der stetigen endlichen Transformationsgruppen". Cartan's criterion is due to É. Cartan (1894).

11.22 Maurer-Cartan Form

Formula: $\omega = g^{-1} dg$, $d\omega + \omega \wedge \omega = 0$ (Maurer-Cartan equation)
Conditions: $g: M \to G$ (map into a Lie group), $\omega \in \Omega^1(M, \mathfrak{g})$ ($\mathfrak{g}$-valued 1-form).
Proof

Define the Maurer-Cartan form. For $g \in G$,

\begin{equation}\omega = g^{-1} dg \label{eq:11-22-1}\end{equation}

We check that $\omega$ takes values in the Lie algebra $\mathfrak{g}$: $g^{-1} dg$ is an element of $T_e G \cong \mathfrak{g}$.

[Derivation of the Maurer-Cartan equation]

Use $d(g^{-1}) = -g^{-1} (dg) g^{-1}$ (from differentiating $g \cdot g^{-1} = I$).

\begin{equation}d(g \cdot g^{-1}) = dg \cdot g^{-1} + g \cdot d(g^{-1}) = 0 \label{eq:11-22-2}\end{equation}

\begin{equation}d(g^{-1}) = -g^{-1} dg \cdot g^{-1} \label{eq:11-22-3}\end{equation}

Compute the exterior derivative of $\omega = g^{-1} dg$.

\begin{equation}d\omega = d(g^{-1}) \wedge dg = -g^{-1} dg \cdot g^{-1} \wedge dg \label{eq:11-22-4}\end{equation}

Compute $\omega \wedge \omega$. In the matrix-valued case, $\omega \wedge \omega$ combines matrix multiplication with the wedge product.

\begin{equation}\omega \wedge \omega = (g^{-1} dg) \wedge (g^{-1} dg) \label{eq:11-22-5}\end{equation}

Since $g^{-1}$ is a 0-form,

\begin{equation}\omega \wedge \omega = g^{-1} dg \cdot g^{-1} \wedge dg = g^{-1} dg \wedge g^{-1} dg \label{eq:11-22-6}\end{equation}

Comparing $\eqref{eq:11-22-4}$ and $\eqref{eq:11-22-6}$,

\begin{equation}d\omega = -\omega \wedge \omega \label{eq:11-22-7}\end{equation}

that is,

\begin{equation}d\omega + \omega \wedge \omega = 0 \label{eq:11-22-8}\end{equation}

Note: The Maurer-Cartan equation expresses that the curvature of the connection is zero (flat). The right-invariant form $\tilde{\omega} = dg \cdot g^{-1}$ satisfies the sign-flipped equation $d\tilde{\omega} - \tilde{\omega} \wedge \tilde{\omega} = 0$ (since $d(g^{-1}) = -g^{-1}\,dg\,g^{-1}$ gives $d\tilde\omega = +\tilde\omega\wedge\tilde\omega$). In components, $d\omega^k + \displaystyle\dfrac{1}{2} c_{ij}^k \omega^i \wedge \omega^j = 0$ (where $c_{ij}^k$ are the structure constants).
References: Due to L. Maurer (1888) and É. Cartan (1904), "Sur la structure des groupes infinis de transformations". For a modern treatment see S. Kobayashi & K. Nomizu, "Foundations of Differential Geometry", Vol.1, Ch.1.