微分の定義（ε-δ 論法）とは？

関数 f(x) の点 a における微分係数は、極限 f'(a) = lim_{h→0} (f(a+h) - f(a))/h で定義されます。この極限が存在するとき f は a で微分可能といいます。

積の微分法則（ライプニッツ則）はどう証明するのか？

(fg)' = f'g + fg' の証明は、差分商 (f(x+h)g(x+h) - f(x)g(x))/h に f(x+h)g(x) を加減して2項に分け、h→0 の極限をとります。f の連続性から f(x+h)→f(x) となり結果が得られます。

合成関数の微分（連鎖律）とは？

合成関数 f(g(x)) の微分は (f∘g)'(x) = f'(g(x))·g'(x) です。外側の関数の微分に内側の関数の微分を掛けます。行列微分ではこれが連鎖律（chain rule）として一般化されます。

証明集第1章: スカラ1変数の微分

Proofs Chapter 1: Scalar Derivatives of Single Variable

1. スカラ1変数の微分

本章では、行列微分の基礎となるスカラ1変数関数の微分について、定義から基本公式までを厳密に証明する。行列微分の多くの公式は、1変数微分の結果を成分ごとに適用することで導出される。したがって、1変数微分の理解は行列微分を学ぶ上での必須の前提知識である。

本シリーズの表記規約
本証明集シリーズでは、第2章以降の行列微分において分母レイアウト（denominator layout）を採用している。分母レイアウトでは、スカラをベクトルで微分した結果は列ベクトル、ベクトルをスカラで微分した結果は行ベクトルとなる。詳細はレイアウト規約を参照。

本章のロードマップ

本章は以下の構成で微分理論を展開する。各定理・公式は、それを使用する前に証明される論理的順序で配置されている。

1.1 微分の定義と基本概念（1.1–1.3）：微分係数・導関数の定義、微分可能性と連続性
1.2 基礎定理と恒等式（1.4–1.10）：Pascal恒等式、二項定理、三角関数の恒等式と基本極限
1.3 線形代数の基礎定理（1.11–1.15）：トレースと行列式の性質
1.4 基本関数の微分（1.16–1.23）：定数、べき、指数、対数関数
1.5 微分の演算法則（1.24–1.29）：線形性、積・商・合成・逆関数の微分
1.6 三角関数の微分（1.30–1.33）：sin, cos, tan 等
1.7 逆三角関数の微分（1.34–1.36）：arcsin, arccos, arctan
1.8 双曲線関数の微分（1.37–1.39）：sinh, cosh, tanh
1.9 その他の重要な微分公式（1.40–1.43）：絶対値、シグモイド、Softplus、Leibniz公式

1.1 微分の定義と基本概念

関数 $f(x)$ の点 $x = a$ における微分係数は、その点での瞬間変化率を表す。これは物理学における速度（位置の時間変化率）や、経済学における限界効用（効用の消費量変化率）など、様々な分野で本質的な概念である。

1.1 点での微分係数の定義

定義：$\displaystyle f'(a) = \lim_{h \to 0} \dfrac{f(a+h) - f(a)}{h}$

条件：極限が存在する

解説

この定義を幾何学的に解説する。

点 $(a, f(a))$ と点 $(a+h, f(a+h))$ を結ぶ直線（割線）の傾きを考える。

\begin{equation}\text{割線の傾き} = \dfrac{f(a+h) - f(a)}{(a+h) - a} = \dfrac{f(a+h) - f(a)}{h} \label{eq:1-1-1}\end{equation}

$h \to 0$ のとき、点 $(a+h, f(a+h))$ は曲線に沿って点 $(a, f(a))$ に近づく。このとき割線は接線に近づく。

\begin{equation}f'(a) = \lim_{h \to 0} \dfrac{f(a+h) - f(a)}{h} \label{eq:1-1-2}\end{equation}

用語：$\eqref{eq:1-1-2}$ の極限が存在するとき、$f$ は点 $a$ で微分可能であるといい、その極限値 $f'(a)$ を点 $a$ における微分係数と呼ぶ。

補足：微分係数は接線の傾きを表す。$f'(a) > 0$ なら $f$ は点 $a$ で増加、$f'(a) < 0$ なら減少している。

微分係数の表記法として、$f'(a)$ のほかに $\displaystyle \dfrac{df}{dx}\bigg|_{x=a}$ とも書く。

1.2 導関数の定義（微分係数を関数としてみる）

定義：$\displaystyle f'(x) = \lim_{h \to 0} \dfrac{f(x+h) - f(x)}{h}$

条件：各点で極限が存在する

解説

微分係数 $f'(a)$ は特定の点 $a$ での値であった。$a$ を変数 $x$ に置き換えると、$x$ の関数としての導関数が定義される。

\begin{equation}f'(x) = \lim_{h \to 0} \dfrac{f(x+h) - f(x)}{h} \label{eq:1-2-1}\end{equation}

導関数の表記法には複数の流儀がある。

Leibniz記法：$\displaystyle \dfrac{df}{dx}$、$\displaystyle \dfrac{d}{dx}f(x)$

Lagrange記法：$f'(x)$

Newton記法：$\dot{f}$（時間微分によく使用）

高階微分は以下のように定義される。

\begin{equation}f''(x) = \dfrac{d^2f}{dx^2} = \dfrac{d}{dx}\left(\dfrac{df}{dx}\right) \label{eq:1-2-2}\end{equation}

\begin{equation}f^{(n)}(x) = \dfrac{d^n f}{dx^n} = \dfrac{d}{dx}\left(\dfrac{d^{n-1}f}{dx^{n-1}}\right) \label{eq:1-2-3}\end{equation}

用語：$\eqref{eq:1-2-1}$ の極限が各点で存在するとき、$f'(x)$ を $f$ の導関数という。

補足：行列微分では、スカラ関数 $f$ を行列 $\boldsymbol{X}$ の各成分 $X_{ij}$ で偏微分する。これは1変数微分を各成分に適用することに相当する。

1.3 微分可能性と連続性

定理：$f$ が点 $a$ で微分可能 $\Rightarrow$ $f$ は点 $a$ で連続

条件：逆は一般には成り立たない（例：$f(x) = |x|$ は $x = 0$ で連続だが微分不可能）

証明

前提：本証明では、極限の基本性質（和・積・定数倍の極限法則）を既知として用いる。

$f$ が点 $a$ で微分可能であると仮定する。微分可能の定義より、極限

\begin{equation}f'(a) = \lim_{h \to 0} \dfrac{f(a+h) - f(a)}{h} \label{eq:1-3-1}\end{equation}

が存在する。

連続性を示すには、$\lim_{h \to 0} f(a+h) = f(a)$ を証明すればよい。

$h \neq 0$ のとき、$f(a+h) - f(a)$ を以下のように変形する。

\begin{equation}f(a+h) - f(a) = \dfrac{f(a+h) - f(a)}{h} \cdot h \label{eq:1-3-2}\end{equation}

$\eqref{eq:1-3-2}$ の両辺で $h \to 0$ の極限を取る。極限の積の法則 $\lim (AB) = (\lim A)(\lim B)$（両極限が存在する場合）より

\begin{equation}\lim_{h \to 0} [f(a+h) - f(a)] = \lim_{h \to 0} \dfrac{f(a+h) - f(a)}{h} \cdot \lim_{h \to 0} h \label{eq:1-3-3}\end{equation}

$\eqref{eq:1-3-1}$ より第1因子は $f'(a)$（有限値）に収束し、第2因子は 0 に収束する。

\begin{equation}\lim_{h \to 0} [f(a+h) - f(a)] = f'(a) \cdot 0 = 0 \label{eq:1-3-4}\end{equation}

$\eqref{eq:1-3-4}$ より

\begin{equation}\lim_{h \to 0} f(a+h) = f(a) \label{eq:1-3-5}\end{equation}

$\eqref{eq:1-3-5}$ は $f$ が点 $a$ で連続であることを意味する。

補足：逆の反例として $f(x) = |x|$ がある。$x = 0$ で連続だが、$\displaystyle \lim_{h \to 0^+} \dfrac{|h|}{h} = 1$ と $\displaystyle \lim_{h \to 0^-} \dfrac{|h|}{h} = -1$ が異なるため微分不可能である。

1.2 基礎定理と恒等式

本節では、微分公式の導出に必要となる基礎的な定理・恒等式を証明する。これらは後続の証明で参照される土台となる。

1.4 Pascalの恒等式

公式：$\displaystyle \binom{n}{k-1} + \binom{n}{k} = \binom{n+1}{k}$

条件：$n \geq 0$、$1 \leq k \leq n$

証明

二項係数の定義から直接計算する。

二項係数の定義より

\begin{equation}\binom{n}{k-1} = \dfrac{n!}{(k-1)!(n-k+1)!}, \quad \binom{n}{k} = \dfrac{n!}{k!(n-k)!} \label{eq:1-4-1}\end{equation}

左辺を計算する。通分のため、分母を $k!(n-k+1)!$ に揃える。

\begin{equation}\binom{n}{k-1} + \binom{n}{k} = \dfrac{n! \cdot k}{k!(n-k+1)!} + \dfrac{n! \cdot (n-k+1)}{k!(n-k+1)!} \label{eq:1-4-2}\end{equation}

分子を整理する。

\begin{equation}\binom{n}{k-1} + \binom{n}{k} = \dfrac{n! (k + n - k + 1)}{k!(n-k+1)!} = \dfrac{n! (n + 1)}{k!(n-k+1)!} \label{eq:1-4-3}\end{equation}

$n! (n + 1) = (n + 1)!$ であるから

\begin{equation}\binom{n}{k-1} + \binom{n}{k} = \dfrac{(n+1)!}{k!(n+1-k)!} = \binom{n+1}{k} \label{eq:1-4-4}\end{equation}

補足：二項定理 1.5 の帰納法による証明、およびLeibnizの公式 1.43 で使用される。Pascalの三角形の各成分が上の2つの成分の和になることを表す。

1.5 二項定理

公式：$\displaystyle (x + y)^n = \displaystyle\sum_{k=0}^{n} \binom{n}{k} x^{n-k} y^k$

条件：$n$ は非負整数、$\binom{n}{k} = \dfrac{n!}{k!(n-k)!}$ は二項係数

証明

数学的帰納法で証明する。

基底：$n = 0$ のとき、左辺は $(x + y)^0 = 1$、右辺は $\displaystyle\sum_{k=0}^{0} \binom{0}{0} x^0 y^0 = 1$ で一致する。

帰納段階：$n = m$ で成立すると仮定する。

\begin{equation}(x + y)^m = \displaystyle\sum_{k=0}^{m} \binom{m}{k} x^{m-k} y^k \label{eq:1-5-1}\end{equation}

$n = m + 1$ の場合を考える。

\begin{equation}(x + y)^{m+1} = (x + y)(x + y)^m = (x + y) \displaystyle\sum_{k=0}^{m} \binom{m}{k} x^{m-k} y^k \label{eq:1-5-2}\end{equation}

$\eqref{eq:1-5-2}$ を展開する。

\begin{equation}(x + y)^{m+1} = \displaystyle\sum_{k=0}^{m} \binom{m}{k} x^{m+1-k} y^k + \displaystyle\sum_{k=0}^{m} \binom{m}{k} x^{m-k} y^{k+1} \label{eq:1-5-3}\end{equation}

第2項で $j = k + 1$ と置換すると（$k = j - 1$）

\begin{equation}\displaystyle\sum_{k=0}^{m} \binom{m}{k} x^{m-k} y^{k+1} = \displaystyle\sum_{j=1}^{m+1} \binom{m}{j-1} x^{m+1-j} y^j \label{eq:1-5-4}\end{equation}

$\eqref{eq:1-5-3}$ と $\eqref{eq:1-5-4}$ を合わせると

\begin{equation}(x + y)^{m+1} = \binom{m}{0} x^{m+1} + \displaystyle\sum_{k=1}^{m} \left[ \binom{m}{k} + \binom{m}{k-1} \right] x^{m+1-k} y^k + \binom{m}{m} y^{m+1} \label{eq:1-5-5}\end{equation}

Pascalの恒等式（1.4）$\binom{m}{k} + \binom{m}{k-1} = \binom{m+1}{k}$ と、$\binom{m}{0} = \binom{m+1}{0} = 1$、$\binom{m}{m} = \binom{m+1}{m+1} = 1$ を用いると

\begin{equation}(x + y)^{m+1} = \displaystyle\sum_{k=0}^{m+1} \binom{m+1}{k} x^{m+1-k} y^k \label{eq:1-5-6}\end{equation}

数学的帰納法により、すべての非負整数 $n$ について二項定理が成立する。

補足：べき関数の微分 1.18 で $(x + h)^n$ を展開する際に使用される。

1.6 ピタゴラスの恒等式

公式：$\sin^2 x + \cos^2 x = 1$

条件：$x \in \mathbb{R}$

証明

単位円の定義から証明する。

単位円は原点を中心とする半径 1 の円であり、方程式は

\begin{equation}x^2 + y^2 = 1 \label{eq:1-6-1}\end{equation}

角度 $\theta$ に対応する単位円上の点の座標は $(\cos\theta, \sin\theta)$ と定義される。

\begin{equation}(x, y) = (\cos\theta, \sin\theta) \label{eq:1-6-2}\end{equation}

$\eqref{eq:1-6-2}$ を $\eqref{eq:1-6-1}$ に代入すると

\begin{equation}\cos^2\theta + \sin^2\theta = 1 \label{eq:1-6-3}\end{equation}

変数名を変えて、すべての $x \in \mathbb{R}$ に対して

\begin{equation}\sin^2 x + \cos^2 x = 1 \label{eq:1-6-4}\end{equation}

補足：正接関数の微分 1.32 で $\cos^2 x = 1 - \sin^2 x$ の形で使用される。また、$1 + \tan^2 x = \sec^2 x$ の導出にも用いられる。

1.7 三角関数の加法定理

公式：$\sin(x + y) = \sin x \cos y + \cos x \sin y$、$\cos(x + y) = \cos x \cos y - \sin x \sin y$

条件：$x, y \in \mathbb{R}$

証明

単位円と回転行列を用いて証明する。

角度 $\theta$ に対応する単位円上の点は $(\cos\theta, \sin\theta)$ で表される。これは点 $(1, 0)$ を原点周りに角度 $\theta$ だけ回転させた点に等しい。

角度 $\theta$ の回転を表す行列は

\begin{equation}R(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} \label{eq:1-7-1}\end{equation}

回転の合成より、角度 $x$ の回転に続けて角度 $y$ の回転を行うと、合計で角度 $x + y$ の回転になる。

\begin{equation}R(x + y) = R(y) R(x) \label{eq:1-7-2}\end{equation}

$\eqref{eq:1-7-2}$ の右辺を計算する。

\begin{equation}R(y) R(x) = \begin{pmatrix} \cos y & -\sin y \\ \sin y & \cos y \end{pmatrix} \begin{pmatrix} \cos x & -\sin x \\ \sin x & \cos x \end{pmatrix} \label{eq:1-7-3}\end{equation}

行列積を展開する。

\begin{equation}R(y) R(x) = \begin{pmatrix} \cos y \cos x - \sin y \sin x & -\cos y \sin x - \sin y \cos x \\ \sin y \cos x + \cos y \sin x & -\sin y \sin x + \cos y \cos x \end{pmatrix} \label{eq:1-7-4}\end{equation}

$\eqref{eq:1-7-2}$ の左辺は

\begin{equation}R(x + y) = \begin{pmatrix} \cos(x + y) & -\sin(x + y) \\ \sin(x + y) & \cos(x + y) \end{pmatrix} \label{eq:1-7-5}\end{equation}

$\eqref{eq:1-7-4}$ と $\eqref{eq:1-7-5}$ の成分を比較すると

\begin{equation}\cos(x + y) = \cos x \cos y - \sin x \sin y \label{eq:1-7-6}\end{equation}

\begin{equation}\sin(x + y) = \sin x \cos y + \cos x \sin y \label{eq:1-7-7}\end{equation}

補足：正弦・余弦関数の微分 1.30、1.31 で $\sin(x + h)$、$\cos(x + h)$ を展開する際に使用される。

1.8 正弦関数の基本極限

公式：$\displaystyle \lim_{x \to 0} \dfrac{\sin x}{x} = 1$

条件：$x$ はラジアン単位

証明

単位円を用いた幾何学的証明を行う。$0 < x < \dfrac{\pi}{2}$ の場合を考える。

単位円において、中心角 $x$（ラジアン）に対応する弧、弦、接線を考える。原点を $O$、単位円上の点を $A = (1, 0)$、角度 $x$ に対応する点を $B = (\cos x, \sin x)$、点 $A$ における単位円の接線と直線 $OB$ の延長との交点を $C = (1, \tan x)$ とする。

これらの図形の面積を比較する。

\begin{equation}\text{三角形 } OAB \text{ の面積} < \text{扇形 } OAB \text{ の面積} < \text{三角形 } OAC \text{ の面積} \label{eq:1-8-1}\end{equation}

各面積を計算する。

\begin{equation}\text{三角形 } OAB = \dfrac{1}{2} \cdot 1 \cdot \sin x = \dfrac{\sin x}{2} \label{eq:1-8-2}\end{equation}

\begin{equation}\text{扇形 } OAB = \dfrac{1}{2} \cdot 1^2 \cdot x = \dfrac{x}{2} \label{eq:1-8-3}\end{equation}

\begin{equation}\text{三角形 } OAC = \dfrac{1}{2} \cdot 1 \cdot \tan x = \dfrac{\tan x}{2} \label{eq:1-8-4}\end{equation}

$\eqref{eq:1-8-1}$ に $\eqref{eq:1-8-2}$、$\eqref{eq:1-8-3}$、$\eqref{eq:1-8-4}$ を代入する。

\begin{equation}\dfrac{\sin x}{2} < \dfrac{x}{2} < \dfrac{\tan x}{2} = \dfrac{\sin x}{2 \cos x} \label{eq:1-8-5}\end{equation}

$\sin x > 0$（$0 < x < \dfrac{\pi}{2}$ より）で全体を割り、逆数をとる。

\begin{equation}1 > \dfrac{\sin x}{x} > \cos x \label{eq:1-8-6}\end{equation}

すなわち

\begin{equation}\cos x < \dfrac{\sin x}{x} < 1 \label{eq:1-8-7}\end{equation}

$x \to 0^+$ のとき $\cos x \to 1$ であるから、はさみうちの原理より

\begin{equation}\lim_{x \to 0^+} \dfrac{\sin x}{x} = 1 \label{eq:1-8-8}\end{equation}

$\dfrac{\sin x}{x}$ は偶関数（$\sin(-x) = -\sin x$ より $\dfrac{\sin(-x)}{-x} = \dfrac{\sin x}{x}$）であるから、$x \to 0^-$ のときも同じ極限値をもつ。

\begin{equation}\lim_{x \to 0} \dfrac{\sin x}{x} = 1 \label{eq:1-8-9}\end{equation}

補足：この極限は正弦関数の微分 1.30 の証明で本質的に使用される。

1.9 余弦関数の基本極限

公式：$\displaystyle \lim_{x \to 0} \dfrac{1 - \cos x}{x} = 0$

条件：$x$ はラジアン単位

証明

半角公式と 1.8 を用いて証明する。

半角公式より

\begin{equation}1 - \cos x = 2\sin^2\dfrac{x}{2} \label{eq:1-9-1}\end{equation}

$\eqref{eq:1-9-1}$ を用いると

\begin{equation}\dfrac{1 - \cos x}{x} = \dfrac{2\sin^2\dfrac{x}{2}}{x} = \dfrac{2\sin^2\dfrac{x}{2}}{2 \cdot \dfrac{x}{2}} = \sin\dfrac{x}{2} \cdot \dfrac{\sin\dfrac{x}{2}}{\dfrac{x}{2}} \label{eq:1-9-2}\end{equation}

$x \to 0$ のとき、$\dfrac{x}{2} \to 0$ であるから、1.8 より $\dfrac{\sin\dfrac{x}{2}}{\dfrac{x}{2}} \to 1$。また $\sin\dfrac{x}{2} \to 0$ であるから

\begin{equation}\lim_{x \to 0} \dfrac{1 - \cos x}{x} = 0 \cdot 1 = 0 \label{eq:1-9-3}\end{equation}

補足：正弦関数の微分の証明において $\dfrac{\cos h - 1}{h} \to 0$ の形で使用される。

1.10 双曲線恒等式

公式：$\cosh^2 x - \sinh^2 x = 1$

条件：$\displaystyle \sinh x = \dfrac{e^x - e^{-x}}{2}$、$\displaystyle \cosh x = \dfrac{e^x + e^{-x}}{2}$

証明

双曲線関数の定義から直接計算する。

$\cosh^2 x$ を計算する。

\begin{equation}\cosh^2 x = \left( \dfrac{e^x + e^{-x}}{2} \right)^2 = \dfrac{e^{2x} + 2 + e^{-2x}}{4} \label{eq:1-10-1}\end{equation}

$\sinh^2 x$ を計算する。

\begin{equation}\sinh^2 x = \left( \dfrac{e^x - e^{-x}}{2} \right)^2 = \dfrac{e^{2x} - 2 + e^{-2x}}{4} \label{eq:1-10-2}\end{equation}

$\eqref{eq:1-10-1}$ から $\eqref{eq:1-10-2}$ を引く。

\begin{equation}\cosh^2 x - \sinh^2 x = \dfrac{e^{2x} + 2 + e^{-2x}}{4} - \dfrac{e^{2x} - 2 + e^{-2x}}{4} = \dfrac{4}{4} = 1 \label{eq:1-10-3}\end{equation}

補足：双曲線正接の微分 1.39 で使用される。ピタゴラスの恒等式の双曲線版である。

1.3 線形代数の基礎定理

行列微分では、トレースや行列式の性質が頻繁に使用される。本節ではこれらの基本的な性質を証明する。

1.11 トレースの線形性

公式：$\text{tr}(\alpha \boldsymbol{A} + \beta \boldsymbol{B}) = \alpha \text{tr}(\boldsymbol{A}) + \beta \text{tr}(\boldsymbol{B})$

条件：$\boldsymbol{A}, \boldsymbol{B} \in \mathbb{R}^{n \times n}$、$\alpha, \beta \in \mathbb{R}$

証明

トレースの定義から直接証明する。

トレースの定義は対角成分の和である。

\begin{equation}\text{tr}(\boldsymbol{A}) = \displaystyle\sum_{i=0}^{n-1} A_{ii} \label{eq:1-11-1}\end{equation}

$\alpha \boldsymbol{A} + \beta \boldsymbol{B}$ の $(i, i)$ 成分は $\alpha A_{ii} + \beta B_{ii}$ である。

\begin{equation}\text{tr}(\alpha \boldsymbol{A} + \beta \boldsymbol{B}) = \displaystyle\sum_{i=0}^{n-1} (\alpha A_{ii} + \beta B_{ii}) \label{eq:1-11-2}\end{equation}

和の線形性より

\begin{equation}\text{tr}(\alpha \boldsymbol{A} + \beta \boldsymbol{B}) = \alpha \displaystyle\sum_{i=0}^{n-1} A_{ii} + \beta \displaystyle\sum_{i=0}^{n-1} B_{ii} = \alpha \text{tr}(\boldsymbol{A}) + \beta \text{tr}(\boldsymbol{B}) \label{eq:1-11-3}\end{equation}

補足：トレースを含む行列微分の計算で頻繁に使用される。

1.12 トレースの巡回性

公式：$\text{tr}(\boldsymbol{ABC}) = \text{tr}(\boldsymbol{BCA}) = \text{tr}(\boldsymbol{CAB})$

条件：行列積 $\boldsymbol{ABC}$ が正方行列となるサイズ

証明

まず2行列の場合 $\text{tr}(\boldsymbol{AB}) = \text{tr}(\boldsymbol{BA})$ を証明し、それを3行列に拡張する。

$\boldsymbol{A} \in \mathbb{R}^{m \times n}$、$\boldsymbol{B} \in \mathbb{R}^{n \times m}$ とする。行列積の定義より

\begin{equation}(\boldsymbol{AB})_{ij} = \displaystyle\sum_{k=0}^{n-1} A_{ik} B_{kj} \label{eq:1-12-1}\end{equation}

$\boldsymbol{AB} \in \mathbb{R}^{m \times m}$ のトレースを計算する。

\begin{equation}\text{tr}(\boldsymbol{AB}) = \displaystyle\sum_{i=0}^{m-1} (\boldsymbol{AB})_{ii} = \displaystyle\sum_{i=0}^{m-1} \displaystyle\sum_{k=0}^{n-1} A_{ik} B_{ki} \label{eq:1-12-2}\end{equation}

同様に $\boldsymbol{BA} \in \mathbb{R}^{n \times n}$ のトレースを計算する。

\begin{equation}\text{tr}(\boldsymbol{BA}) = \displaystyle\sum_{k=0}^{n-1} (\boldsymbol{BA})_{kk} = \displaystyle\sum_{k=0}^{n-1} \displaystyle\sum_{i=0}^{m-1} B_{ki} A_{ik} \label{eq:1-12-3}\end{equation}

$\eqref{eq:1-12-2}$ と $\eqref{eq:1-12-3}$ を比較する。和の順序を入れ替えて

\begin{equation}\text{tr}(\boldsymbol{AB}) = \displaystyle\sum_{i=0}^{m-1} \displaystyle\sum_{k=0}^{n-1} A_{ik} B_{ki} = \displaystyle\sum_{k=0}^{n-1} \displaystyle\sum_{i=0}^{m-1} A_{ik} B_{ki} = \displaystyle\sum_{k=0}^{n-1} \displaystyle\sum_{i=0}^{m-1} B_{ki} A_{ik} = \text{tr}(\boldsymbol{BA}) \label{eq:1-12-4}\end{equation}

3行列の場合に拡張する。$\boldsymbol{D} = \boldsymbol{AB}$ とおくと

\begin{equation}\text{tr}(\boldsymbol{ABC}) = \text{tr}(\boldsymbol{DC}) = \text{tr}(\boldsymbol{CD}) = \text{tr}(\boldsymbol{CAB}) \label{eq:1-12-5}\end{equation}

同様に $\boldsymbol{E} = \boldsymbol{BC}$ とおくと

\begin{equation}\text{tr}(\boldsymbol{ABC}) = \text{tr}(\boldsymbol{AE}) = \text{tr}(\boldsymbol{EA}) = \text{tr}(\boldsymbol{BCA}) \label{eq:1-12-6}\end{equation}

補足：トレースを含む行列微分の計算で最も頻繁に使用される性質。$\text{tr}(\boldsymbol{ABC}) \neq \text{tr}(\boldsymbol{ACB})$ であることに注意（巡回的な置換のみ成立）。

1.13 トレースと転置

公式：$\text{tr}(\boldsymbol{A}^\top) = \text{tr}(\boldsymbol{A})$

条件：$\boldsymbol{A} \in \mathbb{R}^{n \times n}$

証明

転置行列の対角成分は元の行列の対角成分と同じである。

転置の定義より $(\boldsymbol{A}^\top)_{ij} = A_{ji}$ である。特に対角成分については

\begin{equation}(\boldsymbol{A}^\top)_{ii} = A_{ii} \label{eq:1-13-1}\end{equation}

トレースの定義より

\begin{equation}\text{tr}(\boldsymbol{A}^\top) = \displaystyle\sum_{i=0}^{n-1} (\boldsymbol{A}^\top)_{ii} = \displaystyle\sum_{i=0}^{n-1} A_{ii} = \text{tr}(\boldsymbol{A}) \label{eq:1-13-2}\end{equation}

1.14 行列式の積

公式：$\det(\boldsymbol{AB}) = \det(\boldsymbol{A}) \det(\boldsymbol{B})$

条件：$\boldsymbol{A}, \boldsymbol{B} \in \mathbb{R}^{n \times n}$

証明

ブロック行列の行列式を用いて証明する。

次のブロック行列を考える。

\begin{equation}\boldsymbol{M} = \begin{pmatrix} \boldsymbol{A} & \boldsymbol{O} \\ -\boldsymbol{I} & \boldsymbol{B} \end{pmatrix} \label{eq:1-14-1}\end{equation}

$\boldsymbol{M}$ に対して、第1ブロック行に $\boldsymbol{B}$ を右から掛けて第2ブロック行に加える行基本変形を行う。この操作は行列式を変えない。

\begin{equation}\begin{pmatrix} \boldsymbol{I} & \boldsymbol{O} \\ \boldsymbol{O} & \boldsymbol{I} \end{pmatrix} \begin{pmatrix} \boldsymbol{A} & \boldsymbol{O} \\ -\boldsymbol{I} & \boldsymbol{B} \end{pmatrix} \begin{pmatrix} \boldsymbol{I} & \boldsymbol{B} \\ \boldsymbol{O} & \boldsymbol{I} \end{pmatrix} = \begin{pmatrix} \boldsymbol{A} & \boldsymbol{AB} \\ -\boldsymbol{I} & \boldsymbol{O} \end{pmatrix} \label{eq:1-14-2}\end{equation}

さらに第2ブロック行に $\boldsymbol{A}$ を左から掛けて第1ブロック行に加える。

\begin{equation}\begin{pmatrix} \boldsymbol{I} & \boldsymbol{A} \\ \boldsymbol{O} & \boldsymbol{I} \end{pmatrix} \begin{pmatrix} \boldsymbol{A} & \boldsymbol{AB} \\ -\boldsymbol{I} & \boldsymbol{O} \end{pmatrix} = \begin{pmatrix} \boldsymbol{O} & \boldsymbol{AB} \\ -\boldsymbol{I} & \boldsymbol{O} \end{pmatrix} \label{eq:1-14-3}\end{equation}

三角ブロック行列の行列式は対角ブロックの行列式の積に等しい。

\begin{equation}\det(\boldsymbol{M}) = \det\begin{pmatrix} \boldsymbol{A} & \boldsymbol{O} \\ -\boldsymbol{I} & \boldsymbol{B} \end{pmatrix} = \det(\boldsymbol{A}) \det(\boldsymbol{B}) \label{eq:1-14-4}\end{equation}

一方、$\eqref{eq:1-14-3}$ の行列の行列式を計算する。ブロック行・列の交換により

\begin{equation}\det\begin{pmatrix} \boldsymbol{O} & \boldsymbol{AB} \\ -\boldsymbol{I} & \boldsymbol{O} \end{pmatrix} = (-1)^n \det\begin{pmatrix} -\boldsymbol{I} & \boldsymbol{O} \\ \boldsymbol{O} & \boldsymbol{AB} \end{pmatrix} = (-1)^n \cdot (-1)^n \det(\boldsymbol{AB}) = \det(\boldsymbol{AB}) \label{eq:1-14-5}\end{equation}

行基本変形は行列式を変えないので、$\eqref{eq:1-14-4}$ と $\eqref{eq:1-14-5}$ より

\begin{equation}\det(\boldsymbol{A}) \det(\boldsymbol{B}) = \det(\boldsymbol{AB}) \label{eq:1-14-6}\end{equation}

補足：行列式の微分公式の導出で頻繁に使用される。

1.15 転置の行列式

公式：$\det(\boldsymbol{A}^\top) = \det(\boldsymbol{A})$

条件：$\boldsymbol{A} \in \mathbb{R}^{n \times n}$

証明

行列式のLeibniz公式（A.5）を用いて証明する。

行列式のLeibniz公式は

\begin{equation}\det(\boldsymbol{A}) = \displaystyle\sum_{\sigma \in S_n} \text{sgn}(\sigma) \displaystyle\prod_{i=0}^{n-1} A_{i, \sigma(i)} \label{eq:1-15-1}\end{equation}

ここで $S_n$ は $\{0, 1, \ldots, n-1\}$ の置換全体、$\text{sgn}(\sigma)$ は置換 $\sigma$ の符号である。

転置行列 $\boldsymbol{A}^\top$ の行列式を計算する。$(\boldsymbol{A}^\top)_{ij} = A_{ji}$ より

\begin{equation}\det(\boldsymbol{A}^\top) = \displaystyle\sum_{\sigma \in S_n} \text{sgn}(\sigma) \displaystyle\prod_{i=0}^{n-1} (\boldsymbol{A}^\top)_{i, \sigma(i)} = \displaystyle\sum_{\sigma \in S_n} \text{sgn}(\sigma) \displaystyle\prod_{i=0}^{n-1} A_{\sigma(i), i} \label{eq:1-15-2}\end{equation}

$j = \sigma(i)$ と置換を導入する。$\sigma$ が全単射であるから、$i$ が $0$ から $n-1$ を動くとき $j = \sigma(i)$ も $0$ から $n-1$ のすべての値を一度ずつとる。逆置換 $\sigma^{-1}$ を用いると $i = \sigma^{-1}(j)$ である。

\begin{equation}\displaystyle\prod_{i=0}^{n-1} A_{\sigma(i), i} = \displaystyle\prod_{j=0}^{n-1} A_{j, \sigma^{-1}(j)} \label{eq:1-15-3}\end{equation}

$\sigma$ が $S_n$ を動くとき $\sigma^{-1}$ も $S_n$ を動く。また $\text{sgn}(\sigma^{-1}) = \text{sgn}(\sigma)$ である。$\tau = \sigma^{-1}$ と置き換えると

\begin{equation}\det(\boldsymbol{A}^\top) = \displaystyle\sum_{\tau \in S_n} \text{sgn}(\tau) \displaystyle\prod_{j=0}^{n-1} A_{j, \tau(j)} = \det(\boldsymbol{A}) \label{eq:1-15-4}\end{equation}

補足：行列式について、行に関する性質と列に関する性質が対称であることを示す基本的な結果。

1.4 基本関数の微分

以下では、基本的な関数の導関数を定義から直接導出する。これらの結果は、合成関数の微分法則と組み合わせることで、より複雑な関数の微分を計算する際の基礎となる。

1.16 定数関数の微分

公式：$\displaystyle\dfrac{d}{dx} c = 0$

条件：$c$ は任意の定数

証明

$f(x) = c$（定数関数）とする。微分の定義に従って計算する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{f(x+h) - f(x)}{h} \label{eq:1-16-1}\end{equation}

$f(x+h) = c$ および $f(x) = c$ を $\eqref{eq:1-16-1}$ に代入する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{c - c}{h} = \lim_{h \to 0} \dfrac{0}{h} = \lim_{h \to 0} 0 = 0 \label{eq:1-16-2}\end{equation}

補足：幾何学的には、定数関数のグラフは水平線であり、傾きは 0 である。

1.17 恒等関数の微分

公式：$\displaystyle\dfrac{d}{dx} x = 1$

証明

$f(x) = x$ とする。微分の定義に従って計算する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{f(x+h) - f(x)}{h} \label{eq:1-17-1}\end{equation}

$f(x+h) = x + h$ および $f(x) = x$ を $\eqref{eq:1-17-1}$ に代入する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{(x+h) - x}{h} = \lim_{h \to 0} \dfrac{h}{h} = \lim_{h \to 0} 1 = 1 \label{eq:1-17-2}\end{equation}

補足：$y = x$ のグラフは傾き 1 の直線である。

1.18 べき関数の微分（正整数）

公式：$\displaystyle\dfrac{d}{dx} x^n = n x^{n-1}$

条件：$n$ は正整数

証明

$f(x) = x^n$（$n$ は正整数）とする。微分の定義に従って計算する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{(x+h)^n - x^n}{h} \label{eq:1-18-1}\end{equation}

二項定理（1.5）を用いて $(x+h)^n$ を展開する。

\begin{equation}(x+h)^n = \displaystyle\sum_{k=0}^{n} \binom{n}{k} x^{n-k} h^k = x^n + nx^{n-1}h + \binom{n}{2}x^{n-2}h^2 + \cdots + h^n \label{eq:1-18-2}\end{equation}

$\eqref{eq:1-18-2}$ を $\eqref{eq:1-18-1}$ に代入する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{x^n + nx^{n-1}h + \binom{n}{2}x^{n-2}h^2 + \cdots + h^n - x^n}{h} \label{eq:1-18-3}\end{equation}

$x^n$ が打ち消し合い、分子の各項を $h$ で割る。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \left[ nx^{n-1} + \binom{n}{2}x^{n-2}h + \cdots + h^{n-1} \right] \label{eq:1-18-4}\end{equation}

$h \to 0$ の極限を取る。第2項以降は $h$ の正べきを含むのですべて 0 に収束する。

\begin{equation}\dfrac{d}{dx} x^n = nx^{n-1} \label{eq:1-18-5}\end{equation}

補足：行列微分において、$\text{tr}(\boldsymbol{X}^n)$ の微分などでこの公式が間接的に使用される。

1.19 べき関数の微分（一般の実数）

公式：$\displaystyle\dfrac{d}{dx} x^a = a x^{a-1}$

条件：$a$ は任意の実数、$x > 0$

証明

$x > 0$ のとき、$x^a = e^{a \ln x}$ と書ける。この表現を用いて微分する。

注：解析学では、自然対数 $\ln$ を $e^y$ の逆関数として定義し、$x^a = e^{a \ln x}$ を一般の実数べきの定義として採用する。これにより、べき関数の微分が指数・対数関数の微分に帰着される。

$f(x) = x^a = e^{a \ln x}$ とおく。合成関数の微分法則（1.26）を適用する。

\begin{equation}\dfrac{df}{dx} = \dfrac{d}{dx} e^{a \ln x} \label{eq:1-19-1}\end{equation}

$u = a \ln x$ とおくと、$f = e^u$ であり

\begin{equation}\dfrac{df}{dx} = \dfrac{de^u}{du} \cdot \dfrac{du}{dx} \label{eq:1-19-2}\end{equation}

$\displaystyle \dfrac{d}{du} e^u = e^u$（1.20）および $\displaystyle \dfrac{d}{dx}(a \ln x) = \dfrac{a}{x}$（1.21）より

\begin{equation}\dfrac{df}{dx} = e^{a \ln x} \cdot \dfrac{a}{x} = x^a \cdot \dfrac{a}{x} = a x^{a-1} \label{eq:1-19-3}\end{equation}

補足：$a = -1$ のとき $\displaystyle \dfrac{d}{dx} x^{-1} = -x^{-2} = -\dfrac{1}{x^2}$、$a = \dfrac{1}{2}$ のとき $\displaystyle \dfrac{d}{dx} \sqrt{x} = \dfrac{1}{2\sqrt{x}}$ となる。

1.20 指数関数の微分

公式：$\displaystyle\dfrac{d}{dx} e^x = e^x$

証明

$f(x) = e^x$ とする。微分の定義に従って計算する。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{e^{x+h} - e^x}{h} \label{eq:1-20-1}\end{equation}

指数法則 $e^{x+h} = e^x \cdot e^h$ を用いて $e^x$ を因数として括り出す。

\begin{equation}\dfrac{df}{dx} = \lim_{h \to 0} \dfrac{e^x \cdot e^h - e^x}{h} = \lim_{h \to 0} e^x \cdot \dfrac{e^h - 1}{h} = e^x \cdot \lim_{h \to 0} \dfrac{e^h - 1}{h} \label{eq:1-20-2}\end{equation}

極限 $\lim_{h \to 0} \dfrac{e^h - 1}{h}$ を計算する。ここでは $e^h$ のTaylor展開を既知として用いる（Taylor展開の収束性と項別操作の正当化は解析学で別途証明される）。

\begin{equation}e^h = 1 + h + \dfrac{h^2}{2!} + \dfrac{h^3}{3!} + \cdots \label{eq:1-20-3}\end{equation}

$\eqref{eq:1-20-3}$ より

\begin{equation}e^h - 1 = h + \dfrac{h^2}{2!} + \dfrac{h^3}{3!} + \cdots \label{eq:1-20-4}\end{equation}

$\eqref{eq:1-20-4}$ の両辺を $h$ で割る。

\begin{equation}\dfrac{e^h - 1}{h} = 1 + \dfrac{h}{2!} + \dfrac{h^2}{3!} + \cdots \label{eq:1-20-5}\end{equation}

$h \to 0$ の極限を取る。

\begin{equation}\lim_{h \to 0} \dfrac{e^h - 1}{h} = 1 \label{eq:1-20-6}\end{equation}

$\eqref{eq:1-20-6}$ を $\eqref{eq:1-20-2}$ に代入する。

\begin{equation}\dfrac{d}{dx} e^x = e^x \cdot 1 = e^x \label{eq:1-20-7}\end{equation}

補足：$e^x$ は微分しても変化しない唯一の（定数倍を除く）関数である。この性質は $e$ の定義の一つでもある。ニューラルネットワークの活性化関数の微分で重要。

1.21 自然対数の微分

公式：$\displaystyle\dfrac{d}{dx} \ln x = \dfrac{1}{x}$

条件：$x > 0$

証明

$f(x) = \ln x$ とする。逆関数の微分公式を用いる。

$y = \ln x$ とおくと、$x = e^y$ である。両辺を $x$ で微分する。

逆関数の微分公式（1.27）より

\begin{equation}\dfrac{dy}{dx} = \dfrac{1}{\dfrac{dx}{dy}} \label{eq:1-21-1}\end{equation}

$x = e^y$ を $y$ で微分すると、1.20 より $\displaystyle \dfrac{dx}{dy} = e^y$ である。

\begin{equation}\dfrac{dy}{dx} = \dfrac{1}{e^y} \label{eq:1-21-2}\end{equation}

$e^y = x$（$y = \ln x$ の定義より）を $\eqref{eq:1-21-2}$ に代入する。

\begin{equation}\dfrac{d}{dx} \ln x = \dfrac{1}{x} \label{eq:1-21-3}\end{equation}

補足：行列微分において $\log|\boldsymbol{A}|$ の微分で本質的に使用される。

1.22 一般の指数関数の微分

公式：$\displaystyle\dfrac{d}{dx} a^x = a^x \ln a$

条件：$a > 0$、$a \neq 1$

証明

$a^x = e^{x \ln a}$ と変形する。

\begin{equation}a^x = (e^{\ln a})^x = e^{x \ln a} \label{eq:1-22-1}\end{equation}

合成関数の微分法則（1.26）を適用する。$u = x \ln a$ とおくと

\begin{equation}\dfrac{d}{dx} a^x = \dfrac{d}{dx} e^u = \dfrac{de^u}{du} \cdot \dfrac{du}{dx} \label{eq:1-22-2}\end{equation}

$\displaystyle \dfrac{d}{du} e^u = e^u$（1.20）および $\displaystyle \dfrac{d}{dx}(x \ln a) = \ln a$（$\ln a$ は定数）より

\begin{equation}\dfrac{d}{dx} a^x = e^{x \ln a} \cdot \ln a = a^x \ln a \label{eq:1-22-3}\end{equation}

補足：$a = e$ のとき $\ln e = 1$ なので $\displaystyle \dfrac{d}{dx} e^x = e^x$ となり、1.20 と一致する。

1.23 一般の対数関数の微分

公式：$\displaystyle\dfrac{d}{dx} \log_a x = \dfrac{1}{x \ln a}$

条件：$a > 0$、$a \neq 1$、$x > 0$

証明

底の変換公式を用いて $\log_a x$ を自然対数で表す。

\begin{equation}\log_a x = \dfrac{\ln x}{\ln a} \label{eq:1-23-1}\end{equation}

$\ln a$ は定数なので

\begin{equation}\dfrac{d}{dx} \log_a x = \dfrac{1}{\ln a} \cdot \dfrac{d}{dx} \ln x \label{eq:1-23-2}\end{equation}

1.21 より $\displaystyle \dfrac{d}{dx} \ln x = \dfrac{1}{x}$ を代入する。

\begin{equation}\dfrac{d}{dx} \log_a x = \dfrac{1}{\ln a} \cdot \dfrac{1}{x} = \dfrac{1}{x \ln a} \label{eq:1-23-3}\end{equation}

補足：$a = e$ のとき $\ln e = 1$ なので $\displaystyle \dfrac{d}{dx} \ln x = \dfrac{1}{x}$ となり、1.21 と一致する。

1.5 微分の演算法則

微分の演算法則により、複雑な関数を基本関数の組み合わせとして微分できる。これらの法則は行列微分においても（適切に一般化された形で）成り立つ。

1.24 線形性（和と定数倍）

公式：$\displaystyle\dfrac{d}{dx}[af(x) + bg(x)] = a\dfrac{df}{dx} + b\dfrac{dg}{dx}$

条件：$a, b$ は定数、$f, g$ は微分可能な関数

証明

$h(x) = af(x) + bg(x)$ とおく。微分の定義に従って計算する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{h(x + \Delta x) - h(x)}{\Delta x} \label{eq:1-24-1}\end{equation}

$h(x + \Delta x) = af(x + \Delta x) + bg(x + \Delta x)$ を代入する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{af(x + \Delta x) + bg(x + \Delta x) - af(x) - bg(x)}{\Delta x} \label{eq:1-24-2}\end{equation}

項を整理する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \left[ a \cdot \dfrac{f(x + \Delta x) - f(x)}{\Delta x} + b \cdot \dfrac{g(x + \Delta x) - g(x)}{\Delta x} \right] \label{eq:1-24-3}\end{equation}

極限の線形性より

\begin{equation}\dfrac{dh}{dx} = a \cdot \lim_{\Delta x \to 0} \dfrac{f(x + \Delta x) - f(x)}{\Delta x} + b \cdot \lim_{\Delta x \to 0} \dfrac{g(x + \Delta x) - g(x)}{\Delta x} \label{eq:1-24-4}\end{equation}

微分の定義より

\begin{equation}\dfrac{d}{dx}[af(x) + bg(x)] = a\dfrac{df}{dx} + b\dfrac{dg}{dx} \label{eq:1-24-5}\end{equation}

補足：微分演算子 $\displaystyle \dfrac{d}{dx}$ は線形演算子である。行列微分 $\displaystyle \dfrac{\partial}{\partial \boldsymbol{X}}$ も同様に線形である。

1.25 積の微分法則（Leibniz則）

公式：$\displaystyle\dfrac{d}{dx}[f(x)g(x)] = f'(x)g(x) + f(x)g'(x)$

条件：$f, g$ は微分可能な関数

証明

$h(x) = f(x)g(x)$ とおく。微分の定義に従って計算する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{f(x + \Delta x)g(x + \Delta x) - f(x)g(x)}{\Delta x} \label{eq:1-25-1}\end{equation}

分子に $f(x + \Delta x)g(x) - f(x + \Delta x)g(x)$（$= 0$）を加える。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{f(x + \Delta x)g(x + \Delta x) - f(x + \Delta x)g(x) + f(x + \Delta x)g(x) - f(x)g(x)}{\Delta x} \label{eq:1-25-2}\end{equation}

項をグループ化する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \left[ f(x + \Delta x) \cdot \dfrac{g(x + \Delta x) - g(x)}{\Delta x} + g(x) \cdot \dfrac{f(x + \Delta x) - f(x)}{\Delta x} \right] \label{eq:1-25-3}\end{equation}

極限を各項に適用する。$f$ は微分可能なので連続（1.3）であり、$\lim_{\Delta x \to 0} f(x + \Delta x) = f(x)$ である。

\begin{equation}\dfrac{dh}{dx} = f(x) \cdot \lim_{\Delta x \to 0} \dfrac{g(x + \Delta x) - g(x)}{\Delta x} + g(x) \cdot \lim_{\Delta x \to 0} \dfrac{f(x + \Delta x) - f(x)}{\Delta x} \label{eq:1-25-4}\end{equation}

微分の定義より

\begin{equation}\dfrac{d}{dx}[f(x)g(x)] = f(x)g'(x) + g(x)f'(x) = f'(x)g(x) + f(x)g'(x) \label{eq:1-25-5}\end{equation}

補足：行列積の微分 $\displaystyle \dfrac{\partial}{\partial X_{ij}}(\boldsymbol{AB})_{kl}$ でも同様の法則が成り立つ。n個の積への一般化は $\displaystyle \dfrac{d}{dx}[f_0 \cdots f_{n-1}] = \displaystyle\sum_{i=0}^{n-1} f_0 \cdots f_{i-1} f'_i f_{i+1} \cdots f_{n-1}$ となる。
出典：G.W. Leibniz (1684) "Nova methodus pro maximis et minimis", Acta Eruditorum. 「Leibniz則」の名称で知られる。

1.26 合成関数の微分（連鎖律）

公式：$\displaystyle\dfrac{d}{dx}f(g(x)) = f'(g(x)) \cdot g'(x)$

条件：$g$ は $x$ で微分可能、$f$ は $g(x)$ で微分可能

証明

$h(x) = f(g(x))$ とおく。$u = g(x)$ と置換すると $h = f(u)$ である。

微分の定義に従って計算する。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{f(g(x + \Delta x)) - f(g(x))}{\Delta x} \label{eq:1-26-1}\end{equation}

$\Delta u = g(x + \Delta x) - g(x)$ とおく。$g$ は微分可能なので連続であり（1.3）、$\Delta x \to 0$ のとき $\Delta u \to 0$ である。

$\Delta u \neq 0$ の場合、分子分母に $\Delta u$ を乗除する（$\Delta u = 0$ となる $\Delta x$ の点が存在しても、極限値には影響しない。なぜなら $\Delta x \to 0$ で $\Delta u \neq 0$ となる点列を用いて極限を評価できるからである）。

\begin{equation}\dfrac{dh}{dx} = \lim_{\Delta x \to 0} \dfrac{f(g(x) + \Delta u) - f(g(x))}{\Delta u} \cdot \dfrac{\Delta u}{\Delta x} \label{eq:1-26-2}\end{equation}

$u = g(x)$ とおくと、第1因子は

\begin{equation}\lim_{\Delta u \to 0} \dfrac{f(u + \Delta u) - f(u)}{\Delta u} = f'(u) = f'(g(x)) \label{eq:1-26-3}\end{equation}

第2因子は

\begin{equation}\lim_{\Delta x \to 0} \dfrac{\Delta u}{\Delta x} = \lim_{\Delta x \to 0} \dfrac{g(x + \Delta x) - g(x)}{\Delta x} = g'(x) \label{eq:1-26-4}\end{equation}

$\eqref{eq:1-26-3}$ と $\eqref{eq:1-26-4}$ を $\eqref{eq:1-26-2}$ に代入する。

\begin{equation}\dfrac{d}{dx}f(g(x)) = f'(g(x)) \cdot g'(x) \label{eq:1-26-5}\end{equation}

Leibniz記法では

\begin{equation}\dfrac{dh}{dx} = \dfrac{df}{du} \cdot \dfrac{du}{dx} \label{eq:1-26-6}\end{equation}

補足：連鎖律は行列微分における最も重要な法則の一つ。行列関数の合成 $f(\boldsymbol{U}(\boldsymbol{X}))$ の微分は $\displaystyle \text{tr}\left[\left(\dfrac{\partial f}{\partial \boldsymbol{U}}\right)^\top \dfrac{\partial \boldsymbol{U}}{\partial X_{ij}}\right]$ のようなトレース形式で表される。
出典：G.W. Leibniz (1684) "Nova methodus pro maximis et minimis" で微分記法とともに導入。厳密な証明は A.L. Cauchy (1821) "Cours d'analyse" による。

1.27 逆関数の微分

公式：$\displaystyle\dfrac{dx}{dy} = \dfrac{1}{\dfrac{dy}{dx}}$

条件：$y = f(x)$ が狭義単調で微分可能、$f'(x) \neq 0$

証明

$y = f(x)$ とし、逆関数を $x = f^{-1}(y)$ とする。

定義より $f(f^{-1}(y)) = y$ である。両辺を $y$ で微分する。

\begin{equation}\dfrac{d}{dy} f(f^{-1}(y)) = \dfrac{d}{dy} y = 1 \label{eq:1-27-1}\end{equation}

左辺に連鎖律（1.26）を適用する。$u = f^{-1}(y)$ とおくと

\begin{equation}\dfrac{df}{du} \cdot \dfrac{du}{dy} = 1 \label{eq:1-27-2}\end{equation}

$u = f^{-1}(y) = x$ なので、$\displaystyle \dfrac{df}{du} = \dfrac{dy}{dx} = f'(x)$ である。

\begin{equation}f'(x) \cdot \dfrac{dx}{dy} = 1 \label{eq:1-27-3}\end{equation}

$f'(x) \neq 0$ のとき、$\eqref{eq:1-27-3}$ を解いて

\begin{equation}\dfrac{dx}{dy} = \dfrac{1}{f'(x)} = \dfrac{1}{\dfrac{dy}{dx}} \label{eq:1-27-4}\end{equation}

補足：逆行列の微分公式 $\displaystyle \dfrac{d\boldsymbol{A}^{-1}}{dt} = -\boldsymbol{A}^{-1} \dfrac{d\boldsymbol{A}}{dt} \boldsymbol{A}^{-1}$ は、この1変数の結果の行列版に相当する。

1.28 商の微分法則

公式：$\displaystyle\dfrac{d}{dx}\dfrac{f(x)}{g(x)} = \dfrac{f'(x)g(x) - f(x)g'(x)}{[g(x)]^2}$

条件：$f, g$ は微分可能、$g(x) \neq 0$

証明

$h(x) = \dfrac{f(x)}{g(x)} = f(x) \cdot [g(x)]^{-1}$ と書き、積の微分法則を適用する。

積の微分法則（1.25）より

\begin{equation}\dfrac{dh}{dx} = f'(x) \cdot [g(x)]^{-1} + f(x) \cdot \dfrac{d}{dx}[g(x)]^{-1} \label{eq:1-28-1}\end{equation}

$[g(x)]^{-1}$ の微分を求める。$u = g(x)$ とおくと、連鎖律（1.26）より

\begin{equation}\dfrac{d}{dx}[g(x)]^{-1} = \dfrac{d}{dx} u^{-1} = \dfrac{d(u^{-1})}{du} \cdot \dfrac{du}{dx} = (-u^{-2}) \cdot g'(x) = -\dfrac{g'(x)}{[g(x)]^2} \label{eq:1-28-2}\end{equation}

$\eqref{eq:1-28-2}$ を $\eqref{eq:1-28-1}$ に代入する。

\begin{equation}\dfrac{dh}{dx} = \dfrac{f'(x)}{g(x)} + f(x) \cdot \left(-\dfrac{g'(x)}{[g(x)]^2}\right) \label{eq:1-28-3}\end{equation}

通分して整理する。

\begin{equation}\dfrac{dh}{dx} = \dfrac{f'(x) \cdot g(x)}{[g(x)]^2} - \dfrac{f(x) \cdot g'(x)}{[g(x)]^2} = \dfrac{f'(x)g(x) - f(x)g'(x)}{[g(x)]^2} \label{eq:1-28-4}\end{equation}

補足：覚え方：「微分・元引く元・微分、分母の2乗で割る」

1.29 対数微分法

公式：$\displaystyle\dfrac{d}{dx}[f(x)]^{g(x)} = [f(x)]^{g(x)} \left[ g'(x) \ln f(x) + g(x) \dfrac{f'(x)}{f(x)} \right]$

条件：$f(x) > 0$、$f, g$ は微分可能

証明

$h(x) = [f(x)]^{g(x)}$ とおく。$f(x) > 0$ より、両辺の自然対数を取る。

\begin{equation}\ln h(x) = g(x) \ln f(x) \label{eq:1-29-1}\end{equation}

$\eqref{eq:1-29-1}$ の両辺を $x$ で微分する。左辺は連鎖律より

\begin{equation}\dfrac{d}{dx} \ln h(x) = \dfrac{1}{h(x)} \cdot h'(x) = \dfrac{h'(x)}{h(x)} \label{eq:1-29-2}\end{equation}

右辺は積の微分法則より

\begin{equation}\dfrac{d}{dx}[g(x) \ln f(x)] = g'(x) \ln f(x) + g(x) \cdot \dfrac{f'(x)}{f(x)} \label{eq:1-29-3}\end{equation}

$\eqref{eq:1-29-2}$ と $\eqref{eq:1-29-3}$ より

\begin{equation}\dfrac{h'(x)}{h(x)} = g'(x) \ln f(x) + g(x) \dfrac{f'(x)}{f(x)} \label{eq:1-29-4}\end{equation}

両辺に $h(x) = [f(x)]^{g(x)}$ を掛ける。

\begin{equation}h'(x) = [f(x)]^{g(x)} \left[ g'(x) \ln f(x) + g(x) \dfrac{f'(x)}{f(x)} \right] \label{eq:1-29-5}\end{equation}

補足：特殊ケース：$g(x) = n$（定数）のとき、$\displaystyle \dfrac{d}{dx}[f(x)]^n = n[f(x)]^{n-1} f'(x)$。$f(x) = x$、$g(x) = x$ のとき、$\displaystyle \dfrac{d}{dx} x^x = x^x (\ln x + 1)$。

1.6 三角関数の微分

三角関数とその逆関数の微分公式を導出する。これらはフーリエ解析や信号処理で多用され、行列微分においても三角関数を含む関数の微分で必要となる。

1.30 正弦関数の微分

公式：$\displaystyle\dfrac{d}{dx} \sin x = \cos x$

証明

微分の定義に従って計算する。

\begin{equation}\dfrac{d}{dx} \sin x = \lim_{h \to 0} \dfrac{\sin(x+h) - \sin x}{h} \label{eq:1-30-1}\end{equation}

加法定理（1.7） $\sin(x+h) = \sin x \cos h + \cos x \sin h$ を用いる。

\begin{equation}\dfrac{d}{dx} \sin x = \lim_{h \to 0} \dfrac{\sin x \cos h + \cos x \sin h - \sin x}{h} \label{eq:1-30-2}\end{equation}

項を整理する。

\begin{equation}\dfrac{d}{dx} \sin x = \lim_{h \to 0} \left[ \sin x \cdot \dfrac{\cos h - 1}{h} + \cos x \cdot \dfrac{\sin h}{h} \right] \label{eq:1-30-3}\end{equation}

以下の基本極限（1.8、1.9）を用いる。

\begin{equation}\lim_{h \to 0} \dfrac{\sin h}{h} = 1 \label{eq:1-30-4}\end{equation}

\begin{equation}\lim_{h \to 0} \dfrac{\cos h - 1}{h} = 0 \label{eq:1-30-5}\end{equation}

$\eqref{eq:1-30-4}$、$\eqref{eq:1-30-5}$ を $\eqref{eq:1-30-3}$ に代入する。

\begin{equation}\dfrac{d}{dx} \sin x = \sin x \cdot 0 + \cos x \cdot 1 = \cos x \label{eq:1-30-7}\end{equation}

補足：基本極限の証明は 1.8、1.9 を参照。

1.31 余弦関数の微分

公式：$\displaystyle\dfrac{d}{dx} \cos x = -\sin x$

証明

$\cos x = \sin\left(\dfrac{\pi}{2} - x\right)$ の関係を用いる。

連鎖律（1.26）を適用する。$u = \displaystyle \dfrac{\pi}{2} - x$ とおくと

\begin{equation}\dfrac{d}{dx} \cos x = \dfrac{d}{dx} \sin u = \dfrac{d(\sin u)}{du} \cdot \dfrac{du}{dx} \label{eq:1-31-1}\end{equation}

1.30 より $\displaystyle \dfrac{d(\sin u)}{du} = \cos u$ である。また $\displaystyle \dfrac{du}{dx} = -1$ である。

\begin{equation}\dfrac{d}{dx} \cos x = \cos u \cdot (-1) = -\cos\left(\dfrac{\pi}{2} - x\right) = -\sin x \label{eq:1-31-2}\end{equation}

最後の等号では $\cos\left(\dfrac{\pi}{2} - x\right) = \sin x$ を用いた。

1.32 正接関数の微分

公式：$\displaystyle\dfrac{d}{dx} \tan x = \sec^2 x = \dfrac{1}{\cos^2 x}$

条件：$\cos x \neq 0$

証明

$\tan x = \dfrac{\sin x}{\cos x}$ に商の微分法則（1.28）を適用する。

\begin{equation}\dfrac{d}{dx} \tan x = \dfrac{(\sin x)' \cos x - \sin x (\cos x)'}{\cos^2 x} \label{eq:1-32-1}\end{equation}

1.30、1.31 より $(\sin x)' = \cos x$、$(\cos x)' = -\sin x$ を代入する。

\begin{equation}\dfrac{d}{dx} \tan x = \dfrac{\cos x \cdot \cos x - \sin x \cdot (-\sin x)}{\cos^2 x} = \dfrac{\cos^2 x + \sin^2 x}{\cos^2 x} \label{eq:1-32-2}\end{equation}

ピタゴラスの恒等式（1.6） $\cos^2 x + \sin^2 x = 1$ より

\begin{equation}\dfrac{d}{dx} \tan x = \dfrac{1}{\cos^2 x} = \sec^2 x \label{eq:1-32-3}\end{equation}

1.33 その他の三角関数の微分

公式：
$\displaystyle \dfrac{d}{dx} \cot x = -\csc^2 x$
$\displaystyle \dfrac{d}{dx} \sec x = \sec x \tan x$
$\displaystyle \dfrac{d}{dx} \csc x = -\csc x \cot x$

証明

$\cot x$ の微分：

$\cot x = \dfrac{\cos x}{\sin x}$ に商の微分法則を適用する。

\begin{equation}\dfrac{d}{dx} \cot x = \dfrac{-\sin x \cdot \sin x - \cos x \cdot \cos x}{\sin^2 x} = \dfrac{-(\sin^2 x + \cos^2 x)}{\sin^2 x} = -\dfrac{1}{\sin^2 x} = -\csc^2 x \label{eq:1-33-1}\end{equation}

$\sec x$ の微分：

$\sec x = \dfrac{1}{\cos x} = (\cos x)^{-1}$ に連鎖律を適用する。

\begin{equation}\dfrac{d}{dx} \sec x = -(\cos x)^{-2} \cdot (-\sin x) = \dfrac{\sin x}{\cos^2 x} = \dfrac{1}{\cos x} \cdot \dfrac{\sin x}{\cos x} = \sec x \tan x \label{eq:1-33-2}\end{equation}

$\csc x$ の微分：

$\csc x = \dfrac{1}{\sin x} = (\sin x)^{-1}$ に連鎖律を適用する。

\begin{equation}\dfrac{d}{dx} \csc x = -(\sin x)^{-2} \cdot \cos x = -\dfrac{\cos x}{\sin^2 x} = -\dfrac{1}{\sin x} \cdot \dfrac{\cos x}{\sin x} = -\csc x \cot x \label{eq:1-33-3}\end{equation}

1.7 逆三角関数の微分

1.34 逆正弦関数の微分

公式：$\displaystyle\dfrac{d}{dx} \arcsin x = \dfrac{1}{\sqrt{1 - x^2}}$

条件：$-1 < x < 1$

証明

$y = \arcsin x$ とおくと $x = \sin y$ であり、$-\dfrac{\pi}{2} \leq y \leq \dfrac{\pi}{2}$ である。

逆関数の微分公式（1.27）を適用する。

\begin{equation}\dfrac{dy}{dx} = \dfrac{1}{\dfrac{dx}{dy}} = \dfrac{1}{\cos y} \label{eq:1-34-1}\end{equation}

$\cos y$ を $x$ で表す。$\sin^2 y + \cos^2 y = 1$ より

\begin{equation}\cos y = \pm\sqrt{1 - \sin^2 y} = \pm\sqrt{1 - x^2} \label{eq:1-34-2}\end{equation}

$-\dfrac{\pi}{2} \leq y \leq \dfrac{\pi}{2}$ の範囲で $\cos y \geq 0$ なので、正の平方根を取る。

\begin{equation}\cos y = \sqrt{1 - x^2} \label{eq:1-34-3}\end{equation}

$\eqref{eq:1-34-3}$ を $\eqref{eq:1-34-1}$ に代入する。

\begin{equation}\dfrac{d}{dx} \arcsin x = \dfrac{1}{\sqrt{1 - x^2}} \label{eq:1-34-4}\end{equation}

1.35 逆余弦関数の微分

公式：$\displaystyle\dfrac{d}{dx} \arccos x = -\dfrac{1}{\sqrt{1 - x^2}}$

条件：$-1 < x < 1$

証明

$y = \arccos x$ とおくと $x = \cos y$ であり、$0 \leq y \leq \pi$ である。

逆関数の微分公式を適用する。

\begin{equation}\dfrac{dy}{dx} = \dfrac{1}{\dfrac{dx}{dy}} = \dfrac{1}{-\sin y} \label{eq:1-35-1}\end{equation}

$\sin y$ を $x$ で表す。$\sin^2 y + \cos^2 y = 1$ より

\begin{equation}\sin y = \pm\sqrt{1 - \cos^2 y} = \pm\sqrt{1 - x^2} \label{eq:1-35-2}\end{equation}

$0 \leq y \leq \pi$ の範囲で $\sin y \geq 0$ なので、正の平方根を取る。

\begin{equation}\sin y = \sqrt{1 - x^2} \label{eq:1-35-3}\end{equation}

$\eqref{eq:1-35-3}$ を $\eqref{eq:1-35-1}$ に代入する。

\begin{equation}\dfrac{d}{dx} \arccos x = -\dfrac{1}{\sqrt{1 - x^2}} \label{eq:1-35-4}\end{equation}

補足：$\displaystyle \arcsin x + \arccos x = \dfrac{\pi}{2}$ より、$\displaystyle \dfrac{d}{dx} \arccos x = -\dfrac{d}{dx} \arcsin x$ が確認できる。

1.36 逆正接関数の微分

公式：$\displaystyle\dfrac{d}{dx} \arctan x = \dfrac{1}{1 + x^2}$

証明

$y = \arctan x$ とおくと $x = \tan y$ であり、$-\dfrac{\pi}{2} < y < \dfrac{\pi}{2}$ である。

逆関数の微分公式を適用する。

\begin{equation}\dfrac{dy}{dx} = \dfrac{1}{\dfrac{dx}{dy}} = \dfrac{1}{\sec^2 y} = \cos^2 y \label{eq:1-36-1}\end{equation}

$\cos^2 y$ を $x$ で表す。$\sec^2 y = 1 + \tan^2 y$ より

\begin{equation}\cos^2 y = \dfrac{1}{\sec^2 y} = \dfrac{1}{1 + \tan^2 y} = \dfrac{1}{1 + x^2} \label{eq:1-36-2}\end{equation}

$\eqref{eq:1-36-2}$ を $\eqref{eq:1-36-1}$ に代入する。

\begin{equation}\dfrac{d}{dx} \arctan x = \dfrac{1}{1 + x^2} \label{eq:1-36-3}\end{equation}

補足：この結果は $\displaystyle \displaystyle\int \dfrac{1}{1+x^2} dx = \arctan x + C$ を意味し、積分でよく使用される。

1.8 双曲線関数の微分

1.37 双曲線正弦の微分

公式：$\displaystyle\dfrac{d}{dx} \sinh x = \cosh x$

証明

$\sinh x = \dfrac{e^x - e^{-x}}{2}$ の定義に従って微分する。

\begin{equation}\dfrac{d}{dx} \sinh x = \dfrac{d}{dx} \dfrac{e^x - e^{-x}}{2} = \dfrac{1}{2} \left( \dfrac{d}{dx} e^x - \dfrac{d}{dx} e^{-x} \right) \label{eq:1-37-1}\end{equation}

1.20 と連鎖律より、$\displaystyle \dfrac{d}{dx} e^x = e^x$ および $\displaystyle \dfrac{d}{dx} e^{-x} = -e^{-x}$ である。

\begin{equation}\dfrac{d}{dx} \sinh x = \dfrac{1}{2} (e^x - (-e^{-x})) = \dfrac{e^x + e^{-x}}{2} = \cosh x \label{eq:1-37-2}\end{equation}

1.38 双曲線余弦の微分

公式：$\displaystyle\dfrac{d}{dx} \cosh x = \sinh x$

証明

$\cosh x = \dfrac{e^x + e^{-x}}{2}$ の定義に従って微分する。

\begin{equation}\dfrac{d}{dx} \cosh x = \dfrac{d}{dx} \dfrac{e^x + e^{-x}}{2} = \dfrac{1}{2} \left( \dfrac{d}{dx} e^x + \dfrac{d}{dx} e^{-x} \right) \label{eq:1-38-1}\end{equation}

\begin{equation}\dfrac{d}{dx} \cosh x = \dfrac{1}{2} (e^x + (-e^{-x})) = \dfrac{e^x - e^{-x}}{2} = \sinh x \label{eq:1-38-2}\end{equation}

補足：三角関数と異なり、$(\cosh x)' = \sinh x$ には負号がつかない。これは双曲線恒等式 $\cosh^2 x - \sinh^2 x = 1$（1.10）と対応する。

1.39 双曲線正接の微分

公式：$\displaystyle\dfrac{d}{dx} \tanh x = \text{sech}^2 x = 1 - \tanh^2 x$

証明

$\tanh x = \dfrac{\sinh x}{\cosh x}$ に商の微分法則を適用する。

\begin{equation}\dfrac{d}{dx} \tanh x = \dfrac{(\sinh x)' \cosh x - \sinh x (\cosh x)'}{\cosh^2 x} \label{eq:1-39-1}\end{equation}

1.37、1.38 より $(\sinh x)' = \cosh x$、$(\cosh x)' = \sinh x$ を代入する。

\begin{equation}\dfrac{d}{dx} \tanh x = \dfrac{\cosh x \cdot \cosh x - \sinh x \cdot \sinh x}{\cosh^2 x} = \dfrac{\cosh^2 x - \sinh^2 x}{\cosh^2 x} \label{eq:1-39-2}\end{equation}

双曲線恒等式（1.10） $\cosh^2 x - \sinh^2 x = 1$ より

\begin{equation}\dfrac{d}{dx} \tanh x = \dfrac{1}{\cosh^2 x} = \text{sech}^2 x \label{eq:1-39-3}\end{equation}

また、$\text{sech}^2 x = 1 - \tanh^2 x$ も成り立つ（$\dfrac{1}{\cosh^2 x} = \dfrac{\cosh^2 x - \sinh^2 x}{\cosh^2 x}$）。

補足：$\tanh$ はニューラルネットワークの活性化関数として使用される。勾配 $1 - \tanh^2 x$ は $\tanh x$ 自身から計算できるため、逆伝播で効率的である。

1.9 その他の重要な微分公式

1.40 絶対値関数の微分

公式：$\displaystyle\dfrac{d}{dx} |x| = \text{sgn}(x) = \begin{cases} 1 & (x > 0) \\ -1 & (x < 0) \end{cases}$

条件：$x \neq 0$（$x = 0$ では微分不可能）

証明

$|x| = \sqrt{x^2}$ と書ける。連鎖律を適用する。

$u = x^2$ とおくと $|x| = u^{1/2}$ である。

\begin{equation}\dfrac{d}{dx}|x| = \dfrac{d(u^{1/2})}{du} \cdot \dfrac{du}{dx} = \dfrac{1}{2}u^{-1/2} \cdot 2x = \dfrac{x}{\sqrt{x^2}} = \dfrac{x}{|x|} \label{eq:1-40-1}\end{equation}

$x > 0$ のとき $\dfrac{x}{|x|} = \dfrac{x}{x} = 1$、$x < 0$ のとき $\dfrac{x}{|x|} = \dfrac{x}{-x} = -1$ である。

\begin{equation}\dfrac{d}{dx}|x| = \text{sgn}(x) = \begin{cases} 1 & (x > 0) \\ -1 & (x < 0) \end{cases} \label{eq:1-40-2}\end{equation}

補足：$x = 0$ では左極限と右極限が一致しないため微分不可能。機械学習の L1 正則化 $\|\boldsymbol{w}\|_1$ の劣勾配で使用される。

1.41 シグモイド関数の微分

公式：$\displaystyle\dfrac{d}{dx} \sigma(x) = \sigma(x)(1 - \sigma(x))$

条件：$\displaystyle \sigma(x) = \dfrac{1}{1 + e^{-x}}$（シグモイド関数）

証明

$\sigma(x) = \dfrac{1}{1 + e^{-x}} = (1 + e^{-x})^{-1}$ を連鎖律で微分する。

$u = 1 + e^{-x}$ とおくと $\sigma = u^{-1}$ である。

\begin{equation}\dfrac{d\sigma}{dx} = \dfrac{d(u^{-1})}{du} \cdot \dfrac{du}{dx} = (-u^{-2}) \cdot (-e^{-x}) = \dfrac{e^{-x}}{(1 + e^{-x})^2} \label{eq:1-41-1}\end{equation}

この結果を $\sigma(x)$ で表す。$\sigma = \dfrac{1}{1 + e^{-x}}$ より

\begin{equation}1 - \sigma = 1 - \dfrac{1}{1 + e^{-x}} = \dfrac{e^{-x}}{1 + e^{-x}} \label{eq:1-41-2}\end{equation}

したがって

\begin{equation}\sigma(1 - \sigma) = \dfrac{1}{1 + e^{-x}} \cdot \dfrac{e^{-x}}{1 + e^{-x}} = \dfrac{e^{-x}}{(1 + e^{-x})^2} \label{eq:1-41-3}\end{equation}

$\eqref{eq:1-41-1}$ と $\eqref{eq:1-41-3}$ を比較して

\begin{equation}\dfrac{d\sigma}{dx} = \sigma(1 - \sigma) \label{eq:1-41-4}\end{equation}

補足：シグモイド関数はニューラルネットワークの活性化関数およびロジスティック回帰で使用される。勾配が $\sigma$ 自身から計算できるため逆伝播で効率的。最大値は $x = 0$ で $\sigma'(0) = 0.25$。

1.42 Softplus関数の微分

公式：$\displaystyle\dfrac{d}{dx} \ln(1 + e^x) = \sigma(x) = \dfrac{1}{1 + e^{-x}}$

証明

$f(x) = \ln(1 + e^x)$（Softplus関数）を連鎖律で微分する。

$u = 1 + e^x$ とおくと $f = \ln u$ である。

\begin{equation}\dfrac{df}{dx} = \dfrac{d(\ln u)}{du} \cdot \dfrac{du}{dx} = \dfrac{1}{u} \cdot e^x = \dfrac{e^x}{1 + e^x} \label{eq:1-42-1}\end{equation}

分子分母に $e^{-x}$ を掛ける。

\begin{equation}\dfrac{df}{dx} = \dfrac{e^x \cdot e^{-x}}{(1 + e^x) \cdot e^{-x}} = \dfrac{1}{e^{-x} + 1} = \dfrac{1}{1 + e^{-x}} = \sigma(x) \label{eq:1-42-2}\end{equation}

補足：Softplus は ReLU $\max(0, x)$ の滑らかな近似である。$\displaystyle \dfrac{d}{dx} \text{softplus}(x) = \sigma(x)$ という関係は、Softplus が「シグモイドの積分」であることを意味する。

1.43 Leibnizの公式（高階微分の積）

公式：$\displaystyle (fg)^{(n)} = \displaystyle\sum_{k=0}^{n} \binom{n}{k} f^{(k)} g^{(n-k)}$

条件：$f, g$ は $n$ 回微分可能

証明

数学的帰納法で証明する。

基底ケース（$n = 1$）：

\begin{equation}(fg)' = f'g + fg' = \binom{1}{0}f^{(0)}g^{(1)} + \binom{1}{1}f^{(1)}g^{(0)} \label{eq:1-43-1}\end{equation}

これは積の微分法則（1.25）と一致する。

帰納ステップ：

$n = m$ で公式が成り立つと仮定する。

\begin{equation}(fg)^{(m)} = \displaystyle\sum_{k=0}^{m} \binom{m}{k} f^{(k)} g^{(m-k)} \label{eq:1-43-2}\end{equation}

$n = m + 1$ の場合を示す。$\eqref{eq:1-43-2}$ の両辺を微分する。

\begin{equation}(fg)^{(m+1)} = \displaystyle\sum_{k=0}^{m} \binom{m}{k} \left( f^{(k+1)} g^{(m-k)} + f^{(k)} g^{(m-k+1)} \right) \label{eq:1-43-3}\end{equation}

$\eqref{eq:1-43-3}$ を整理し、Pascalの恒等式（1.4） $\binom{m}{k-1} + \binom{m}{k} = \binom{m+1}{k}$ を用いると

\begin{equation}(fg)^{(m+1)} = \displaystyle\sum_{k=0}^{m+1} \binom{m+1}{k} f^{(k)} g^{(m+1-k)} \label{eq:1-43-4}\end{equation}

補足：この公式は二項定理の微分版であり、Taylor展開の計算などで使用される。

参考文献

Petersen, K. B., & Pedersen, M. S. (2012). The Matrix Cookbook. Technical University of Denmark.
Magnus, J. R., & Neudecker, H. (1999). Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised ed.). Wiley.
Matrix calculus - Wikipedia

証明集 第1章: スカラ1変数の微分

1. スカラ1変数の微分

本章のロードマップ

1.1 微分の定義と基本概念

1.1 点での微分係数の定義

解説

1.2 導関数の定義（微分係数を関数としてみる）

解説

1.3 微分可能性と連続性

証明

1.2 基礎定理と恒等式

1.4 Pascalの恒等式

証明

1.5 二項定理

証明

1.6 ピタゴラスの恒等式

証明

1.7 三角関数の加法定理

証明

1.8 正弦関数の基本極限

証明

1.9 余弦関数の基本極限

証明

1.10 双曲線恒等式

証明

1.3 線形代数の基礎定理

1.11 トレースの線形性

証明

1.12 トレースの巡回性

証明

1.13 トレースと転置

証明

1.14 行列式の積

証明

1.15 転置の行列式

証明

1.4 基本関数の微分

1.16 定数関数の微分

証明

1.17 恒等関数の微分

証明

1.18 べき関数の微分（正整数）

証明

1.19 べき関数の微分（一般の実数）

証明

1.20 指数関数の微分

証明

1.21 自然対数の微分

証明

1.22 一般の指数関数の微分

証明

1.23 一般の対数関数の微分

証明

1.5 微分の演算法則

1.24 線形性（和と定数倍）

証明

1.25 積の微分法則（Leibniz則）

証明

1.26 合成関数の微分（連鎖律）

証明

1.27 逆関数の微分

証明

1.28 商の微分法則

証明

1.29 対数微分法

証明

1.6 三角関数の微分

1.30 正弦関数の微分

証明

1.31 余弦関数の微分

証明

1.32 正接関数の微分

証明

1.33 その他の三角関数の微分

証明

1.7 逆三角関数の微分

1.34 逆正弦関数の微分

証明

1.35 逆余弦関数の微分

証明

証明集第1章: スカラ1変数の微分