Differentiation on Matrix Manifold

Author: Dr. Jack Yansong Li
Affiliation: Liii Network
Email: yansong@liii.pro

Question

How to calculate the derivative of $f(X) = \log \det(X)$ , where $X \in \mathbb{S}^n$ is a $n$ -dimensional positive-definite symmetrical real-valued matrix. Or $g(x) = x^{\top} A x$ , where $x \in \mathbb{R}^n$ and $A \in \mathbb{S}^n$ .

Definitions

Scalar field: A mapping $\psi : \mathcal{M} \rightarrow \mathbb{R}$ . The set of all scalar fields on $\mathcal{M}$ is denoted by $\mathcal{F}_{\mathcal{M}}$ .

Example: The functions $f$ and $g$ in the question are scalar fields.

Taylor expansion for a scalar field $f$ :

f(x + \Delta x) = f(x) + \mathrm{D} f(x) [\Delta x] + \frac{1}{2} \mathrm{D}^2 f(x) [\Delta x, \Delta x] + \text{h.o.t},

where $\mathrm{D} f$ maps $\mathcal{M} \rightarrow \mathcal{L}(\mathcal{M}, \mathbb{R})$ and $\mathrm{D}^2 f$ maps $\mathcal{M} \rightarrow \operatorname{BL}(\mathcal{M}, \mathbb{R})$ .

Notation:

$\mathcal{L}(\mathcal{M}, \mathbb{R})$ : set of linear maps $\mathcal{M} \to \mathbb{R}$
$\operatorname{BL}(\mathcal{M}, \mathbb{R})$ : set of bilinear maps $\mathcal{M} \times \mathcal{M} \to \mathbb{R}$

Remark: $\mathrm{D}^2 f = \mathrm{D}(\mathrm{D} f)$ and $\mathcal{L}(\mathcal{M}, \mathcal{L}(\mathcal{M}, \mathbb{R})) \cong \operatorname{BL}(\mathcal{M}, \mathbb{R})$ .

Vector at $x \in \mathcal{M}$ : a linear map $v : \mathcal{F}_{\mathcal{M}} \rightarrow \mathbb{R}$ .

Example: The perturbation $\Delta x$ acts as a vector at $x$ via:

\Delta x(f) \triangleq \mathrm{D} f(x) [\Delta x].

Derivative of $g(x) = x^{\top} A x$

Expand:

g(x + \Delta x) = (x + \Delta x)^{\top} A (x + \Delta x) = x^{\top} A x + x^{\top} A \Delta x + \Delta x^{\top} A x + \text{h.o.t}.

Using symmetry of $A$ ( $\Delta x^{\top} A x = x^{\top} A \Delta x$ ):

g(x + \Delta x) \approx g(x) + 2 x^{\top} A \Delta x.

By definition of Taylor expansion:

\mathrm{D} g(x) [\Delta x] = 2 x^{\top} A \Delta x. \tag{1}

$\mathrm{D} g(x) \in \mathcal{L}(\mathbb{R}^n, \mathbb{R})$ . By the Riesz Representation Theorem, there exists a gradient $\nabla g(x)$ such that:

\mathrm{D} g(x) [\Delta x] = \langle \nabla g(x), \Delta x \rangle_{\mathbb{R}^n}.

Comparing with (1) gives $\langle \nabla g(x), \Delta x \rangle = 2 x^{\top} A \Delta x$ , so:

\boxed{\nabla g(x) = 2 A x}.

Derivative of $f(X) = \log \det(X)$

Expand:

f(X + \Delta X) = \log \det(X + \Delta X).

Factor $X = X^{1/2} X^{1/2}$ :

f(X + \Delta X) = \log \det(X^{1/2} (I + X^{-1/2} \Delta X X^{-1/2}) X^{1/2}).

Using $\det(AB) = \det(A)\det(B)$ :

= \log \det(X) + \log \det(I + X^{-1/2} \Delta X X^{-1/2}).

Let $\lambda_i$ be eigenvalues of $X^{-1/2} \Delta X X^{-1/2}$ . Then:

f(X + \Delta X) = \log \det(X) + \sum_{i=1}^n \log(1 + \lambda_i).

For small $\lambda_i$ , $\log(1 + \lambda_i) \approx \lambda_i$ , so:

f(X + \Delta X) \approx \log \det(X) + \sum_{i=1}^n \lambda_i.

Since $\sum \lambda_i = \operatorname{tr}(X^{-1/2} \Delta X X^{-1/2})$ and $\operatorname{tr}(AB) = \operatorname{tr}(BA)$ :

\sum \lambda_i = \operatorname{tr}(X^{-1} \Delta X).

Thus:

f(X + \Delta X) \approx f(X) + \operatorname{tr}(X^{-1} \Delta X).

Hence:

\mathrm{D} f(X) [\Delta X] = \operatorname{tr}(X^{-1} \Delta X).

On $\mathbb{S}^n$ , the inner product is $\langle A, B \rangle = \operatorname{tr}(A^{\top} B)$ . Since $X^{-1}$ is symmetric:

\operatorname{tr}(X^{-1} \Delta X) = \langle X^{-1}, \Delta X \rangle_{\mathbb{S}^n}.

Therefore:

\boxed{\nabla f(X) = X^{-1}}.

Useful Identities

$\det(AB) = \det(A)\det(B)$
$\operatorname{tr}(AB) = \operatorname{tr}(BA)$
$\operatorname{tr}(X) = \sum_{i=1}^n \lambda_i$ , where $\lambda_i$ are eigenvalues of $X$ .

Exercise

Calculate the second derivative operator of $f(X) = \log \det(X)$ using the expansion method.

Hint 1: Treat $\mathrm{D} f(\cdot)[\Delta X]$ as a scalar field and expand $\mathrm{D} f(X + \delta X)[\Delta X]$ .

Hint 2: For small $A$ , $(I + A)^{-1} \approx I - A$ .

Hint 3: The representation of $\mathrm{D}^2 f$ is a tensor, not necessarily a matrix.

下载题目文件

📥 下载《001_matrix_calculus.tmu》题目文件

提交要求：

将答案写在《001_matrix_calculus.tmu》文件末尾
重命名为：001_你的姓名_学校.tmu
发送至：yansong@liii.pro
截止时间：本周日 23:59

奖品： Liii STEM定制文化衫

参与方式：

加入QQ群：934456971 获取前置资料
下载题目文件并仔细阅读
完成题目并按要求提交

← 返回每周一题活动主页

Differentiation on Matrix Manifold ​

Question ​

Definitions ​

Derivative of g(x)=x⊤Axg(x) = x^{\top} A xg(x)=x⊤Ax ​

Derivative of f(X)=log⁡det⁡(X)f(X) = \log \det(X)f(X)=logdet(X) ​

Useful Identities ​

Exercise ​

下载题目文件 ​