Generalised Diagonalisation

Let \mathbb K be a field and V be a finite-dimensional vector space over V. Recall that T : V \to V is diagonalisable if and only if it is similar to some simpler diagonal matrix [T]_K : \mathbb K^n \to \mathbb K^n for some basis K for V. This condition is equivalent to the condition

V = E_{\lambda_1} \oplus \cdots \oplus E_{\lambda_k},

where E_{\lambda_i} = \ker(T - \lambda I_V). We abbreviate T - \lambda \equiv T - \lambda I_V. If c_T can be factored into (possibly repeated) linear factors, find constants \alpha_i such that

c_T(x) = (x-\lambda_1)^{\alpha_1}\cdot \cdots \cdot (x-\lambda_k)^{\alpha_k}.

Then the minimal polynomial m_T of T can be factored into

m_T(x) = (x-\lambda_1)^{\beta_1}\cdot \cdots \cdot (x-\lambda_k)^{\beta_k},\quad 1 \leq \beta_i \leq \alpha_i.

Now, if \mathbb K = \mathbb C, then the fundamental theorem of algebra guarantees this condition is satisfied. In this case, is T similar to some sufficiently simple matrix [T]_K for some basis K of V, even if [T]_K is not diagonal?

One hint in this direction is that in this case, we can make the decomposition

V = K_{\lambda_1} \oplus \cdots \oplus K_{\lambda_k},

where K_{\lambda_i} := \ker( ( T - \lambda_i )^{\beta_i} ) is a generalised eigenspace (since the case \beta_i = 1 yields an actual eigenspace). That is, can we use this decomposition to derive a generalised diagonalisation?

Lemma 1. Define the matrix \mathbf J : \mathbb K^n \to \mathbb K^n by \mathbf J(\mathbf e_1) = \lambda \mathbf e_1 and

\mathbf J( \mathbf e_{i+1} ) = \mathbf e_i + \lambda \mathbf e_{i+1},\quad  1 \leq i \leq n-1.

Then (\mathbf J - \lambda)^n = \mathbf 0. Equivalently, \ker((\mathbf J - \lambda)^n) = \mathbb K^n. Explicitly, we call

\mathbf J \equiv \mathbf J_n(\lambda) := \begin{bmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & 0 \\ \vdots &  & \ddots & \ddots & 1\\ 0 & \cdots & \cdots & 0 & \lambda \end{bmatrix} : \mathbb K^n \to \mathbb K^n

a Jordan block of \lambda with size n.

Proof. The first condition yields (\mathbf J - \lambda)(\mathbf e_1) = \mathbf 0. More generally, if (\mathbf J - \lambda)^i (\mathbf e_i)= \mathbf 0, then

\begin{aligned} (\mathbf J - \lambda)^{i+1}(\mathbf e_{i+1}) &= (\mathbf J - \lambda)^i ( ( \mathbf J - \lambda )( \mathbf e_{i+1}) ) \\ &= (\mathbf J - \lambda)^i (\mathbf e_i) \\ &= \mathbf 0. \end{aligned}

Therefore, (\mathbf J - \lambda)^n = \mathbf 0.

With some work, we can prove that c_{\mathbf J}(x) = m_{\mathbf J}(x) = (x-\lambda)^n.

Lemma 2. Suppose there exist constants \lambda_i \in \mathbb K and positive integers \beta_i such that

\mathbf J = \begin{bmatrix} \mathbf J_{\lambda_1}(\beta_1) & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \mathbf J_{\lambda_k}(\beta_k)\end{bmatrix} : \mathbb K^n \to \mathbb K^n,

where n := \beta_1 + \cdots + \beta_k. Then for each i and j, (\mathbf J - \lambda_i)^n (\mathbf e_j) = \mathbf 0. We call \mathbf J a matrix in Jordan normal form.

Proof. If 1 \leq j \leq \beta_1, then

\begin{aligned} (\mathbf J - \lambda_1)^n (\mathbf e_j) &= (\mathbf J_{\lambda_1} - \lambda_1)^n (\mathbf e_j) \\ &= (\mathbf J_{\lambda_1} - \lambda_1)^{\beta_1} (\mathbf e_j) = \mathbf 0.\end{aligned}

More generally, if \beta_i+1 \leq j \leq \beta_{j+1}, then

\begin{aligned} (\mathbf J - \lambda_{i+1})^n (\mathbf e_j) &= (\mathbf J_{\lambda_{i+1}} - \lambda_{i+1})^n (\mathbf e_j) \\ &= (\mathbf J_{\lambda_{i+1}} - \lambda_{i+1})^{\beta_{i+1}} (\mathbf e_j) = \mathbf 0. \end{aligned}

What we have studied is that if c_{\mathbf A} can be factored into a product of linear factors, then there exist constants \lambda_i \in \mathbb K and positive integers \beta_i such that

\mathbb K^n = K_{\lambda_1} \oplus \cdots \oplus K_{\lambda_k}.

This yields the result (\mathbf A - \lambda_i)^n(\mathbf e_j) = \mathbf 0 for any i, j. Can we go the other direction, just like diagonalisation? Yes!

Theorem 1 (Jordan Normal Form). If c_T can be factored into a product of linear factors, then there exists a basis B of V such that [T]_B is in Jordan normal form.

Proof Sketch. Suppose for each \lambda_i, we can find a matrix \mathbf J_i in Jordan normal form. Then the decomposition

V = K_{\lambda_1} \oplus \cdots \oplus K_{\lambda_k}.

yields a basis B := B_1 \cup \cdots \cup B_k such that

[T]_B = \begin{bmatrix} \mathbf J_1 & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \mathbf J_k \end{bmatrix}.

It remains to find a basis B_i for K_{\lambda_i} such that [T|_{K_{\lambda_i}}]_{B_i} is in Jordan normal form, without loss of generality. Denote S_i := T|_{K_{\lambda_i}} - \lambda_i and for any \mathbf v \in V, let \langle \mathbf v \rangle := \mathrm{span}\{S_i^j(\mathbf v) : j \in \mathbb N\} be the S_i-cyclic subspace generated by \mathbf v.

Fix \mathbf v_i^{(1)} \in K_{\lambda_i} and find a basis B_i^{(1)} for \langle \mathbf v_i^{(1)} \rangle. If \mathrm{span}(B_i^{(1)}) = K_{\lambda_i}, then we are done. Inductively, if \mathrm{span}(B_i^{(1)} \cup \dots \cup B_i^{(j)}) \neq K_{\lambda_i}, find

\mathbf v_i^{j+1} \in K_{\lambda_i} \backslash \mathrm{span}(B_i^{(1)} \cup \dots \cup B_i^{(j)})

and obtain a basis B_i^{(j+1)} for \langle \mathbf v_i^{(j+1)} \rangle. Since K_{\lambda_i} is finite-dimensional, this process must end, yielding the finite basis

B_i = B_i^{(1)} \cup \dots \cup B_i^{(j_i)}

of K_{\lambda_i}, so that

[T|_{K_{\lambda_i}}]_{B_1} = \begin{bmatrix} \mathbf J_{\lambda_i}(|B_i^{(1)}|) & 0 & \cdots & 0 \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \mathbf J_{\lambda_i}(|B_i^{(j_i)}|) \end{bmatrix} =: \mathbf J_i.

I think this proof sketch more-or-less provides the key steps and intuition required to prove Theorem 1. Check out a more complete proof in this link, which effectively formalises our ideas using mathematical induction.

Essentially, any linear transformation \mathbf A : \mathbb C^n \to \mathbb C^n can be split into subspaces that can be generated through the Jordan blocks—which are precise formulations of generalised diagonalisation.

—Joel Kindiak, 11 Mar 25, 1914H

,

Published by


Response

  1. The Spectral Theorems – KindiakMath

    […] operators are diagonalisable (though all operators over will have a Jordan normal form, details here). But diagonalisable operators are convenient since we can compute with relative ease. […]

    Like

Leave a comment