The Nuts and Bolts of Diagonalisation

Previously, we defined eigenvectors and eigenvalues, and used them to obtain a succinct formula for the $n$ -th Fibonacci number $x_n$ , known as Binet’s formula:

$\displaystyle x_n = \frac{\phi^n - (-\phi)^{-n}}{\sqrt 5},\quad n \in \mathbb N,$

where $\phi = \frac 12 (1+\sqrt 5)$ denotes the golden ratio. We needed to take advantage of the eigenvectors and eigenvalues of the matrix $\mathbf A := \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix}$ . In particular, we needed to find two eigenvectors $\mathbf v_1, \mathbf v_2$ that can encode any $\mathbf v \in \mathbb R^2$ . That is, we wanted a basis for $\mathbb R^2$ comprised of the eigenvectors of $\mathbf A$ .

Let $\mathbb K$ be a field and $V$ be a vector space over $\mathbb K$ .

Definition 1. Let $U$ be a vector space over $\mathbb K$ . The linear transformation $S : U \to U$ is similar to $T : V \to V$ if there exists an isomorphism $R : U \to V$ such that $S = R \circ T \circ R^{-1}$ . Equivalently, $S \circ R = R \circ T$ . We remark that the “is similar to” relation behaves like an equivalence relation.

Example 1. If $V$ is finite dimensional with $\dim(V) = n$ , then any linear transformation $T : V \to V$ is similar to some linear transformation $U : \mathbb K^n \to \mathbb K^n$ . This means that computations in $V$ can effectively be reduced to computations in $U : \mathbb K^n$ .

Lemma 1. Let $S : U \to U$ and $T : V \to V$ be similar linear transformations. Then there exists an isomorphism $R : U \to V$ such that for any $k \in \mathbb N$ ,

$S^n = R \circ T^n \circ R^{-1},$

where $T^n$ denotes $n$ -fold composition of $T$ .

Roughly speaking, Definition 1 gives us a condition on which $U$ and $T$ have essentially the same (i.e. similar) effect on the vectors in their respective vector spaces. The similarity vocabulary empowers us to discuss diagonalisation.

Definition 2. A matrix $\mathbf D : \mathbb K^n \to \mathbb K^n$ is diagonal if each $\mathbf e_i$ is an eigenvector of $\mathbf D$ , i.e. for each $\mathbf e_i$ , there exists $\lambda_i \in \mathbb K$ such that $\mathbf D(\mathbf e_i) = \lambda_i \mathbf e_i$ .

In matrix notation, we have

$\begin{aligned} \mathbf D &= \begin{bmatrix} \mathbf D(\mathbf e_1) & \mathbf D(\mathbf e_2) & \cdots & \mathbf D(\mathbf e_n) \end{bmatrix} \\ &= \begin{bmatrix} \lambda_1 \mathbf e_1 & \lambda_2 \mathbf e_2 & \cdots & \lambda_n \mathbf e_n \end{bmatrix} \\ &= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix} .\end{aligned}$

In particular, for any $\mathbf e_i$ , $\mathbf D^k \mathbf e_i = \lambda_i^k \mathbf e_i$ . A matrix $\mathbf A : \mathbb K^n \to \mathbb K^n$ is diagonalisable if it is similar to some diagonal matrix.

Theorem 1. A matrix $\mathbf A : \mathbb K^n \to \mathbb K^n$ is diagonalisable if and only if there exists a basis $\{\mathbf v_1,\dots,\mathbf v_n\}$ of $\mathbb K^n$ such that each $\mathbf v_i$ is an eigenvector of $\mathbf A$ .

Proof. We prove this theorem in two directions. Suppose $\mathbf A : \mathbb K^n \to \mathbb K^n$ is diagonalisable. Then there exists a diagonal matrix $\mathbf D : \mathbb K^n \to \mathbb K^n$ and an invertible matrix $\mathbf P : \mathbb K^n \to \mathbb K^n$ such that

$\mathbf A = \mathbf P \mathbf D \mathbf P^{-1} \quad \iff \quad \mathbf A \mathbf P = \mathbf P \mathbf D.$

Writing $\mathbf P = \begin{bmatrix} \mathbf v_1 &\cdots & \mathbf v_n\end{bmatrix}$ and employing the recipe-ingredient analogy for matrix multiplication,

$\begin{aligned}\mathbf A\begin{bmatrix} \mathbf v_1 &\cdots & \mathbf v_n\end{bmatrix} &=\mathbf P \begin{bmatrix}\lambda_1 \mathbf e_1 & \cdots & \lambda_n \mathbf e_n\end{bmatrix} \\ \begin{bmatrix} \mathbf A \mathbf v_1 &\cdots & \mathbf A \mathbf v_n\end{bmatrix} &= \begin{bmatrix}\lambda_1 \mathbf P\mathbf e_1 & \cdots & \lambda_n \mathbf P \mathbf e_n\end{bmatrix} \\ &= \begin{bmatrix}\lambda_1 \mathbf v_1 & \cdots & \lambda \mathbf v_n\end{bmatrix}.\end{aligned}$

Since $\mathbf A\mathbf v_i = \lambda_i\mathbf v_i$ for each $i$ , $\{\mathbf v_1,\dots,\mathbf v_n\}$ forms the desired basis proving the direction $(\Rightarrow)$ . Reversing the argument is not hard, and establishes the direction $(\Leftarrow)$ .

One theoretical consequence is that diagonalisation only proves meaningful when discussing vector spaces that have convenient “standard bases”, such as $\mathbb K^n$ with the standard ordered basis $(\mathbf e_1,\dots,\mathbf e_n)$ . Nevertheless, the discussion on eigenstuff generally still works in broad contexts, especially if the vector space is finite-dimensional.

Corollary 1. Suppose $\dim(V) = n$ . A linear transformation $T : V \to V$ is diagonalisable if it is similar to a diagonalisable matrix $\mathbf A : \mathbb K^n \to \mathbb K^n$ . Then $T$ is diagonalisable if and only if there exists a basis $\{\mathbf v_1,\dots,\mathbf v_n\}$ of $V$ such that each $\mathbf v_i$ is an eigenvector of $T$ .

Definition 3. A linear transformation $T : V \to V$ is diagonalisable if there exists a basis $\{\mathbf v_1,\dots,\mathbf v_n\}$ of $V$ such that each $\mathbf v_i$ is an eigenvector of $T$ .

Therefore, we can reduce our study of general finite-dimensional vector spaces to the concrete setting of $\mathbb K^n$ . In this case, what sufficiently simple tests do we have to know if a matrix $\mathbf A$ is diagonalisable?

Lemma 2. Any matrix $\mathbf A : \mathbb K^n \to \mathbb K^n$ has at most $n$ distinct eigenvalues. In particular, if $\mathbf A$ has $n$ distinct eigenvalues, then $\mathbf A$ is diagonalisable.

Proof. Each eigenvalue $\lambda_i$ yields an eigenvector $\mathbf v_i$ , and since distinct eigenvalues yield linearly independent eigenvectors $\mathbf v_i$ , the set $\{\mathbf v_1,\dots,\mathbf v_k\} \subseteq \mathbb K^n$ is linearly independent, so that $k \leq n$ . In particular, if $k = n$ , the set forms a basis for $\mathbb K^n$ , as required.

Example 1. Since $\mathbf A := \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix} : \mathbb R^2 \to \mathbb R^2$ has two eigenvalues $\phi, -\phi^{-1} \in \mathbb R$ , it is diagonalisable, and by Corollary 1, we could extract two key vectors $\mathbf v_1,\mathbf v_2$ such that $\mathbf A^n \mathbf v_i = \lambda_i^n \mathbf v_i$ trivially. This property was key in helping us derive the formula for the Fibonacci numbers.

Now, what if we had less than $n$ eigenvalues? Is all hope lost? Not quite. Theorem 1 only requires us to have $n$ eigenvectors. Some of them might have the same eigenvalues—that’s alright. Let $\lambda_1,\dots,\lambda_k$ denote the $k \leq n$ eigenvalues of $\mathbf A$ . Each eigenvalue $\lambda_i$ yields at least one nonzero eigenvector $\mathbf v_i$ such that

$\begin{aligned} \mathbf A\mathbf v_i = \lambda_i \mathbf v_i \quad &\iff \quad (\lambda \mathbf I_n - \mathbf A)\mathbf v_i = \mathbf 0 \\ &\iff \quad \mathbf v_i \in \ker(\lambda \mathbf I_n - \mathbf A) =: E_\lambda. \end{aligned}$

It is clear that $\det(\lambda \mathbf I_n - \mathbf A) = 0$ .

Definition 4. We call the function $c_{\mathbf A}(x) := \det(x\mathbf I_n - \mathbf A)$ the characteristic polynomial of $\mathbf A$ . In particular, $\lambda$ is an eigenvalue of $\mathbf A$ if and only if $c_{\mathbf A}(\lambda) = 0$ .

Lemma 3. If $\mathbf A,\mathbf B : \mathbb K^n \to \mathbb K^n$ are similar, then $c_{\mathbf A}(x) = c_{\mathbf B}(x)$ .

Proof. Let $\mathbf P$ be a matrix such that $\mathbf A = \mathbf P \mathbf B \mathbf P^{-1}$ . Then

$\begin{aligned} c_{\mathbf A}(x) &= |x\mathbf I_n - \mathbf A| \\ &= |x \mathbf P \mathbf P^{-1} - \mathbf P \mathbf B \mathbf P^{-1}| \\ &= |\mathbf P (x \mathbf I_n - \mathbf B) \mathbf P^{-1}| \\ &= |\mathbf P| |x \mathbf I_n - \mathbf B| |\mathbf P^{-1}| \\ &= |x \mathbf I_n - \mathbf B| = c_{\mathbf B}(x). \end{aligned}$

By the factor theorem, there exists a polynomial $p \in \mathbb K_{n-k}[x]$ and unique integers $\alpha_i$ such that

$c_{\mathbf A}(x) = (x-\lambda_1)^{\alpha_1} \cdots (x-\lambda_k)^{\alpha_k} p(x).$

We call $\alpha_i$ the algebraic multiplicity of $E_{\lambda_i}$ . If $\mathbb K = \mathbb C$ , then by the fundamental theorem of algebra, we can take $p(x)=1$ so that

$\alpha_1 + \cdots + \alpha_k = n.$

In any case, the point from Lemma 2 is that $E_{\lambda_i}$ captures all of the eigenvectors of $\mathbf A$ with eigenvalue $\lambda_i$ . Denote $\gamma_i := \dim(E_{\lambda_i})$ , called the geometric multiplicity of $\lambda_i$ . We have a fascinating result.

Theorem 2. For any eigenvalue $\lambda_i$ of $\mathbf A$ for $1 \leq i \leq k$ , $\gamma_i \leq \alpha_i$ . Hence, if $\mathbb K = \mathbb C$ and $\gamma_i = \alpha_i$ for all $1 \leq i \leq k$ , then $\mathbf A$ is diagonalisable.

Proof. Let $\{\mathbf v_1,\dots,\mathbf v_{\gamma_i}\}$ denote the eigenvectors in $E_{\lambda_i}$ . Extend to a basis $\{\mathbf v_1,\dots,\mathbf v_{\gamma_i}, \mathbf u_1,\dots, \mathbf u_{n -\gamma_i}\}$ of $\mathbb K^n$ with corresponding invertible matrix $\mathbf P$ . Then $\mathbf A$ is similar to the matrix

$\mathbf J := \begin{bmatrix} \lambda_i \mathbf I_{\gamma_i} & \mathbf B \\ \mathbf 0 & \mathbf C \end{bmatrix},$

not necessarily diagonal. It follows that

$\begin{aligned} c_{\mathbf A}(x) = c_{\mathbf J}(x) &= | x \mathbf I_n - \mathbf J| \\ &=\left| \begin{matrix} (x - \lambda_i) \mathbf I_{\gamma_i} & \mathbf B \\ \mathbf 0 & x \mathbf I_{n-\gamma_i} - \mathbf C \end{matrix} \right| \\ &= |(x - \lambda_i) \mathbf I_{\gamma_i}| |x \mathbf I_{n-\gamma_i} - \mathbf C| \\ &= (x -\lambda_i)^{\gamma_i} |x \mathbf I_{n-\gamma_i} - \mathbf C|. \end{aligned}$

Hence, $(x-\lambda_i)^{\gamma_i}$ is a factor of $c_{\mathbf A}(x)$ , which itself contains a factor of $(x-\lambda_i)^{\alpha_i}$ , which yields $\gamma_i \leq \alpha_i$ . In particular, over $\mathbb C$ , $\mathbf A$ is diagonalisable if and only if for each eigenvalue $\lambda_i$ , $\gamma_i = \alpha_i$ .

Far from just determining eigenvalues, the characteristic polynomial boasts remarkable algebraic properties, encapsulated by the Cayley-Hamilton theorem. In particular, rather surprisingly, we can use characteristic polynomials to compute inverse matrices. We will explore these ideas in the next post.

—Joel Kindiak, 7 Mar 25, 1707H

Responses

The Cayley-Hamilton Theorem – KindiakMath

January 7, 2026 at 9:46 pm

[…] 1. Suppose is finite-dimensional with ordered basis . Then is similar to . In particular, we define […]

LikeLike

Generalised Diagonalisation – KindiakMath

January 7, 2026 at 9:52 pm

[…] be a field and be a finite-dimensional vector space over . Recall that is diagonalisable if and only if it is similar to some simpler diagonal matrix for some basis for . This condition […]

LikeLike

The Nuts and Bolts of Diagonalisation

Share this:

Responses

Leave a comment Cancel reply