Category: Linear Algebra

A Surprising Determinant

Problem 1. Let $\mathbf{A}$ be a real square matrix. Suppose $\mathbf{A} \mathbf{A}^{\mathrm T} = \mathbf{I}$ and $\det(\mathbf{A}) < 0$ . Compute $\det(\mathbf{A} + \mathbf{I})$ .

(Click for Solution)

Solution. Since $\mathbf{A}\mathbf{A}^{\mathrm T} = \mathbf{I}$ , we have

$\begin{aligned} \mathbf{A} + \mathbf{I} &= \mathbf{A} + \mathbf{A}\mathbf{A}^{\mathrm T} \\ &= \mathbf{A}(\mathbf{I} + \mathbf{A}^{\mathrm T}) \\ &= \mathbf{A}(\mathbf{A} + \mathbf{I}^{\mathrm T}) \\ &= \mathbf{A}(\mathbf{A}^{\mathrm T} + \mathbf{I}^{\mathrm T}) \\ &= \mathbf{A}(\mathbf{A} + \mathbf{I})^{\mathrm T}. \end{aligned}$

Taking determinants on both sides,

$\begin{aligned} \det(\mathbf{A} + \mathbf{I}) &= \det (\mathbf{A} (\mathbf{A} + \mathbf{I})^{\mathrm T}) \\ &= \det (\mathbf{A}) \det((\mathbf{A} + \mathbf{I})^{\mathrm T}) \\ &= \det (\mathbf{A}) \det(\mathbf{A} + \mathbf{I}). \end{aligned}$

Therefore, $\det(\mathbf{A} + \mathbf{I}) ( 1 - \det(\mathbf{A}) ) = 0$ . Since $\det(\mathbf{A}) < 0$ , we have

$1 - \det(\mathbf{A}) > 1 - 0 = 1 > 0 \quad \Rightarrow \quad \det(\mathbf{A} + \mathbf{I}) = 0.$

—Joel Kindiak, 9 Feb 25, 1322H

March 25, 2025
The Linearity of Calculus
The “algebra” in linear algebra essentially refers to the kinds of transformations one can perform on vectors. Since the algebra is linear, we shall define linear transformations, which is rather familiar to us in various fields of study.

Before giving a formal definition of a linear transformation, let’s motivate the discussion through some elementary school geometry. Given a rectangle with base $x$ and height $2$ , what is the area $A(x)$ of the rectangle? It would be $A(x) = 2x$ . This means

$\begin{gathered} A(x+y) = 2(x+y) = 2x + 2y = A(x) + A(y),\\ \quad A(kx) = 2(kx) = k(2x) = kA(x). \end{gathered}$

These properties have been used again and again, especially in calculus, since

$\displaystyle \frac{\mathrm d}{dx}(f+g) = \frac{\mathrm d}{dx}(f) + \frac{\mathrm d}{dx}(g),\quad \frac{\mathrm d}{dx}(kf) = k \frac{\mathrm d}{dx}(f)$

and

$\displaystyle \int (f+g) = \int f + \int g,\quad \int (kf) = k \int f.$

We can now define this notion of a linear transformation in broad generality. The idea is that any result that is true of linear transformations would be true to each of these examples, be it contrived ones like $A(\cdot)$ , and less contrived ones like the calculus operations.

For the rest of this post, let $K$ be any set, and let $V, W$ be vector spaces over a field $\mathbb K$ .

Definition 1. A function $T : V \to W$ is a linear transformation if for any $\mathbf u, \mathbf v \in V$ and $c \in \mathbb K$ ,

$T(\mathbf u + \mathbf v) = T(\mathbf u) + T(\mathbf v),\quad T(c\mathbf v) = cT(\mathbf v).$

These properties mimic the closure properties of a vector space. In fact, the key idea is that we want $T$ to preserve these closure properties. If $\{\mathbf v_1,\dots, \mathbf v_n\}$ forms a basis for $V$ , then it turns out that $V$ and $\mathbb K^n$ are not too different, allowing us to formally define dimensions. Finally, when we are dealing with the special cases $V = \mathbb K^n$ and $W = \mathbb K^m$ , we want to recover the usual notion of a matrix.

We first state a seemingly obvious yet incredibly useful result for linear transformations.

Lemma 1. A linear transformation $T : V \to W$ is injective if and only if for any $\mathbf v \in V$ ,

$T(\mathbf v) = \mathbf 0 \quad \Rightarrow \quad \mathbf v = \mathbf 0.$

Proof. For $(\Rightarrow)$ , if $T$ is injective, then $T(\mathbf v) = \mathbf 0 = T(\mathbf 0)$ implies $\mathbf v = \mathbf 0$ , as required. On the other hand, for $(\Leftarrow)$ , for vectors $\mathbf u,\mathbf v \in V$ , suppose $T(\mathbf u) = T(\mathbf v)$ . Then

$T(\mathbf u - \mathbf v) = T(\mathbf u) - T(\mathbf v) = \mathbf 0 \quad \Rightarrow \quad \mathbf u - \mathbf v = \mathbf 0.$

Hence, $\mathbf u = \mathbf v$ , so that $T$ is injective, as required.

Let’s start with a question that sounds obvious but requires more thought: Is the constant $c$ a function? Well, for any $c \in \mathbb K$ , we can define the corresponding function $c \in \mathcal F(K, \mathbb K)$ by $f_c : K \to \mathbb K$ , where $f_c(x) = c$ for any $x \in K$ . Therefore, we can define a linear injection between the two vector spaces.

Lemma 2. Define the function $\iota : \mathbb K \to \mathcal F(K, \mathbb K)$ by the map $c \mapsto f_c$ , where $f_c(x) = c$ for any $x \in K$ . Then $\iota$ is a linear transformation and is injective.

Proof. For linearity, we observe that

$f_{c+d}(x) = c+d = f_c(x) + f_d(x) = (f_c + f_d)(x).$

Therefore,

$\iota(c+d) = f_{c+d} = f_c + f_d = \iota(c) + \iota(d).$

Similarly, $\iota(kc) = k \iota(c)$ , establishing linearity. For injectivity, we notice that $f_c = T(c) = 0$ implies that for any $x \in K$ ,

$c = f_c(x) = 0(x) = 0,$

so that $c =0$ , as required.

Recall that given any function $f : K \to L$ ,

$f(K) := \{f(x) : x \in K\} \subseteq L.$

In particular, if $f$ is injective, then we can regard $K \hookrightarrow f(K) \subseteq L$ as a subset via the imbedding $f$ . Contextualising to Lemma 2, we can regard

$\mathbb K \hookrightarrow \iota(\mathbb K) \subseteq \mathcal F(K, \mathbb K)$

as a subspace. Therefore, constants are as good as functions, via the imbedding $\iota$ .

Using techniques in calculus and real analysis, we can prove that the collection of functions

$o(1) := \{ f \in \mathcal F(\mathbb R \backslash \{0\}, \mathbb R) : f(x) \to 0\ \text{as}\ x \to 0\}$

forms a subspace of $\mathcal F(\mathbb R \backslash \{0\}, \mathbb R)$ , with the property that

$c_1 - c_2 \in o(1) \quad \iff \quad c_1 = c_2.$

For elements $f \in o(1)$ , we write $f \to 0$ . We can use this definition to extend the calculus-based subsets of functions.

For any function $f \in \mathcal F(\mathbb R \backslash \{0\}, \mathbb R)$ and $L \in \mathbb R$ , define $f \to L$ to mean

$\begin{aligned} f \to L \quad &\iff \quad f(\cdot) - L \to 0 \\ &\iff \quad f(\cdot) - L \in o(1). \end{aligned}$

Lemma 3. If $f \to L$ and $f \to M$ , then $L = M$ .

Proof. Since $o(1)$ is a subspace,

$L-M = (f(\cdot) - M) + (-1) (f(\cdot) - L) \in o(1) \quad \Rightarrow \quad L = M.$

Henceforth, we will denote $\mathrm L_f = L$ as the unique limit such that $f \to L$ .

Theorem 1. Define the subset of functions

$\mathcal G_0 = \{f \in \mathcal F(\mathbb R \backslash \{0\}, \mathbb R) : (\exists L \in \mathbb R: f \to L)\}.$

Then

$o(1) \subseteq \mathcal G_0 \subseteq \mathcal F(\mathbb R \backslash \{0\},\mathbb R)$

as subspaces and the map $\mathrm L : \mathcal G_0 \to \mathbb R$ defined by $\mathrm L(f) := \mathrm L_f$ is a well-defined linear transformation.

Proof. We first remark that

$f \in \mathcal G_0 \iff f - \mathrm L_f \in o(1).$

Hence $f, g \in \mathcal G_0$ implies that $f-\mathrm L_f, g - \mathrm L_g \in o(1)$ , which yields

$(f+g) - (\mathrm L_f+\mathrm L_g) = (f-\mathrm L_f) + (g-\mathrm L_g) \in o(1)$

so that $f+g \in \mathcal G_0$ . Similarly, $cf \in \mathcal G_0$ , so that

$\mathcal G_0 \subseteq \mathcal F(\mathbb R \backslash \{0\}, \mathbb R)$

as a subspace. With $L = 0$ , $o(1) \subseteq \mathcal G_0$ . Finally, the uniqueness of limits yields

$\displaystyle \begin{gathered} \mathrm L(f+g) = \mathrm L_{f+g} = \mathrm L_f + \mathrm L_g = \mathrm L(f) + \mathrm L(g), \\ \mathrm L(cf) = \mathrm L_{cf} = c\mathrm L_f = c \mathrm L(f).\end{gathered}$

Remark 1. Using conventional notation, given $f \in \mathcal G_0$ ,

$\displaystyle \lim_{t \to 0} f(t) := \mathrm L(f).$

Doing some book-keeping, we recover the usual limit laws

$\begin{aligned} \lim_{t \to 0} (f(t) + g(t)) &= \lim_{t \to 0} f(t) + \lim_{t \to 0} g(t), \\ \lim_{t \to 0} ( cf(t) ) &= c\lim_{t \to 0} f(t). \end{aligned}$

Henceforth, we will, by a small abuse of notation, denote

$\displaystyle \lim_{t \to 0} \equiv \mathrm L : \mathcal G_0 \to \mathbb R.$

Having defined limits at $0$ , we can generalise to limits at any other point $c$ . In fact, we don’t need to do a lot of hard work to even show that the set of functions with limits at $c$ exist; we’ll just transport the subspace property in the following manner:

Theorem 2. Let $T : V \to W$ be a linear transformation. Then the range of $T$ defined by

$T(V) := \{T(\mathbf v) : \mathbf v \in V\}$

is a subspace of $W$ .

Proof. For additivity, if $T(\mathbf u), T(\mathbf v) \in T(V)$ , then $\mathbf u + \mathbf v \in V$ yields

$T(\mathbf u) + T(\mathbf v) = T(\mathbf u + \mathbf v) \in V.$

Similarly, $c T(\mathbf v) = T(c\mathbf v) \in T(V)$ for any $c \in \mathbb K$ , thus $T(V) \subseteq W$ as a subspace.

Lemma 4. Let $T : V \to W$ be a linear transformation.
- Let $U$ be a vector space over $\mathbb K$ and $S : U \to V$ and $T : V \to W$ be functions. If $S$ is a linear transformation, then so is $T \circ S$ .
- If $T$ is bijective, then $T^{-1}$ is a bijective linear transformation.
- If $K \subseteq V$ is a subset and $T$ is injective, then $K$ is a subspace of $V$ .
Proof. Exercise.

Corollary 1. For any $c \in \mathbb R$ , define the subset of functions

$\displaystyle \mathcal G_c = \left\{f \in \mathcal F(\mathbb R\backslash \{c\}, \mathbb R) : \lim_{t \to 0} f(c+t)\ \text{exists}\right\}.$

Then $\mathcal G_c \subseteq \mathcal F(\mathbb R \backslash \{c\},\mathbb R)$ as a subspace and the map $\displaystyle \lim_{x \to c} : \mathcal G_c \to \mathbb R$ defined by

$\displaystyle f \mapsto \lim_{t \to 0}f(c+t) =: \lim_{x \to c}f(x)$

is a well-defined linear transformation.

Proof. Define the linear transformation $T : \mathcal G_c \to \mathcal G_0$ by $T(f) = f(c + \cdot)$ . Then $T$ is clearly bijective so that $\mathcal G_c \subseteq \mathcal F(\mathbb R \backslash \{c\}, \mathbb R)$ as a subspace.

Furthermore

$\displaystyle \lim_{x \to c} = \lim_{t \to 0}\ \circ\ T : \mathcal G_c \to \mathbb R$

is a linear transformation.

In most calculus courses, limits are applied to discuss continuous functions, in the following manner: $f$ is continuous at $c$ if and only if

$\displaystyle \lim_{x \to c}f(x) = f(c).$

In other words, $f \in \mathcal G_c$ is not all that is required; we also require

$\displaystyle \lim_{x \to c} f(x) - f(c) = 0.$

We could define this property then manually prove that the set of functions continuous at $c$ is indeed a subspace of $\mathcal G_c$ . However, we have an even more efficient tool up our linear algebraic sleeves.

Theorem 3. Let $T : V \to W$ be a linear transformation. Then the kernel of $T$ defined by

$\ker(T) := \{\mathbf v \in V : T(\mathbf v) = \mathbf 0\}$

is a subspace of $V$ .

Proof. For additivity, if $\mathbf u,\mathbf v \in \ker(T)$ , then

$T(\mathbf u + \mathbf v) = T(\mathbf u) + T(\mathbf v) = \mathbf 0 + \mathbf 0 = \mathbf 0,$

so that $\mathbf u + \mathbf v \in \ker(T)$ . Similarly, $c\mathbf v \in \ker(T)$ for any $c \in \mathbb K$ , thus $\ker(T) \subseteq V$ as a subspace.

Corollary 2. For any $c \in \mathbb R$ , define the subset of functions continuous at $c$ by

$\displaystyle \mathcal C_c = \left\{f \in \mathcal F(\mathbb R, \mathbb R) : \lim_{x \to c} f(x) = f(c)\right\}.$

Then for any $c \in \mathbb R$ ,

$\displaystyle \mathcal C := \bigcap_{c \in \mathbb R} \mathcal C_c \subseteq \mathcal C_c \subseteq \mathcal G_c \cap \mathcal F(\mathbb R,\mathbb R)$

as subspaces. Here $\mathcal C$ denotes the set of functions that are continuous on $\mathbb R$ .

Proof. Since arbitrary intersections of vector spaces remain as vector spaces, it remains to prove that $\mathcal C_c \subseteq \mathcal G_c \cap \mathcal F(\mathbb R,\mathbb R)$ as a subspace (as a subset, this is obvious). To that end, define the linear transformation $T: \mathcal G_c \to \mathbb R$ by

$\displaystyle T(f) = \lim_{x \to c} f(x) - f(c).$

Then the result $\mathcal C_c = \ker(T) \subseteq \mathcal G_c$ establishes the result.

Corollary 3. For any $c \in \mathbb R$ , define the subset of functions differentiable at $c$ by

$\displaystyle \mathcal D_c = \left\{f \in \mathcal F(\mathbb R, \mathbb R) : \lim_{x \to c} \frac{f(x) - f(c)}{x-c}=:f'(c)\ \text{exists}\right\}.$

Then for any $c \in \mathbb R$ ,

$\displaystyle \mathcal D := \bigcap_{c \in \mathbb R} \mathcal D_c \subseteq \mathcal D_c \subseteq \mathcal C_c$

as subspaces. Here $\mathcal D$ denotes the set of functions that are differentiable on $\mathbb R$ . Furthermore, the functions $\displaystyle D_c : \mathcal D_c \to \mathbb R$ and $\displaystyle \frac{\mathrm d}{\mathrm dx} : \mathcal D \to \mathcal F(\mathbb R, \mathbb R)$ defined by

$\displaystyle D_c(f) := f'(c),\quad \frac{\mathrm d}{\mathrm dx} (f) = f'$

are linear transformations.

Proof. To prove that $\mathcal D_c \subseteq \mathcal C_c$ requires the use of several limit laws in calculus. For the subspace property, define the injective linear transformation $Q : \mathcal D_c \to \mathcal G_c$ by

$\displaystyle f \mapsto Q_f,\quad Q_f(x):= \frac{f(x) - f(c)}{x - c}.$

More concretely, $Q(f) = Q_f$ . It is not hard to verify that $Q(\mathcal D_c) \subseteq \mathcal G_c$ as a subspace. Thus, $Q : \mathcal D_c \to Q(\mathcal D_c)$ is a bijective linear transformation, and thus $\mathcal D_c \subseteq \mathcal C_c$ as a subspace. Finally,

$\displaystyle D_c = \lim_{x \to c}\ \circ\ Q : \mathcal D_c \to \mathbb R$

as a linear transformation, and

$\displaystyle \frac{\mathrm d}{\mathrm dx} : \mathcal D \to \mathcal F(\mathbb R,\mathbb R),\quad \frac{\mathrm d}{\mathrm dx}(f) = f',\quad f'(c) := D_c(f)$

as a linear transformation.

Of course, these ideas extend even to integral calculus, which includes the generalisation of integral transforms, whose special cases include the Laplace transform and the Fourier transform. However, to establish the existence of these objects requires us to create new ideas in Lebesgue integration (and even in Riemann integration we don’t get a sufficiently big picture). Thus, we relegate the discussions therein if, and when, we get there.

Next up: differentiating polynomials which causes us to touch base, pun intended, with bases for vector spaces once again.

—Joel Kindiak, 26 Feb 25, 2344H
March 24, 2025
The Key Ingredients of Vector Spaces

Let $V$ be a vector space over a field $\mathbb K$ . We have previously seen that for vectors $\mathbf v_1, \mathbf v_2 \in V$ ,

$\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathbb K\{\mathbf v_1\} + \mathbb K \{\mathbf v_2\}.$

Furthermore, if $\mathbf v \neq \mathbf 0$ , we require at least one vector, namely $\mathbf v$ , to cook up the space $\mathbb K\{\mathbf v\}$ .

We are tempted then to think that $\mathrm{span}(\{\mathbf v_1, \mathbf v_2\})$ requires two minimum ingredients. However, we have seen that this does not always hold. For instance, suppose $V$ contains some $\mathbf v \neq \mathbf 0$ . Define $\mathbf v_1 = \mathbf v$ and $\mathbf v_2 = 2\mathbf v$ . Then $\mathbb K\{\mathbf v_1\} + \mathbb K \{\mathbf v_2\} = \mathbb K\{\mathbf v\}$ yields

$\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathrm{span}(\{\mathbf v\}),$

which requires only one ingredient. The key observation is that $\mathbf v_2 = 2 \mathbf v_1 \in \mathrm{span}(\{\mathbf v_1\})$ . That is, $\mathbf v_2$ doesn’t truly increase $\mathrm{span}(\{\mathbf v_1\})$ at all. We can consolidate by making the following observation.

Lemma 1. Let $W_1,W_2 \subseteq V$ be sets of vectors. Suppose $W_1 \subseteq W_2$ . Then

$\mathrm{span}(W_1) \subseteq \mathrm{span}(W_2)$

as subspaces. Furthermore, if $W_2 \backslash W_1 \subseteq \mathrm{span}(W_1)$ , then

$\mathrm{span}(W_1) = \mathrm{span}(W_2).$

Proof. Firstly, $W_1 \subseteq W_2 \subseteq \mathrm{span}(W_2)$ , so that $\mathrm{span}(W_2)$ contains $W_1$ . Since $\mathrm{span}(W_1)$ is the smallest subspace containing $W_1$ , we must have

$\mathrm{span}(W_1) \subseteq \mathrm{span}(W_2).$

If $W_2 \backslash W_1 \subseteq \mathrm{span}(W_1)$ , then $W_2 = (W_1 \cap W_2) \cup W_2 \backslash W_1$ implies

$\begin{aligned} \mathrm{span}(W_2) &\subseteq \mathrm{span}(W_1 \cap W_2) + \mathrm{span}(W_2 \backslash W_1) \\ &\subseteq \mathrm{span}(W_1) + \mathrm{span}(W_1) \\ &\subseteq \mathrm{span}(W_1). \end{aligned}$

Hence, $\mathrm{span}(W_1) = \mathrm{span}(W_2)$ .

The main point of the equality is that for any excess vectors $W_2 \backslash W_1$ , if these vectors belong to $\mathrm{span}(W_1)$ , then $\mathrm{span}(W_2)$ doesn’t add any new information to $\mathrm{span}(W_1)$ , and thus does not increase the minimum required number of ingredients to generate the space.

Recall our example $\{\mathbf v_1, \mathbf v_2\}$ where both vectors are nonzero. We have seen that if $\mathbf v_2 \in \mathrm{span}(\{ \mathbf v_1 \})$ , then

$\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathrm{span}(\{\mathbf v_1\}).$

What if $\mathbf v_2 \notin \mathrm{span}(\{ \mathbf v_1 \}) = \mathbb K( \{\mathbf v_1\} )$ ? Then there are no scalars $c \in \mathbb K$ such that $\mathbf v_2 = c\mathbf v_1 \iff c\mathbf v_1 +(-1)\mathbf v_2 = \mathbf 0$ . In fact, something stronger happens.

Lemma 2. Let $\mathbf v_1,\mathbf v_2 \in V$ be nonzero vectors. Then $\mathbf v_2 \notin \mathrm{span}(\{\mathbf v_1\})$ if and only if for any $c_1,c_2 \in \mathbb K$ ,

$c_1 \mathbf v_1 + c_2 \mathbf v_2 = \mathbf 0\quad \Rightarrow \quad c_1 = c_2 = 0.$

Proof. We first prove $(\Rightarrow)$ . Fix scalars $c_1,c_2 \in \mathbb K$ such that

$c_1 \mathbf v_1 + c_2 \mathbf v_2 = \mathbf 0.$

By algebra, $c_1 \mathbf v_1 = -c_2 \mathbf v_2$ . If $c_2 = 0$ , then we have $c_1 = 0$ and we are done. Otherwise, $\mathbf v_2 = -(c_1/c_2) \mathbf v_1 \in \mathrm{span}(\{\mathbf v_1\})$ , a contradiction. Therefore, $c_1 = c_2 = 0$ necessarily.

We next prove $(\Leftarrow)$ by contrapositive. Suppose $\mathbf v_2 \in \mathrm{span}(\{\mathbf v_1\})$ . Then there exists some $c \in \mathbb K$ such that $\mathbf v_2 = c\mathbf v_1$ . If $c = 0$ then $\mathbf v_2 = \mathbf 0$ , a contradiction. Hence, $c \neq 0$ . Therefore,

$c\mathbf v_1 + (-1)\mathbf v_2 = \mathbf 0.$

Setting $c_1 = c$ and $c_2 = -1$ ,

$c_1 \mathbf v_1 + c_2\mathbf v_2 = \mathbf 0,$

and yet $c_1 = c_2 = 0$ is not true.

It is this latter condition that we will define as linear independence.

Definition 2. The finite set $\{\mathbf v_1, \dots, \mathbf v_n\} \subseteq V$ is called linearly independent if for any $c_1, \dots, c_n \in \mathbb K$ ,

$c_1 \mathbf v_1 + \dots + c_n \mathbf v_n = \mathbf 0\quad \Rightarrow \quad c_1 = \cdots = c_n.$

It is not hard to see that nonempty finite subsets of $\{\mathbf v_1, \dots, \mathbf v_n\}$ are linearly independent as well. A single set is linearly independent as well too.

We use this property to generalise to infinite sets. For any set $W \subseteq V$ , we say that $W$ is linearly independent if every nonempty finite subset of $W$ is linearly independent. We say that $W$ is linearly dependent if it is not linearly independent.

Corollary 1. Let $\mathbf v_1,\mathbf v_2 \in V$ be nonzero vectors. Then $\mathbf v_2 \notin \mathrm{span}(\{\mathbf v_1\})$ if and only if $\{\mathbf v_1,\mathbf v_2\}$ is linearly independent.

Roughly speaking, a set $\{\mathbf v_1,\dots,\mathbf v_n\}$ is linearly independent if it contains $n$ pieces of information to its span. This is what we define to be the dimension of a span.

However, let’s first return to mathematical earth and discuss $\mathbb R^n = \mathcal F(\{1,\dots,n\}, \mathbb R)$ . Intuitively, $\mathbb R^n$ ought to have $n$ dimensions. This is true.

Example 3. For each $i$ , define $\mathbf e_i := \mathbb I_{\{i\}}$ , which means

$\mathbf e_i(j) = \begin{cases}1, & i = j,\\ 0, & i \neq j.\end{cases}$

Then $\{ \mathbf e_1, \dots, \mathbf e_n \}$ is linearly independent, and

$\mathbb R^n = \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).$

Proof. For linear independence, fix scalars $c_1,\dots,c_n \in \mathbb K$ such that

$c_1 \mathbf e_1 + \dots + c_n \mathbf e_n = \mathbf 0.$

Then for any $i$ ,

$\begin{aligned} 0 = \mathbf 0(i) &= (c_1 \mathbf e_1 + \dots + c_n \mathbf e_n)(i) \\ &= c_1 \mathbf e_1(i) + \dots + c_i \mathbf e_i(i) + \cdots + c_n \mathbf e_n(i) \\ &= c_1 \cdot 0+ \dots + c_i \cdot 1 + \cdots + c_n \cdot 0 = c_i.\end{aligned}$

Therefore, $c_1 = \dots = c_n = 0$ , so that $\{ \mathbf e_1, \dots, \mathbf e_n \}$ is linearly independent. For the spanning property, fix $\mathbf v \in \mathbb R^n$ , where $\mathbf v(i) = v_i$ for each $I$ . Then

$\mathbf v = v_1 \mathbf e_1 + \dots + v_n \mathbf e_n \in \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).$

This yields $\mathbb R^n \subseteq \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \})$ . On the other hand, since $\{ \mathbf e_1, \dots, \mathbf e_n \} \subseteq \mathbb R^n$ and $\mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \})$ is the smallest vector space containing $\{ \mathbf e_1, \dots, \mathbf e_n \}$ , we automatically have $\mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}) \subseteq \mathbb R^n$ . Therefore,

$\mathbb R^n = \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).$

We call $\{ \mathbf e_1, \dots, \mathbf e_n \}$ a basis for $\mathbb R^n$ , and can generalise the idea to other vector spaces.

Definition 3. We call $K \subseteq V$ a basis for $V$ if $K$ is linearly independent and $\mathrm{span}(K) = V$ .

Theorem 1. Suppose $\{ \mathbf v_1, \dots, \mathbf v_n \}$ is a basis for $V$ . Then for any vector $\mathbf v \in V$ , there exists unique scalars $c_1,\dots,c_n \in \mathbb K$ such that

$\mathbf v = c_1 \mathbf v_1 + \dots + c_n \mathbf v_n.$

Proof. Existence is immediate from $V = \mathrm{span}(\{ \mathbf v_1, \dots, \mathbf v_n \})$ . For uniqueness, suppose there are two representations

$c_1 \mathbf v_1 + \dots + c_n \mathbf v_n = d_1 \mathbf v_1 + \dots + d_n \mathbf v_n.$

Subtracting on both sides,

$(c_1-d_1) \mathbf v_1 + \dots + (c_n-d_n) \mathbf v_n = \mathbf 0.$

Since $\{ \mathbf v_1, \dots, \mathbf v_n \}$ is linearly independent, $c_i - d_i = 0\iff c_i = d_i$ for each $i$ , so that the “recipe” that creates $\mathbf v$ is unique.

Corollary 2. Suppose $\{ \mathbf v_1, \dots, \mathbf v_n \}$ is a basis for $V$ . Define the map $T : V \to \mathbb K^n$ in the following manner: $T(\mathbf v_i) = \mathbf e_i$ for any $i$ and for any $\mathbf u,\mathbf v \in V$ and $c,d \in \mathbb K$ ,

$T(c \mathbf u + d \mathbf v) = c T(\mathbf u) + d T(\mathbf v).$

Then $T : V \to \mathbb K^n$ is a well-defined bijection. In fact, $T$ is a linear transformation that maps the dish $\mathbf v$ to the recipe $T(\mathbf v)$ that is required to “cook” it.

In a sense, any vector space that only requires a minimum of $n$ ingredients to cook all dishes is essentially the same as the vector space $\mathbb K^n$ . We can formalise this idea using isomorphisms, which is our next topic of discussion.

—Joel Kindiak, 24 Feb 0007H

March 17, 2025
Creating New Vector Spaces
Let $K$ be any set and $\mathbb K$ be any field. Recall that the function space $\mathcal F(K, \mathbb K)$ forms a vector space over $\mathbb K$ . Furthermore, given any vector space $V$ over $\mathbb K$ , the function space $\mathcal F(K, V)$ forms a vector space over $K$ . In this manner, we can create many vector spaces.

But even if we restrict our attention to just one vector space $V$ over $K$ , we can create many, many vector spaces.

For any $\mathbf v \in V$ , define

$\mathbb K\{\mathbf v\} := \{c\mathbf v : c \in \mathbb K\}.$

It should seem intuitive that $\mathbb K\{\mathbf v\}$ forms a vector space over $V$ . In this case, we would call $\mathbb K\{\mathbf v\}$ a subspace of $V$ .

Definition 1. We say that $W \subseteq V$ is a subspace of $V$ if $W$ forms a vector space over $\mathbb K$ .

However, this definition requires us to verify all 8 or more conditions of a vector space—this would be an incredibly arduous task. Is there a short-cut to determine this result? Thankfully, the answer is yes.

Theorem 1. For any $W \subseteq V$ , $W$ is a subspace of $V$ if and only if the following three conditions are satisfied:
- $\mathbf 0 \in W$ ,
- For any $\mathbf u, \mathbf v \in W$ , $\mathbf u + \mathbf v \in W$ ,
- For any $\mathbf v \in W$ and $c \in \mathbb K$ , $c \mathbf v \in W$ .
Proof Sketch. Apart from these closure properties, all other properties are guaranteed by the definition of vector spaces.

Example 1. For any $\mathbf v \in V$ , $\mathbb K\{ \mathbf v \}$ is a subspace of $V$ .

Proof. We verify the three properties of a vector space, and suppose $\mathbf v \neq \mathbf 0$ for nontriviality.
- Firstly, $\mathbf 0 = 0 \mathbf v \in \mathbb K\{ \mathbf v \}$ .
- Secondly, for any $c, d \in \mathbb K$ , $c \mathbf v + d \mathbf v = (c+d)\mathbf v \in \mathbb K\{ \mathbf v\}$ .
- Finally for $c, d \in \mathbb K$ , $c(d \mathbf v) = (cd) \mathbf v \in \mathbb K\{\mathbf v \}$ .
Therefore, $\mathbb K\{ \mathbf v \}$ is a subspace of $V$ . In fact, if $\mathbf v \neq \mathbf 0$ , then $\mathbb K\{ \mathbf v \}$ is isomorphic to $\mathbb K$ as vector spaces (more on isomorphisms in a future post). We call $\mathbb K\{ \mathbf v \}$ a $1$ -dimensional subspace of $V$ .

Example 2. Let $K$ be any set and $V$ be a vector space. For any subspace $W \subseteq V$ , $\mathcal F(K, W) \subseteq \mathcal F(K, V)$ as a subspace.

Proof. Since $W$ is a vector space, $\mathcal F(K, W)$ is a vector space as well, so that $\mathcal F(K, W) \subseteq \mathcal F(K, V)$ as a vector subspace.

Theorem 2. Let $W_1 , W_2 \subseteq V$ be subspaces. Then $W_1 \cap W_2$ is subspace of $V$ (and similarly, $W$ ).

Proof. We verify the three identities.
- Since $W_1, W_2$ are subspaces, $\mathbf 0 \in W_1$ and $\mathbf 0 \in W_2$ , so that $\mathbf 0 \in W_1 \cap W_2$ .
- Fix $\mathbf u, \mathbf v \in W_1 \cap W_2 \subseteq W_1$ . Since, $\mathbf u \in W_1$ and $\mathbf v \in W_1$ , we have $\mathbf u + \mathbf v \in W_1$ . Symmetrically, since $W_1 \cap W_2 = W_2 \cap W_1 \subseteq W_2$ , $\mathbf u + \mathbf v \in W_2$ . Therefore, $\mathbf u + \mathbf v \in W_1 \cap W_2$ .
- Fix $\mathbf v \in W_1 \cap W_2 \subseteq W_1$ and $c \in \mathbb K$ . Since, $\mathbf v \in W_1$ , we have $c \mathbf v \in W_1$ . Symmetrically, since $W_1 \cap W_2 = W_2 \cap W_1 \subseteq W_2$ , $\mathbf u + \mathbf v \in W_2$ . Therefore, $\mathbf u + \mathbf v \in W_1 \cap W_2$ .
In general, $W_1 \cup W_2$ does not form a vector space.

For a concrete example, recall that $\mathbb R^2 = \mathcal F(\{1, 2\}, \mathbb R)$ . For each $i = 1, 2$ , define $\mathbf e_i := \mathbb I_{ \{i\} }$ . Then clearly, $\mathbf e_i = 1 \cdot \mathbf e_i \in \mathbb R\{\mathbf e_i \}$ , but $\mathbf e_1 + \mathbf e_2 \notin \mathbb R\{\mathbf e_1 \} \cup \mathbb R\{\mathbf e_2 \}$ .

The correct generalisation would be direct sum. In fact, we can characterise subspaces in terms of sums of subsets.

Theorem 3. For subsets $U, W \subseteq V$ , and $K \subseteq \mathbb K$ , define the subsets

$U + W := \{\mathbf u + \mathbf w : \mathbf u \in U, \mathbf w \in W\},\quad K(W) := \{c \mathbf w : c \in K, \mathbf w \in W\}.$

Then $W$ is a subspace of $V$ if and only if $W + W \subseteq W$ and $\mathbb K (W) \subseteq W$ . Furthermore, if $U$ and $V$ are subspaces of $V$ , then $U + W$ is a subspace of $V$ .

Proof. It is not hard to see that if $U,W$ are subspaces of $V$ , then

$(U+W) + (U+W) \subseteq (U+U) + (W+W) \subseteq U + W.$

Furthermore, $\mathbb K(U + W) \subseteq \mathbb K (U) + \mathbb K (W) \subseteq U + W$ .

Let $V$ be a vector space and $W_1, W_2 \subseteq V$ be subspaces. Then any subspace $U$ containing $W_1 \cup W_2$ must contain $W_1 + W_2$ . In that sense, $W_1 + W_2$ is the smallest subspace that contains $W_1 \cup W_2$ .

In particular, given nonzero vectors $\mathbf v_1,\mathbf v_2 \in V$ , $\mathbb K\{\mathbf v_1\} + \mathbb K\{\mathbf v_2\}$ must be the smallest vector space that contains $\mathbb K\{\mathbf v_1\} \cup \mathbb K\{\mathbf v_2\}$ .

Lemma 1. For any subspace $U \subseteq V$ such that $\{\mathbf v_1, \mathbf v_2\} \subseteq U$ ,

$\mathbb K\{\mathbf v_1\} + \mathbb K\{\mathbf v_2\} \subseteq U + U \subseteq U.$

Hence $\mathbb K\{\mathbf v_1\} + \mathbb K\{\mathbf v_2\}$ is the smallest vector space that contains $\{\mathbf v_1,\mathbf v_2\}$ .

In fact, more is true. More generally, if $W \subseteq V$ is just a set, then $V$ is a subspace of $V$ that contains $W$ . The smallest subspace that contains $W$ will be the intersections of all subspaces that contain $W$ . This is called the span of $W$ .

Theorem 4. Let $W \subseteq V$ . Let $\mathcal S(W)$ denote the collection of subspaces of $V$ that contain $W$ . The span of $W$ is then defined by

$\displaystyle \mathrm{span}(W) := \bigcap_{U \in \mathcal S(W)} U.$

Then $\mathrm{span}(W)$ is the smallest subspace of $V$ that contains $W$ .

Proof. Exercise.

If $W$ is a finite set, then we can write $\mathrm{span}(W)$ as a combination of one-dimensional subspaces.

Theorem 5. For vectors $\mathbf v_1,\dots,\mathbf v_k \in V$ , then

$\displaystyle \mathrm{span}(\{\mathbf v_1,\dots,\mathbf v_k\}) = \mathbb K\{\mathbf v_1\} + \cdots + \mathbb K\{\mathbf v_k\}.$

Proof. For the case $k =2$ , the equality

$\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathbb K\{\mathbf v_1\} + \mathbb K\{\mathbf v_2\}$

holds since both sides of the proposed equality are the smallest subspace containing $\{\mathbf v_1, \mathbf v_2\}$ . Apply induction on Lemma 1 to obtain the desired result.

In fact, we can even take the span of the empty set.

Example 3. $\mathrm{span}(\emptyset) = \{\mathbf 0\}$ .

Proof. We leave it as an exercise to verify that $\{\mathbf 0\}$ is a subspace that contains $\emptyset$ . On the other hand, any subspace that contains $\emptyset$ just refers to any subspace, which must contain $\{\mathbf 0\}$ . Hence, $\mathrm{span}(\emptyset) = \{\mathbf 0\}$ .

Finally, we obtain the usual definition for spans of sets in a vector space.

Corollary 1. For vectors $\mathbf v_1,\dots,\mathbf v_k \in V$ ,

$\mathrm{span}(\{\mathbf v_1,\dots,\mathbf v_k\}) = \{c_1 \mathbf v_1 + \cdots + c_k \mathbf v_k : c_1,\dots,c_k \in \mathbb K\}.$

Intuitively, any vector $\mathbf v$ belongs to $\mathrm{span}(\{\mathbf v_1,\dots,\mathbf v_k\})$ if we can “cook” $\mathbf v$ up by combining some recipe $(c_1,\dots,c_k)$ with the ingredients:

$\mathbf v = c_1 \mathbf v_1 + \cdots + c_k \mathbf v_k ,$

where $c_i$ denotes the amount of the ingredient $\mathbf v_i$ used to cook up $\mathbf v$ .

It might be tempting therefore to assume that subspaces of the form $\mathbb K\{\mathbf v_1\} + \cdots + \mathbb K\{\mathbf v_k\}$ require all $k$ ingredients $\mathbf v_1, \dots, \mathbf v_k$ to generate. However, that is not always true. Observe that $\mathbb K\{2\mathbf v\} \subseteq \mathbb K\{\mathbf v\}$ implies

$\begin{aligned} \mathbb K\{\mathbf v\} &= \mathbb K\{\mathbf v\} + \{\mathbf 0\} \\ &\subseteq \mathbb K\{\mathbf v\} + \mathbb K\{2\mathbf v\} \\ &\subseteq \mathbb K\{\mathbf v\} + \mathbb K\{\mathbf v\} = \mathbb K\{\mathbf v\} \end{aligned}$

so that $\mathbb K\{\mathbf v\} + \mathbb K\{2\mathbf v\} = \mathbb K\{\mathbf v\}$ only requires one vector instead of two. How do know if we have hit the lowest possible number of ingredients? We need a fundamental tool called linear independence, which is our next topic of discussion.

—Joel Kindiak, 22 Feb 25, 2226H
March 10, 2025
The Broad Basics of Vectors
Let’s talk linear algebra. This subject involves two key words: linear—referring to some nice vector-ish objects and related properties, and algebra—the manipulations and transformations we can perform on said vector-ish properties.

For an introduction to the topic, we will discuss 2D vectors. But we shall not (and will not) shy away from its more exciting abstractions.

Throughout this post, let $K$ denote any set and $\mathbb K$ denote any field, which roughly speaking refers to any set where addition, subtraction, multiplication, and division is sufficiently well-defined.

Definition 1. The two-dimensional $K$ -space is defined to be

$K^2 \equiv K \times K := \left\{ \begin{bmatrix} x \\ y \end{bmatrix} : x \in K, y \in K\right\},$

where we will denote the ordered pairs in column notation. In particular, $\mathbb R^2$ denotes the two-dimensional real space that we all know and love.

Very soon we will discuss ideas in much broader generality. But perhaps to motivate the subject, we can recall our usual vector operations that correspond to two-dimensional vectors used in high school physics.

Definition 2. Define addition and scalar multiplication on $\mathbb K^2$ via

$\begin{bmatrix} u_1 \\ u_2 \end{bmatrix} + \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} := \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix},\quad c\begin{bmatrix} v_1 \\ v_2 \end{bmatrix} := \begin{bmatrix} cv_1 \\ cv_2 \end{bmatrix} ,\quad c \in \mathbb K.$

We expect these objects to behave like the vectors that we are familiar with, that in essence, encode directed distance. We call the set of these vectors a vector space.

Theorem 1. Let $V = \mathbb K^2$ . Then $V$ satisfies the following additive properties:
- For $\mathbf u, \mathbf v \in V$ , $\mathbf u + \mathbf v \in V$ .
- For $\mathbf u, \mathbf v, \mathbf w \in V$ , $\mathbf u + (\mathbf v + \mathbf w) = (\mathbf u + \mathbf v) + \mathbf w$ .
- There exists an element $\mathbf 0 \in V$ such that for any $\mathbf v \in V$ , $\mathbf v + \mathbf 0 = \mathbf v = \mathbf 0 + \mathbf v$ .
- For any $\mathbf v \in V$ , there exists an element $-v \in V$ such that $\mathbf v + (-\mathbf v) = (-\mathbf v) + \mathbf v = \mathbf 0$ .
In this case, we call $V$ a group under $+$ . In addition, we can add vectors in either order (i.e. this is the commutativity property):
- For any $\mathbf u, \mathbf v \in V$ , $\mathbf u + \mathbf v = \mathbf v + \mathbf u$ .
In this case, we call $V$ an abelian group under $+$ . In addition, $V$ satisfies the following scaling properties:
- For any $\mathbf v \in V$ and $c, d \in \mathbb K$ , $c(d\mathbf v) = (cd)\mathbf v$ .
- For any $\mathbf v \in V$ , $1\mathbf v = \mathbf v$ .
- For $\mathbf u, \mathbf v \in V$ and any $c \in \mathbb K$ , $c(\mathbf u + \mathbf v) = c\mathbf u + c\mathbf v$ .
- For any $\mathbf v \in V$ and $c, d \in \mathbb K$ , $(c+d)\mathbf v = c\mathbf v + d\mathbf v$ .
In this case, we call $V$ a vector space over $\mathbb K$ .

Proof Sketch. The proof is a matter of definition-checking. Nevertheless, we will complete some proofs to illustrate some of the techniques being used.

For the second property, we take advantage of the associativity of $+$ in $\mathbb K$ :

$\begin{aligned} \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} + \left(\begin{bmatrix} v_1 \\ v_2 \end{bmatrix} + \begin{bmatrix} w_1 \\ w_2 \end{bmatrix} \right) &= \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} + \begin{bmatrix} v_1 + w_1 \\ v_2 + w_2 \end{bmatrix} \\ &= \begin{bmatrix} u_1 + (v_1 + w_1) \\ u_2 + (v_2 + w_2) \end{bmatrix} \\ &= \begin{bmatrix} (u_1 + v_1) + w_1 \\ (u_2 + v_2) + w_2 \end{bmatrix} \\ &= \begin{bmatrix} u_1+v_1 \\ u_2+v_2 \end{bmatrix} + \begin{bmatrix} w_1 \\ w_2 \end{bmatrix} \\ &= \left( \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} + \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} \right) + \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}. \end{aligned}$

For the third property, we define $\mathbf 0 := \begin{bmatrix} 0 \\ 0 \end{bmatrix}$ and check that it satisfies the required equations. For the fourth property, we define $-\begin{bmatrix} v_1 \\ v_2 \end{bmatrix} := \begin{bmatrix} -v_1 \\ -v_2 \end{bmatrix}$ and do the needful bookkeeping.

Notice that this idea is not unique to $\mathbb K^2$ . It could apply to many, many other sets, as we are about to see.

Lemma 1. For any field $\mathbb L \supseteq \mathbb K$ , $\mathbb L$ forms a vector space over $\mathbb K$ . In particular, $\mathbb K$ forms a vector space over $\mathbb K$ .

Arguably the most important instances of vector spaces would be the function spaces. These spaces don’t always share all of the same properties as $\mathbb R^2$ , but when they do, share these properties in a beautifully unified manner.

Theorem 2. For any vector space $V$ over $\mathbb K$ , let $\mathcal F(K, V)$ denote the set of $V$ -valued functions on $K$ . Define addition and scalar multiplication according to the vector space structure of $V$ :

For any $f, g \in \mathcal F(K, V)$ and $c \in \mathbb K$ ,

$(f +g)(x) := f(x) + g(x),\quad (cf)(x) := cf(x).$

Then $\mathcal F(K, V)$ forms a vector space over $\mathbb K$ with additive identity $0 : K \to V$ defined by

$0(x) = 0,\quad x \in K$

and for any $f \in \mathcal F(K, V)$ , additive inverse $-f : K \to V$ defined by

$(-f)(x) = -f(x),\quad x \in K.$

In particular, $\mathcal F(\{1, 2\}, \mathbb K)$ forms a vector space over $\mathbb K$ .

It is this last example that we want to emphasise as the twin brother of $\mathbb K^2$ .

Theorem 3. For any $\mathbf v = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} \in \mathbb K^2$ , define $f_{\mathbf v} \in \mathcal F(\{1, 2\}, \mathbb K)$ by

$f_{\mathbf v}(1) = v_1,\quad f_{\mathbf v}(2) = v_2.$

Then the function $T : \mathbb K^2 \to \mathcal F(\{1, 2\}, \mathbb K)$ defined by $T( \mathbf v ) = f_{\mathbf v}$ satisfies the following property:

For any $\mathbf u, \mathbf v \in \mathcal \mathbb K^2$ and $c \in \mathbb K$ ,

$T(\mathbf u + \mathbf v)= T(\mathbf u) + T(\mathbf v),\quad T(c\mathbf v) := cT(\mathbf v).$

In this case, we call $T$ a linear transformation. In addition, $T$ is bijective, and we call $T$ a vector space isomorphism. Therefore, we can write $\mathbb K^2 = \mathcal F(\{1, 2\}, \mathbb K)$ without ambiguity.

Proof Sketch. The proof is immediate after we recognise that for each $i = 1, 2$ ,

$f_{\mathbf u + \mathbf v}(i) = u_i + v_i = f_{\mathbf u}(i) + f_{\mathbf v}(i) = (f_{\mathbf u} + f_{\mathbf v})(i),$

which implies that

$T(\mathbf u+ \mathbf v) = f_{\mathbf u + \mathbf v} = f_{\mathbf u} + f_{\mathbf v} =T(\mathbf u) + T(\mathbf v).$

The bijectivity of $T$ is easily verifiable.

This connection allows us to define $n$ -space as a function space.

Definition 3. For any vector space $V$ over $\mathbb K$ and $n \geq 1$ , we define the vector space $V^n := \mathcal F(\{1,\dots,n\}, V)$ , which is a vector space over $\mathbb K$ . In particular, $\mathbb K^n = \mathcal F(\{1,\dots,n\}, \mathbb K)$ , of which $\mathbb R^n$ is a special case.

We insist on defining the vector space $\mathcal F(K, V)$ , since we can make remarkable connections with other areas of mathematics, as we will see in the next post.

—Joel Kindiak, 19 Feb 25, 2233H
February 20, 2025
Reversing Linear Transformations

Throughout this post, let $V,W$ be vector spaces over some field $\mathbb F$ and $T : V \to W$ be a linear transformation.

Problem 1. For any subspace $K \subseteq V$ , prove that $T(K)$ is a subspace of $W$ . In particular, the range $T(V)$ of $T$ is a subspace of $W$ .

(Click for Solution)

Solution. For any $x,y \in K$ and for any scalar $c \in \mathbb F$ ,

$T(x) + T(y) = T(x+y) \in T(K),\quad cT(x) = T(cx) \in T(K).$

Problem 2. For any subspace $L \subseteq W$ , prove that $T^{-1}(L)$ is a subspace of $V$ . In particular, the kernel $\mathrm{ker}(T):= T^{-1}(\{0\})$ of $T$ is a subspace of $V$ .

(Click for Solution)

Solution. For any $x,y \in T^{-1}(L)$ , $T(x),T(y) \in K$ . Since $K$ is a subspace of $W$ , for any scalar $c \in \mathbb F$ ,

$T(x+y) = T(x) + T(y) \in L,\quad T(cx) = cT(x) \in L.$

Problem 3. For any $x, y \in V$ , prove that $T(x) = T(y)$ if and only if $x-y \in \mathrm{ker}(T)$ .

(Click for Solution)

Solution. We have $T(x) = T(y) \iff T(x) - T(y) = 0 \iff T(x-y) = 0$ .

Problem 4. For subspaces $K_1, K_2 \subseteq W$ , prove that

$K_1 +K_2 := \{x + y : x \in K_1, y \in K_2\}$

is a subspace of $W$ . Also, for any $c \in \mathbb F$ , prove that $cK_1 := \{cx : x \in K_1\}$ is a subspace of $W$ .

(Click for Solution)

Solution. For any $x_1+y_1, x_2+y_2 \in K_1 + K_2$ ,

$(x_1+y_1) + (x_2+y_2) = (x_1 + x_2) + (y_1 + y_2) \in K_1 + K_2.$

Similarly, for any scalar $c \in \mathbb F$ ,

$c(x_1+y_1) = c x_1 + c x_2 \in K_1 + K_2.$

The proof for $cK_1$ is similar.

Problem 5. For subspaces $K_1, K_2 \subseteq W$ , prove that

$T^{-1}(K_1 + K_2) \supseteq T^{-1}(K_1) + T^{-1}(K_2) + \mathrm{ker}(T),$

where equality holds if $T$ is surjective.

(Click for Solution)

Solution. For the first claim, to prove $(\supseteq)$ , we note that for any $x_i \in T^{-1}(K_i)$ , $x_3 \in \mathrm{ker}(T)$ ,

$T(x_1+x_2+x_3) = T(x_1) + T(x_2) \in K_1 + K_2$

so that $x_1+x_2+x_3 \in T^{-1}(K_1 + K_2)$ .

To prove $(\subseteq)$ when $T$ is surjective, fix $x \in T^{-1}(K_1 + K_2)$ . Then $T(x) \in K_1 + K_2$ . Hence, there exists $y_i \in K_i$ such that $T(x) = y_1 + y_2$ .

Since $T$ is surjective, find $x_i \in V$ such that $T(x_i) = y_i \in K_i \Rightarrow x_i \in T^{-1}(K_i)$ . Then $T(x) = T(x_1) + T(x_2) = T(x_1 + x_2)$ . By Problem 3,

$x_3 := x - (x_1 + x_2) \in \mathrm{ker}(T).$

Hence, $x = x_1 + x_2 + x_3 \in T^{-1}(K_1) + T^{-1}(K_2) + \mathrm{ker}(T)$ .

Problem 6. For any scalar $c \in \mathbb F$ and subspace $K \subseteq W$ , prove that $T^{-1}(cK_1) = c T^{-1}(K_1)$ .

(Click for Solution)

Solution. We make the quick observation that

$\displaystyle x \in K \quad \iff \quad cx \in cK,$

for any vector subspace $K$ . This means for any $x \in K_1$ ,

$T(x) \in cK_1 \quad \iff \quad T(c^{-1} x) \in K_1,$

where $c^{-1} := 1/c$ for brevity. The result follows by bookkeeping.

—Joel Kindiak, 26 Jan 25, 1826H

January 31, 2025
A Shocking Trichotomy

Problem 1. Let $V$ be a vector space over $\mathbb R$ . Prove that if $V$ has at least one nonzero element, then $V$ has infinitely many elements.

(Click for Solution)

Solution. Suppose $V$ contains some nonzero $v$ . Since $V$ is a vector space over $\mathbb R$ , $V$ contains $tv$ for any $t \in \mathbb R$ . Each $t$ yields a unique $tv$ since $v \neq 0$ . Thus, $V$ contains an infinite number of elements given by the injection $\mathbb R \to V, t \mapsto tv$ .

Problem 2. Let $V, W$ be a vector spaces over $\mathbb R$ , and $T : V \to W$ be a linear transformation. For any $y \in W$ , prove that the equation $T(x) = y$ either has zero solutions, or one solution, or infinitely many solutions.

(Click for Solution)

Solution. If the equation $T(x) = y$ has no solutions, then we are done. Otherwise, it has at least one solution $u$ , which yields $T(u) = y$ .

If the equation has at most one solution, then we are done. Otherwise, the equation has at least one other solution $v \neq u$ , which yields $T(v) = y$ . Since $T$ is linear,

$T(v-u) = T(v) - T(u) = y - y = 0.$

For any $t \in \mathbb R$ , consider the vector $x_t := u + t(v-u)$ . Each $t$ yields a unique $x_t$ since $v-u \neq 0$ . On the other hand

$\begin{aligned} T(x_t) &= T(u + t(v-u)) \\ &= T(u) + t \cdot T(v-u) \\ &= y + t \cdot 0 \\ &= y + 0 = y. \end{aligned}$

Thus, the equation $T(x)=y$ has infinitely many solutions defined by $x_t = u + t(v-u)$ .

—Joel Kindiak, 30 Nov 24, 0112H

December 13, 2024
Equating Ranges and Kernels

Let $V$ be a vector space and $T : V \to V$ be a linear transformation.

Problem 1. Suppose $T^2 = I$ , where we denote $I = I_V$ for brevity. Prove that $(T-I)(V) = \mathrm{ker}(T+I)$ .

(Click for Solution)

Solution. Under the hypothesis that $T^2 = I$ , we have $(T+I)(T-I) = O$ .

We will first prove $(\subseteq)$ . Fix $v \in (T-I)(V)$ . Find $w \in V$ such that $v = (T-I)(w)$ . Then

$(T+I)(v) = (T+I)(T-I)(w) = (T^2 - I)(w) = O(w) = 0,$

so that $v \in \mathrm{ker}(T+I)$ . Hence, $(T-I)(V) \subseteq \mathrm{ker}(T+I)$ .

Now, we prove $(\supseteq)$ . Fix $v \in \mathrm{ker}(T+I)$ . Then

$0 = (T+I)(v) = T(v) + v \quad \Rightarrow \quad v = -T(v) = T(-v).$

By repeated application of this result,

$v = \frac 12v + \frac 12v = T(-\frac 12 v) - (-\frac 12 v) = (T-I)(-\frac 12 v) \in (T-I)(V).$

Hence, $\mathrm{ker}(T+I) \subseteq (T-I)(V)$ .

Combining both results, $(T-I)(V) = \mathrm{ker}(T+I)$ .

Problem 2. Suppose $T^2 = T$ . Prove that

$V = \ker(T) + T(V),\quad \ker(T) \cap T(V) = \{ 0\}.$

In this case, we call $T$ a projection operator. In particular, if $T$ is the orthogonal projection, we obtain orthogonal decomposition, where all vectors in $\ker(T)$ are “perpendicular” or orthogonal to vectors in $T(V)$ .

(Click for Solution)

Solution. Fix $v \in V$ . Write

$v = (v - T(v)) + T(v).$

Since

$\begin{aligned} T(v - T(v)) & = T(v) - T^2(v)\\ &= T(v) - T(v) \\& = 0, \end{aligned}$

we have $v - T(v) \in \ker(T)$ , so that

$v = \underbrace{ (v - T(v)) }_{\in\ker(T)} + \underbrace{ T(v) }_{\in T(V)} \in \ker(T) + T(V).$

To establish the trivial intersection, fix $v \in \ker(T) \cap T(V)$ . Since $v \in T(V)$ , find $u \in V$ such that $v = T(u)$ :

$v = T(u) = T^2(u) = T(v).$

Since $v \in \ker(T)$ , $v = T(v) = 0$ . Therefore, $\{0\} \subseteq \ker(T) \cap T(V) \subseteq \{0\}$ .

—Joel Kindiak, 29 Nov 24, 2130H

December 6, 2024
A Unitarily Normal Problem

I couldn’t solve this linear algebra question as an undergraduate. Today, we revisit it and defeat it once and for all.

Problem 1. Let $V$ be a finite-dimensional (complex) vector space equipped with the inner product $\langle \cdot, \cdot \rangle$ . Let $T$ be an invertible linear operator. Suppose that for any $u,v \in V$ ,

$\displaystyle \frac{\langle T(u), T(v)\rangle}{\sqrt{\langle T(u), T(u) \rangle \langle T(v), T(v) \rangle}} = \frac{\langle u, v\rangle}{\sqrt{\langle u, u \rangle \langle v, v \rangle}}.$

Prove that $T$ is a scalar multiple of some unitary operator.

It is not hard to show that the converse holds (i.e. if $T$ is a scalar multiple of a unitary operator then the equation holds).

In the original exam paper, the examiner actually offered sub-steps to the question. The solution will therefore follow the suggested roadmap and furthermore, address the result more directly.

(Click for Solution)

Solution. We first observe that $(T^* \circ T)^* = T^* \circ T$ implies that $T^* \circ T$ is self-adjoint, therefore normal, and thus unitarily diagonalisable.

Thus, there exists an orthonormal basis $B := \{v_1,\dots,v_n\}$ for $V$ such that $(T^* \circ T)v_i = \lambda_i v_i$ , where each $\lambda_i \in \mathbb{C}$ .

We observe that

$\langle T(v_i),T(v) \rangle = \langle (T^* \circ T)(v_i),v \rangle = \langle \lambda_i v_i,v \rangle = \lambda_i \langle v_i,v \rangle.$

In particular,

${\| T(v_i) \|}^2 = \langle T(v_i),T(v_i) \rangle = \lambda_i \langle v_i,v_i \rangle = \lambda_i$

so that $\lambda_i > 0$ . Thus, substituting $u = v_i$ into the equation

$\displaystyle \frac{\langle T(u), T(v)\rangle}{\| T(u)\| \| T(v) \|} = \frac{\langle u, v\rangle}{\| u \| \| v \|}$

and simplifying,

$\displaystyle \frac{ \lambda_i \langle v_i,v \rangle}{ \sqrt{\lambda_i} \| T(v) \| } = \frac{\langle v_i, v\rangle}{\| v \|}.$

Simplifying for any $v \in V$ such that $\langle v_i, v\rangle \neq 0$ ,

$\displaystyle \| T(v) \| = \sqrt{\lambda_i} \| v\|.$

Since $\lambda_i$ was chosen arbitrarily, we must have $\lambda_1 = \cdots = \lambda_n = \lambda$ so that

$\displaystyle \| T(v) \| = \sqrt{\lambda} \| v\|\quad \iff \quad \left\|\left( \frac T{\sqrt{\lambda}} \right) v \right\| = \|v\|.$

It is not hard to show that this equation holds for any $v \in V$ . Thus, $S:=(1/\sqrt{\lambda})T$ is unitary, and $T = \sqrt{\lambda} S$ , as required.

—Joel Kindiak, 19 Oct 24, 0150H

October 19, 2024