The Key Ingredients of Vector Spaces

Let V be a vector space over a field \mathbb K. We have previously seen that for vectors \mathbf v_1, \mathbf v_2 \in V,

\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathbb K\{\mathbf v_1\} + \mathbb K \{\mathbf v_2\}.

Furthermore, if \mathbf v \neq \mathbf 0, we require at least one vector, namely \mathbf v, to cook up the space \mathbb K\{\mathbf v\}.

We are tempted then to think that \mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) requires two minimum ingredients. However, we have seen that this does not always hold. For instance, suppose V contains some \mathbf v \neq \mathbf 0. Define \mathbf v_1 = \mathbf v and \mathbf v_2 = 2\mathbf v. Then \mathbb K\{\mathbf v_1\} + \mathbb K \{\mathbf v_2\} = \mathbb K\{\mathbf v\} yields

\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathrm{span}(\{\mathbf v\}),

which requires only one ingredient. The key observation is that \mathbf v_2 = 2 \mathbf v_1 \in \mathrm{span}(\{\mathbf v_1\}). That is, \mathbf v_2 doesn’t truly increase \mathrm{span}(\{\mathbf v_1\}) at all. We can consolidate by making the following observation.

Lemma 1. Let W_1,W_2 \subseteq V be sets of vectors. Suppose W_1 \subseteq W_2. Then

\mathrm{span}(W_1) \subseteq \mathrm{span}(W_2)

as subspaces. Furthermore, if W_2 \backslash W_1 \subseteq  \mathrm{span}(W_1), then

\mathrm{span}(W_1) = \mathrm{span}(W_2).

Proof. Firstly, W_1 \subseteq W_2 \subseteq \mathrm{span}(W_2), so that \mathrm{span}(W_2) contains W_1. Since \mathrm{span}(W_1) is the smallest subspace containing W_1, we must have

\mathrm{span}(W_1) \subseteq \mathrm{span}(W_2).

If W_2 \backslash W_1 \subseteq  \mathrm{span}(W_1), then W_2 = (W_1 \cap W_2) \cup W_2 \backslash W_1 implies

\begin{aligned} \mathrm{span}(W_2) &\subseteq \mathrm{span}(W_1 \cap W_2) + \mathrm{span}(W_2 \backslash W_1) \\ &\subseteq \mathrm{span}(W_1) + \mathrm{span}(W_1) \\ &\subseteq \mathrm{span}(W_1). \end{aligned}

Hence, \mathrm{span}(W_1) = \mathrm{span}(W_2).

The main point of the equality is that for any excess vectors W_2 \backslash W_1, if these vectors belong to \mathrm{span}(W_1), then \mathrm{span}(W_2) doesn’t add any new information to \mathrm{span}(W_1), and thus does not increase the minimum required number of ingredients to generate the space.

Recall our example \{\mathbf v_1, \mathbf v_2\} where both vectors are nonzero. We have seen that if \mathbf v_2 \in \mathrm{span}(\{ \mathbf v_1 \}), then

\mathrm{span}(\{\mathbf v_1, \mathbf v_2\}) = \mathrm{span}(\{\mathbf v_1\}).

What if \mathbf v_2 \notin \mathrm{span}(\{ \mathbf v_1 \}) = \mathbb K( \{\mathbf v_1\} )? Then there are no scalars c \in \mathbb K such that \mathbf v_2 = c\mathbf v_1 \iff c\mathbf v_1 +(-1)\mathbf v_2 = \mathbf 0. In fact, something stronger happens.

Lemma 2. Let \mathbf v_1,\mathbf v_2 \in V be nonzero vectors. Then \mathbf v_2 \notin \mathrm{span}(\{\mathbf v_1\}) if and only if for any c_1,c_2 \in \mathbb K,

c_1 \mathbf v_1 + c_2 \mathbf v_2 = \mathbf 0\quad \Rightarrow \quad c_1 = c_2 = 0.

Proof. We first prove (\Rightarrow). Fix scalars c_1,c_2 \in \mathbb K such that

c_1 \mathbf v_1 + c_2 \mathbf v_2 = \mathbf 0.

By algebra, c_1 \mathbf v_1 = -c_2 \mathbf v_2. If c_2 = 0, then we have c_1 = 0 and we are done. Otherwise, \mathbf v_2 = -(c_1/c_2) \mathbf v_1 \in \mathrm{span}(\{\mathbf v_1\}), a contradiction. Therefore, c_1 = c_2 = 0 necessarily.

We next prove (\Leftarrow) by contrapositive. Suppose \mathbf v_2 \in \mathrm{span}(\{\mathbf v_1\}). Then there exists some c \in \mathbb K such that \mathbf v_2 = c\mathbf v_1. If c = 0 then \mathbf v_2 = \mathbf 0, a contradiction. Hence, c \neq 0. Therefore,

c\mathbf v_1 + (-1)\mathbf v_2 = \mathbf 0.

Setting c_1 = c and c_2 = -1,

c_1 \mathbf v_1 + c_2\mathbf v_2 = \mathbf 0,

and yet c_1 = c_2 = 0 is not true.

It is this latter condition that we will define as linear independence.

Definition 2. The finite set \{\mathbf v_1, \dots, \mathbf v_n\} \subseteq V is called linearly independent if for any c_1, \dots, c_n \in \mathbb K,

c_1 \mathbf v_1 + \dots + c_n \mathbf v_n = \mathbf 0\quad \Rightarrow \quad c_1 = \cdots = c_n.

It is not hard to see that nonempty finite subsets of \{\mathbf v_1, \dots, \mathbf v_n\} are linearly independent as well. A single set is linearly independent as well too.

We use this property to generalise to infinite sets. For any set W \subseteq V, we say that W is linearly independent if every nonempty finite subset of W is linearly independent. We say that W is linearly dependent if it is not linearly independent.

Corollary 1. Let \mathbf v_1,\mathbf v_2 \in V be nonzero vectors. Then \mathbf v_2 \notin \mathrm{span}(\{\mathbf v_1\}) if and only if \{\mathbf v_1,\mathbf v_2\} is linearly independent.

Roughly speaking, a set \{\mathbf v_1,\dots,\mathbf v_n\} is linearly independent if it contains n pieces of information to its span. This is what we define to be the dimension of a span.

However, let’s first return to mathematical earth and discuss \mathbb R^n = \mathcal F(\{1,\dots,n\}, \mathbb R). Intuitively, \mathbb R^n ought to have n dimensions. This is true.

Example 3. For each i, define \mathbf e_i := \mathbb I_{\{i\}}, which means

\mathbf e_i(j) = \begin{cases}1, & i = j,\\ 0, & i \neq j.\end{cases}

Then \{ \mathbf e_1, \dots, \mathbf e_n \} is linearly independent, and

\mathbb R^n = \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).

Proof. For linear independence, fix scalars c_1,\dots,c_n \in \mathbb K such that

c_1 \mathbf e_1 + \dots + c_n \mathbf e_n = \mathbf 0.

Then for any i,

\begin{aligned} 0 = \mathbf 0(i) &= (c_1 \mathbf e_1 + \dots + c_n \mathbf e_n)(i) \\ &= c_1 \mathbf e_1(i) + \dots + c_i \mathbf e_i(i) + \cdots + c_n \mathbf e_n(i) \\ &= c_1 \cdot 0+ \dots + c_i \cdot 1 + \cdots + c_n \cdot 0 = c_i.\end{aligned}

Therefore, c_1 = \dots = c_n = 0, so that \{ \mathbf e_1, \dots, \mathbf e_n \} is linearly independent. For the spanning property, fix \mathbf v \in \mathbb R^n, where \mathbf v(i) = v_i for each I. Then

\mathbf v = v_1 \mathbf e_1 + \dots + v_n \mathbf e_n \in \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).

This yields \mathbb R^n \subseteq \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}). On the other hand, since \{ \mathbf e_1, \dots, \mathbf e_n \} \subseteq \mathbb R^n and \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}) is the smallest vector space containing \{ \mathbf e_1, \dots, \mathbf e_n \} , we automatically have \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}) \subseteq \mathbb R^n. Therefore,

\mathbb R^n = \mathrm{span}(\{ \mathbf e_1, \dots, \mathbf e_n \}).

We call \{ \mathbf e_1, \dots, \mathbf e_n \} a basis for \mathbb R^n, and can generalise the idea to other vector spaces.

Definition 3. We call K \subseteq V a basis for V if K is linearly independent and \mathrm{span}(K) = V.

Theorem 1. Suppose \{ \mathbf v_1, \dots, \mathbf v_n \} is a basis for V. Then for any vector \mathbf v \in V, there exists unique scalars c_1,\dots,c_n \in \mathbb K such that

\mathbf v = c_1 \mathbf v_1 + \dots + c_n \mathbf v_n.

Proof. Existence is immediate from V = \mathrm{span}(\{ \mathbf v_1, \dots, \mathbf v_n \}). For uniqueness, suppose there are two representations

c_1 \mathbf v_1 + \dots + c_n \mathbf v_n = d_1 \mathbf v_1 + \dots + d_n \mathbf v_n.

Subtracting on both sides,

(c_1-d_1) \mathbf v_1 + \dots + (c_n-d_n) \mathbf v_n = \mathbf 0.

Since \{ \mathbf v_1, \dots, \mathbf v_n \} is linearly independent, c_i - d_i = 0\iff c_i = d_i for each i, so that the “recipe” that creates \mathbf v is unique.

Corollary 2. Suppose \{ \mathbf v_1, \dots, \mathbf v_n \} is a basis for V. Define the map T : V \to \mathbb K^n in the following manner: T(\mathbf v_i) = \mathbf e_i for any i and for any \mathbf u,\mathbf v \in V and c,d \in \mathbb K,

T(c \mathbf u + d \mathbf v) = c T(\mathbf u) + d T(\mathbf v).

Then T : V \to \mathbb K^n is a well-defined bijection. In fact, T is a linear transformation that maps the dish \mathbf v to the recipe T(\mathbf v) that is required to “cook” it.

In a sense, any vector space that only requires a minimum of n ingredients to cook all dishes is essentially the same as the vector space \mathbb K^n. We can formalise this idea using isomorphisms, which is our next topic of discussion.

—Joel Kindiak, 24 Feb 0007H

,

Published by


Leave a comment