The Peculiar Dimensions of Polynomial Space

Let’s discuss polynomials. Let \mathbb K be any field.

Definition 1. Recall the constant function 1 \in \mathcal F(\mathbb K, \mathbb K). A function \mu_n \in \mathcal F(\mathbb K, \mathbb K) is a monomial with degree n \in \mathbb N_0 if \mu_n(x) = x^n for any x \in \mathbb K. By convention, we define \mu_0 = 1. We then define the set \mathbb K_n[x] \subseteq \mathcal F(\mathbb K, \mathbb K) of polynomials over \mathbb K with degree n by

\mathbb K_n[x] := \mathrm{span} \{\mu_0,\mu_1,\dots,\mu_n\} \subseteq \mathcal F(\mathbb K,\mathbb K).

Then the set \mathbb K[x] \subseteq \mathcal F(\mathbb K, \mathbb K) of polynomials over \mathbb K is then defined to be the union of all such \mathbb K_n[x]:

\displaystyle \mathbb K[x] := \bigcup_{n=0}^\infty \mathbb K_n[x] \subseteq \mathcal F(\mathbb K, \mathbb K).

It is clear that for any n \in \mathbb N, \mathbb K_n[x] \subseteq \mathbb K[x] as a subspace. Furthermore, for any m, n \in \mathbb N, \mu_{m+n} = \mu_m \cdot \mu_n. In fact, more is true.

Theorem 1. For any n \in \mathbb N, \{\mu_0,\mu_1,\dots,\mu_n\} forms a basis for \mathbb K_n[x].

Proof. We prove by induction. It is obvious that \{\mu_0\} is linearly independent. Suppose for any k \in \mathbb N, \{\mu_0,\mu_1,\dots,\mu_k\} is linearly independent. Consider the equation

c_0 \mu_0 + c_1 \mu_1 + \cdots + c_k \mu_k + c_{k+1} \mu_{k+1} = 0.

Setting x = 0, \mu_0(0) = 1 while \mu_i(0) = 0 for i \geq 1. Hence, c_0 = 0, yielding

c_1 \mu_1 + \cdots + c_k \mu_k + c_{k+1} \mu_{k+1} = 0.

Factoring \mu_1,

\mu_1 \cdot (c_1 \mu_0 + \cdots + c_k \mu_{k-1} + c_{k+1} \mu_k) = 0.

For x \neq 0, since \mu_1(x)  \neq 0, we must have

c_1 \mu_0 + \cdots + c_k \mu_{k-1} + c_{k+1} \mu_k = 0.

By the induction hypothesis, since \{\mu_0, \mu_1,\dots,\mu_k\} is linearly independent, we must have

c_1 = c_2 = c_3 = \cdots = c_k = c_{k+1} = 0.

Hence, c_i = 0 for 0 \leq i \leq k+1. Therefore, the set \{\mu_0,\mu_1,\dots,\mu_{k+1}\} is linearly independent. By induction, for any n \in \mathbb N_0, the set \{\mu_0,\mu_1,\dots,\mu_{n}\} forms a basis for \mathbb K_n[x].

Corollary 1. The set M :=\{\mu_i : i \in \mathbb N_0\} forms a basis for \mathbb K[x].

Proof. To prove that \mathbb K[x] = \mathrm{span}(M), fix f \in \mathbb K[x]. Then f \in \mathbb K_n[x] for some n \in \mathbb N, yielding

f \in  \mathbb K_n[x] = \mathrm{span}\{\mu_0,\mu_1,\dots,\mu_{n}\} \subseteq \mathrm{span}(M).

On the other hand, since \mathrm{span}(M) is the smallest subspace of \mathbb K[x] containing M, we have

M \subseteq \mathrm{span}(M) \subseteq \mathbb K[x],

as required. To prove that M is linearly independent, let L \subseteq M be a finite set. Define

n := \max \{i \in \mathbb N : \mu_i \in L\}.

Then L \subseteq \{\mu_0,\mu_1,\dots,\mu_n\} is linearly independent, as required. Since any finite subset of M is linearly independent, M is linearly independent.

Therefore, M forms a basis for \mathbb K[x].

Recall that differentiation \displaystyle \frac{\mathrm d}{\mathrm dx} : \mathbb R[x] \subseteq \mathcal D \to \mathcal F(\mathbb R,\mathbb R) is a linear transformation. In fact, techniques in calculus yield the famous result

\displaystyle \frac{\mathrm d}{\mathrm dx}(1) = 0,\quad \frac{\mathrm d}{\mathrm dx}(\mu_n) = n \cdot \mu_{n-1},\quad n \in \mathbb N.

In other words, \displaystyle \frac{\mathrm d}{\mathrm dx} is defined on the basis \{\mu_i : i \in \mathbb N\}. It turns out that we can go the other way around—first start with a function defined on a basis, then extend it to the rest of the vector space.

Let V be a vector space over \mathbb K.

Theorem 2. Let W be a vector space over \mathbb K. Suppose K forms a basis for V. For any function T : K \to W, there exists a unique linear transformation \widetilde T : V \to W such that \widetilde T|_K = T. Furthermore, if T(K) is linearly independent, then T(K) forms a basis for T(V).

Proof. For any \mathbf v \in V, find unique vectors \mathbf v_1,\dots,\mathbf v_k \in K and constants c_1,\dots,c_k \in \mathbb K such that

\mathbf v = c_1 \mathbf v_1 + \cdots + c_k \mathbf v_k.

Then define \widetilde T : V \to W by

\widetilde T(\mathbf v) := c_1 T(\mathbf v_1) + \cdots + c_k T(\mathbf v_k).

which can be verified to be a well-defined linear transformation satisfying \widetilde T|_K = T. For uniqueness, suppose there exists another linear transformation S : V \to W such that S|_K = T. Then

(S-T)|_K = S|_K - T|_K = 0.

The extension yields (S - T) : V \to W defined by S - T = 0 \Rightarrow S = T, establishing uniqueness.

Corollary 2. The map D : \{\mu_i : i \in \mathbb N \} \to \mathbb K[x] defined by

D(1) := 0,\quad D(\mu_n) :=n \cdot \mu_{n-1},\quad n \in \mathbb N,

extends to a canonical linear transformation \widetilde D : \mathbb K[x] \to \mathbb K[x] called the formal derivative.

From now on, we will write T = \widetilde T without loss of ambiguity.

Corollary 3. For any n \in \mathbb N_0, the linear transformation T : \mathbb K_n[x] \to \mathbb K^{n+1} defined by T(\mu_i) = \mathbf e_{i+1} is a bijective linear transformation.

Proof. Since T(\{\mu_0, \mu_1,\dots,\mu_n\}) = \{ \mathbf e_1, \mathbf e_2, \dots, \mathbf e_{n+1} \}, we have

\begin{aligned} T(\mathbb K_n[x]) &= \mathrm{span}\ \! T(\{\mu_0, \mu_1,\dots,\mu_n\}) \\ &= \mathrm{span}\{ \mathbf e_1, \mathbf e_2, \dots, \mathbf e_{n+1} \} = \mathbb K^{n+1}\end{aligned}

and thus T is surjective. It suffices to prove that T is injective. Fix f \in \mathbb K_n[x] and find unique scalars c_0,c_1,\dots,c_n \in \mathbb K such that

f = c_0 \mu_0 + c_1 \mu_1 + \cdots + c_n \mu_n.

If T(f) = 0, then

\begin{aligned} 0 = T(f) &= T(c_0 \mu_0 + c_1 \mu_1 + \cdots + c_n \mu_n) \\ &= c_0 T(\mu_0) + c_1 T(\mu_1) + \cdots + c_n T(\mu_n).\end{aligned}

Since T(\{\mu_0, \mu_1,\dots,\mu_n\}) = \{ \mathbf e_1, \mathbf e_2, \dots, \mathbf e_{n+1} \} forms a basis for T(\mathbb K_n[x]),

c_0 = c_1 = \dots = c_n = 0,

so that f = 0, as required.

Corollary 3 tells us that \mathbb K_n[x] and \mathbb K^{n+1} are essentially the same as vector spaces. The technical term is that these two vector spaces are isomorphic to each other.

Definition 2. Two vector spaces V,W over a field \mathbb K are isomorphic if there exists a vector space isomorphism (i.e. a bijective linear transformation) T : V \to W. In this case, we denote V \cong W.

Example 1. For any n \in \mathbb N_0, \mathbb K_n[x] \cong \mathbb K^{n+1} \cong \mathcal F(\{1,\dots,n+1\}, \mathbb K).

As the infinite version of \mathbb K_n[x], it true that \mathbb K[x] \cong \mathcal F(\mathbb N,\mathbb K)? No! Consider the case \mathbb K = \mathbb Q. It turns out there isn’t even a bijection, since \mathbb Q[x] is a countably infinite set and \mathcal F(\mathbb N, \mathbb Q) is uncountably infinite.

This surprising fact boils down to \mathbb Q[x] being defined as a union of a countably infinite number of countably infinite sets \mathbb Q_n[x]. In a sense, \mathbb Q[x] encodes the smallest infinity that encompasses the finite cases.

Yet, the finite-dimensional case is a fascinating study on its own right. We have seen that \mathbb K_n[x] behaves essentially like \mathbb K^n, which we will define to have dimension n. In fact, we will use this rather broad definition of a dimension.

Definition 3. We say that V is finite-dimensional if there exists some n \in \mathbb N such that V \cong \mathbb K^n. Otherwise, V is infinite-dimensional.

Example 2. For any n \in \mathbb N, \mathbb K_n[x] is finite-dimensional, while \mathbb K[x] is infinite-dimensional.

Furthermore, the study of finite-dimensional vector spaces is the motivation for the rather strange object known as matrices. In fact, studying matrices turns out to be the key to help us understand dimensions, and so we turn our attention there in the next post.

—Joel Kindiak, 27 Feb 25, 2339H

,

Published by


Leave a comment