Generalised Perpendicularity

Recall that an inner product \langle \cdot , \cdot \rangle on a vector slave V over a field \mathbb K (\mathbb R or \mathbb C) gives us the generalised formulation of the dot product on \mathbb R^2. In particular, if \mathbb K = \mathbb R, then two nonzero vectors \mathbf u, \mathbf v \in V have angle \theta defined by

\langle \mathbf u , \mathbf v \rangle = \| \mathbf u \| \| \mathbf v \| \cos(\theta).

In particular, \langle \mathbf u, \mathbf v \rangle = 0 if and only if \cos(\theta) = 0, which means \theta = \pi/2. In this case, we say that \mathbf u, \mathbf v are perpendicular. In the usual two-dimensional space,

\mathbf u \perp \mathbf v \quad \iff \quad \mathbf u \cdot \mathbf v = 0.

In particular, we obtain a class result from coordinate geometry.

Theorem 1. Two lines a_1 x + b_1 y = c_1 and a_2 x + b_2 y = c_2 are perpendicular if and only if a_1 a_2 + b_1 b_2 = 0. In particular, if m_{ \text{tangent} } = a_1, m_{ \text{normal} } = a_2 and b_1 = b_2 = 1, then we obtain the more familiar formula

m_{ \text{tangent} } \cdot m_{ \text{normal} } = -1

via m_{ \text{tangent} } = a_1, m_{ \text{normal} } = a_2.

Proof. The first line is parallel to the direction vector b_1 \mathbf e_1 - a_1 \mathbf e_2 and the second line is parallel to the direction vector b_2 \mathbf e_1 - a_2 \mathbf e_2.

While this interpretation only makes sense when \mathbf u, \mathbf v are nonzero vectors, we still have the property \langle \mathbf u, \mathbf v \rangle = 0. This lends us to the more general (and objectively more veracious) notion of right (i.e. ortho) angles (i.e. gonal)—orthogonality.

Henceforth, let (V, \langle \cdot , \cdot \rangle) be an inner product space over \mathbb K.

Definition 1. Two vectors \mathbf u, \mathbf v are orthogonal if \langle \mathbf u, \mathbf v \rangle = 0.

  • A set K of vectors is orthogonal if any distinct vectors \mathbf u, \mathbf v \in K are orthogonal.
  • We say that \mathbf v is perpendicular to the subset U \subseteq V if for any \mathbf u \in U, \mathbf u,\mathbf v are orthogonal. In this case, we denote \mathbf v \perp U.
  • Two subsets U, W \subseteq V are orthogonal if for any \mathbf u \in U and \mathbf v \in V, \{ \mathbf u, \mathbf v \} is orthogonal. In this case, we denote U \perp W.

Lemma 1. The zero vector \mathbf 0 is orthogonal to any \mathbf v \in V. Furthermore, if U,W \subseteq V are orthogonal subspaces, then U \cap W = \{\mathbf 0\}.

Proof. For any \mathbf v \in U \cap W, \langle \mathbf v, \mathbf v \rangle = 0.

Lemma 2. Any orthogonal set K of nonzero vectors is linearly independent.

Proof. Assume K = \{\mathbf v_1,\dots,\mathbf v_k\} without loss of generality. Consider the equation

\displaystyle \sum_{i=1}^k c_i \mathbf v_k = \mathbf 0.

For each j, apply \langle \cdot , \mathbf v_j\rangle on both sides to obtain

\begin{aligned}\langle \mathbf 0, \mathbf v_j\rangle &= \left\langle \sum_{i=1}^k c_i \mathbf v_i, \mathbf v_j \right\rangle = \sum_{i=1}^k c_i  \langle \mathbf v_i, \mathbf v_j \rangle \\ &= \sum_{i \neq j} c_i \cdot 0 + c_j \langle \mathbf v_j , \mathbf v_j \rangle = c_j \cdot \|\mathbf v_j \|^2. \end{aligned}

Since \|\mathbf v_j \| \neq 0, c_j = 0. Since this argument holds for any j, we get c_1 = \cdots = c_k = 0, and the result holds.

Lemma 3. The map \langle \cdot, \cdot \rangle: \mathbb K^n \times \mathbb K^n \to \mathbb K defined by

\displaystyle \langle \mathbf u, \mathbf v \rangle := \sum_{i=1}^n u_i \bar v_i

is an inner product on \mathbb K^n. In particular, \langle \mathbf e_i, \mathbf e_j \rangle = \mathbb I_{\{i\}}(j). Hence, we call the standard basis \{\mathbf e_1,\dots,\mathbf e_n\} for \mathbb K^n orthonormal.

Definition 2. An orthogonal set K \subseteq V is orthonormal if \| \mathbf v \| = 1 for each \mathbf v \in K. We call K an orthonormal basis for V if \mathrm{span}(K) = V.

Why are orthonormal sets so special? That is because decomposing any vector in \mathrm{span}(K) becomes trivial.

Theorem 2. Let I be an index set and K = \{\mathbf u_\alpha : \alpha \in I\} be an orthonormal set. For any vector \mathbf v \in \mathrm{span}(K), there exist unique vectors \mathbf u_1,\dots,\mathbf u_k such that

\displaystyle \mathbf v = \sum_{i=1}^n \langle \mathbf v, \mathbf u_i \rangle \mathbf u_i.

Proof. Since K is linearly independent, it forms an orthonormal basis for \mathrm{span}(K). Hence, for any \mathbf v \in \mathrm{span}(K), there exist unique vectors \mathbf u_i \in K and unique scalars c_i \in \mathbb K such that

\displaystyle \mathbf v = \sum_{i=1}^n c_i \mathbf u_i.

For each j, apply \langle \cdot, \mathbf u_j\rangle on both sides,

\displaystyle \langle \mathbf v, \mathbf u_j\rangle = \sum_{i=1}^n c_i \langle \mathbf u_j, \mathbf u_i\rangle = \sum_{i \neq j} c_i \cdot 0 + c_j \cdot 1 = c_j,

as required.

In this regard, the standard basis \{\mathbf e_1,\dots,\mathbf e_n\} plays a special role in being a sufficiently “preferred” basis than others.

That’s great, but what if we only have a basis K for V, and not an orthonormal basis for V. Can we make such a conversion? For finite-dimensional vector spaces, the answer is ‘Yes’!

But first, we will need an “orthogonalising” tool. Given an orthonormal set K, if \mathbf v \notin \mathrm{span}(K), then can we find some vector \mathbf u \in \mathrm{span}(K) such that (\mathbf v - \mathbf u) \perp K?

Lemma 4. Let K := \{ \mathbf u_1,\dots,\mathbf u_n \} be an orthonormal set and U := \mathrm{span}(K). Motivated by Theorem 2, define the projection map by

\displaystyle \mathrm{proj}_U := \sum_{i=1}^n \langle \cdot, \mathbf u_i \rangle \mathbf u_i : V \to U.

Clearly, (\mathrm{proj}_U)|_U = \mathrm{id}_U. Then for any \mathbf v \notin U, (\mathbf v - \mathrm{proj}_U(\mathbf v)) \perp U.

Proof. For any j,

\begin{aligned} \langle \mathrm{proj}_U(\mathbf v), \mathbf u_j \rangle &= \sum_{i=1}^n \langle \mathbf v, \mathbf u_i \rangle \langle \mathbf u_i , \mathbf u_j\rangle = \langle \mathbf v , \mathbf u_j \rangle. \end{aligned}

Hence,

\langle \mathbf v - \mathrm{proj}_U(\mathbf v), \mathbf u_j \rangle = \langle \mathbf v , \mathbf u_j \rangle -\langle \mathrm{proj}_U(\mathbf v), \mathbf u_j \rangle = 0.

The result follows by the conjugate-linearity of \langle \mathbf v - \mathrm{proj}_U(\mathbf v), \cdot \rangle.

Theorem 3 (Gram-Schmidt Process). Every finite-dimensional vector space V will have an orthonormal basis.

We may think that the isomorphism \mathbf v_k \mapsto \mathbf e_k \in \mathbb K^n suffices. However, we need this bijection to preserve angles, which isn’t a trivial matter to check at all.

Instead, we will slowly construct this orthonormal basis. Let \{\mathbf v_1,\dots,\mathbf v_n\} be a basis (not necessarily orthonormal!) for V. Define \mathbf u_1 := \hat{\mathbf w}_1 = \hat{\mathbf v}_1. Now \{\mathbf v_1, \mathbf v_2\} is obviously linearly independent. This means \mathbf v_2 \notin \mathrm{span}\{\mathbf u_1\} .

Define

\mathbf w_2 := \mathbf v_2 - \mathrm{proj}_{\mathrm{span}\{\mathbf u_1\}}(\mathbf v_2),

which is orthogonal to \mathbf u_1. Hence, define \mathbf u_2 := \hat{\mathbf w}_2. Inductively, given K_i := \{\mathbf u_1,\dots,\mathbf u_i\}, define,

\mathbf w_{i+1} := \mathbf v_{i+1} -  \mathrm{proj}_{\mathrm{span}(K_i)}(\mathbf v_{i+1})

and \mathbf u_{i+1} := \hat{\mathbf w}_{i+1}. Then we will obtain the desired orthonormal basis K_n, since by construction we have for each i, \mathbf v_i \in \mathrm{span}(K_i) \subseteq \mathrm{span}(K_n).

The infinite-dimensional cases aren’t easy to analyse, but can be reasonably studied in the context of Hilbert spaces, which features surprisingly prominently in modern physics. To do that will require us to study functional analysis, since we need to scrutinise convergence issues more closely than the finite-dimensional setting.

For now, let’s discuss the crucial finite-dimensional application in best-fit approximations, one of my favourite areas of study in applied mathematics—approximation theory.

—Joel Kindiak, 12 Mar 25, 1711H

,

Published by


Leave a comment