Problem 1. Let be a real square matrix. Suppose
and
. Compute
.
(Click for Solution)
Solution. Since , we have
Taking determinants on both sides,
Therefore, . Since
, we have
—Joel Kindiak, 9 Feb 25, 1322H
Problem 1. Let be a real square matrix. Suppose
and
. Compute
.
Solution. Since , we have
Taking determinants on both sides,
Therefore, . Since
, we have
—Joel Kindiak, 9 Feb 25, 1322H
The “algebra” in linear algebra essentially refers to the kinds of transformations one can perform on vectors. Since the algebra is linear, we shall define linear transformations, which is rather familiar to us in various fields of study.
Before giving a formal definition of a linear transformation, let’s motivate the discussion through some elementary school geometry. Given a rectangle with base and height
, what is the area
of the rectangle? It would be
. This means
These properties have been used again and again, especially in calculus, since
and
We can now define this notion of a linear transformation in broad generality. The idea is that any result that is true of linear transformations would be true to each of these examples, be it contrived ones like , and less contrived ones like the calculus operations.
For the rest of this post, let be any set, and let
be vector spaces over a field
.
Definition 1. A function is a linear transformation if for any
and
,
These properties mimic the closure properties of a vector space. In fact, the key idea is that we want to preserve these closure properties. If
forms a basis for
, then it turns out that
and
are not too different, allowing us to formally define dimensions. Finally, when we are dealing with the special cases
and
, we want to recover the usual notion of a matrix.
We first state a seemingly obvious yet incredibly useful result for linear transformations.
Lemma 1. A linear transformation is injective if and only if for any
,
.
Proof. For , if
is injective, then
implies
, as required. On the other hand, for
, for vectors
, suppose
. Then
Hence, , so that
is injective, as required.
Let’s start with a question that sounds obvious but requires more thought: Is the constant a function? Well, for any
, we can define the corresponding function
by
, where
for any
. Therefore, we can define a linear injection between the two vector spaces.
Lemma 2. Define the function by the map
, where
for any
. Then
is a linear transformation and is injective.
Proof. For linearity, we observe that
Therefore, . Similarly,
, establishing linearity. For injectivity, we notice that
implies that for any
,
, so that
, as required.
Recall that given any function ,
. In particular, if
is injective, then we can regard
as a subset via the imbedding
. Contextualising to Lemma 2, we can regard
as a subspace. Therefore, constants are as good as functions, via the imbedding
.
Using techniques in real analysis, we can prove that the collection such that
is a subspace, with the property that
For elements , we write
. We can use this definition to extend the calculus-based subsets of functions.
For any function and
, define
to mean
Lemma 3. If and
, then
.
Proof. Since is a subspace,
Henceforth, we will denote as the unique limit such that
.
Theorem 1. Define the subset of functions
Then
as subspaces and the map defined by
is a well-defined linear transformation.
Proof. We first remark that . Hence
implies that
, which yields
so that . Similarly,
, so that
as a subspace. With
,
. Finally, the uniqueness of limits yields
Having defined limits at , we can generalise to limits at any other point
. In fact, we don’t need to do a lot of hard work to even show that the set of functions with limits at
exist; we’ll just transport the subspace property in the following manner:
Theorem 2. Let be a linear transformation. Then the range of
defined by
is a subspace of .
Proof. For additivity, if , then
yields
Similarly, for any
, thus
as a subspace.
Lemma 4. Let be a linear transformation.
Proof. Exercise.
Corollary 1. For any , define the subset of functions
Then as a subspace and the map
defined by
is a well-defined linear transformation.
Proof. Define the linear transformation by
. Then
is clearly bijective so that
as a subspace.
Furthermore
is a linear transformation.
In most calculus courses, limits are applied to discuss continuous functions, in the following manner: is continuous at
if and only if
In other words, is not all that is required; we also require
We could define this property then manually prove that the set of functions continuous at is indeed a subspace of
. However, we have an even more efficient tool up our linear algebraic sleeves.
Theorem 3. Let be a linear transformation. Then the kernel of
defined by
is a subspace of .
Proof. For additivity, if , then
so that . Similarly,
for any
, thus
as a subspace.
Corollary 2. For any , define the subset of functions continuous at
by
Then for any ,
as subspaces. Here denotes the set of functions that are continuous on
.
Proof. Since arbitrary intersections of vector spaces remain as vector spaces, it remains to prove that as a subspace (as a subset, this is obvious). To that end, define the linear transformation
by
Then the result establishes the result.
Corollary 3. For any , define the subset of functions differentiable at
by
Then for any ,
as subspaces. Here denotes the set of functions that are differentiable on
. Furthermore, the functions
and
defined by
are linear transformations.
Proof. To prove that requires the use of several limit laws in calculus. For the subspace property, define the injective linear transformation
by
It is not hard to verify that as a subspace. Thus,
is a bijective linear transformation, and thus
as a subspace. Finally,
as a linear transformation, and
as a linear transformation.
Of course, these ideas extend even to integral calculus, which includes the generalisation of integral transforms, whose special cases include the Laplace transform and the Fourier transform. However, to establish the existence of these objects requires us to create new ideas in Lebesgue integration (and even in Riemann integration we don’t get a sufficiently big picture). Thus, we relegate the discussions therein if, and when, we get there.
Next up: differentiating polynomials which causes us to touch base, pun intended, with bases for vector spaces once again.
—Joel Kindiak, 26 Feb 25, 2344H
Let be a vector space over a field
. We have previously seen that for vectors
,
Furthermore, if , we require at least one vector, namely
, to cook up the space
.
We are tempted then to think that requires two minimum ingredients. However, we have seen that this does not always hold. For instance, suppose
contains some
. Define
and
. Then
yields
which requires only one ingredient. The key observation is that . That is,
doesn’t truly increase
at all. We can consolidate by making the following observation.
Lemma 1. Let be sets of vectors. Suppose
. Then
as subspaces. Furthermore, if , then
Proof. Firstly, , so that
contains
. Since
is the smallest subspace containing
, we must have
If , then
implies
Hence, .
The main point of the equality is that for any excess vectors , if these vectors belong to
, then
doesn’t add any new information to
, and thus does not increase the minimum required number of ingredients to generate the space.
Recall our example where both vectors are nonzero. We have seen that if
, then
What if ? Then there are no scalars
such that
. In fact, something stronger happens.
Lemma 2. Let be nonzero vectors. Then
if and only if for any
,
Proof. We first prove . Fix scalars
such that
By algebra, . If
, then we have
and we are done. Otherwise,
, a contradiction. Therefore,
necessarily.
We next prove by contrapositive. Suppose
. Then there exists some
such that
. If
then
, a contradiction. Hence,
. Therefore,
Setting and
,
and yet is not true.
It is this latter condition that we will define as linear independence.
Definition 2. The finite set is called linearly independent if for any
,
It is not hard to see that nonempty finite subsets of are linearly independent as well. A single set is linearly independent as well too.
We use this property to generalise to infinite sets. For any set , we say that
is linearly independent if every nonempty finite subset of
is linearly independent. We say that
is linearly dependent if it is not linearly independent.
Corollary 1. Let be nonzero vectors. Then
if and only if
is linearly independent.
Roughly speaking, a set is linearly independent if it contains
pieces of information to its span. This is what we define to be the dimension of a span.
However, let’s first return to mathematical earth and discuss . Intuitively,
ought to have
dimensions. This is true.
Example 3. For each , define
, which means
Then is linearly independent, and
Proof. For linear independence, fix scalars such that
Then for any ,
Therefore, , so that
is linearly independent. For the spanning property, fix
, where
for each
. Then
This yields . On the other hand, since
and
is the smallest vector space containing
, we automatically have
. Therefore,
We call a basis for
, and can generalise the idea to other vector spaces.
Definition 3. We call a basis for
if
is linearly independent and
.
Theorem 1. Suppose is a basis for
. Then for any vector
, there exists unique scalars
such that
Proof. Existence is immediate from . For uniqueness, suppose there are two representations
Subtracting on both sides,
Since is linearly independent,
for each
, so that the “recipe” that creates
is unique.
Corollary 2. Suppose is a basis for
. Define the map
in the following manner:
for any
and for any
and
,
Then is a well-defined bijection. In fact,
is a linear transformation that maps the dish
to the recipe
that is required to “cook” it.
In a sense, any vector space that only requires a minimum of ingredients to cook all dishes is essentially the same as the vector space
. We can formalise this idea using isomorphisms, which is our next topic of discussion.
—Joel Kindiak, 24 Feb 0007H
Let be any set and
be any field. Recall that the function space
forms a vector space over
. Furthermore, given any vector space
over
, the function space
forms a vector space over
. In this manner, we can create many vector spaces.
But even if we restrict our attention to just one vector space over
, we can create many, many vector spaces.
For any , define
It should seem intuitive that forms a vector space over
. In this case, we would call
a subspace of
.
Definition 1. We say that is a subspace of
if
forms a vector space over
.
However, this definition requires us to verify all 8 or more conditions of a vector space—this would be an incredibly arduous task. Is there a short-cut to determine this result? Thankfully, the answer is yes.
Theorem 1. For any ,
is a subspace of
if and only if the following three conditions are satisfied:
Proof Sketch. Apart from these closure properties, all other properties are guaranteed by the definition of vector spaces.
Example 1. For any ,
is a subspace of
.
Proof. We verify the three properties of a vector space, and suppose for nontriviality.
Therefore, is a subspace of
. In fact, if
, then
is isomorphic to
as vector spaces (more on isomorphisms in a future post). We call
a
-dimensional subspace of
.
Theorem 2. Let be subspaces. Then
is subspace of
(and similarly,
).
Proof. We verify the three identities.
In general, does not form a vector space. For a concrete example, recall that
. For each
, define
. Then clearly,
, but
.
The correct generalisation would be direct sum. In fact, we can characterise subspaces in terms of sums of subsets.
Theorem 3. For subsets , and
, define the subsets
Then is a subspace of
if and only if
and
. Furthermore, if
and
are subspaces of
, then
is a subspace of
.
Proof. It is not hard to see that if are subspaces of
, then
Furthermore, .
Let be a vector space and
be subspaces. Then any subspace
containing
must contain
. In that sense,
is the smallest subspace that contains
.
In particular, given nonzero vectors ,
will be the smallest vector space that contains
. In fact, more is true.
Lemma 1. For any subspace such that
,
Hence is the smallest vector space that contains
.
More generally, if is just a set, then
is a subspace of
that contains
. The smallest subspace that contains
will be the intersections of all subspaces that contain
. This is called the span of
.
Theorem 4. Let . Let
denote the collection of subspaces of
that contain
. The span of
is then defined by
Then is the smallest subspace of
that contains
.
Proof. Exercise.
If is a finite set, then we can write
as a combination of one-dimensional subspaces.
Theorem 5. For vectors , then
Proof. For the case , the equality
holds since both sides of the proposed equality are the smallest subspace containing . Apply induction on Lemma 1 to obtain the desired result.
In fact, we can even take the span of the empty set.
Example 2. .
Proof. We leave it as an exercise to verify that is a subspace that contains
. On the other hand, any subspace that contains
just refers to any subspace, which must contain
. Hence,
.
Finally, we obtain the usual definition for spans of sets in a vector space.
Corollary 1. For vectors ,
Intuitively, any vector belongs to
if we can “cook”
up by combining some recipe
with the ingredients:
where denotes the amount of the ingredient
used to cook up
.
It might be tempting therefore to assume that subspaces of the form require all
ingredients
to generate. However, that is not always true. Observe that
implies
so that only requires one vector instead of two. How do know if we have hit the lowest possible number of ingredients? We need a fundamental tool called linear independence, which is our next topic of discussion.
—Joel Kindiak, 22 Feb 25, 2226H
Let’s talk linear algebra. This subject involves two key words: linear—referring to some nice vector-ish objects and related properties, and algebra—the manipulations and transformations we can perform on said vector-ish properties.
For an introduction to the topic, we will discuss 2D vectors. But we shall not (and will not) shy away from its more exciting abstractions.
Throughout this post, let denote any set and
denote any field, which roughly speaking refers to any set where addition, subtraction, multiplication, and division is sufficiently well-defined.
Definition 1. The two-dimensional -space is defined to be
where we will denote the ordered pairs in column notation. In particular, denotes the two-dimensional real space that we all know and love.
Very soon we will discuss ideas in much broader generality. But perhaps to motivate the subject, we can recall our usual vector operations that correspond to two-dimensional vectors used in high school physics.
Definition 2. Define addition and scalar multiplication on via
We expect these objects to behave like the vectors that we are familiar with, that in essence, encode directed distance. We call the set of these vectors a vector space.
Theorem 1. Let . Then
satisfies the following additive properties:
In this case, we call a group under
. In addition, we can add vectors in either order (i.e. this is the commutativity property):
In this case, we call an abelian group under
. In addition,
satisfies the following scaling properties:
In this case, we call a vector space over
.
Proof Sketch. The proof is a matter of definition-checking. Nevertheless, we will complete some proofs to illustrate some of the techniques being used.
For the second property, we take advantage of the associativity of in
:
For the third property, we define and check that it satisfies the required equations. For the fourth property, we define
and do the needful bookkeeping.
Notice that this idea is not unique to . It could apply to many, many other sets, as we are about to see.
Lemma 1. For any field ,
forms a vector space over
. In particular,
forms a vector space over
.
Arguably the most important instances of vector spaces would be the function spaces. These spaces don’t always share all of the same properties as , but when they do, share these properties in a beautifully unified manner.
Theorem 2. For any vector space over
, let
denote the set of
-valued functions on
. Define addition and scalar multiplication according to the vector space structure of
:
For any and
,
Then forms a vector space over
with additive identity
defined by
and for any , additive inverse
defined by
In particular, forms a vector space over
.
It is this last example that we want to emphasise as the twin brother of .
Theorem 3. For any , define
by
Then the function defined by
satisfies the following property:
For any and
,
In this case, we call a linear transformation. In addition,
is bijective, and we call
a vector space isomorphism. Therefore, we can write
without ambiguity.
Proof Sketch. The proof is immediate after we recognise that for each ,
which implies that
The bijectivity of is easily verifiable.
This connection allows us to define -space as a function space.
Definition 3. For any vector space over
and
, we define the vector space
, which is a vector space over
. In particular,
, of which
is a special case.
We insist on defining the vector space , since we can make remarkable connections with other areas of mathematics, as we will see in the next post.
—Joel Kindiak, 19 Feb 25, 2233H
Throughout this post, let be vector spaces over some field
and
be a linear transformation.
Problem 1. For any subspace , prove that
is a subspace of
. In particular, the range
of
is a subspace of
.
Solution. For any and for any scalar
,
Problem 2. For any subspace , prove that
is a subspace of
. In particular, the kernel
of
is a subspace of
.
Solution. For any ,
. Since
is a subspace of
, for any scalar
,
Problem 3. For any , prove that
if and only if
.
Solution. We have .
Problem 4. For subspaces , prove that
is a subspace of . Also, for any
, prove that
is a subspace of
.
Solution. For any ,
Similarly, for any scalar ,
The proof for is similar.
Problem 5. For subspaces , prove that
where equality holds if is surjective.
Solution. For the first claim, to prove , we note that for any
,
,
so that .
To prove when
is surjective, fix
. Then
. Hence, there exists
such that
.
Since is surjective, find
such that
. Then
. By Problem 3,
Hence, .
Problem 6. For any scalar and subspace
, prove that
.
Solution. We make the quick observation that
for any vector subspace . This means for any
,
where for brevity. The result follows by bookkeeping.
—Joel Kindiak, 26 Jan 25, 1826H
Problem 1. Let be a vector space over
. Prove that if
has at least one nonzero element, then
has infinitely many elements.
Solution. Suppose contains some nonzero
. Since
is a vector space over
,
contains
for any
. Each
yields a unique
since
. Thus,
contains an infinite number of elements given by the injection
.
Problem 2. Let be a vector spaces over
, and
be a linear transformation. For any
, prove that the equation
either has zero solutions, or one solution, or infinitely many solutions.
Solution. If the equation has no solutions, then we are done. Otherwise, it has at least one solution
, which yields
.
If the equation has at most one solution, then we are done. Otherwise, the equation has at least one other solution , which yields
. Since
is linear,
For any , consider the vector
. Each
yields a unique
since
. On the other hand
Thus, the equation has infinitely many solutions defined by
.
—Joel Kindiak, 30 Nov 24, 0112H
Problem 1. Let be a vector space and
be a linear transformation. Suppose
, where we denote
for brevity. Prove that
.
Solution. Under the hypothesis that , we have
.
We will first prove . Fix
. Find
such that
. Then
so that . Hence,
.
Now, we prove . Fix
. Then
By repeated application of this result,
Hence, .
Combining both results, .
—Joel Kindiak, 29 Nov 24, 2130H
I couldn’t solve this linear algebra question as an undergraduate. Today, we revisit it and defeat it once and for all.
Problem 1. Let be a finite-dimensional (complex) vector space equipped with the inner product
. Let
be an invertible linear operator. Suppose that for any
,
Prove that is a scalar multiple of some unitary operator.
It is not hard to show that the converse holds (i.e. if is a scalar multiple of a unitary operator then the equation holds).
In the original exam paper, the examiner actually offered sub-steps to the question. The solution will therefore follow the suggested roadmap and furthermore, address the result more directly.
Solution. We first observe that implies that
is self-adjoint, therefore normal, and thus unitarily diagonalisable.
Thus, there exists an orthonormal basis for
such that
, where each
.
We observe that
In particular,
so that . Thus, substituting
into the equation
and simplifying,
Simplifying for any such that
,
Since was chosen arbitrarily, we must have
so that
It is not hard to show that this equation holds for any . Thus,
is unitary, and
, as required.
—Joel Kindiak, 19 Oct 24, 0150H