Having discussed vector spaces and linear transformations in sufficient generality, we are finally ready to discuss matrices: the beefed-up brother to vectors.
Let denote a field and
denote the standard (ordered) basis for
. Here, we will distinguish the ordered bases
and
since the elements are sequenced in different orders.
Definition 1. For positive integers , an
matrix over
is a rectangular array of elements in
. We use capital letters
to denote matrices and small letters
to denote the entry in the
-th row and
-th column:
We will let denote the collection of such matrices. Rather obviously, we denote
to mean
for any
.
This is where our discussion on matrices will diverge from the usual progression—defining the various operations and then computing several examples. We will not adopt that approach. Instead, we will interpret matrices, fundamentally, as linear transformations. Then using established ideas in linear transformations, we will recover the natural definitions for such matrix operations.
For vector spaces over
, let
denote the collection of linear transformations from
.
Theorem 1. Let be positive integers. Define
Then the map defined by
is a well-defined bijection.
Proof. That is well-defined follows from
being a function. For surjectivity, fix
. Define
and extend it to a linear transformation
by linearity. Then
As of now, we have not shown that forms a vector space over
. Thus, we cannot use the property
to prove that
is injective. (Indeed, what do we mean by
on the right-hand side?) Thankfully, we can still use the vanilla definition of injectivity with relative ease.
For injectivity, fix and suppose
Then for each ,
. Therefore
. Since the extension to
is unique, we have
.
Since is bijective,
is also a bijection. This means that we can create the vector space structure in
using the vector space structure in
.
Corollary 1. Let be matrices and
be a scalar. Using the transformation
in Theorem 1, addition and scalar multiplication are defined by
In particular, we have the additive identity and for any
, the additive inverse
.
Proof. Define and
. Then by an exercise, matrix addition is defined by
For any , we use the vector space structure of
to compute
Therefore, the -th entry of
is
. In more concise terms,
Scalar multiplication is derived similarly using the definition .
This construction also strengthens into a vector space isomorphism.
Corollary 2. .
The key point worth highlighting from Corollary 2 is this:
Matrices are effectively the same as linear transformations from
to
with respect to the standard ordered basis.
In more layperson terms, the key point is to think of matrices not as arbitrary tables of numbers, but as encodings of linear transformations. In this regard, the definition of matrix multiplication and its properties become trivial consequences of composing linear transformations.
Definition 2. Let and
. Define matrix multiplication by
Corollary 3. Let ,
,
. Then
Proof. By the associativity of function composition,
Therefore,
Lemma 1. Let . For any
,
. In particular,
Proof. To prove , we first note that
is defined by
, then extended by linearity since
. By definition of matrix multiplication,
For the matrix representation, find unique scalars such that
. Then
As a side-note, Lemma 1 gives us a rather comical yet useful analogy to left-multiplying the matrix with the vector
. We can think of each
as an “ingredient”, and
as a “recipe”, which tells us to add up
“units” of each “ingredient”
. Then
gives us the “dish” that we created by combining the recipe with the ingredients.
Corollary 4. Let and
. Then for any vector
,
In particular,
Furthermore, for and
Proof. The associativity of matrix multiplication follows from Corollary 3. The -entry of
refers to the
-th coordinate of
, which we evaluate by the first part to equal
where the expansion in the fourth “” works thanks to Lemma 1. In particular, the
-th coordinate is
, as required.
The final result is the conventional definition for matrix multiplication, but the intermediate result tends to be more intuitive and useful, since it can be interpreted using our recipe-ingredient analogy in the following manner:
The matrix encodes the
different recipes, with
denoting the
-th recipe. Since we have
recipes, we want to cook
dishes. For the
-th dish, we use the
-th recipe with the ingredients in
to obtain the dish
, which is the
-th column in
. Therefore,
encodes the
different dishes:
Therefore, many of our desired matrix operations arise from regarding matrices as linear transformations, including matrix multiplication, which encodes composing linear transformations. As such, it should make intuitive sense that more often than not, for matrices ,
, since function composition is likewise in general not commutative, i.e.
is a very rare sighting.
We could wonder to ourselves: are there infinite-dimensional matrices? Once again, if we interpret matrices as linear transformations, and infinite-dimensional vector spaces like do exist, it wouldn’t be too far-fetched to rigorously formulate infinite-dimensional matrices, albeit with more care than the finite-dimensional case.
But perhaps let’s at least posit a more immediate question. If there are inverses of bijective linear transformations
, could there be inverses of matrices
? Intuitively, we would like
This idea turns out to work. In fact, since , this idea gives us a nice commutative property
But we will explore this idea in more detail next time. In particular, invertible matrices will help us formalise the otherwise mundane algorithm known as Gaussian elimination.
—Joel Kindiak, 28 Feb 25, 1825H
Leave a comment