Given two random variables , what is the distribution of
? We could define the discrete random variables, and perhaps continuous ones as well. But let’s go back to measure theory and define the joint distribution
as rigorously as possible.
Let be a measure space (or a probability space if
is a probability measure). Equip
is equipped with the Borel
-algebra
, generated by open balls under the Euclidean metric.
Lemma 1. Let be a random variable. Then
defined by
is a random variable.
Proof. The map is continuous, and therefore, has open sets as pre-images of open sets. Therefore,
is
/
-measurable.
What would be a reasonable measure on ? Intuitively, we should have a measure
on
such that
, where
denotes the usual Lebesgue measure that we painstakingly constructed. In fact, more generally, given measure spaces
, we would like to define a reasonable
-algebra
on
and a measure
on
such that
It turns out that with the help of Carathéodory’s extension theorem, this task isn’t as Sisyphean as it seems.
Theorem 1. Given measure spaces , there exists a
-algebra
on
and a measure
on
such that
and
Proof. We will prove the special case for simplicity. Define the algebra
For any and
, define the
-section
by
Define the -section
similarly. Now given
, for any
Therefore, the quantity is well-defined. Similarly,
is well-defined for any
. Hence, define the function
, which is non-negative and simple since in the special case
,
We can similarly define , and define
To see this in the simplest case when is a disjoint union (the rest follows by careful bookkeeping),
In particular, . We claim that
is countably additive. Fix
. Then for any
,
Therefore, the function converges monotonically to
, and by the monotone convergence theorem,
Now apply Carathéodory’s extension theorem to obtain a -algebra
and a measure
such that
.
Theoretically, we could just start defining random variables on the product space and go on our merry way. But we still need to answer a key question: given the distributions
and
, how do we compute
? In a more abstract manner, we need to integrate with respect to our newly minted measure
in a computationally consistent manner with integrals with respect to our old measures
respectively. Surprisingly, answering this question leads us to one of the most important theorems in multivariable calculus, which is Fubini’s theorem, as it allows us to rigorously swap integrals—a key tool in any reasonable calculation.
Denote the base measure spaces by and
, and their product space by
. By construction,
. We remark that for any
and
,
and
, since the
-algebra
contains .
Now we observe that and each
has Lebesgue measure
.
Definition 1. A measure space is
-finite if there exist
with
such that
. For instance,
is
-finite.
Lemma 2. Suppose are
-finite. For any
, the non-negative functions
and
are measurable, and define the predicate
by
Then holds for any
. Note that this result is an extension from that of Theorem 1.
Proof. We first prove the case that and
. It is straightforward that
holds if
or even a disjoint union of sets in
. If
and
, then
. Finally, if
and
, then defining
,
is measurable, and by the monotone convergence theorem,
Therefore, . Let
denote the smallest subset of
such that these two properties are satisfied. We can verify that
is a
-algebra, and hence contains
, as required.
We now generalise to the -finite case. Suppose
such that
, and
similarly. For each
, define
so that
. The result follows by the monotone convergence theorem.
Lemma 3. For any map , all of its sections
are measurable.
Proof. Apply Lemma 2 to the result .
We can now discuss the Fubini-Tonelli theorem. The Fubini theorem is the special case when all integrals therein are finite. Here, a function is integrable if
has measure zero and
is integrable.
Theorem 2 (Fubini-Tonelli Theorem). Suppose are
-finite. If
is either non-negative and measurable (resp. integrable), then the functions
defined by
are measurable (resp. integrable) and
Proof. We return to the usual simple non-negative measurable
integrable strategy. If
, then we obtain this result by Lemma 2. The result extends by linearity to non-negative simple functions.
If is non-negative, find a sequence of non-negative simple functions
that monotonically converge to
. By the monotone convergence theorem,
For each , define
by setting for each
,
. Then
monotonically increases to
. By the monotone convergence theorem again, since
are all step functions,
Finally, in the case is integrable, write
and perform needful bookkeeping.
As much as we feel somewhat justified to add distributions in general, there is one more measure-theoretic machinery we need to discuss—the technical density function known as the Radon-Nikodým derivative. In doing so, we can be justified in letting denote the density function for any sufficiently nice random variable
.
—Joel Kindiak, 21 Jul 25, 2313H
Leave a comment