The infinite coin toss is an intuitively straightforward idea that needs effort to properly construct: what do we mean by tossing a coin infinitely many times? And yet, such constructions are crucial to discuss discrete distributions like the geometric and Poisson distributions that find meaningful applications in mathematical finance and other fields of applied mathematics.
Consider the following experiment. Flip a biased coin with . Let
denote the outcome of the
-th flip (i.e.
if the
-th flip is Head, and
otherwise, excluding the case when the coin lands on its side). Let
denote the number of flips needed for you to obtain the first Head. What is the value of
for
? Intuitively, the first
flips has to be Tails, and the last should be Head. Assuming all coin flips are independent, we should have the probability
The problem is that could take on any of the infinitely many values in
, and so a reasonable sample space
will require outcomes of the form
. To be fair, that is not the challenging bit: define
The problem arises in defining the underlying probability measure. The issue isn’t Head and Tail; we obtain effectively the same probability space working with . The problem is this: how do we evaluate the quantity
?
Your first instinct should be to take the limit of as
, and your instinct is not wrong. But how do we set up the probability space so that this instinct does, in fact, yield a valid answer? Furthermore, by that principle,
for any sequence
. How do we obtain nonzero probabilities in that case?
To that end, let’s scale down our analyses a bit, lest we overthink in excessive panic. Rather than think of infinitely many coin tosses, let’s start with the first coin toss. No matter which infinite outcome we obtain, we know that
or
, and that these outcomes are assigned the probabilities
respectively. This observation motivates us to define sub-
-algebras of
that agree with our intuitions.
To that end, equip with the usual
-algebra
and probability measure
. Then for any
, define
Then define the -algebra
on
and equip it with the probability measure
. We can repeat this procedure
times as follows.
Definition 1. For any , equip
with the usual
-algebra
and probability measure
. Then for any
, define
Then define the -algebra
on
and equip it with the probability measure
. Define
Lemma 1. We have . Furthermore,
is closed under finite unions, though not necessarily countably infinite unions. We call
an algebra of subsets of
.
Lemma 2. and
is countably additive:
Proof. Consider as the topological space
Equip with the discrete topology, and it is clear that
is compact. Then
equipped with the product topology is compact by the Tychonoff theorem.
Now fix such that
. Then
forms an open cover of
. Since
is closed, it is compact, and thus is covered by a finite subcollection
without loss of generality. Since the union is disjoint, we can conclude that
for
. Hence,
Unfortunately, we still need a -algebra on
. Currently, we only have an algebra
, which is a
-algebra except that it is closed under only finite, rather than countable, unions. Thankfully, an algebra and a countably additive measure allows us to recover a
-algebra and a bona fide measure. This result is called Carathéodory’s extension theorem.
Theorem 1. Let be an algebra of subsets of
. Suppose there exists a countably additive map
, which we call a pre-measure on
. Then there exists a
-algebra
and a measure
such that
. We say that
extends to a measure
on
.
Carathéodory’s extension theorem does more than just help us establish the logical validity of infinite coin flips—it will help us create suitable -algebras on
. But we will need to put in some nontrivial effort to prove it.
Henceforth, assume the hypotheses of Theorem 1. Let be an algebra of subsets of
and
be a pre-measure on
.
Lemma 3. Define the outer measure by
Then for
and, in addition, satisfies the following properties:
if
,
,
.
Proof. For the subset claim, any cover of is a cover of
, so the result is immediate.
For the subadditivity claim, fix . Fix
. By the definition of
, there exist
such that
and ,
. Define
so that
. By the countable additivity of
,
Taking yields the desired result.
For countable subadditivity, if there exists such that
, then the inequality holds trivially. Suppose therefore that
for any
. For each
, find
that covers
(i.e.
) and
Then we notice that covers
, and by the sum of a geometric series
Taking yields the desired result.
Lemma 4. The subset defined by
forms an algebra over , and
is countably additive on
. We say that each
satisfies the Carathéodory condition.
Proof. By Lemma 3, the inequality
holds automatically, so that the only direction that needs to be checked is the direction
It is not hard to see that since
. Furthermore, closure under complementation is obvious, since for any
,
since . It remains to prove closure under finite unions, and by induction, it suffices to prove the two-set case. Fix
. We claim that for any
,
By subadditivity in Lemma 3,
By definition,
Applying the Carathéodory criterion,
Since also satisfies the Carathéodory criterion, adding yields the finite union result. For the countably additive claim, we first observe that
Inductively, . Now, fix disjoint
. Then
by monotonicity, so that coupled with countable subadditivity in Lemma 3,
Taking yields the desired result.
Lemma 5. The set as defined in Lemma 4 forms a
-algebra over
. Furthermore,
.
Proof. Fix . Since
we may assume without loss of generality that is pairwise disjoint. We need to establish Carathéodory’s criterion for
, that is, prove that for any
,
By countable subadditivity,
By monotonicity, for any ,
Taking yields the desired result.
For the subset claim, fix . Fix
. Consider the cover
for
. Then
forms a cover for
and
forms a cover for
. The result follows from bookkeeping:
Proof of Theorem 1. The map is a rigorously defined measure (by Lemma 4) on the rigorously defined
-algebra
(by Lemma 5) that extends the pre-measure
(by Lemma 3).
Corollary 1. The sample space equipped with the algebra
can be extended to a probability space
such that
for any
.
Finally, we can rigorously define the geometric distribution.
Theorem 2. Define the random variable by
Then for any and
,
We say that follows a geometric distribution, denoted
. Furthermore,
and
.
Remark 1. We also call a stopping time of some stochastic process.
Proof. We observe that
For the expectation,
if the limit on the right-hand side exists. We first define
and observe that
By algebruh,
Taking , we have
for any polynomial
, so that
. Therefore,
. For the variance, we first define
which converges as by the ratio test. By observation,
Taking on both sides,
By algebruh,
Hence,
Not only can we define the geometric distribution, but we can finally define the length measure on in a measure-theoretically useful manner. This we will do next time.
—Joel Kindiak, 7 Jul 25, 2310H
Leave a comment