Having ventured far into the measure-theoretic world and taking great pains to verify that taking the length of a compact interval
is mathematically legitimate, we turn our attention to modelling continuous data. If the uniform counting measure considered the “basic” distribution in discrete probability, then the uniform continuous distribution would naturally be considered the “basic” distribution in continuous probability.
Definition 1. Let be a sample space. The uniform continuous distribution on
is the random variable
with a distribution defined as follows: for any measurable
,
Here, the probability density function is given by . Henceforth also we will disregard the sample space unless we need to undertake some theoretic construction.
As usual, here denotes the Lebesgue measure on
, and in this case, we write
.
The expected value and variance tends to be useful statistical quantities for our purposes. Nevertheless, there is a special expected value and a special variance that causes the rest of our calculations to become trivial.
Lemma 1. Suppose and
. Define the standardised random variable
by
Then and
. Conversely, for any
with
, if the p.d.f.
of
is continuous, then the p.d.f.
of
can be written in terms of the p.d.f. of
:
Furthermore, and
.
Proof. For the p.d.f. result, by the fundamental theorem of calculus,
The other results follow from expectation and variance properties.
Theorem 1. If , then
and
.
Proof. Let for simplicity, so that
implies that
By definition,
Similarly,
so that
Then the general case follows from
We can generalise Lemma 1 to any differentiable, invertible transformation of .
Lemma 2. For any where
is a invertible differentiable transformation, if the p.d.f.
of
is continuous, then the p.d.f.
of
can be written in terms of the p.d.f. of
:
Proof. Apply the same arguments as in Lemma 1, but apply the chain rule.
The continuous analog of the geometric distribution would be the exponential distribution.
Theorem 2. The random variable follows an exponential distribution with rate parameter
, denoted
, if is defined by the probability density function
and
.
Proof. Let for simplicity, so that
gives the final result. Integrating by parts,
so that .
Another crucial distribution in probability and statistics is the normal distribution. It is a useful model to represent all kings of continuous data—heights of humans, test scores for exams, and even month-on-month returns on investment. Actually more is true—as long as we have a continuous nonzero function that is integrable, we obtain a corresponding probability distribution.
Lemma 3. For any continuous and integrable nonzero function , there exists a continuous random variable
with probability density function
where is called the normalising constant of
.
Theorem 3. The random variable follows an normal distribution with mean
and variance
, denoted
, if is defined by the probability density function
and
.
Proof. Denoting ,
so it suffices to prove that and
. We remark that the normalising constant of
is
by the Gaussian integral
and a change of variables. The former is immediate since the function is odd and its integral over
converges absolutely, so that
The latter requires integration by parts (here, as with the exponential distribution calculation, all integrals converge absolutely so we do not need to worry about the limits):
Therefore,
What makes the normal distribution so ubiquitous is how it arises from samples of virtually any distribution.
Theorem 4 (Central Limit Theorem). Let be i.i.d. random variables with mean
and variance
. Define
Then in the following sense: for any
, there exists
such that for any
,
We say that converges in distribution to the standard normal distribution
.
Proof. Delayed.
Our main goal in these posts on probability is to prove this statement rigorously. The central limit theorem is responsible for our intuition about probability arising from repeated experiments.
A more immediate application comes from simulation. It turns out that the humble uniform random variable (implemented using pseudorandom numbers) can be used to generate all other random variables.
Theorem 5. Suppose . For any random variable
, there exists a measurable function
such that
.
Proof. Let denote the c.d.f. of
. We observe that
by definition. Therefore, the idea is to generate
, then define the variate
, so that
.
To that end, define the measurable map by
Then define . Observe that
. Therefore
Therefore, .
But we have a slightly more urgent question to answer: what’s the distribution of in general? We need to construct the relevant sample space, and prove relevant properties in the multivariable setting, before we can legitimately continue on our quest to prove the central limit theorem. In fact, once we have at least done so for normal distributions, we will be in a sufficiently good place to discuss the principle of statistical hypothesis testing—a quantitative implementation of the scientific method.
—Joel Kindiak. 19 Jul 25, 2258H
Leave a comment