The Mathematical Average

Given a discrete random variable $X$ taking values in $\mathbb Z$ , what is its expectation, if it exists?

Let’s suppose $X$ takes on finitely many values, i.e. the set

$\mathrm{supp}(X) := \{x \in \mathbb Z : \mathbb P(X = x) > 0\}$

is finite. Denote $\mathrm{supp}(X) = \{x_1,\dots,x_n\}$ . The expectation of $X$ is the value that we expect $X$ to take. In a sense, we want to obtain the “center value” $\mu$ that the random variable $X$ will take.

We can think of the center value $\mu$ as a “pivot” on a “balance beam”. Each $x_i < \mu$ will induce a “weight” of $\mathbb P(X = x_i)$ that tilts the beam anticlockwise, and similarly, each $x_i > \mu$ will induce a “weight” of $\mathbb P(X = x_i)$ that tilts the beam clockwise. Intuitively, the total contributions ought to cancel out, yielding the equality

$\displaystyle \sum_{x \in \mathrm{supp}(X)} (x - \mu) \cdot \mathbb P(X = x) = 0.$

Expanding the left-hand side,

$\begin{aligned} \sum_{x \in \mathrm{supp}(X)} (x - \mu) \cdot \mathbb P(X = x) &= \sum_{x \in \mathrm{supp}(X)} x \cdot \mathbb P(X = x) - \mu \cdot \sum_{x \in \mathrm{supp}(X)} \mathbb P(X = x) \\ &= \sum_{x \in \mathrm{supp}(X)} x \cdot \mathbb P(X = x) - \mu \cdot 1 \\ &= \sum_{x \in \mathrm{supp}(X)} x \cdot \mathbb P(X = x) - \mu. \end{aligned}$

Therefore,

$\displaystyle \sum_{x \in \mathrm{supp}(X)} x \cdot \mathbb P(X = x) = \mu.$

Using measure notation $\mathbb P_X \equiv \mathbb P(X \in \cdot)$ and $\mathbb P_X(x) \equiv \mathbb P_X(\{x\})$ for brevity,

$\displaystyle \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x) = \mu,$

where the left-hand side is well-defined if $\mathrm{supp}(X)$ is finite. This quantity we will formally define as the expectation, if the sum exists (even if the sum is infinite).

Let $X$ be a $\mathbb Z$ -valued random variable.

Definition 1. The expectation of $X$ , denoted $\mathbb E[X]$ , is defined by

$\displaystyle \mathbb E[X] := \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x)$

whenever the right-hand side exists.

Example 1. For $p \in [0, 1]$ , if $X \sim \mathrm{Ber}(p)$ , then

$\begin{aligned} \mathbb E[X] &= 0 \cdot \mathbb P_X(0) + 1 \cdot \mathbb P_X(1) \\ &= 0 \cdot (1-p) + 1 \cdot p = p. \end{aligned}$

Lemma 1. Let $(\Omega, \mathcal F, \mathbb P)$ be a probability space and $X : \Omega \to \mathbb Z$ be a random variable. Then

$\displaystyle \mathbb P_X(x) = \sum_{\omega \in \Omega} \mathbb I\{X(\omega) = x\} \cdot \mathbb P(\{\omega\}).$

Whenever both sides are well-defined,

$\displaystyle \mathbb E[X] = \sum_{\omega \in \Omega} X(\omega) \cdot \mathbb P(\{ \omega \}).$

Proof. We first observe that

$\displaystyle \mathbb P_X(x) = \mathbb P(X(\omega) = x) = \sum_{\omega : X(\omega)=x} \mathbb P(\{ \omega \}) = \sum_{\omega \in \Omega} \mathbb I\{X(\omega)=x\} \cdot \mathbb P(\{ \omega \}).$

Hence,

$\begin{aligned} \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x) &= \sum_{x \in \mathbb Z} x \cdot \sum_{\omega \in \Omega} \mathbb I\{X(\omega)=x\} \cdot \mathbb P(\{ \omega \}) \\ &=\sum_{x \in \mathbb Z} \sum_{\omega \in \Omega} x \cdot \mathbb I\{X(\omega)=x\} \cdot \mathbb P(\{ \omega \}) \\ &= \sum_{\omega \in \Omega} \sum_{x \in \mathbb Z} x \cdot \mathbb I\{X(\omega)=x\} \cdot \mathbb P(\{ \omega \}) \\ &= \sum_{\omega \in \Omega} X(\omega) \cdot \mathbb P(\{ \omega \}) . \end{aligned}$

Lemma 2. For any map $g : \mathbb Z \to \mathbb Z$ , $g(X) := g \circ X$ is a $\mathbb Z$ -valued random variable.

Theorem 1. If $\mathbb E[g(X)]$ exists, then

$\displaystyle \mathbb E[g(X)] = \sum_{x \in \mathbb Z} g(x) \cdot \mathbb P_X(x),$

whenever both sides are well-defined.

Proof. Defining $Y := g(X)$ by Lemma 2, the proof and result in Lemma 1 yields

$\begin{aligned} \mathbb E[g(X)] = \mathbb E[Y] &= \sum_{\omega \in \Omega} Y(\omega) \cdot \mathbb P(\{\omega\}) \\ &= \sum_{\omega \in \Omega} g(X(\omega)) \cdot \mathbb P(\{\omega\}) \\ &= \sum_{\omega \in \Omega} \sum_{x \in \mathbb Z} g(X(\omega)) \cdot \mathbb I\{X(\omega) = x\} \cdot \mathbb P(\{\omega\}) \\ &= \sum_{x \in \mathbb Z} \sum_{\omega \in \Omega} g(X(\omega)) \cdot \mathbb I\{X(\omega) = x\} \cdot \mathbb P(\{\omega\}) \\ &= \sum_{x \in \mathbb Z} g(x) \cdot \sum_{\omega \in \Omega} \mathbb I\{X(\omega) = x\} \cdot \mathbb P(\{\omega\}) \\ &= \sum_{x \in \mathbb Z} g(x) \cdot \mathbb P_X(x). \end{aligned}$

Corollary 1. Let $Y$ be a $\mathbb Z$ -valued random variable. For any map $g : \mathbb Z^2 \to \mathbb Z$ , $g(X,Y) := g \circ (X,Y)$ is a $\mathbb Z$ -valued random variable. Furthermore,

$\displaystyle \mathbb E[g(X,Y)] = \sum_{(x, y) \in \mathbb Z^2} g(x,y) \cdot \mathbb P_{X,Y}(x,y).$

Furthermore, if $\mathbb E[X], \mathbb E[Y]$ exist, then the following hold:

$\mathbb E[X + Y] = \mathbb E[X] + \mathbb E[Y]$ ,
$\mathbb E[\alpha X] = \alpha \cdot \mathbb E[X]$ for any $\alpha \in \mathbb Z$ ,
$\mathbb E[X - \mathbb E[X]] = 0$ .

Proof. We prove the first identity for simplicity. Define $g(x,y) = x+y$ . Then

$\begin{aligned} \mathbb E[g(X,Y)] &= \sum_{(x, y) \in \mathbb Z^2} g(x,y) \cdot \mathbb P_{X,Y}(x,y) \\ \mathbb E[X+Y]&= \sum_{(x, y) \in \mathbb Z^2} (x+y) \cdot \mathbb P_{X,Y}(x,y) \\ &= \sum_{(x, y) \in \mathbb Z^2} x \cdot \mathbb P_{X,Y}(x,y) + \sum_{(x, y) \in \mathbb Z^2} y \cdot \mathbb P_{X,Y}(x,y) \\ &= \sum_{x \in \mathbb Z} x \cdot \mathbb P_{X}(x) + \sum_{y \in \mathbb Z} y \cdot \mathbb P_{Y}(y) \\ &= \mathbb E[X] + \mathbb E[Y], \end{aligned}$

where the simplifications arise from

$\begin{aligned} \sum_{(x, y) \in \mathbb Z^2} x \cdot \mathbb P_{X,Y}(x,y) &= \sum_{x \in \mathbb Z} \sum_{y \in \mathbb Z} x \cdot \mathbb P_{X,Y}(x,y) \\ &= \sum_{x \in \mathbb Z} \sum_{y \in \mathbb Z} x \cdot \mathbb P(Y = y \mid X = x) \cdot \mathbb P_X(x) \\ &= \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x) \cdot \sum_{y \in \mathbb Z} \mathbb P(Y = y \mid X = x) \\ &= \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x) \cdot 1 \\ &= \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x).\end{aligned}$

Thankfully, all series here are valid due to linearity.

Example 2. For $n \in \mathbb N$ and $p \in [0, 1]$ , if $X \sim \mathrm{Bin}(n, p)$ , then $\mathbb E[X] = np$ .

Proof. Find independent identically distributed (i.i.d.) Bernoulli random variables $\xi_1, \dots, \xi_n \sim \mathrm{Ber}(p)$ such that

$\displaystyle X = \sum_{i=1}^n \xi_i.$

By Corollary 1,

$\displaystyle \mathbb E[X] = \mathbb E\left[ \sum_{i=1}^n \xi_i \right] = \sum_{i=1}^n \mathbb E[\xi_i] = \sum_{i=1}^n p = np.$

Example 3. Equip the finite sample $(x_1,\dots,x_n) \subseteq \mathbb Z$ with the uniform probability measure and induced random variable $X := \mathrm{id}$ . Then

$\begin{aligned} \mathbb E[X] &= \sum_{x \in \mathbb Z} x \cdot \mathbb P_X(x) \\ &= \sum_{i=1}^n x_i \cdot \mathbb P_X(x_i) \\ &= \sum_{i=1}^n x_i \cdot \frac 1n = \frac 1n \cdot \sum_{i=1}^n x_i =: \bar x. \end{aligned}$

Thus, the right-hand side is called the mean of the sample $(x_1,\dots,x_n)$ .

The expectation has a cousin—the covariance and its child the variance. We will discuss these ideas more in the next post.

—Joel Kindiak, 1 Jul 25, 1915H

KindiakMath

The Mathematical Average

Leave a comment Cancel reply

The Mathematical Average

Share this:

Leave a comment Cancel reply