Why is the variance equal to E[X²] — E²[X] ?

Oscar Nieves
3 min readDec 29, 2022

In statistics and probability theory, we use the variance of a random variable X, denoted as Var(X), to quantify how large the deviations are from the mean value (also known as expected value) E[X], and in a way this tells us how large the fluctuations in X are. The larger Var(X), the more unpredictable X is. The expected value E[X] is defined for discrete X as:

where each x term is an outcome (a possible value of X) and each p is the probability of such an outcome occurring. In the case that X is a continuous random variable, we express E[X] as an integral rather than a discrete sum:

where f(x) is the probability density function (pdf) of X telling us how likely each outcome x is. The variance in the other hand is expressed as:

and if we expand this binomial expression we end up with:

this is because the expected value of a constant is equal to that constant, namely E[c] = c. The mean value E[X] is in itself a constant, so taking the expected value of it returns the same value, namely E[E[X]] = E[X]. Similarly, it follows that E[cX] = cE[X] for some constant c. Hence, the variance ends up as E[X²] — E[X]². This seems like a simple enough derivation, but often times by looking at it we tend to forget what the meaning of the variance actually is, which is in fact that we take the difference between a random variable X and its mean E[X], square that value (since we are not interested in whether the deviation is below or above X), and then take the expected value of all those outcomes. The main reason we use Var(X) = E[X²] — E[X]² more often in practice is because it is easier to compute (takes less time), but the definition remains the same.

--

--

Oscar Nieves
Oscar Nieves

Written by Oscar Nieves

I write stories about applied math, physics and engineering.

Responses (1)