Alternate Variance Formula

As we saw in the previous post, for a population, the variance is calculated as
σ² = ( Σ (x-μ)² ) / N.
Another equivalent formula is σ² = ( (Σ x²) / N ) – μ². If we need to calculate variance by hand, this alternate formula is easier to work with.

Let’s get a little bit better intuition of just manipulating sigma notation.

The formula for population variance is:

[math]\sigma^{2}=\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{N}[/math]

Let’s focus on the numerator part and multiply out the squared term and see where it takes us. 

That part is the same thing as:

[math]\sum_{i=1}^{n}(x_{i}^{2}-2x_{i}\mu+\mu^{2})[/math]

[math]=\sum_{i=1}^{N}x_{i}^{2}-2\mu\sum_{i=1}^{N}x_{i}+\mu^{2}\sum_{i=1}^{N}1[/math]

Before we bring the denominator back, let’s focus on the last part first.

[math]\sum_{i=1}^{n}x[/math]  This means, whatever you have there, where ‘x’ is, iterate it N times. 

So in our case, this part is equal to N, since we have 1 over there.

Now let’s put the denominator back.

[math]=\frac{\sum_{i=1}^{N}x_{i}^{2}}{N}-\frac{2\mu\sum_{i=1}^{N}x_{i}}{N}+\frac{\mu^{2}N}{N}[/math]

We remember also that the [math]\frac{\sum_{i=1}^{n}x_{i}}{N}[/math]  part is the mean of a population that equals to μ.

[math]=\frac{\sum_{i=1}^{N}x_{i}^{2}}{N}-2\mu^{2}+\mu^{2}=\frac{\sum_{i=1}^{N}x_{i}^{2}}{N}-\mu^{2}[/math]

Now we have reached a neat way of writing the variance.

This means that we can essentially take the average of the squares of all the numbers in a population and then subtract them from the mean squared.

We can go a little further and convert that mu squared.

[math]\frac{\sum_{i=1}^{N}x_{i}^{2}}{N}-\frac{(\sum_{i=1}^{N}x_{i})^{2}}{N^{2}}[/math]

With this last one, we don’t even have to calculate the mean ahead of time.


Disclaimer: Like most of my posts, this content is intended solely for educational purposes and was created primarily for my personal reference. At times, I may rephrase original texts, and in some cases, I include materials such as graphs, equations, and datasets directly from their original sources.I typically reference a variety of sources and update my posts whenever new or related information becomes available. For this particular post, the primary source was Khan Academy’s Statistics and Probability series.