Intuitive explanation for dividing by $n-1$ when calculating standard deviation?
I was asked today in class why you divide the sum of square error by $n-1$ instead of with $n$, when calculating the standard deviation.
I said I am not going to answer it in class (since I didn't wanna go into unbiased estimators), but later I wondered - is there an intuitive explanation for this?!
I'd like to quote this zinger from the book *Numerical Recipes*: "...if the difference between $n$ and $n-1$ ever matters to you, then you are probably up to no good anyway - e.g., trying to substantiate a questionable hypothesis with marginal data."
a really elegant, intuitive explanation is presented here (below the proof) https://en.wikipedia.org/wiki/Bessel%27s_correction#Proof_of_correctness_-_Alternate_3 The basic idea is that your observations are, naturally, going to be closer to the sample mean than the population mean.
@Tal, This is why schools suck. You ask them "why *this*?", and they reply "just memorize it".
If you are looking for an intuitive explanation, you should see the reason for yourself by actually taking samples! Watch this, it precisely answers you question. https://www.youtube.com/watch?v=xslIhnquFoE
**tl;dr:** (from top answer:) "...the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population..." See also: https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Bias_correction So, unless you feel like calculating something somewhat complex, just use n-1 if it's from a sample.
The standard deviation calculated with a divisor of $n-1$ is a standard deviation calculated from the sample as an estimate of the standard deviation of the population from which the sample was drawn. Because the observed values fall, on average, closer to the sample mean than to the population mean, the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population. Using $n-1$ instead of $n$ as the divisor corrects for that by making the result a little bit bigger.
Note that the correction has a larger proportional effect when $n$ is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.
When the sample is the whole population we use the standard deviation with $n$ as the divisor because the sample mean is population mean.
(I note parenthetically that nothing that starts with "second moment recentered around a known, definite mean" is going to fulfil the questioner's request for an intuitive explanation.)
@Michael, This doesn't explain Why do we use `n−1` instead of `n−2` (or even `n−3`)?
@Pacerier Have a look at Whuber's answer below for detail on that point. In essence, the correction is n-1 rather than n-2 etc because the n-1 correction gives results that are very close to what we need. More exact corrections are shown here: http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation
Hi @Michael, so why deviation calculated from sample mean tends to be smaller than population mean?
"Because the observed values fall, on average, closer to the sample mean than to the population mean, the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population." Why the sample mean always underestimates? What if it overestimates?