### What does the size of the standard deviation mean?

According to Wikipedia (my emphasis),

In statistics, the standard deviation ... is a measure that is used to

**quantify the amount of variation**or dispersion of a set of data values. A standard deviation close to 0 indicates that the data points tend to be very close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.The purpose of the standard deviation (SD), then, is to tell us

**how varied**or uniform (SD → 0) the data is.So, given a certain SD, how varied

*is*the data?**What does the size of the standard deviation mean?**Please explain the meaning of the SD by interpreting an

*SD*= 1 (*M*= 0).If you cannot interpret the size (quantity) of this SD, what other information would you need to be able to interpret it, and how would you interpret it, given that information? Please provide an example.

If, on the other hand, the quantity of the SD cannot be

*qualified*in this manner, my argument is that it is essentially meaningless. If you disagree, please explain the meaning of the SD._{The following are earlier versions to give context to the answers. Unfortunately these didn't really convey what I wanted, and my attempt to ask it elsewhere was closed. (I don't need these versions answered now):}_{First revision:}_{What does the size of the standard deviation mean?}_{For example, if I want to study human body size and I find that adult human body size has a standard deviation of 2 cm, I would probably infer that adult human body size is very uniform, while a 2 cm standard deviation in the size of mice would mean that mice differ surprisingly much in body size.}_{Obviously the meaning of the standard deviation is its relation to the mean, and a standard deviation around a tenth of the mean is unremarkable (e.g. for IQ: SD = 0.15 * M).}_{But what is considered "small" and what is "large", when it comes to the relation between standard deviation and mean? Are there guidelines similar to the ones that Cohen gives for correlations (a correlation of 0.5 is large, 0.3 is moderate, and 0.1 is small)?}_{Original question:}_{We always calculate and report means and standard deviations. But what does the size of the variance actually mean?}_{For example, assume we are observing which seat people take in an empty room. If we observe that the majority of people sit close to the window with little variance, we can assume this to mean that people generally prefer siting near the window and getting a view or enough light is the main motivating factor in choosing a seat. If on the other hand we observe that while the largest proportion sit close to the window there is a large variance with other seats taken often also (e.g. many sit close to the door, others sit close to the water dispenser or the newspapers), we might assume that while many people prefer to sit close to the window, there seem to be more factors than light or view that influence choice of seating and differing preferences in different people.}_{At what values can we say that the behavior we have observed is very varied (different people like to sit in different places)? And when can we infer that behavior is mostly uniform (everyone likes to sit at the window) and the little variation our data shows is mostly a result of random effects or confounding variables (dirt on one chair, the sun having moved and more shade in the back, etc.)?}_{Are there guidelines for assessing the magnitude of variance in data, similar to Cohen's guidelines for interpreting effect size (a correlation of 0.5 is large, 0.3 is moderate, and 0.1 is small)?}_{For example, if 90% (or only 30%) of observations fall within one standard deviation from the mean, is that uncommon or completely unremarkable?}@whuber As you can see, I have tried what you suggest in the second revision of my question, to which glen_b has replied that no meaning can be derived from this. Since your comment is being continually upvoted, maybe you or some of the upvoters can explain what your comment means, where I went wrong (with my second revision) or where glen_b might be mistaken. As it stands, your comment does not provide any insights to me. Also, please consider the current (hopefully final) revision of my question, where I have attempted to express my question without any of the obviously distracting examples.

What is missing from this question and my comment is any indication of the units of measure. "90" by itself is meaningless. Another crucial missing element is any contextual frame of reference to determine whether 90 is large or small.

You are leading me around in circles. I had units of measure and contexts in the examples in previous versions of my question. These were heavily criticized. Obviously I am unable to find appropriate examples and come to a conclusion on my own. I explicitly ask you (or anyone else) to *give* an example and explain the answer to me.

A review of your original post shows you were asking this question in great generality: "Are there guidelines for assessing the magnitude of variance in data?" If this were (say) the Physics site and somebody were to ask "are there guidelines for assessing the magnitude of length," don't you think the question would immediately be closed as being too broad (or too vague or both)? I was only hoping that this analogy would make it apparent just how impossible it is to answer your question here.

Discussion of the new question:

For example, if I want to study human body size and I find that adult human body size has a standard deviation of 2 cm, I would probably infer that adult human body size is very uniform

It depends on what we're comparing to. What's the standard of comparison that makes that very uniform? If you compare it to the variability in bolt-lengths for a particular type of bolt that might be hugely variable.

while a 2 cm standard deviation in the size of mice would mean that mice differ surprisingly much in body size.

By comparison to the same thing in your more-uniform humans example, certainly; when it comes to lengths of things, which can only be positive, it probably makes more sense to compare coefficient of variation (as I point out in my original answer), which is the same thing as comparing sd to mean you're suggesting here.

Obviously the meaning of the standard deviation is its relation to the mean,

No, not always. In the case of

*sizes*of things or*amounts*of things (e.g. tonnage of coal, volume of money), that often makes sense, but in other contexts it doesn't make sense to compare to the mean.Even then, they're not necessarily comparable from one thing to another. There's no applies-to-all-things standard of how variable something is before it's variable.

and a standard deviation around a tenth of the mean is unremarkable (e.g. for IQ: SD = 0.15 * M).

Which things are we comparing here? Lengths to IQ's? Why does it make sense to compare one set of things to another? Note that the choice of mean 100 and sd 15 for one kind of IQ test is entirely arbitrary. They don't have units. It could as easily have been mean 0 sd 1 or mean 0.5 and sd 0.1.

But what is considered "small" and what is "large", when it comes to the relation between standard deviation and mean?

Already covered in my original answer but more eloquently covered in whuber's comment -- there is no one standard, and there

*can't*be.Some of my points about Cohen there still apply to this case (sd relative to mean is at least unit-free); but even with something like say Cohen's d, a suitable standard in one context isn't necessarily suitable in another.

Answers to an earlier version

We always calculate and report means and standard deviations.

Well, maybe a lot of the time; I don't know that I

*always*do it. There's cases where it's not that relevant.But what does the size of the variance actually mean?

The standard deviation is a kind of average* distance from the mean. The variance is the square of the standard deviation. Standard deviation is measured in the same units as the data; variance is in squared units.

*(RMS -- https://en.wikipedia.org/wiki/Root_mean_square)

They tell you something about how "spread out" the data are (or the distribution, in the case that you're calculating the sd or variance of a distribution).

For example, assume we are observing which seat people take in an empty room. If we observe that the majority of people sit close to the window with little variance,

That's not exactly a case of recording "which seat" but recording "distance from the window". (Knowing "the majority sit close to the window" doesn't necessarily tell you anything about the mean nor the variation about the mean. What it tells you is that the

*median*distance from the window must be small.)we can assume this to mean that people generally prefer siting near the window and getting a view or enough light is the main motivating factor in choosing a seat.

That the median is small doesn't of itself tell you that. You might infer it from other considerations, but there may be all manner of reasons for it that we can't in any way discern from the data.

If on the other hand we observe that while the largest proportion sit close to the window there is a large variance with other seats taken often also (e.g. many sit close to the door, others sit close to the water dispenser or the newspapers), we might assume that while many people prefer to sit close to the window, there seem to be more factors than light or view that influence choice of seating and differing preferences in different people.

Again, you're bringing in information outside the data; it might apply or it might not. For all we know the light is better far from the window, because the day is overcast or the blinds are drawn.

At what values can we say that the behavior we have observed is very varied (different people like to sit in different places)?

What makes a standard deviation large or small is not determined by some external standard but by subject matter considerations, and to some extent what you're doing with the data, and even personal factors.

However, with positive measurements, such as distances, it's sometimes relevant to consider standard deviation relative to the mean (the coefficient of variation); it's still arbitrary, but distributions with coefficients of variation much smaller than 1 (standard deviation much smaller than the mean) are "different" in some sense than ones where it's much greater than 1 (standard deviation much larger than the mean, which will often tend to be heavily right skew).

And when can we infer that behavior is mostly uniform (everyone likes to sit at the window)

Be wary of using the word "uniform" in that sense, since it's easy to misinterpret your meaning (e.g. if I say that people are "uniformly seated about the room" that means almost the opposite of what you mean). More generally, when discussing statistics, generally avoid using jargon terms in their ordinary sense.

and the little variation our data shows is mostly a result of random effects or confounding variables (dirt on one chair, the sun having moved and more shade in the back, etc.)?

No, again, you're bringing in external information to the statistical quantity you're discussing. The variance doesn't tell you any such thing.

Are there guidelines for assessing the magnitude of variance in data, similar to Cohen's guidelines for interpreting effect size (a correlation of 0.5 is large, 0.3 is moderate, and 0.1 is small)?

Not in general, no.

Cohen's discussion[1] of effect sizes is more nuanced and situational than you indicate; he gives a table of 8 different values of small medium and large depending on what kind of thing is being discussed. Those numbers you give apply to differences in independent means (Cohen's

*d*).Cohen's effect sizes are all scaled to be unitless quantities. Standard deviation and variance are not -- change the units and both will change.

Cohen's effect sizes are intended to apply in a particular application area (and even then I regard too much focus on those standards of what's small, medium and large as both somewhat arbitrary and somewhat more prescriptive than I'd like). They're more or less reasonable for their intended application area but may be entirely unsuitable in other areas (high energy physics, for example, frequently require effects that cover many standard errors, but equivalents of Cohens

*effect sizes*may be many orders of magnitude more than what's attainable).

For example, if 90% (or only 30%) of observations fall within one standard deviation from the mean, is that uncommon or completely unremarkable?

Ah, note now that you have stopped discussing the size of standard deviation / variance, and started discussing the proportion of observations within one standard deviation of the mean, an entirely different concept. Very roughly speaking this is more related to the peakedness of the distribution.

For example, without changing the variance at all, I can change the proportion of a population within 1 sd of the mean quite readily. If the population has a $t_3$ distribution, about 94% of it lies within 1 sd of the mean, if it has a uniform distribution, about 58% lies within 1 sd of the mean; and with a beta($\frac18,\frac18$) distribution, it's about 29%; this can happen with all of them having the same standard deviations, or with any of them being larger or smaller without changing those percentages -- it's not really related to spread at all, because you defined the interval in terms of standard deviation.

[1]: Cohen J. (1992),

"A power primer,"

*Psychol Bull.*,**112**(1), Jul: 155-9.If the distribution is identical, the percentage would be fixed, not changing.

If things work as they should, you won't be able to delete it; while you "own" your question, once a question has answers, you don't get to delete them, so the question - a valid question with valid answers - should stay, *even if it's not what you wanted to ask about*. I'd suggest you start your new question with some basic concepts; you may find many of your current intuitions don't apply.

It's a clearer question, and would have been a good one to ask. Unfortunately, the problem is that you've dramatically changed the question in a way that invalidates the answers you received (the other one fairly completely, mine partially). Why should it not simply be rolled back to as it stood when it got those answers?

However, rather than remove what you had before, you can add your revised question at the end, and leave the original for context, so that the other answer still looks like it answers a question. It's hardly fair to put Tim's originally valid answer in danger of being marked as "not an answer" (and then deleted) when his answer responded to an important part of what you originally asked. The easy way is to copy what you have now (into say a notepad window), roll your question back, then edit to repaste in the new content (and add any explanation of the change you feel is necessary).

(a), no the comparison to mice came later in the discussion. At the time you called it "very uniform" no mention of mice had been made. (b) No, there's no relationship between mean and sd for normal distributions in general; the normal is a location-scale family. There is for say exponential distributions. ... (ctd)

(ctd)... (c) I'm not sure it makes sense to view the set of population standard deviations (or coefficients of variation if you prefer that) across species as being a population of its own; I guess if one views it as a kind of random effect, it could be done, but there's no a priori basis I am aware of to assume any kind of distribution for those, so it's a bit hard to figure out how one would conclude a value was anomalous. Maybe a biologist has some way to justify such an approach.

Of course the SD tells you something; this was right near the start of my opening response. It just doesn't tell you the exact kind of thing you appear to want it to -- whether the variation is large in some absolute sense. The length of a piece of string tells you something -- how long the piece of string is. You appear now to be saying the equivalent of "unless I can say that length is *long* or *short* in some absolute sense, knowing the string is 23 cm long doesn't tell me anything"

By Chebyshev's inequality we know that probability of some $x$ being $k$ times $\sigma$ from mean is at most $\frac{1}{k^2}$:

$$ \Pr(|X-\mu|\geq k\sigma) \leq \frac{1}{k^2} $$

However with making some distributional assumptions you can be more precise, e.g. Normal approximation leads to 68–95–99.7 rule. Generally using any cumulative distribution function you can choose some interval that should encompass a certain percentage of cases. However choosing confidence interval width is a subjective decision as discussed in this thread.

*Example*

The most intuitive example that comes to my mind is intelligence scale. Intelligence is something that cannot be measured directly, we do not have direct "units" of intelligence (by the way, centimeters or Celsius degrees are also somehow arbitrary). Intelligence tests are scored so that they have mean of 100 and standard deviation of 15. What does it tell us? Knowing mean and standard deviation we can easily infer which scores can be regarded as "low", "average", or "high". As "average" we can classify such scores that are obtained by most people (say 50%), higher scores can be classified as "above average", uncommonly high scores can be classified as "superior" etc., this translates to table below.Wechsler (WAIS–III) 1997 IQ test classification IQ Range ("deviation IQ")

`IQ Classification 130 and above Very superior 120–129 Superior 110–119 High average 90–109 Average 80–89 Low average 70–79 Borderline 69 and below Extremely low`

So standard deviation tells us how far we can assume individual values be distant from mean. You can think of $\sigma$ as of unitless distance from mean. If you think of observable scores, say intelligence test scores, than knowing standard deviations enables you to easily infer how far (how many $\sigma$'s) some value lays from the mean and so how common or uncommon it is. It is subjective how many $\sigma$'s qualify as "far away", but this can be easily qualified by thinking in terms of probability of observing values laying in certain distance from mean.

This is obvious if you look on what variance ($\sigma^2$) is

$$ \operatorname{Var}(X) = \operatorname{E}\left[(X - \mu)^2 \right]. $$

...the expected (average) distance of $X$'s from $\mu$. If you wonder, than here you can read why is it squared.

Your interpretation of the mean requires normality. IQ is not normally distributed (the tails are thicker and the curve is skewed). Therefore the 3-sigma-rule does not apply. Also, your interpretation is circular, because the IQ classification is randomly based on the SD and cannot in turn explain the SD.

License under CC-BY-SA with attribution

Content dated before 6/26/2020 9:53 AM

whuber 5 years ago

I would like to suggest that considerable insight into these questions can be had by replacing "variance" or "standard deviation" by some other (more familiar) quantity that plays an analogous role in quantitative description, such as length. When describing most physical objects, scientists will report a length. What does the length actually mean? What length is considered uncommonly large or small? Are there guidelines for assessing the magnitudes of lengths? If a length is 90 (or 30), is that uncommon or completely unremarkable?