Contents

Create Your Own Survey Today

Free, easy-to-use survey builder with no response limits. Start collecting feedback in minutes.

Get started free
Logo SurveyNinja

Standard deviation

The average satisfaction score is 4.2. But what does that mean: did everyone answer "4" and "5", or did half give a "1" and half a "7"? To understand how much answers "scatter" around the mean, we use standard deviation. It shows the typical distance from the mean to an individual value: the larger it is, the wider the spread. In surveys, standard deviation helps you assess how consistent the answers are, build confidence intervals and understand how "typical" the mean really is.

Standard deviation is one of the measures of spread in descriptive statistics. Without it the mean can be misleading: the same "average score of 3.5" can come from consensus (everyone around 3-4) or from polarization (half low scores, half high ones).

What standard deviation is in plain terms

Standard deviation (SD) is a measure of how spread out data are around the arithmetic mean. It shows how far, on average, values deviate from the mean: if the standard deviation is small, the data are "clustered" near the center; if it is large, they are spread out widely. It is calculated as the square root of the variance (the mean of the squared deviations from the mean). In a symmetric distribution, roughly two thirds of observations fall within the band "mean plus or minus one standard deviation".

To put it simply: standard deviation answers the question "how much, on average, do the answers differ from the mean". If on a 1-5 scale the mean is 4 and the standard deviation is 0.3, almost everyone answered "4"; if the standard deviation is 1.5, the answers are scattered from "1" to "5".

How it is calculated

Formula. First you compute the arithmetic mean of all values. Then for each value you find its deviation from the mean (value minus mean), square it, and average those squares - this gives the variance. The standard deviation is the square root of the variance. In software and spreadsheets this is done automatically with the STDEV function or an equivalent.

Worked example. Five answers: 3, 4, 4, 4, 5. Mean = (3+4+4+4+5)/5 = 4. Deviations: -1, 0, 0, 0, 1. Squares: 1, 0, 0, 0, 1. Mean of the squares (variance) = 0.4. Standard deviation = sqrt(0.4) ~ 0.63. The answers are close to the mean - the spread is small.

If the answers had been 1, 2, 4, 6, 7 (mean also 4), the standard deviation would be about 2.3 - the spread is noticeably larger.

For a sample and for a population. Formulas sometimes divide by N (the number of observations) or by N-1 (Bessel's correction). Dividing by N-1 gives an unbiased estimate of the population standard deviation from a sample; to describe the sample itself you can use either version, but N-1 is more common. In Excel the STDEV.S function uses N-1, while STDEV.P uses N.

When you need it

Assessing how consistent the answers are. A low standard deviation (for example, 0.5 on a 1-5 scale) means respondents answered similarly; a high one (for example, 1.8) means opinions differ greatly. This is useful when interpreting the mean: "a mean of 4.2 with a standard deviation of 0.4" indicates consensus, while "a mean of 4.2 with a standard deviation of 1.6" indicates polarization.

Confidence intervals. To build a confidence interval around the mean you need to know the standard error of the mean (the standard deviation divided by the square root of the sample size). The larger the standard deviation, the wider the interval - the uncertainty is higher.

Comparing groups. When comparing means across two groups (for example, by segments), it matters to look not only at the difference in means but also at the standard deviations. If group A has a mean of 4.5 (standard deviation 0.3) and group B has a mean of 4.2 (standard deviation 1.2), the groups differ not only in their mean but also in how consistent the answers are.

Checking data quality. An unusually large standard deviation can signal problems: data-entry errors, an ambiguous question, a mixed audience. For example, if for the question "Rate from 1 to 5" the standard deviation is close to 2.0 with a mean of 3.0, this may mean the question was understood in different ways or the sample is heterogeneous.

Interpretation in surveys

By scales. For a 1-5 scale a standard deviation of about 0.5-0.8 usually indicates high consistency; 1.0-1.3 indicates moderate spread; above 1.5 indicates strong spread or polarization. But it matters to read it in the context of the mean: with a mean of 4.5 a standard deviation of 0.6 may mean "mostly 4s and 5s", whereas with a mean of 3.0 the same standard deviation means "mostly 2s, 3s and 4s".

Comparison with the range. The range (maximum minus minimum) shows the full span but does not capture how the values are distributed within it. Standard deviation takes into account all values and their frequencies. For example, with answers 1, 1, 5, 5 the range = 4 and the standard deviation ~ 2.0 (polarization). With answers 2, 3, 3, 4, 4 the range is also 2, but the standard deviation ~ 0.8 (consensus).

Relation to the median. In a symmetric distribution the mean and the median are close, and the standard deviation describes the spread around both. With skew, the median is more robust to outliers, while the standard deviation can be inflated by single extreme values. That is why, with skew, people sometimes look at the median and the interquartile range (IQR) as an alternative to the mean and the standard deviation.

Examples in numbers

Scenario 1: high consistency. Question "Rate the service from 1 to 5": 100 answers, 80% are "4", 15% are "5", 5% are "3". Mean ~ 4.1, standard deviation ~ 0.4. Almost everyone is satisfied, the spread is minimal. In the report you can write: "an average score of 4.1 (standard deviation 0.4), which indicates a high level of agreement in opinions".

Scenario 2: polarization. The same 100 answers, but 40% are "1", 20% are "3", 40% are "5". The mean is also about 3.0, but the standard deviation ~ 1.8. Here the mean does not reflect the "typical" answer - there are two "camps". In the report it is appropriate to note: "a mean of 3.0 with a standard deviation of 1.8 points to polarized opinions; analysis by segments is recommended".

Scenario 3: moderate spread. Distribution: 10% are "1", 20% are "2", 30% are "3", 25% are "4", 15% are "5". Mean ~ 3.2, standard deviation ~ 1.2. There is spread, but no clear polarization. This is a typical picture for many surveys.

Relation to other metrics

Variance. Standard deviation is the square root of the variance. Variance is measured in squared units (for example, if scores are in points, the variance is in "squared points"), while standard deviation is in the same units as the original data. That is why standard deviation is more convenient for interpretation.

Arithmetic mean. The mean and the standard deviation together describe the center and the spread. Without the standard deviation the mean can be deceptive: a "mean of 3.5" with a standard deviation of 0.3 and with one of 1.8 are different situations. Survey reports often present both figures: "a mean of 4.2 (SD = 0.7)".

Confidence intervals. To build an interval around the mean you use the standard error of the mean (standard deviation / sqrt(N)). The larger the standard deviation and the smaller the sample, the wider the interval. Read more in the article on confidence intervals.

The normal distribution. In the normal distribution about 68% of values fall within "mean +/- 1 standard deviation", and 95% within "mean +/- 2 standard deviations". This "three sigma" rule helps you estimate how many answers fall in a given range when the distribution is close to normal.

Coefficient of variation. The ratio of the standard deviation to the mean (CV = SD / mean) shows the relative spread. It is useful when comparing variables in different units or with different means. For example, if the mean is 4.0 and the standard deviation is 0.8, then CV = 0.2 (20% spread relative to the mean).

Common mistakes

Ignoring the standard deviation when interpreting the mean. A "mean of 3.8" without a measure of spread does not give the full picture. Always state the standard deviation next to the mean, or at least mention the spread in words ("answers ranged from 1 to 5").

Computing a standard deviation for categorical variables. For nominal variables (region, customer type) a standard deviation is not calculated - only frequencies and proportions are appropriate there. For ordinal scales (for example, a Likert scale) the standard deviation does make sense, but it is interpreted with the limitations of the scale in mind.

Comparing standard deviations without taking the means into account. A standard deviation of 1.0 with a mean of 2.0 and with a mean of 4.5 are different situations. With a mean of 2.0 a spread of 1.0 is relatively large (50% of the mean), while with a mean of 4.5 it is moderate (about 22%). Use the coefficient of variation to compare relative spread.

Confusing standard deviation with standard error. Standard deviation describes the spread of the original data; the standard error of the mean (SD / sqrt(N)) describes the uncertainty in the estimate of the mean. For confidence intervals you need the standard error, not the standard deviation itself.

Expecting normality. The rule "68% within one standard deviation" works for the normal distribution. In surveys, distributions are often skewed or bounded by the scale - in that case this rule is only approximate. Look at the histogram and use the standard deviation as a descriptive measure, not as a strict rule.

By segments and subgroups

Standard deviation is calculated not only for the whole sample but also within segments: by region, customer type, age. This helps you understand in which groups opinions are consistent and in which they are polarized. For example, if in segment A the mean is 4.5 (SD = 0.4) and in segment B the mean is 3.8 (SD = 1.5), the groups differ both in level and in consistency. In the report it is appropriate to state the standard deviations for each subgroup next to the means.

How this looks in SurveyNinja

In reports, the mean is shown by default for scale questions; the standard deviation is not displayed in the interface. You can obtain it after exporting answers to CSV/XLSX and computing it in Excel (the STDEV.S function) or in another statistical package. When preparing a report for a client, it is convenient to add the standard deviation next to the mean - this gives a fuller picture of how the answers are spread.

Practical recommendations

Always state the standard deviation next to the mean. Instead of "an average score of 4.2", write "an average score of 4.2 (SD = 0.7)" or "an average score of 4.2, standard deviation 0.7". This helps the reader assess how consistent the answers are.

Interpret it in the context of the scale. For a 1-5 scale a standard deviation of 0.5 means high consistency; for a 0-100 scale the same standard deviation of 0.5 means almost complete agreement. Always take the range of the scale into account.

When there is polarization, add to your analysis. If the standard deviation is large (for example, more than half the range of the scale), it makes sense to look at the distribution and run analysis by segments - there may be groups with different opinions within the sample.

When comparing groups, state the standard deviations. When comparing means across two groups, state the standard deviation for each group. This helps you understand whether the groups differ only in their mean or also in how consistent the answers are.

What to write in the report. In the results section, state the mean and the standard deviation for the key scale questions. If the standard deviation is unusually large or small, comment briefly: "a high level of agreement in opinions" or "polarized answers are observed, analysis by segments is recommended".

Standard deviation shows the spread of data around the mean and helps you understand how consistent the answers are. Without it the mean can be misleading; together they give a full picture of central tendency and spread - the basis for interpreting survey results.

1