Scale (rating scale)
May 31, 2026 Reading time ≈ 9 min
A familiar situation: you open a survey and see the question "Rate the quality of service from 1 to 5." You give it a 4, because "overall it was fine, but not perfect."
Your colleague also gives it a 4 — but in their mind it means "almost everything was bad, I only give a 5 for perfection." The same number, a completely different meaning. How did this happen? It comes down to the scale — or more precisely, to the fact that the scale was chosen carelessly: without labels, without calibration, without an understanding of what exactly it measures. A scale is not just "a strip from 1 to 5." It is a measuring instrument, and whether you get accurate data or a set of random numbers depends on its quality.
What a scale is
A scale is a system of gradations through which a respondent expresses their evaluation, attitude, degree of agreement, or the intensity of an attribute. A scale translates a person's subjective experience ("I liked it," "I somewhat disagree," "very convenient") into a numeric or categorical value suitable for analysis.
Without scales, surveys would consist only of open-ended questions — and analyzing hundreds of free-form text answers by hand is slow and unreliable. Scales solve this problem: they standardize answers, make them comparable to one another, and ready for statistical processing. But standardization works only when the scale is chosen correctly — otherwise it does not help, it masks the real picture.
Levels of measurement: the foundation of any scale
Before examining specific scales, it is worth understanding the four levels of measurement — a classification proposed by Stanley Stevens in 1946, and it still defines the rules for working with data.
Nominal Scale. The simplest level: categories without order. "Which browser do you use: Chrome / Firefox / Safari / Other." The categories can be counted (how many people chose each option), but they cannot be ranked — Chrome is not "greater than" or "better than" Firefox in the sense of a scale. Permitted operations: frequency counts, mode.
Ordinal Scale. The categories are ordered, but the distances between them are unknown. "Beginner → Intermediate → Advanced → Expert." Advanced is "greater than" intermediate, but by exactly how much is unknown. The gap between beginner and intermediate may be enormous, while the gap between advanced and expert may be minimal. Permitted operations: median, rank correlations.
Interval Scale. The distances between the divisions are equal, but the zero point is arbitrary. The classic example is temperature in Celsius: the difference between 10° and 20° is the same as between 20° and 30°, but 0° does not mean "an absence of temperature." In surveys, the Likert scale is considered interval (with caveats — more on this below). Permitted operations: mean, standard deviation.
Ratio Scale. Like the interval scale, but with an absolute zero. Income, age, number of purchases — zero means the absence of the attribute, and "40 years" is exactly twice as much as "20 years." In its pure form it is rare in questionnaire scales — more often it appears in quantitative questions ("How many times have you used the service?").
The level of measurement determines which mathematical operations are permissible. Calculating a mean for a nominal scale (for example, "the average gender is 1.4") is nonsense. Calculating a mean for a 5-point satisfaction rating is debatable, but widely practiced. Understanding these boundaries helps you avoid presenting artifacts as conclusions.
Popular scales in surveys
Likert Scale
The most widespread scale in the world of surveys. The respondent is presented with a statement and expresses their degree of agreement: "Strongly disagree — Disagree — Neutral — Agree — Strongly agree." The classic version has 5 points, but 7-point and 4-point (without a neutral option) versions are also used.
When to use it: measuring attitudes, opinions, degree of satisfaction. A universal tool for HR surveys, marketing research, and measuring CSAT.
Pitfalls: a tendency toward the middle (respondents en masse choose "neutral" so they don't have to think), socially desirable answers (they mark "agree" because that feels "more correct"). Read more about the properties of this scale in the article Likert Scale.
Numeric Rating Scale
"Rate from 0 to 10" or "from 1 to 5." The respondent gives a number. Visually it can be presented as a row of digits, a slider, or stars.
When to use it: when you need a quick numeric measurement. The NPS metric is built on an 11-point numeric scale (0–10). Star ratings in CSAT are on a 5-point scale.
Pitfalls: without labels for the values, respondents interpret the scale differently. For one person "7 out of 10" is good, for another it is mediocre. Always label the endpoints ("1 = Very poor," "5 = Excellent"), and ideally every point.
Semantic Differential
Two opposite poles with a scale between them. "Expensive ——— Cheap," "Modern ——— Outdated," "Friendly ——— Cold." The respondent marks their position on this spectrum.
When to use it: evaluating perceptions of brands, products, interfaces. It works well for comparing several objects across the same parameters. The UEQ and SUS scales for measuring usability are close relatives of the semantic differential.
Pitfalls: the wording of the poles must be truly opposite. "Fast — Unreliable" are not antonyms, and such a scale will mislead respondents.
Visual Analogue Scale (VAS)
A continuous line without divisions — the respondent places a mark anywhere on it. The line is usually 100 mm long, which lets you convert the position of the mark into a number from 0 to 100. It is used in medical research (pain assessment), psychology, and UX testing. Read more in the article VAS.
Guttman Scale
A set of statements arranged by increasing difficulty or intensity. If a respondent agrees with a "stronger" statement, it is implied that they also agree with all the previous ones. Example: "I use the app once a month" → "I use the app once a week" → "I use it every day" → "I recommend the app to friends." Read more in the article Guttman Scale.
How many points in a scale is optimal
The eternal question: 5, 7, 10, or something else? There is no single answer, but there are guidelines.
Fewer than 4 points is too coarse. A binary scale ("Yes / No") does not capture nuances. A 3-point scale ("Bad / Okay / Good") loses most of the information. It is suitable only for the simplest measurements.
5 points is the classic. Enough granularity for most tasks, without overloading the respondent. The standard for CSAT, CES, and general satisfaction ratings.
7 points is a little more precise than 5, especially for academic research and measuring attitudes. Studies show that a 7-point scale yields more reliable results than a 5-point one — but the difference is small.
10–11 points is high granularity. NPS uses a 0–10 scale. The advantage: greater variability of answers. The downside: it is harder for a respondent to distinguish "6" from "7" — the choice becomes less deliberate.
Even vs. odd number of points. An odd number of points (5, 7) allows a neutral position — the respondent can "dodge." An even number (4, 6) forces them to pick a side. If getting a definite answer matters to you, use an even number. If you want to grant the right to neutrality, use an odd one.
Practical recommendations for working with scales
Label every point. "1 — 2 — 3 — 4 — 5" without explanations is guesswork with numbers. "1 = Strongly disagree, 2 = Somewhat disagree, 3 = Neutral, 4 = Somewhat agree, 5 = Strongly agree" is measurement. Labels reduce the spread of interpretations and increase the reliability of the data.
Use one scale for one block of questions. If in a block of 10 questions the first three use a 5-point scale, the next two a 7-point one, and the last five a 10-point one, the respondent will get confused and the data will be incomparable. Consistency is the key to clean results.
Avoid ambiguous labels. "Satisfactory" means "acceptable" to some and "just barely enough" to others. The more specific the wording, the smaller the spread. "I liked everything" is more precise than "Good."
Take cultural context into account. In some cultures, extreme ratings (1 and 5) are given rarely — it is considered "impolite" or "categorical." In others, on the contrary, middle values are unpopular. If your audience is international, test the scale on different groups.
Add a "Cannot rate" / "Not applicable" option. If a respondent has not used a feature, forcing them to give a rating is pointless. They will put down a random number that pollutes the data. An "N/A" option filters out irrelevant answers and makes averages cleaner.
Scales in SurveyNinja
All the main types of scales are available in the SurveyNinja builder.
Star and number ratings. A configurable scale from 1 to N: stars, numbers, emoji. Labels for each value are optional. Suitable for quick satisfaction ratings and CSAT.
Slider. A continuous scale with a configurable range and step. The respondent drags a marker — visually it resembles a VAS, but with a numeric value. Convenient for questions like "How likely are you to recommend us?"
Matrix. Several statements with the same scale, combined into a table. It saves screen space and speeds up completion. Ideal for blocks of Likert scales. Read more about setting up questions in the help article on elements.
NPS question. A ready-made element with a 0–10 scale, automatic splitting into promoters, passives, and detractors, and index calculation in the analytics.
A scale is not a decoration of a questionnaire but its measuring instrument. A poor scale produces data that looks convincing but means nothing. A good one turns respondents' subjective feelings into numbers you can trust. Choose the scale before writing the questions, label every point, and do not change the scale in the middle of the questionnaire.
Published: May 31, 2026
Mike Taylor