Pilot survey
May 31, 2026 Reading time ≈ 9 min
Picture this: you spent two weeks designing a questionnaire. You aligned it with your team, checked the wording, set up logic jumps. You launched it to your entire database — 10,000 contacts. Three days later you collected 1,200 responses and discovered that question number 7 is read two different ways: half the respondents interpret it as a rating of the product, the other half as a rating of the purchase process.
The data for that question is useless. And yet it was the very question that measured the key metric the whole study was built around. Fifteen minutes of pilot testing on 15 people would have prevented this disaster — but the pilot stage was "skipped to save time."
What a pilot survey is
A pilot survey (pilot study) is a trial run of a questionnaire on a small group of participants from the target audience before the main study. The goal is to uncover problems with wording, logic, technical settings, and the overall perception of the questionnaire before they spoil real data.
An analogy from aviation: a pilot does not take an aircraft on a commercial flight without running a pre-flight check of every system. A pilot survey is exactly that pre-flight check: everything seems to be set up correctly, but until it has been tested on live people, there is no certainty. Human perception is too complex a system to predict its behavior from behind a desk.
A pilot is not just "taking the questionnaire yourself." Self-review is necessary but not sufficient: as the author, you see in the questions exactly the meaning you intended. A pilot is valuable precisely because it involves people who see your questions for the first time — and interpret them in their own way.
What problems a pilot uncovers
The list of things that can go wrong in a questionnaire is astonishingly long. Piloting catches most of these problems — provided you know where to look.
Confusing wording
The most common finding. A question that seemed crystal clear to the author is understood differently by a pilot participant — or not understood at all.
Example. The question: "How often do you use our service for professional purposes?" A pilot participant asks: "And if I use it both for work and for personal tasks at the same time — is that 'for professional purposes' or not?" The wording failed to account for an edge case that, as it turned out, is typical for half the audience.
Missing answer options
A multiple-choice question lists five options, but a pilot participant says: "My case doesn't fit any of them — I had to pick the closest one, even though it's inaccurate." That's a signal to add an option or rethink the categories. In a field study, such a respondent will silently pick the "wrong" option — and you won't even know the data is distorted.
Length problems
You planned for 5 minutes, but pilot participants complete the questionnaire in 12. After the eighth minute they start speeding up, giving less considered answers, skipping open-ended questions. This is a direct signal: the questionnaire needs to be shortened, otherwise in the main launch the abandonment rate will turn out to be unacceptably high.
Broken branching logic
You set up a logic jump: if option "No" is selected on question 3, the respondent should skip to question 8. But because of an error in the settings, they see question 4, which begins with the words "Tell us more about your experience using..." — an experience they don't have, judging by their "No" answer. In a pilot, the participant will notice this and tell you. In a field launch, they'll simply get confused and answer at random.
Technical defects
The survey won't open on certain mobile devices. A slider scale doesn't work in Safari. An image in a question loads only on desktop. The "Next" button is cut off on a small screen. All of these are realities that are impossible to detect by testing the questionnaire on your single laptop.
Sensitive questions
A question about income, health, or political views may cause discomfort. A pilot helps gauge the reaction: if three out of ten participants say "this question made me uncomfortable" — it's worth softening the wording, adding a "Prefer not to answer" option, or moving the question to the end of the questionnaire, where it won't affect willingness to answer the rest.
Uniform answers
If 90% of pilot participants answer "4" on a five-point scale question, the question doesn't differentiate — it doesn't distinguish opinions. Either the wording is too general, or the scale doesn't fit, or the question is simply obvious. Such a question should be removed or reworded: if everyone answers the same way, the data carries no information.
A pilot doesn't guarantee a perfect questionnaire — but it does guarantee that the crudest mistakes won't make it into the final version. Every problem found at the pilot stage is a problem that won't spoil a thousand real responses.
How to run a pilot: step by step
Step 1. Determine the size of the pilot group
For most tasks, 10–20 people is enough. This is not a statistical sample — a pilot does not aim to obtain representative data. Its job is to find problems. Research in survey methodology shows that 80% of wording problems are uncovered within the first 10–12 participants. Increasing the group to 30–50 is justified only for complex questionnaires with many branches — so that each branch is tested by at least a few people.
Step 2. Recruit the right participants
The pilot group should match the study's target audience as closely as possible. Testing a questionnaire for construction workers on marketers is pointless: they won't notice the industry terminology that would stump the target respondent. And vice versa: a marketer won't stumble over the word "conversion," but a construction worker might.
Try to include in the pilot representatives of different subgroups of the target audience: different ages, different experience, different devices (a must — both mobile and desktop). The more diverse the group, the more different problems it will help uncover.
Step 3. Run the pilot in two stages
Stage A: "Think-Aloud." Ask 3–5 participants to go through the questionnaire with you (in person or via video call), voicing their thoughts: "I understand this question to mean...", "Here I paused to think...", "I'm not sure which option to choose...". This is the most informative format: you see not only the final answer but also the decision-making process. The method came from cognitive interviewing — a technique developed specifically for testing questionnaires.
Stage B: Independent completion. The remaining 10–15 participants go through the questionnaire on their own — as in real conditions. After they finish, ask them 3–4 questions: "Were there any questions that seemed unclear?", "Did the survey seem too long?", "Were there moments where you wanted to choose an option that wasn't there?", "Is there anything you'd like to add?"
Step 4. Analyze the pilot results
Look not only at participant feedback, but also at the data itself:
- Completion time. The average and the spread. If the spread is huge (someone in 3 minutes, someone in 15) — most likely some participants got stuck on certain questions.
- Drop-off points. If several participants abandoned the questionnaire on the same question — that's a red flag.
- Answer distribution. A question that everyone answers the same way doesn't differentiate. A question where most pick "Other" is poorly designed.
- Quality of open-ended answers. If participants write "ok," "don't know," or leave the field empty on an open-ended question — the wording doesn't motivate a detailed response.
Step 5. Make edits and (if necessary) repeat
After the pilot, you make edits to the questionnaire. If the edits were substantial — a key question rephrased, the branching structure changed, an entire block removed — it's worth running a second mini-pilot on 5–7 people to make sure the new wording works. If the edits are cosmetic (clarifying one word, adding an answer option) — a repeat pilot is usually unnecessary.
What a pilot does not check
It's important to understand the tool's limits so as not to overestimate its capabilities.
Representativeness of results. 15 pilot participants are not a sample, and their answers cannot be generalized. If 8 of 15 gave an NPS of 9 — that doesn't mean your NPS is 53%. A pilot tests the instrument, it doesn't measure the audience.
The main launch's response rate. The fact that 15 of 15 invitees completed the pilot does not guarantee a high response when sending to 10,000. Pilot participants are usually more motivated (you asked them personally), and their behavior is not representative of a mass audience.
Long-term comparability. A pilot verifies that the questionnaire works now. But if six months from now you change the wording of one key question — comparing results before and after the change will be incorrect, and the pilot won't warn you about it.
Piloting in SurveyNinja
In the SurveyNinja builder, piloting requires no separate tools — all the functionality for it is already built into the standard survey-creation process.
Preview. Before publishing, go through the questionnaire in preview mode — it reproduces the survey exactly as the respondent will see it, including all branches and conditions. Test it on mobile and desktop: behavior may differ.
A private link for the pilot group. Publish the survey and send the link only to the pilot participants. After the pilot, you can delete the test responses, make edits, and launch to the main audience — without recreating the survey.
Analytics of pilot data. Even on 15 responses, the built-in analytics will show distributions, average completion time, and drop-off points. The incomplete responses section will tell you exactly which question participants abandoned the questionnaire on.
Collaborative editing. Bring in colleagues through collaborative editing so that, after the pilot, the team can quickly make edits — without waiting for one person to work through all the comments one by one.
Piloting is the cheapest stage of research and at the same time the most cost-effective. 15 minutes of testing on 15 people save you from mistakes that would surface only when analyzing a thousand responses — when it's already too late to fix them.
Published: May 31, 2026
Mike Taylor