Five-user testing still beats every fancy research method we've paid for.

A prospective client showed up last month with a 47-page research deck. Fifty users, eight tasks each, statistical significance bars on everything. They had paid a research firm about $14,000 for it. They wanted us to redesign their checkout based on the findings. We read the deck twice. Then we asked if we could put five people in front of the existing checkout for forty-five minutes apiece.

The five-user sessions found three problems the 47-page deck had missed entirely.

One of them was the reason their checkout abandonment was bad. A label said "Continue" when users expected "Pay." Two of the five users hovered over it, hesitated, scrolled back up to look for a payment summary, and then bounced. The panel study had asked about the label in a multiple-choice question and gotten clean data. None of the fifty users had volunteered the actual confusion, because the panel format does not surface confusion. It surfaces opinions about confusion.

Why five is the right number

Jakob Nielsen put the math out in 2000. With five users you catch roughly 85% of the usability problems present in an interface. The curve flattens hard after that: user six adds maybe 4%, user ten adds maybe 1%. If you have the budget for fifteen users on one design, you will learn dramatically more by running five users on three different designs than fifteen on one. The dollar-for-dollar return is not close.

The reason this still surprises people is that five sounds unscientific. It is unscientific in the inferential-statistics sense. It is not unscientific in the "did we find the bug" sense, which is the only sense that matters when your goal is shipping a better product instead of publishing a paper.

What we actually charge for a round

A five-user moderated test with recruiting, a discussion guide, the sessions themselves, and a written summary runs us about three to five hours of work and $200 to $300 in participant incentives. We bill the client somewhere between $1,200 and $1,800 depending on recruiting difficulty. Turnaround is usually under two weeks.

An unmoderated panel study through a platform like UserTesting or Maze, fifty participants, lands a client in the $3,000–$6,000 range and takes about the same amount of time once you factor in the analysis. A custom-recruited quantitative study with a third-party firm is $10,000 and up. We have done all three. We have never been more useful to a client than during the moderated rounds.

What each method actually surfaces

Moderated testing with five users is good for one thing: finding out where real humans get stuck on a specific flow. The moderator catches the hesitation, the misread, the small "wait" the user mutters before clicking. None of that survives transcription. The video is the artifact. Watching a real person fail to find the "edit address" link is worth more than any chart of how many users found it.

Large unmoderated studies are good for a different thing: telling you whether a problem is widespread once you already know it exists. They are confirmatory, not exploratory. They are also good for testing variants of copy, layout, or pricing in a way that needs statistical confidence. Use them after you have qualitative insight, not before.

Diary studies are excellent for understanding context of use over time, and we recommend them maybe twice a year. Surveys are useful for sizing a market or validating a positioning hypothesis, and worse than useless for understanding usability. People cannot reliably introspect on their own clicking.

The pitch clients keep falling for

The reason agencies push panel studies is that they bill better and they look more rigorous on a proposal. A 47-page deck with bar charts feels like real research. A 4-page memo summarizing what five people stumbled on does not. Clients buy the deck. The deck does not fix the checkout.

We've had to talk three different clients out of panel studies this year. The pitch we use is simple. Spend a tenth of the budget on five-user testing. If we don't find at least three actionable problems in those five sessions, we'll refund and you can go run your panel. We have not had to refund.

The one case we'd skip it

If the question is genuinely "which of these two designs converts better," you do not want a usability test. You want an A/B test on real traffic with real money. Usability testing tells you why something is bad. It does not tell you what fraction of users will buy. Use the right tool. We've watched clients spend twenty grand on research that would have been answered by a Cursor-generated test plan, a coin flip, and a week of live traffic.

Five users, a quiet conference room, and a moderator who knows when to shut up. It's the cheapest research method we sell. It's also the one we'd defend on a witness stand.

Why five is the right number

What we actually charge for a round

What each method actually surfaces

The pitch clients keep falling for

The one case we'd skip it

Curious what five users would find on your site? Let's set up a round.