Survey sample size
Determining survey sample size depends largely on required confidence
in the results and resources allocated for the study. Because there
are diminishing returns, or diminished statistical confidence, in
conducting additional interviews, clients often don’t want
to purchase more “confidence” than they need to decide
their next course of action.
Several factors contribute to determining the optimal sample size
for a market research survey. The primary factors are population
size and homogeneity, and desired confidence interval and level.
While population size and homogeneity can be determined, confidence
interval and level must be determined by the purpose and budget
for the study. A brief explanation of each factor follows:
Population Size – The number of people in the group that
the survey sample will represent. For example, you might want to
represent a nation of 100 million potential consumers, a city of
100 thousand registered voters, or a local area of 2,000 qualified
customers. Unless the population is very small, there is little
difference in the sample size needed unless a very small confidence
interval is selected.
Population Homogeneity - The measure of similarity in the group.
A smaller sample is needed to achieve the same confidence if 85%
of the group agrees on an issue than if only 55% agree. Of course,
we often don’t have an estimate of this until after we conduct
the surveys. Further, a large percentage may feel one way, while
other questions show an almost even split in the group’s opinions.
As such, we assume that the group we are studying is evenly divided.
This way we achieve the desired confidence for all questions, while
achieving better than desired confidence for questions where a large
percentage of the group is in agreement.
Confidence Interval and Confidence Level - Refers to the level
of certainty needed from the results. Confidence interval tells
us how close we expect the results of our sample to be to the true
population. If we choose a confidence level of plus or minus five
percent, for example, and the survey shows 60% in favor of a given
proposition, then we expect that between 55% and 65% of the total
population agrees with the proposition. The confidence level indicates
the likelihood that the population percentage will be within the
selected interval. For example, if we select a confidence level
of 95% and a confidence interval of plus or minus five percent,
then we are 95% certain that the population percentage is within
five percentage points above or below the sample percentage determined
in the survey.
The selection of the confidence level and interval depend upon
the purpose of the survey. We most often recommend a confidence
level of 95 percent with a confidence interval of plus or minus
five percent (95% +/-5%) for the total sample. For a close political
race, however, a five percent confidence interval might be too broad.
The following table shows the number of surveys required to achieve
specific levels of confidence based on a known population size.
It’s clear that increasing the confidence level or narrowing
the confidence interval can significantly impact the cost of the
study.
Two final factors need to be considered when determining the sample
size. First, is the number of people who refuse to answer certain
questions. The sample sizes shown in the table above represent the
number of people who answer any given question. If respondents refuse
to answer certain questions, the sample must be increased until
the desired number of answers equals the minimum shown above. As
such, SRA often recommends conducting 400 surveys if the client
wants to achieve 95% +/-5% confidence.
The final factor in selecting the sample size is the desired level
of confidence in demographic groupings within the total sample.
As an example, consider a study in a town of 5,000 voters that is
divided into five districts of 1,000 voters each. While a sample
of 375 should be enough to obtain 95% +/-5% confidence for the total
sample (assuming no more than 18 refusals and at least 357 responses
to every question), this gives us only 75 surveys per district.
This would result in only 90% +/-9% confidence in survey results
by district, which has little statistical meaning. At least 1,390
surveys would be needed to achieve 95% +/-5% confidence by district
(assuming no refusals), increasing the cost to field the survey
by almost 400%.
One alternative would be to conduct 214 interviews in each district,
accepting 90% +/-5% confidence by district, while improving the
confidence in the total sample to 98% +/-3.2%. ‡
|