A representative sample is drawn from a population of interest and has demographics and characteristics that match those of the population in as many ways as possible.

## A Problematic Study

Suppose you had a friend that did a study of intelligence, and found that 9 out 10 people are super geniuses. Between your peals of laughter, you would probably think that he must have been the odd one out! Obviously, his study had a problem with it, but what is the problem, exactly?You might think that it is because his sample only had 10 people, but if he grabbed 10 people *at random*, what are the odds that 9 of them would be super geniuses? Suppose further that the results of his study were based on a sample of 100 people. 90 super geniuses in a group of 100 – pretty impressive, and clearly proves the study’s problem, right?Not if his sample is Fields Medal winners and Nobel Laureates! (Now you are wondering about the other 10, aren’t you?)

## Representative Sample Defined

The reason why his study is invalid is because the sample he chose did not represent the population to which he is applying the findings.

If he was saying that 9 out of 10 Fields Medal winners and Nobel Laureates are super geniuses, that finding might make sense. But, if he tried to apply those results to the regular population, you would counter that he does not have a **representative sample** of the general population. That is, the characteristics of the sample he chose do not match those of the group at large, so the sample does not *represent* the group well.Whenever you want to study a small group and *generalize* from the small group to the larger one, you need to make sure that the small group is just like the larger one.

Otherwise, one of the differences between the makeup of the groups could also explain any qualitative differences between the groups.

## Examples

Let’s take a look at a couple of examples.Example 1:Suppose I wanted to find out if Americans are generally better at basketball than Canadians. I could take 1000 people from each country and run them through tests of basketball skills, and may the best team win! But, if 51% of the American population is women, but 700 of my sample of 1000 Americans are women, then my sample does not accurately represent the demographics of the USA. In that case, any differences that I would want to attribute to country might not be valid, because the American sample is not a *representative sample.*Example 2:A government agency wants to assess a new smoking cessation program, so they grab 10,000 Americans at random, being very careful to match the number of men and women in the sample to the percentage of each gender within the U.

S. population. The agency also matches up the ethnic makeup of the sample to the U.S. population, and likewise for the ages of the participants. This sounds like a nice representative sample, except for one thing: the program is for *smokers*! If the agency wants to apply the findings to smokers, then the sample needs to represent the population of smokers.

Taking 10,000 Americans at random is going to include a bunch of people who do not smoke, which defeats the purpose of the whole study.

Thus, if we want to generalize a finding from a sample to a population, the sample must be *demographically* and *characteristically* representative of the population to which we want to apply the finding.

## Lesson Summary

Let’s review.

A **representative sample** is demographically and characteristically similar to the population of interest. When the sample matches the population, it can be inferred that any effects shown by the sample are likely to be reflected in the population, too.