We can learn a lot by gathering information from a small subset of a larger group. By sampling this smaller group, scientists can save time and money while still gaining a good understanding of the population as a whole.
Part of the Whole
When I cook, I like to follow recipes, but I also like to make them my own. So I follow the steps – chopping, mixing, stirring, baking, boiling, etc. But along the way, I may add a dash of this or a tablespoon of that to make the dish a bit more interesting.
I don’t just dump in a whole bag of seasoning, though; I add a little until it tastes just right. How do I know when to stop? Well, I keep tasting it until it has the right flavor.These small tastes allow me to test the changes to the dish without having to eat the whole thing. I mean, that would be a pretty silly thing to do, right? I can reasonably assume that my small spoonful is representative of the rest of the pot on the stove – there’s no reason that I have to keep taste testing unless I add something new. Otherwise, I can safely bet that what I have for my meal will taste the same as that small sample on my spoon.Scientists often make the same types of assumptions about a population, which is all the members of a group being studied. A population can be made up of anything: people, trees, households, cars, bottles of shampoo…whatever it is that you’re studying.
The assumptions about the population are based on a sample, which is a small portion that represents the characteristics of the overall population. Just like my random spoonful of food should represent what the rest of the dish will taste like, a random scientific sample should represent what the rest of the population is like.
Simple Random Sampling
The randomness of sampling is very important. If your sample is not random, it doesn’t fairly represent the characteristics of the population.Let’s think about this in terms of your dinner.
If, for example, the dish you made was a vegetable soup, there are many different characteristics to represent. There’s the broth, the individual vegetables, like carrots, celery and onions, and of course, any cheese or noodles you may have mixed in. If you were to take a spoonful of soup to sample it, a random sample would mean that you just dunked your spoon in randomly, giving each piece of vegetable, cheese and noodle a fair chance of being selected each time.But if you were to purposefully avoid the carrots each time you sampled, this would not be random.
Your sample would not be a true representation of the overall soup population because no carrots would ever be sampled!One way to avoid biasing your sample like this is to use a technique called simple random sampling. In order for the sample to be random, each individual must have an equal opportunity to be selected, and each individual selection is independent of the others. What this means is that each individual in the population has a fair chance of being selected but that the selections don’t influence each other in any way.
Another useful type of sampling is systematic sampling.
This is when samples are selected at specific, predetermined intervals. This is often used when simple random sampling would be too time-consuming.For example, a grocery store might give a survey to every tenth customer in order to collect information on their shopping experiences. In this way, the sample is still random, but it is also set at a certain interval in order to collect information about the population.
The key with systematic sampling is that while the interval is predetermined, the start point must be randomized. You don’t want the first sampling point to always be the first customer of the day because the store is likely to have more shoppers at night than in the morning, which could bias your sample. If, however, you randomly select an hour of the day to start sampling, this ensures that your sample is equally distributed throughout the time your store is open.
Sometimes, instead of picking a random individual, it’s more appropriate to select from different categories within the population. When this is done, we call it stratified sampling.
These categories are called ‘strata,’ hence the name of this type of selection method. Each category, or stratum, is considered to be a sub-population from which individuals are sampled.Stratified sampling is used when the characteristics of the population can’t be randomly sampled as a whole. Have you ever filled out a survey where you were asked which age range you belonged to? How about your race? You were probably asked about your gender as well, right?When you select an age range of 20-29, 30-39, 40-49; check a box for ‘Caucasian,’ ‘Native American,’ or ‘Hispanic;’ and indicate if you are ‘male’ or ‘female,’ it allows the survey authors to group individuals into certain categories before selecting random individuals for a sample. It probably doesn’t make much sense to lump 20-year-olds with 70-year-olds when asking about health care, nor is it always beneficial to combine males and females when studying jobs, salaries or other employment information. It’s in this way that stratified sampling allows for random samples to be selected, but from smaller, more comparable populations.
Similar to stratified sampling, but more broadly categorized, is cluster sampling. This is when populations are divided into clusters, which are then randomly sampled. So, instead of randomly sampling individuals from each subset of the population, the subsets themselves are sampled.Say, for example, that you want to do a political survey in a large city like Atlanta. It would very time-consuming and expensive to use simple random sampling, because you might randomly select individuals that are spread throughout the entire city, and you would have to drive to each individual home in order to get your information.If, however, you used cluster sampling, you would define clusters within the city, say a size of one square mile each, and then randomly select which clusters to visit. And instead of sampling random individuals, you would instead visit each and every house within each cluster.
When randomly selected, clusters can be assumed to represent the whole (like the city of Atlanta) but with much less time and effort.
Like any good chef knows, you only need a small spoonful to understand what an entire pot of soup tastes like. Scientists know this as well, and they use a variety of sampling techniques to make inferences about a population.In order for a sample to be representative of the population, it must be random. Simple random sampling takes this to heart, and in this technique, each individual must have an equal opportunity to be selected, and each individual selection is independent of the others.
Systematic sampling takes a more ordered approach because samples are selected at specific intervals. As long as the start point is randomly selected, systematic sampling could be every 100th customer, every 3rd tree or every 140th bag of dog food at the factory.Stratified sampling is used to select from different categories within the population. By grouping individuals into relevant categories (such as age or gender), scientists can randomly select samples from smaller populations that are more similar to each other.Finally, cluster sampling is used to divide populations into clusters, which are then randomly sampled. Instead of sampling individuals, the clusters themselves are randomly selected to represent the population as a whole.
You should be able to describe four sampling techniques that allow scientists to make inferences about a population after completing this lesson.