Talking about multiples of standard deviations can get exhausting and confusing. Luckily, z-scores allow us to talk about how far a point is removed from a mean in terms of how many standard deviations away it is.
What Is a Z-Score?
Imagine that you were dealing with a data set that had a standard deviation of 2. Remember that standard deviation is simply a measure of how scattered a collection of data is from the mean.
Obviously, much of your data will be the mean plus or minus 2. Even more will be two standard deviations away from the mean, so the mean plus or minus 4. That notation works pretty well when the standard deviation is succinct.
After all, 2 is a pretty easy concept to understand.But what if your standard deviation was a very small decimal, as it often is in engineering? Or a less-simple number, like 593, as it could be in polling. Wouldn’t it be nice to have a unit that measures the value of something in terms of standard deviations away from the mean? Luckily, we have exactly such a tool: the z-score. The z-score is a measure of distance from the mean in terms of how many standard deviations it is removed from the mean.
Review of Standard Deviation
But wait, how do you find the standard deviation? Chances are you may already remember, but if you don’t, we’ll review that quickly. First, take the mean of all the values in the set.
Then, subtract the mean from each value, then square each difference. Find the mean of the sum of those differences, then take the square root of that number.For example, let’s say you wanted to know the population standard deviation of the following test scores: 98, 90, 86, 83, and 70. First, take the mean, which comes out to 85.
4. Then subtract that from each score. You get the following values: 12.6, 4.6, 0.6, -2.4, and -15.
4. Square each of those, giving you 158.76, 21.16, 0.
36, 5.76, and 237.16.
Add all those up and find the average of them, which turns out to be 84.64. Then take the square root, which is 9.
Why Use Z-Scores?
By far, the biggest advantage of a z-score is its brevity. When I say to include all values between z-scores of -2 and 2, you know that I want to include 95% of values. That’s because within two standard deviations on either side of the mean are 95% of the data in a set.
It would be pretty easy to do that with values when the standard deviation is a highly regular number, like 2, but could you imagine doing that when the standard deviation is 0.00487? Or when it is 1,381? You could bring out a calculator and find that two standard deviations of 0.00487 is 0.
00974, and the two of 1,381 is 2,762. However, that introduces a great deal of potential errors. Z-scores limit those errors by giving a concise answer. With z-scores, you can worry about other things, not making sure that 0.
00974 is exactly double of 0.00487.
How to Calculate a Z-Score
As calculating a z-score is simply finding something in terms of the standard deviation, it is a relatively simple formula.
First, subtract the mean from the value in question. Then, divide the answer by the standard deviation. If you have a value in question of 55, a mean of 30, and a standard deviation of 20, you would take 55 and subtract 30. That leaves 25. Divide 25 by the standard deviation of 20 and you end up with 1.
25. 1.25 is the z-score of that data point.
Sometimes, that number may come out negative. That’s fine; negative numbers just mean that the z-score in question is less than the mean. On a normal distribution curve, that means that it is to the left of the high point of the curve. Likewise, positive numbers are always on the right of the curve.
It may seem that the math for this sort of thing is pretty simple, but let’s do a few questions to make sure that we understand it. We will do one with a positive z-score and then one with a negative z-score.First, let’s say you were trying to find the z-score for a student’s book collection.
According to your data, the mean book collection is 50, while the standard deviation is 10. The student in question has a collection of 75 books. What is the first thing that we do? First, subtract the mean from the value in question. 75 minus 50 is 25. Then divide that quantity by the standard deviation. 25 divided by 10 is 2.
5. That means that this student has an exceptionally large book collection.But what about his roommate? Let’s say that the mean stays the same at 50, while the standard deviation remains the same at 10. However, this student only has 15 books. Obviously, it’s a small number, but what is its z-score? First, subtract the mean from the quantity. This means 15 minus 50, or -35.
Divide -35 by 10 and you get -3.5. That is this student’s z-score. It is statistically interesting because -3.
5 is a number of standard deviations from the mean. Perhaps his academic advisor should have a talk with him?
In this lesson, we learned how to find the z-score of a data point. The z-score is a measure of distance from the mean in terms of standard deviations.
Remember that a standard deviation is just a measure of how scattered a collection of data is.To find the z-score, subtract the mean from the quantity in question, and then divide by the standard deviation of the whole set. Remember that negative z-scores are on the left of the curve, while positive z-scores are on the right.
Standard Deviation Overview
|Standard deviation||a measure of how scattered a collection of data is from the mean|
|Z-score||a measure of distance from the mean in terms of how many standard deviations it is removed from the mean|
|Finding standard deviation||take the mean of all the values in the set; subtract the mean from each value, then square each difference; find the mean of the sum of those differences; then take the square root of that number|
|Calculating Z-score||subtract the mean from the value in question; divide the answer by the standard deviation|
When you reach the conclusion of the lesson, display your ability to:
- Define standard deviation
- Solve for standard deviation
- Understand the advantage of using z-scores
- Find the z-score by showing examples