Description
Chamberlain University
MATH 225N: Statistical Reasoning for Health Sciences
Week 3: Quartiles and Box Plots
Question
The five number summary for a set of data is given below.
Min | Q1 | Median | Q3 | Max |
69 | 74 | 81 | 85 | 87 |
What is the interquartile range of the set of data?
Enter just the number as your answer. For example, if you found that the interquartile range was 24, you would enter 24.
Answer Explanation
Correct answers:
- 11
Calculate the Interquartile by finding the difference between Q3 and Q1.
IQR=Q3–Q1
The interquartile range is the difference between Q3 and Q1, so 85−74=11. Therefore, the interquartile range is 11.
Question
The five number summary for a set of data is given below.
Min | Q1 | Median | Q3 | Max |
64 | 73 | 84 | 89 | 96 |
Using the interquartile range, which of the following are outliers? Select all correct answers.
- 38
- 71
- 73
- 119
- 131
Answer Explanation
Correct answer:
38
119
131
Remember that outliers are numbers that are less than 1.5⋅IQR below the first quartile or more than 1.5⋅IQR above the third quartile, where IQR stands for the interquartile range.
The interquartile range is the third quartile minus the first quartile. So we find
IQR=89−73=16
So a value is an outlier if it is less than
Q1−1.5⋅IQR=73−(1.5)(16)=49
or greater than
Q3+1.5⋅IQR=89+(1.5)(16)=113
So we see that 38, 119, and 131 are outliers.
Question
A data set lists the number of lab notes written by students during a biology lab. For this data set, the minimum is 2, the first quartile is 13, the median is 14, the third quartile is 15, and the maximum is 18. Construct a box-and-whisker plot that shows the number of lab pages. Begin by first placing the middle dot on the median. Then work on placing the rest of the points starting with the ones closest to the median.
Answer Explanation
Remember that the box-and-whisker plot represents the five number summary of a set of data. So the left end of the left whisker is the minimum value (2), the left edge of the box is the first quartile (13), the line in the middle of the box is the median (14), the right edge of the box is the third quartile (15), and the right end of the right whisker is the maximum value (18).
Question
Based on the box-and-whisker plot from the solution above, what is the interquartile range of the data?
Answer Explanation
Correct answers:
- 2
The interquartile range is the difference between the third and first quartile, so it is
15−13=2
Question
Herpetologists frequently examine the reproductive characteristics of the eastern cottonmouth, a once widely distributed snake whose numbers have decreased recently due to the intrusion by humans. Suppose the frequency table summarizes the number of young per litter for a random sample of 15 female cottonmouths in Florida. What is the five-number summary?
Litter Size | Frequency |
7 | 2 |
8 | 3 |
9 | 2 |
10 | 3 |
11 | 3 |
13 | 1 |
14 | 1 |
Answer Explanation
Min | Q1 | Median | Q3 | Max |
1$$ | 2$$ | 3$$ | 4$$ | 5$$ |
Correct answers:
- 1$7$7
- 2$8$8
- 3$10$10
- 4$11$11
- 5$14$14
Step 1. Sort the data values in order from smallest to largest.
In a frequency table, the litter size column represents the data values and the frequency column represents the number of cottonmouths with that litter size. So, we can generate the data values in the data set then list them from smallest to largest.
7,7,8,8,8,9,9,10,10,10,11,11,11,13,14
Step 2. Find the minimum value, Min, and the maximum data value, Max.
The sorted list makes it easy to identify the minimum value (Min) as 7 and the maximum value (Max) as 14.
Step 3. Find the median of the data set, Q2. The median divides the bottom 50% of the data from the top 50% of the data.
Note: If the number of values in the data set is odd, the median will be the value that is exactly in the middle of the data set. If the number of values in the data set is even, the median will be the mean of the two middle values in the data set.
Since this data set has an odd number of values, the median is the data value that is exactly in the middle of the data set
7,7,8,8,8,9,9,10,10,10,11,11,11,13,14
Referring to the sorted list, we see that the median, Q2, is 10.
Step 4. Find the median of the lower half of the data set. This value is the first quartile, Q1. The first quartile divides the bottom 25% of the data from the top 75%. The lower half of the data set represents the data values that are less than or equal to Q2.
The number of values in the data set is odd, so the median splits the data set into two halves. Therefore, the lower half consists of the numbers
7,7,8,8,8,9,9
We see that the median of the lower half is 8, so Q1 is 8.
Step 5. Find the median of the upper half of the data set. This value is the third quartile, Q3. The third quartile divides the bottom 75% of the data from the top 25%. The upper half of the data set represents the data values that are greater than or equal to Q2.
The number of values in the data set is odd, so the median splits the data set into two halves. Therefore, the upper half consists of the numbers
10,10,11,11,11,13,14
We see that the median of the upper half is 11, so Q3 is 11.
Step 6: Report the five-number summary in the following order: Min, Q1, Median (Q2), Q3, Max
The minimum data value, min, in the data set is 7. The largest number in the data set, max, is 14. The first quartile, Q1, is 8. The median, Q2, is 10. The third quartile, Q3, is 11. Therefore, the five-number summary can be reported as
7,8,10,11,14.
Question
The U.S. Census Bureau frequently conducts nationwide surveys on the characteristics of U.S. households. Suppose the following data are on the number of people per household for a sample of 20 households. Find the five-number summary for this data set by hand. Report the five-number summary in the following order: Min, Q1, Median (Q2), Q3, Max
Number of
People |
Frequency |
1 | 4 |
2 | 7 |
3 | 4 |
4 | 2 |
5 | 1 |
6 | 1 |
7 | 1 |
Answer Explanation
1$$,2$$,3$$,4$$,5$$
Correct answers:
- 1$1$1
- 2$2$2
- 3$2$2
- 4$3.5$3.5
- 5$7$7
Step 1. Sort the data values in order from smallest to largest.
In the frequency table, the “number of people” column represents the data values and the frequency column represents how many times that response was given. So, we can generate the data values in the data set then list them from smallest to largest.
1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,4,4,5,6,7
Step 2. Find the minimum value, Min, and the maximum data value, Max.
The sorted list makes it easy to identify the minimum value (Min) as 1 and the maximum value (Max) as 7.
Step 3. Find the median of the data set, Q2. The median divides the bottom 50% of the data from the top 50% of the data.
Note: If the number of values in the data set is odd, the median will be the value that is exactly in the middle of the data set. If the number of values in the data set is even, the median will be the mean of the two middle values in the data set.
Since this data set has an even number of values, the median is the average of the two data values in the middle. Referring to the sorted listed we see that the two data values in the middle are 2 and 2.
1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,4,4,5,6,7
The average (or mean) of these two values is
2+22=42=2.
We see that the median, Q2, is 2.
Step 4. Find the median of the lower half of the data set. This value is the first quartile, Q1. The first quartile divides the bottom 25% of the data from the top 75%. The lower half of the data set represents the data values that are less than or equal to Q2.
The number of values in the data set is even, so the lower half of the data set consists of the numbers
1,1,1,1,2,2,2,2,2,2 .
We see that the number of data values in the lower half is even, so there are two numbers in the middle, 2 and 2. To find the median, take the average (or mean) of these two values.
2+22=42=2
Since the median of the lower half is 2, Q1, is 2.
Step 5. Find the median of the upper half of the data set. This value is the third quartile, Q3. The third quartile divides the bottom 75% of the data from the top 25%. The upper half of the data set represents the data values that are greater than or equal to Q2.
The upper half of the data set consists of the numbers
2,3,3,3,3,4,4,5,6,7 .
We see that the number of data values in the upper half is even so there are two numbers in the middle, 3 and 4. To find the median, take the average (or mean) of 3 and 4:
3+42=72=3.5
Since the median of the upper half is 3.5, Q3, is 3.5.
Step 6: Report the five-number summary in the following order: Min, Q1, Median (Q2), Q3, Max
The minimum data value, min, in the data set is 1. The largest number in the data set, max, is 7. The first quartile, Q1, is 2. The median, Q2, is 2. The third quartile, Q3, is $_3.5_. Therefore, the five-number summary can be reported as
1,2,2,3.5,7.
Question
The following data set represents the amount spent, in billions of dollars, each year on national defense by the U.S. government over the course of a 16-year period. Find the five-number summary of this data set. Round all answers to one decimal place.
296.3,335.9,401.2,451.3,479.5,481.7,403.2,409.5,366.9,351.8,348.5,370.1,472.8,546.4,574.6,675.1
Answer Explanation
Min | Q1 | Median (Q2) | Q3 | Max |
1$$ | 2$$ | 3$$ | 4$$ | 5$$ |
Correct answers:
- 1$296.3$296.3
- 2$359.4$359.4
- 3$406.4$406.4
- 4$480.6$480.6
- 5$675.1$675.1
Step 1. Sort the data values in order from smallest to largest.
The following list displays the data values sorted in order from smallest to largest.
296.3,335.9,348.5,351.8,366.9,370.1,401.2,403.2,409.5,451.3,472.8,479.5,481.7,546.4,574.6,675.1
Step 2. Find the minimum value, Min, and the maximum data value, Max.
The sorted list makes it easy to identify the minimum value (Min) as 296.3 and the maximum value (Max) as 675.1.
Step 3. Find the median of the data set, Q2. The median divides the bottom 50% of the data from the top 50% of the data.
Note: If the number of values in the data set is odd, the median will be the value that is exactly in the middle of the data set. If the number of values in the data set is even, the median will be the mean of the two middle values in the data set.
Since this data set has an even number of values, the median is the average of the two data values in the middle. Referring to the sorted listed we see that the two data values in the middle are 403.2 and 409.5.
296.3,335.9,348.5,351.8,366.9,370.1,401.2,403.2,409.5,451.3,472.8,479.5,481.7,546.4,574.6,675.1
The average (or mean) of these two numbers is
403.2+409.52=812.72=406.4 .
So, the median, Q2, rounded to one decimal place, is 406.35.
Step 4. Find the median of the lower half of the data set. This value is the first quartile, Q1. The first quartile divides the bottom 25% of the data from the top 75%. The lower half of the data set represents the data values that are less than or equal to Q2.
The number of values in the data set is even, so the data set can be divided into two halves. The lower half of the data set consists of the numbers
296.3,335.9,348.5,351.8,366.9,370.1,401.2,403.2.
We see that the number of data values in the lower half is even, so there are two numbers in the middle, 351.8 and 366.9. To find the median, take the average (or mean) of these two numbers.
351.8+366.92=718.72=359.35
Since the median of the lower half is 359.35, Q1, rounded to one decimal place, is 359.4.
Step 5. Find the median of the upper half of the data set. This value is the third quartile, Q3. The third quartile divides the bottom 75% of the data from the top 25%. The upper half of the data set represents the data values that are greater than or equal to Q2.
The upper half of the data set consists of the numbers
409.5,451.3,472.8,479.5,481.7,546.4,574.6,675.1.
We see that the number of data values in the upper half is even so there are two numbers in the middle, 479.5 and 481.7. To find the median, take the average (or mean) of these two numbers.
479.5+481.72=961.22=480.6
Since the median of the upper half is 480.6, Q3, is 480.6.
Step 6: Report the five-number summary in the following order: Min, Q1, Median (Q2), Q3, Max
The minimum data value, min, in the data set is 296.3. The largest number in the data set, max is 675.1. The first quartile, Q1, is 359.35. The median, Q2, is 406.35. The third quartile, Q3, is 480.6. Therefore, the five-number summary is
Min | Q1 | Median | Q3 | Max |
296.3 | 359.4 | 406.4 | 480.6 | 675.1 |
Question
The Wimbeldon Championship is one of the oldest, some would say most prestigious, tennis event in the world. It is one of the four major Grand Slam tournaments in tennis (US Open, Australian Open and French Open are the other three). The following data set lists a random sample of the ages of 16 winners of the men’s Wimbledon tennis championship. Find the five-number summary. Do not round your answers.
30,26,25,23,22,21,22,26,27,31,21,22,24,25,24,28
Answer Explanation
Min | Q1 | Median (Q2) | Q3 | Max |
1$$ | 2$$ | 3$$ | 4$$ | 5$$ |
Correct answers:
- 1$21$21
- 2$22$22
- 3$24.5$24.5
- 4$26.5$26.5
- 5$31$31
Step 1. Sort the data values in order from smallest to largest.
The following list displays the data values sorted in order from smallest to largest.
21,21,22,22,22,23,24,24,25,25,26,26,27,28,30,31
Step 2. Find the minimum value, Min, and the maximum data value, Max.
The sorted list makes it easy to identify the minimum value (Min) as 21 and the maximum value (Max) as 31.
Step 3. Find the median of the data set, Q2. The median divides the bottom 50% of the data from the top 50% of the data.
Note: If the number of values in the data set is odd, the median will be the value that is exactly in the middle of the data set. If the number of values in the data set is even, the median will be the mean of the two middle values in the data set.
Since this data set has an even number of values, the median is the average of the two data values in the middle. Referring to the sorted listed we see that the two data values in the middle are 24 and 25.
21,21,22,22,22,23,24,24,25,25,26,26,27,28,30,31
The average (or mean) of these two values is
24+252=492=24.5
So, the median, Q2, is 24.5.
Step 4. Find the median of the lower half of the data set. This value is the first quartile, Q1. The first quartile divides the bottom 25% of the data from the top 75%. The lower half of the data set represents the data values that are less than or equal to Q2.
The number of values in the data set is even, the data set can be divided into halves. Therefore, the lower half of the data set consists of the numbers
21,21,22,22,22,23,24,24 .
We see that the number of data values in the lower half is even, so there are two numbers in the middle, 22 and 22. To find the median, take the average (or mean) of these two numbers.
22+222=442=22
Since the median of the lower half is 22, Q1, is 22.
Step 5. Find the median of the upper half of the data set. This value is the third quartile, Q3. The third quartile divides the bottom 75% of the data from the top 25%. The upper half of the data set represents the data values that are greater than or equal to Q2.
The upper half of the data set consists of the numbers
25,25,26,26,27,28,30,31.
We see that the number of data values in the upper half is even so there are two numbers in the middle, 26 and 27. To find the median, take the average (or mean) of these two numbers_.
26+272=532=26.5
Since the median of the upper half is 26.5, Q3, is 26.5.
Step 6: Report the five-number summary in the following order: Min, Q1, Median (Q2), Q3, Max
The minimum data value, min, in the data set is 21. The largest number in the data set, max is 31. The first quartile, Q1, is 22. The median, Q2, is 24.5. The third quartile, Q3, is 26.5. Therefore, the five-number summary is
Min | Q1 | Median | Q3 | Max |
21 | 22 | 24.5 | 26.5 | 31 |
.
Question
Suppose a credit card company has a fraud-detection service that determines if a card has any unusual activity. The company maintains a database of daily charges on a customer’s credit card. The dataset below includes a day’s worth of charges (rounded to the nearest dollar). The customer is contacted to make sure that the credit card has not been compromised if a day’s worth of charges appears unusual. In this case, an unusual charge would be considered an outlier. Determine the amount the daily charges must exceed before the customer is contacted.
153 | 174 | 121 | 178 | 137 | 90 |
89 | 99 | 95 | 122 | 101 | 109 |
31 | 47 | 71 | 209 | 154 | 126 |
- A daily charge that exceeds $248would be considered unusual.
A daily charge that exceeds $117 would be considered unusual.
A daily charge that exceeds −$5 would be considered unusual.
A daily charge that exceeds $209 would be considered unusual.
Answer Explanation
Correct answer:
A daily charge that exceeds $248 would be considered unusual.
In this case since we are looking for an outlier that may be in the upper half of the data set, we only need to calculate the Upper Limit.
Step 1: Find the quartiles of the dataset, Q1, Q2, Q3. We
Order the data set from smallest to greatest.
31,47,71,89,90,95,99,101,109,121,122,126,137,153,154,174,178,209
Since the data set has even number of values, the median is the mean of the two values in the middle:
31,47,71,89,90,95,99,101,109,121,122,126,137,153,154,174,178,209
So the median, Q2, is 115.
Next, we only need to identify values in the upper half of the data set.
The upper half of the data set consists of the values: 121,122,126,137,153,154,174,178,209. Since there is an odd number of values, the median of the upper half, Q3, is the mean of the value exactly in the middle values. So, Q3 is 153.
Step 2: Calculate the Interquartile Range (IQR), where IQR = Q3 – Q1. This is the difference between 153 (Q3) and 90 (Q1). So, the IQR is 63.
We can skip Step 3 since we only need to calculate the Upper Limit.
Step 4: Calculate an Upper Limit for Outliers as follows: UpperLimit=Q3+(1.5IQR). This is =153+1.5(63)=153+94.5=247.5. Rounded to the nearest dollar, this is, $248 .
Since the Upper Limit is $248, we can conclude that any daily charge that exceeds $248 is a potential outlier and can be considered unusual.
Question
The five-number summary for a set of data is given below.
Min | Q1 | Median | Q3 | Max |
52 | 53 | 60 | 67 | 88 |
What is the interquartile range of the set of data?
- 14
- 10
- 8
- 4
- 12
Answer Explanation
Correct answer:
14
Calculate the Interquartile by finding the difference between Q3 and Q1.
IQR=Q3–Q1
The interquartile range is the difference between Q3 and Q1, so 67−53=14. Therefore, the interquartile range is 14 .
Question
The five number summary for a set of data is given below.
Min | Q1 | Median | Q3 | Max |
4 | 55 | 68 | 69 | 137 |
Using the interquartile range, which of the following are outliers? Select all correct answers.
- 4
- 39
- 73
- 100
- 137
Answer Explanation
Correct answer:
4
100
137
Remember that outliers are numbers that are less than 1.5⋅IQR below the first quartile or more than 1.5⋅IQR above the third quartile, where IQR stands for the interquartile range.
The interquartile range is the third quartile minus the first quartile. So we find
IQR=69−55=14
So a value is an outlier if it is less than
Q1−1.5⋅IQR=55−(1.5)(14)=34
or greater than
Q3+1.5⋅IQR=69+(1.5)(14)=90
So we see that 4, 100, and 137 are outliers.
Question
A data set lists the number of times a machine breaks each month in a clothing factory over the past year. For this data set, the minimum is 4, the median is 14, the third quartile is 17, the interquartile range is 6, and the maximum is 18. Construct a box-and-whisker plot that shows the number of times the machine breaks. Begin by first placing the middle dot on the median. Then work on placing the rest of the points starting with the ones closest to the median.
Answer Explanation
Remember that the interquartile range is the third quartile minus the first quartile. Since we know the third quartile is 17, and the interquartile range is 6, we find that the first quartile must be 17−6=11.
To construct the box-and-whisker plot, remember that the minimum value of the data (4) is at the end of the left whisker, the first quartile (11) is the left edge of the box, the median value (14) is the vertical line in the box, the third quartile (17) is the right edge of the box, and the maximum value (18) is the end of the right whisker.
Question
A data set lists the number of extra credit points awarded on midterm scores of 15 students taking a statistics course. For this data set, the minimum is 3, the median is 15, the third quartile is 16, the interquartile range is 4, and the maximum is 19. Construct a box-and-whisker plot that shows the extra credit points awarded. Begin by first placing the middle dot on the median. Then work on placing the rest of the points starting with the ones closest to the median.
Answer Explanation
Remember that the interquartile range is the third quartile minus the first quartile. Since we know the third quartile is 16, and the interquartile range is 4, we find that the first quartile must be 16−4=12.
To construct the box-and-whisker plot, remember that the minimum value of the data (3) is at the end of the left whisker, the first quartile (12) is the left edge of the box, the median value (15) is the vertical line in the box, the third quartile (16) is the right edge of the box, and the maximum value (19) is the end of the right whisker.
For any assistance in this course or any other, please contact us on WhatsApp +254716353533