Just like with histograms and dot plots, box plots are also used with numerical data. So just like the CO2 emissions data or this quiz scores data, we are going to use box plots to help us graph numerical data. When I was in college, ultimately, I actually learned box plots under the name of box-and-whisker plots. Why? Well, because if you look at the actual graph below, I want you to see that there is one main box and two lines coming out of it, like little whiskers. Kind of looks like a cat. Ultimately, I want you to see that the box-and-whisker plot has the entire five-number summary in it. I want you to see that when you look at the middle line inside of the box, notice how that represents the median. So if we had a number line drawn underneath this, the idea is whatever number that median line corresponds to on the number line represents the median. Notice how the left vertical line represents the value of Q1, the lower quartile. Notice how the right vertical line represents Q3, the upper quartile. And remember that the interquartile range represents the middle 50%, meaning the distance between Q1 and Q3. So notice what we've just done is we literally just outlined the box. Notice how the three vertical lines constructing this box represent Q1, median, and Q3. And notice the horizontal lines closing off this box represent the interquartile range. This is such an elegant picture because it's literally just a rectangle. It's just a rectangle with a line in the middle of it. And yet, it's giving us four huge pieces of information: median, Q1, Q3, and a visualization of the interquartile range. So then, what do the whiskers represent? It's then representing that lower and upper 25% of data between Q1 and the minimum or Q3 and the maximum. But you are probably asking me, "Wait, Shannon, Shannon, we had outliers that we just found on the previous page. How am I going to denote those?" Well, in situations where you have outliers, we are going to use dots. We are going to use dots to ultimately represent an outlier. But the rub then is that this maximum is no longer the maximum value, as we saw in our previous example. One of the outliers, 6.81, is the maximum. So the question then is, "Well, what then is the end of this whisker?" Well, what that value is representing is the next largest value that's not an outlier. So can you guys look at this dataset one more time and tell me what is the next largest value that is not one of these blue outliers? You tell me that. What is the next largest value? Yeah, the 3.86 is ultimately the next largest value that's not an outlier. And so ultimately, that is what the end of this whisker is going to represent. In cases where you have an outlier or outliers, the idea of the end of these whiskers are the minimum and maximum or the min and max that's not an outlier. All right, I love box-and-whisker plots. I love box plots because it's such an elegant way of letting us know the range of data where the lowest 25% of data lives. Remember, between min and Q1 is where 25% of the data lives. Literally, if you're making a dot plot, it's like 25% of the dots are between the min and Q1 value. 25% of the data is between Q1 and the median. 25% of the data is between the median and Q3. And 25% of the data is between Q3 and the maximum value. It's such an elegant picture of breaking down each quarter chunk of our data. And one of the best things about box plots is that they are not independent of the histograms that we learned in Chapter Two. Ultimately, in Chapter Two, let's go back and remember, we learned how to graph histograms. And now, here in Chapter Three, we are learning about box plots. And yet, what I wanted you guys to see in this graph here is a direct one-to-one of these histograms to these box plots. I want you guys to remember skewed-right graphs. Remember skewed-right graphs ultimately had a tail on the right. And that ultimately, that tail on the right is translating to a long whisker on the right. What I want you to see is everything has to do with right. When you have a right tail, when you have a long right whisker, I want you to see they are all related to right-skewed graphs. In the same way, when you're looking at skewed-left graphs and you're looking at tails on the left, you're looking at long whiskers on the left. Again, everything with skewed-left graphs are related to what's on the left. You're looking at left tails, you're looking at long left whiskers. So what I'm trying to emphasize is that box plots are still giving us a lot of the same information we saw in Chapter Two when we made histograms, except the big difference is it's giving us even more information because it's also giving us that five-number summary of min, Q1, median, Q3, and max. Lastly, symmetric. Symmetric, as you guys probably guess, means you have symmetric whiskers. And so what I wanted to do was graph both the box plot and histogram of that CO2 emissions per capita data. And I want you guys to tell me what shape graph do we have here? Is it skewed right, symmetric, or skewed left? The whisker also includes those outliers just like how we can see this tail on the right. So what do you guys think now? They are skewed right, the one on the right, the histogram. That's something we recognize. But ultimately, we're seeing how that's directly translating from the histogram into the box plot.