[Music] next we have something called the five number summary and the five number summary is the minimum data point so the lowest value in the data then q1 separating the bottom 25 percent of the bottom quarter of the data and then the median remember that's the same as q2 then q3 and then the max so 1 2 3 4 5. that's the five number summary and from the five number summary we can draw something called a box and whiskers plot so what we do is the box consists of this middle piece so this is the box here and the box goes from q1 to q2 i'm sorry the box goes from q1 to q3 so it starts here at q1 and it ends here at q3 and there's a line here in the middle of the box and that's the median or q2 now one whisker goes down here and to the minimum value and another whisker comes out here to the maximum value and that's why it's called a box and a whisper plot so the lines are the whiskers obviously the box you can see there and what we always do is we always have a scale underneath it showing the values in example four the problem is to draw a box and whisk a plot that represents the data set from example one so i pulled up the five number summary from example one and if you read your textbook you can see here if you want to know how to get the calculator to do it you can look at pages 124 and 125. so what we're going to do is draw our box and whisker plot but the first thing we're going to do we found our five number summary now we need this horizontal scale here and we know that it goes from a minimum of 11 so i think i'm going to start at 10 and it goes to a maximum here of 35 so i think i'm going to end at 40 so i know my minimum value is about here 11 and my maximum value it's about here at 35 so i want to try and get that lined up here and then the median value well here's 25 so that's exactly in the middle it's the middle of my box and then i have 23 so that'll be about here and then 30 is up about here so here's my box and this line indicates q1 this line indicates q3 and this line in the middle is q2 or the median then i draw a line through so the dot at this end is the minimum and the dot at this end is the maximum now let's talk about percentiles and other fractiles so in addition to quartiles which divide the data up into one-quarter pieces you could have deciles they divide the data up into ten equal parts instead of four equal parts like the quartiles or percentiles they divide the data into 100 equal parts so quartile obvious is a quarter or four parts deciles 10 parts percentiles 100 parts and we often use these percentiles in educational health and they are used to identify and unusually high or low values oftentimes children's growth measurements are expressed in percentiles so if you're in the 95th percentile then 95 percent of the population is lower you're in that and that's unusually high and the fifth percentile is pretty low so notice that the 25th percentile is the first quartile isn't it so that's the same as q1 because q1 divides it up into four equal parts so q2 that's the median divides the data into the top half and the bottom half and that's the 50th percentile i'm going to have room to write all that down and of course q3 well that's the 75th percentile the three-quarter mark be sure you understand what a percentile means the weight of a six-month-old infant is at the 78th percentile means the infant weighs the same or more than 78 percent of all six month old in example five we're asked to interpret the percentiles the ogive remember o drive is cumulative frequencies at the right represents the cumulative frequency distribution for sat scores of college-bound students in a recent year what score represents the 80th percentile so you're going to read the graph it says percentile here on the vertical or the y-axis so you want to come over here and see where that meets it meets the graph right there and then you're going to have to go down and read what the x-axis is and that looks about 1250 so that means that approximately 80 percent of the students had an sat score of 1250 or less in example 6 we have to find a percentile and this is the data from example 2. so we want to find what percentile corresponds to 34 000 remember the tuition costs were given in thousands of dollars so 34 000 is the data entry that begins with 34. so it's it's here 34 000 right here so what we have to do is count the number of data entries less than 34. one two three four five six seven eight so there are eight data entries less than 34. so now what i have to do is i've got to remember the total number of all these data points was 25 so in example 2 there were 25 different colleges that we looked at so the percentile of 34. well i've got to use a formula now the number of data entries less than 30 so that's 8 data entries less than 30 over the total number of data entries multiplied by 100 because that's the percentile part and i put that in my calculator and i get 32 that means the tuition cost of 34 000 is more than 32 percent of the other tuition costs because i have to include all of these values up to 34. when you know the mean and the standard deviation of a data set you can measure measure the position of any individual data entry in the data set with something called a standard score or a z score now this is very important for the rest of the course so make sure you understand this the standard score or the z-score is the number of standard deviations a single data value x lies away from the mean that could be less than the mean which case it'll be negative or it could be more than the mean in which case it'll be positive and we use this formula z equals the data value minus the mean divided by the standard deviation so remember a z co score could be positive or negative if it's zero then the z score means it is exactly the same as the average or the mean so usual scores are between -2 and positive 2 standard deviations away from the mean so that means most of the normal scores are here the mean one the mean minus two standard deviations is this side so that's the mean minus two standard deviations then there's a mean in the middle and on the other side there's the mean plus two standard deviations two sigma anything that is more than two standard deviations away is unusual and if you're more than three standard deviations away from the mean that is very unusual in example seven we're told that the mean speed of vehicles along a stretch of highway is 56 miles an hour so let me just write down everything i know the mean speed is 56 miles per hour the standard deviation is four miles per hour you measure the speeds of three cars traveling along this stretch so i'm going to say x1 car 1 is traveling at 62 miles per hour car 2 is traveling at miles per hour and car 3 data point 3 is 56 miles per hour find the z score that corresponds to each speed and assume the distribution of the speeds is approximately bell-shaped we have to have this assumption in order to use the formula for z-scores and remember the formula for z-scores is the data x minus the mean divided by the standard deviation so i'm going to get the z score for the first data point so i'll just call that z sub 1 and that's going to be x sub 1 minus the mean all over the standard deviation so that's 62 minus 56 all divided by 4 and i can do that on my calculator now on the calculator it wants to have parentheses around the entire numerator in order to be correct so that's what i'm going to do so 62 minus the mean of 56 all divided by four so that's 1.5 now let me do z sub 2 the second one so that's x sub two minus the mean divided by the standard deviation and the second one is 47 minus 56 all divided by four so i'm going to do the same thing with my calculator 47 minus 56 entire numerator and press parentheses divided by four i'm going to get minus 2.25 and z 3 i almost don't even have to do that's x 3 minus the mean minus the standard deviation sigma and that is 56 minus 56 that's just zero isn't it divided by four now zero divided by four i can put it in my calculator if i want to be sure 0 divided by 4 is 0. so this tells me that the score of 62 miles an hour is one and a half standard deviations above the mean kind of normal a speed of 47 miles per hour is 2.25 standard deviations below the means getting to be unusual and a speed of 60 i'm sorry a speed of 56 miles an hour is exactly the mean it doesn't differ from the average at all