Understanding Variance and Standard Deviation

if you divide those you get 215 point five three eight that sometimes referred to as variance variance it's actually the square of the standard deviation it is also famous but if you but now we need to sort of take the square root of that so the 215 is not the standard deviation is sometimes referred to as variance taking the square root of it we get this number fourteen point six eight and that's about our standard deviation actually it's pretty close to what the computer I actually put the state into a computer and calculated it and it came out to fourteen point six eight one so I think we did pretty well especially with all this rounding error I go ahead and rounded it to one more decimal than the original data again remember the original data was all to the ones place it had no numbers to the right of the decimal so I'm going one more place value to the right than the original data so in other words the original data stopped here right 1/4 that's where the original data so I go one more place value to the right than the original data and that's where I went round so I'm going to round that to fourteen point seven so this would be my standard deviation so this would tell me that typical Britax have our with our on average about fourteen point seven kilograms from the mean so think of the standard deviation is how far typical values are from the mean okay okay so we got we know the shape was normal we know the mean average of this data set was thirty six point one kilograms we know the standard deviation the amount of spread is fourteen point seven kilograms so now leaking sort of get to our calculation for what typical values would be in this data set and what are the outliers in this data set okay so let's take a look so typical values are we did we say the standard deviation is how far typical values are from the mean so if you think of them as the center and I'm going to go that distance above and below the mean that's sort of the that's where my typical values will fall so if I kind of think of these so usually the four normal data what we want to do is we want it's within one standard deviation from the mean that's going to be the typical people typical values in the data set by the way this is usually the top or I should say the middle 68% of the data well we will get to that later when we get into empirical rule and things like that but but 68% is usually the middle 68% is referred to as typical if you have normal data so typical values are between the mean minus the standard deviation and the mean plus standard deviation in other words to get typical all you do is you take the mean and you add and subtract the standard deviation so again we'll just do this little calculation here 36.1 was the mean minus the standard deviation 14.7 and that's going to give me 21 20 1.4 kilograms if I go ahead and add them the mean plus the standard deviation so then I got this would add up to fifty point eight kilograms so 36 point one plus fourteen point seven gives me 50 point eight so typical weights of the bricks are in between twenty-one point four kilograms and fifty point eight kilograms one standard deviation from the mean or the middle 68% is usually in normal data now which numbers are those well look right here which numbers are between 21 and 50 well it's kind of these ones rights right here visually I can see that these are sort of the typical right here all right these numbers right here those are the numbers that were typical so typical bricks were sort of between these two now what about outliers outliers are often referred to as unusual so anything that's unusual like if you're analyzing blood pressure was there anybody whose blood pressure was really high that they should see a doctor was there anybody whose blood pressure was dangerously low maybe they need to be in a hospital you know those are key things that you always want to get out of your data we call those outliers or unusual values so what we said was one standard deviation is considered typical if you're more than two standard deviations or two standard deviations or more from the mean usually that's considered unusual or an outlier so high outliers are greater than or equal to the mean plus two s or two standard deviations above the mean and by the way this is the top 2.5 percent of normal data usually low outliers are the bottom 2.5 percent and again that's the mean minus 2 s minus 2 times the standard deviation so basically sometimes we referred to this formula as x-bar plus 2 s or x-bar minus 2 s as the outliner cut offs where where to Cutler to the outliers start in the graph so if we calculate this so 36.1 was their mean I'm going to do two times fourteen point seven first always multiply first before you add and then add the thirty-six point one and I get sixty five point five would be my cutoff so if I'm kind of visually looking at the graph especially the dot plot is a good thing to look at 65 point five is sorted right here here's my here's my high outlier cutoff and then if I do the same thing for the low outliers you do the mean minus two times the standard so you're multiplying the standard deviation by two before you subtract so again two times fourteen point seven and subtract and I get six point seven so six point seven so that's about right here this would be the low outlier cutoff so basically anything lower than the low cutoff is an outlier and anything higher than the high cutoff is an outlier so I can see visually that this dot right here 7b must be an outlier and this dot right here of three kilograms must be an outlier okay now not all data sets have outliers this one happened to have a high outlier and the low outlier but not all data sets do okay so so the seventy would be considered a high outlier because it's higher than the high cutoff you probably won't have any numbers in the data set that are exactly the high cutoff so just kind of keep in mind that you have to kind of think about what numbers in your data set are are that value or above and then for the low cutoff you're looking for any numbers in the data set that are that value or below so three is going to be a low outlier and seventy would be a high outlier okay now one thing I did want to mention is that some people think that data is just made up of typical and outliers it's actually not the case this is a very tiny data set so it's not really a good example of really real data when you have thousands of values but usually you'll have quite a few values in the data set again the middle 68% of the numbers are typical so the very middle and then that the far ends will be the outliers but you'll also have a section that's sort of in the middle I don't know why it students always won that one they're always asking me well what do i label that it's not typical and it's not and outlier and I'm like exactly so not all values in a dataset are typical or an outlier there's always that middle zone I kind of referred to it sometimes as that could happen zone it could happen it's not an outlier it's it's not it's not an outlier it's not typical so don't think that every number in your data set has to be typical or an outlier okay so that kind of gives you just an idea so so again we'd have if we kind of summarize our our brick data here we have the shape was normal or bell-shaped the mean the center was the or average was the mean average of 36.1 kilograms the typical spread was fourteen point seven kilograms so that was the standard deviation typical values typical brick weights were between twenty one point four kilograms and fifty point eight kilograms and we had two outliers one high outlier at 70 kilograms and one low outlier at three kilograms so if I was to summarize what I just said in a little paragraph that would be sometimes referred to as a little data analysis report so a statistician might write a little report just reporting out some of the key features of the quantitative data okay all right so I'm hoping this was helpful for you so just remember normal data goes with the mean and the standard deviation that's one of the key things key takeaways from that and again a typical is one standard deviation from the mean outliers are two or more standard deviations from the mean okay so typical or within one standard deviation from the mean outliers two or more standard deviations from the mean all right so I'm hoping that was helpful so this has been normal quantitative data analysis and I'm Mattie show and I will see you next time

Transcript for:Understanding Variance and Standard Deviation

Transcript for:
Understanding Variance and Standard Deviation