well hello students and now we begin chapter six the z-scores and the normal curve so what are our goals for this chapter you're going to learn how to calculate a z-score the idea behind a z-score understand the distribution of the z-score how the standardized or standard normal curve is used with standard scores which are z-scores and we're also going to touch on the sampling distribution of means and standard error of the mean something that will become more relevant in future chapters and central limit theorem tied to that computing z scores for sample means to determine relative frequency so applying some of the previous concepts from previous in that earlier chapter concepts applying them to this notion of z-scores and z-score distributions some new notation absolute value that refers to the value of the number regardless of the signs so that is relevant in this in future chapters so the absolute value of positive 2 is 2 as is the absolute value of negative 2. it's that is 2 as well so it's the value ignoring the sign plus or minus this refers to when you see this symbol that means plus or minus so plus or minus one would refer to either positive or negative one okay z scores so let me ask you let's start off with a question let's say you learned that you scored a 67 on an exam so how would you interpret that 67 is that a good score not so great score well it really depends on where that score lies in the distribution of scores relative your score relative to other scores um is how is it above or below the mean what's the range of scores how tight are the scores what's in other words what is the standard deviation all of these are questions that can be answered by the use of z-scores so z-scores really they're a way of standardizing your scores so you can interpret your raw scores and in the context of whatever it is that you are whatever distribution it is a part of we'll talk about z-scores relative to normally distributed scores there's always an assumption of normal distribution in order to interpret the z-scores and what a z-score means is that it reflects it's a number that um is reflects the distance between a raw score like your score and the mean and the sign of the z-score hotel indicates whether that score is above or below the mean and again standard deviation becomes very important as i said a moment ago the standard deviation will tell you something about the shape of the distribution really how close together the scores how if they're bundled right around the mean or if they're more spread out so a z score is really a location on a and distribution gives us a lot of information about the where it falls on that distribution but it can also because it's the assumption is that the scores are normally distributed it can also give us information about things like percentile and frequency more on all of that in just a moment let's say we have a distribution that looks like this let's let's step back for a second let's say we are a researcher interested in the concept of attractiveness what is it what do people mean when they say someone is attractive more or less attractive we're trying to understand that concept so we have some photos of of and individuals and we're asking with was particular attributes and we're asking uh participants to rate the individual each individual on attractiveness so here we have three photos of individuals we have binky biff and slug so binky's attractiveness score is 65 biff's is 90 and slugs is 35. so what you see here is that the average that the scores are are normally distributed and that the average is 60 and the binky is a little above average okay buffs a lot above and slug is below so we can translate each of these scores these raw scores these are raw scores into a z-score and that can give us more specific information about uh percentiles frequencies where the person falls relative to the mean so there's a formula and we do that by calculating a z-score so the formula for transforming a raw score into in a sample and this is a sample this is a sample of participants that we're referring to here into a z-score is that z is equal to the raw score minus the mean over this is the standard deviation as we saw in previous chapters so it's as simple as that now we may um instead of a sample have a population so this the formula looks similar it's the z-score then is the raw score minus mu or the population mean if we do know that population over the population standard deviation all right so let's go back to binky what can we say about binky in terms of the z-score now we're going to do more of these calculations in class but let me just pause here why don't you pause and pause this um this youtube and and calculate binky's z-score with this information well actually i need to tell you one additional piece of information don't don't i we're missing a standard deviation so um if we assume that the standard deviation in this case is 10 what would you then how would you calculate binkies all right so the calculation then would be the score mickey score 65 minus the mean of 60 and i told you the standard deviation is 10. so binky's the z-score then is .5 now we could calculate the z-score for slug and biff as well but let's let's hold up on that for a moment just want to give you an idea of how you would calculate a z-score now you can also something else you can do is if you have a z-score in a sample like you're told someone's z-score and but you don't have their actual score you only know their z-score if you have the mean and the standard deviation of the sample you can also calculate the raw score for a particular z-score standard deviation and mean same thing for the population but in that case you would have the population standard deviation and mu the population mean so that's what's that's sort of the basics of how to calculate a z-score now how do we interpret z-scores a z-score if you have a distribution of squares you can also have a distribution of z-scores so a z-distribution is the distribution produced by transforming all of your raw scores into z-scores in your set of data so for instance a z-score distribution might look something like this so you have raw scores on this row right here and then right below you have z-scores okay so your z-scores would have a distribution you have z-scores for all of the scores in the distribution just like you and they would be distributed like this normally like this so now i had you i asked you to calculate the z-score for binky why don't we do that for biff and slug again with a standard deviation of 10. so go ahead and do that pause this this video and go ahead and do that for these two individuals let's see we have biff with a score of 90. to calculate biff's z-score what you should have done is 90 that score minus the mean 60 over 10 the standard deviation and so his z-score would be 3.0 and you can see it right here that actually falls on an even number of uh z-score there are z-scores in between as well but this happens to be right there at 3.0 now with slug you can also calculate slugs you would take this 35 minus 60 over 10 and 35 minus 60 is a negative 25 over 10 so that would be actually a negative z-score of negative 2.5 so what does the negative tell us negative is not a bad score it simply means that the z-score is below the mean if you're measuring something like number of errors actually a negative z-score could be a good thing right fewer errors than average so again the negative component of the z-score just tells us that it's below the mean positive would be above the mean and the size of the z-score the absolute value tells us how far away from the mean that particular raw score z and corresponding z-score false so this distribution looks very much like a raw score distribution that's that normal distribution and uh a difference though is that it's going to have a mean of 0 and a standard deviation of one so you see the mean is zero and the standard deviation is one for all these distributions so the different distributions will look different depending on the standard deviation and whether or not the scores are clustered around the mean or more spread out so you can see here's an example of two different z distributions one for a statistics test and a second for an english test and we have two students uh millie and athena so let's take a look at milliampia and let's take a look at their scores and what we can say about their scores what information is gained by knowing their z-score for their the z-score that corresponds to their raw score for their performance on these two exams well first of all let's take a look at the average so the average score for the statistics exam was 30 and for english was 40. and you can see for the english for english the score is a little more spread out than they are for statistics so looking at millie here for a minute um millie tells a friend i got a 30 in english so is that a good score is that a not so good score and so we see that 30 is actually below the mean now her z-score for that square of 30 is is negative one so we understand then from that that is below the mean because of the sun and althea says i got a 37 in stat okay so althea and our statistics of course got a 37 up here now 30 and 37 aren't really that much different in terms of their absolute value right however althea's statistics z-score is a positive 1.6 so we can see that that is quite a bit quite far from the mean so she actually did did pretty well in that statistics course and far better than millie did in her english now let's look at althea scores okay on her on her test let's compare those so in english althea had a score of 45 and her z score is a positive 0.5 so she scored above the mean just a bit now her stat as we just mentioned her score in statistics was 37 well that's actually less than 45 right so on the face of that you may think oh she didn't do as well in statistics compared to english but actually she did better and that's because of the distribution of scores and the average on in the statistics class versus english so in her stat class clearly her z-score of positive 1.6 is far better than the positive 0.5 all right so this just gives you a feel for how the different distributions can be better understood by looking at the z-scores now one thing that we can there are a couple things that we can calculate from a z-score distribution we can calculate the relative frequency for instance so relative frequency the proportion of a time the score occurs or proportion of a total area under the curve is something that we can we can look at with with these z-score distributions what we look at when we talk about the distribution of z-scores is something called the standard normal curve so it's like a it's a normal distribution and it's made up of standard scores z-scores are sometimes called standard scores because they're standardized so a standard normal curve is uh in theory it's a perfect distribution that completely models that uh normal distribution that we have referred to so many times so you can look at the proportion of total area under the standard normal curve and we've done that in previous chapters with uh just looking at a normal what we call a normal distribution and you can see that for z-scores we have z-scores now on this axis um the the mean is of course zero as mentioned earlier and one standard deviation is reflected as a positive one or negative one depending on the direction that you're looking and then you have a z of positive two positive three negative two negative three and by the way you have z scores all in between here as well so the percentage under the curve the relative frequency of the scores or the z-score is reflected by these percentages that mirror what we've what we've discussed in the normal distribution in that you will find between the mean of zero and a negative one z-score these scores represent about one 0.3413 of of the entire 1.0 under the curve so about 34 percent of the scores or z-scores will fall between 0 and negative 1 another 34 between 0 and positive 1. and um between there's there are percent percentages for each area under the curve so between positive one and positive two you have another 0.1359 then as you can see the frequency goes down as you get out and get some pretty small numbers here and it's mirrored symmetrical in the in the negative direction below the mean so about 47 48 of the scores fall between zero and negative two z-scores and that same percentage fall between zero and positive two z-score positive two and so when you get all of these percentages um proportions under the standard normal curve so that can be very valuable information that we will be using throughout the course so here's an example of covey cubby has a score of 80. beyond that attractiveness measure and with the score of 80 the z-score that corresponds to that is two so you can see that all of these folks these fall below about uh 97 98 um of the cases are less than less than um cubby score of 80. okay now looking again at the distribution of z-scores for the attractiveness study we've talked about slug binky and biff let's add elvis to this distribution we know that let's say elvis received a score of 40 on attractiveness so because we know about this distribution and that corresponds to a z-score of negative two so because we know about this distribution right here of z-scores on the standard normal curve we can make a statement about where our new participant elvis falls so again elvis received a z score of negative 2 so let's look here negative 2 so what percentile does that represent how many cases are below elvis so to do that we would simply add together these two numbers 0.0215 and one .0013 and this is the percentile that is that can describe where what the 2.0 and z score of 2.0 represents so what is that number go ahead and calculate that if you will okay hopefully what you came up with is this .0228 and again that was just summing together the two areas to the left of the 2.0 z distribution 0.0215 plus 0.013 all right so hopefully you got that right so again that is the the representation or the percentile for elvis about two percent of the cases fall below elvis now let's say we didn't know elvis had a score of 40. another thing you can do is you can calculate a raw score as we mentioned earlier if you have the individual's z-score and the mean and the standard deviation so if we go all the way back to a previous formula in the beginning here we can calculate now you see that it's 40 but double check and make sure you understand that where the 40 comes from if we didn't know his score if we're only given a z-score the z-score of 2.0 but we knew the standard deviation which is 10 and the mean which is 60 if we plug in those numbers we should come up with 40. all right so hopefully you're able to do that it would be uh the way we would calculate that in case you had any trouble with it is that squirrel would be equal to negative 2 times 10 which is negative 20 plus the 60 so that's a score of 40. so again we can see here that it's 40 but if you were if you didn't have that raw score available you should be able to calculate that okay one more thing that i would like to say before i moved on is that you can also you can also calculate and i don't have a slide on this but you can also calculate a simple frequency so if you know what the sample size is let's say we have a sample size of a thousand and you know that cubby or someone falls within the the percentile of the 98th percentile we can take the the actual value which actually in this case was 0.9772 times a thousand and then we can say that that number 977 represents the number of people that scored below cubby so that goes back to cubby here all right now we're going to move to a slightly different topic and one that is really feeds into chapters moving forward using z-scores to describe sample means here i'm just going to present the concepts to you and we will do more examples we'll do examples of this in in future chapters all right so let's say that we're interested in all of the uh all college students in the u.s on some variable let's say creativity so we want to get a handle on we want to understand this concept of creativity for college students and we have a population of college students all college students in the us and so to understand the creativity level of college students in the us we might pull out samples of about 100 students give them a creativity test compute a mean an average for for those 100 students and put them back in the sample of from the population put them back on the population choose another 100 calculate a mean put those folks back which is another 100 calculate a mean put those folks back continuing to continue to do that so that we in an attempt to understand what that true mean is for creativity so what we end up with is a sampling distribution of means so let's say that the um the mean for creativity about 60 on our test most groups of 100 would have a mean of about 60 but some would be higher 65 67 70 and some would be lower 40 50 55 so we would end up with a sampling distribution of means and so that is where we are with this lecture now i'm trying to understand how we would use z-scores for the sampling distribution of means because we can translate each of those means into a z-score now let's give a let me talk about a real life kind of example with these standardized distributions and that is entrance exams for college like sat exams the mean for sat scores is set at 500 and standard deviation at 100 so that way individuals it's standard the scores are standardized to 500 so that 500 reflects the average score and 100 reflects the standard deviation if we know that the mean for the population in sat scores is 500 and the standard deviation is 100 and we wanted to understand where a group of students falls with regard to sat scores we could sample that group and test them and look at where they fall on the distribution of means with that mean of 500 for the sat scores and 100 as the standard deviation that would give us information not only where they fall on that distribution but how they do relative to another group of students or you could do this for individual students as well so because the scores are standardized it gives us a good sense of where the individual or where the group in this case we're talking about sample means now where that sample mean falls relative to others and uh again how far away from the mean they are it gives us information in a meaningful way how far away from the mean they are whether they are above the mean or below the mean the central limit theorem tells us something about that it specifically applies to the distribution of means so when we when we formulate a distribution of means we've collected a sample of means that distribution whether it's referring to sat scores or something else will form a normal distribution the mu in in that distribution is equal to the mu of the underlying raw score population and there's a standard deviation too that is related to the standard deviation of the raw score population so the standard deviation of that sampling distribution of means it's not called standard deviation that's already taken so it's called the standard error of the mean and this makes sense because it refers to the amount of air in that mean that we've calculated and again remember we have a distribution of means and so there's some error some means are going to be right there at mu some means are going to be above and some means are going to be below so there's going to be some air what does that error look like the formula for the true standard error the mean is this is the symbol for the standard order of the mean and notice that this is a symbol representing mean okay so standard error the mean is that standard deviation over the square root of n and this refers to the n in your sample a z-score for a sample mean because as i said these these sample means on this distribution they you can calculate a z-score corresponding to each sample mean and the z-score then would be equal to that mean that sample mean minus mu over the switching i've just calculated the standard error of the mean now just as with raw scores we translate all the sample means into a sampling distribution uh that of means that of z scores and it'll look like that normal z distribution you can again talk about percentiles and percentages and areas under the curve percentile of scores to the left of the score and so on and so forth everything that we did previously with z-scores now this is going to become very relevant in future chapters because what we're going to be looking at are means and distribution of means and we're going to try to answer the question whether a particular mean falls into a distribution or is so unlikely to be in that distribution that it actually is better explained by another population so this is an introduction to that to those concepts i hope you got a good understanding of z-scores from this lecture but um please take advantage of the class time because what we will do in class is actually go through some examples of the calculation of z-scores and that's very important to your understanding of the concept okay see you with the next lecture bye