hi welcome to the video on hypothesis tests for the mean mu okay recall with me last time we were talking about when we're doing statistics we often want to know something about a population this is called a parameter so in this picture this is the stuff above the red bar we can't figure this stuff out this is completely unknown so what do we do in real life we take a sample so this is all the stuff below the red bar here we take a sample from our population and we calculate a statistic and we use that statistic to inform us about the population parameter that we're concerned about so earlier we saw confidence intervals and so in that scenario again pretend with me we want to know something about the mean of a population we would take a sample calculate the x bar that's the statistic from our sample and we constructed a confidence interval that hopefully captures the population mean and this confidence interval for example if it was a 95 confidence interval that's telling us the success rate so in other words if we were to do this 100 times have 100 intervals 95 of them or at least we can expect 95 of them should actually contain the population mean in other words means we're off five percent of the time so that was what we did with confidence intervals now now in this chapter we want to talk about well suppose we have a claim about a population parameter so in this scenario suppose my claim is that the population mean is 50. now what can i do with my sample so i would have a sample i would calculate x bar and the question is i would compare x bar to 50 and i'd say well does this sample support the idea that the population mean is 50 or does it make me wonder hmm i think something's wrong here so that's the idea behind hypothesis testing here we're going to use our sample and we're going to use a statistic a test statistic actually to say well if we compare this to what our claim is does it seem to support our claim or does it actually give you reason to pause and go i think something's fishy here all right so let's talk about the requirements that we need for hypothesis testing now i have to admit i misspoke earlier i said that we need to have for hypothesis testing it turns out we don't necessarily need these requirements for hypothesis testing and in later on chapters we're actually going to loosen or lose some of these requirements but in the meantime we're trying to learn about this process we want to make it as easy as possible for ourselves so we're going to assume these requirements these are the same requirements from the previous chapter so let's just go over them really quick first one we're going to assume that nothing went wrong with our sample so we have a simple random sample from the population that we're wishing to study the second thing is since we're talking about means here the population parameter is nice i.e it's normal and we have a it's a normal distribution with a mu and a sigma okay third parameter or requirement is that while we don't know the population mean we do know sigma or the standard deviation of the population all right i know this seems kind of silly i mean if you have the standard deviation shouldn't you know what the mean is uh well that's a little bit of a can of worms but we're using this assumption here while we're learning about the process again i need to repeat this because it's going to become more interesting later on we're going to get rid of this condition and when we get rid of this condition it's like okay so how does our process get modified or it'll add another extra step or two all right so let's talk about performing a hypothesis test so when you perform a hypothesis test to be honest it's very much like a recipe um so for those of you who like following instructions or recipes if you like to cook you might like this section a lot because it has very standardized steps that we have to follow all right so the goal um or as we have written here our quest is to decide the fate of a claim that has been made about a population mean by using information gained from a sample all right so that's the whole goal of this process now here comes the recipe for how we do this so step one we need to come up with or find out what this claim is we call this the null hypothesis and so we write it notationally this is what's known as h naught and we have mu is equal to some value in the previous slide we had mu equal to 50 so mu naught would be 50 here and the idea is is that this is the claim that we assume is true let me make a little side note here we actually have claims all around us if you have a bottle of water or a can of soda it tells you what the weight or the how much liquid is inside of that bottle or a can that is a claim 12 ounces that is a claim maybe there's 12 ounces in there maybe not anytime you take any medicine if you take headache medicine when they say what's the ingredients in there those are claims so we actually have claims all over the place so that's what we're thinking about here the claim or the null hypothesis this is what we're assuming to be true all right so we need to test this thing so we're going to come up with an alternate hypothesis the alternate hypothesis kind of can take on several forms so the one that i have written here is i have okay well if the claim is that it's equal to this mu naught then my alternate is going to be well it's not equal to that okay this is what's known as a two-sided alternate hypothesis and i'll talk more about this in class but i just wanted to make note of that we can also have other types of alternate hypothesis these are these guys right here are called one-sided one-sided because they're saying oh it's bigger or oh it should be smaller so you're picking one side in a sense so here this would be like saying um i actually think mu should be smaller than what the claim says it should be and this one says i think the mu should be greater than what the claim says it is um these hypothesis or these sort of claims are a little bit harder because you need to have some more information um maybe some type of prior information whereas when we say mu is just not equal to you not that's when you just don't have any information and you're just saying well it's just not what the claim is you're not saying which way it's not the claim all right second thing that we need to do is we need to decide how many times we're going to be wrong that sounds a little strange that this is as you can see here we have alpha this is the greek letter alpha and this is um the significance level and it's at zero five or five percent this says five percent of the time we're going to be wrong or one out of twenty we're going to be wrong okay um kind of the opposite of confidence intervals right confidence intervals say oh ninety-five percent of the time we're successful and this is saying five percent of the time we're going to be wrong um and we need this because this is sort of our cutoff point this is what we say after we do our calculations we look at this and we go okay is what we have really super strange or is it still okay now again we'll talk a little bit more about this in class but this um alpha being .05 corresponds to the z value of 1.96 and so this kind of gives us our cutoff now what i'm thinking about here is and picture the normal curve okay and put a line at z equal to 1.96 and another vertical line at z equal to negative 1.96 and we know that what's between those two vertical lines is the middle 95 percent okay so what we're saying is anything that's in the middle well that's in the middle 95 percent and anything that's outside of that that's in the five percent that it's not in the middle okay that makes a little that's kind of weird strange stuff in the tails of the normal distribution okay so after we've set okay our claims uh our hypothesis and we've decided what our sort of cutoff point is going to be for what's strange and what isn't at that point that's when you go collect the data and you go collect that simple random sample and then you start to do some calculations now i've got to be honest with you if you change this order up you go collect your data and then you write your null hypothesis alternate hypothesis you can get statistics to say anything that you want so to be an ethical statistician you should always follow these steps at least that's my opinion it's important that we don't come with any preconceived notions okay so we have our claims we have step two this is going to tell us where our cutoffs are on whether we accept something or we don't step 3 we go find our sample do the collection and then we're going to calculate all right so the statistic that we're using here we have z excuse me x bar okay so what we're going to do is we're going to calculate this is a z score and so you're going to see this throughout many chapters it may not be z equal to this but you're going to see this fraction observed minus expected divided by spread that quantity is something that we're going to use in many many more chapters okay so we do observed minus the expected divided by spread all right so that's our z test statistic so that's our third step our fourth step is we need to make a conclusion all right so we said see how we have z asterisk here or z star is 1.96 so the question is is the absolute value of our z test statistic that we got step three greater than z star this is like asking on the normal curve so think about that bell curve is our z value way way tucked into the tails of the normal curve remember with me if they're way tucked into the tails of the curve okay this is two standard deviations from the middle okay if you're way out there under the tail of the curve either side left or right this means that what you have is something very strange okay if the claim is true so if you get something that is greater than the c star this means it's really unlikely given the claim is true the null hypothesis is true it's really unlikely that we would have gotten this z-score and so in that case if the answer is yes you reject the null hypothesis now if it's not the case where the absolute value is greater than z star that means that it's somewhere in that middle 95 percent of the normal curve then it's not that strange that you've got this value given that the claim is true so in other words you don't do anything it's not that strange and you keep the null hypothesis so this is yes it's really strange we're way out in the normal the tails of the normal curve we reject the null hypothesis this is it's not that strange okay now i would like to go ahead and just make a note of this notice we're talking about the null hypothesis we never mentioned the alternate hypothesis so the whole sort of logic of this test is you're just asking the null hypothesis we're assuming it to be true and we're just asking did our sample is our sample strange enough to make us doubt that this is true okay that's the strategy behind here some people like to think about like the way we have court in our country in that the assumption is is that a person is innocent until proven guilty it's not that you proved that the person's innocent it's just that you've given that enough doubt and that's when they would excuse me let's try that again it isn't um that you prove that they're innocent you get to assume that they're innocent and then you have to prove that they're guilty okay so that's kind of what's going on here you assume the null hypothesis is true this and you have to look at your sample and go does it give me enough concern to say i should reject this okay so i should reject the idea that this is true notice it doesn't say anything about the alternate just is all about the null okay let's take a look at an example so suppose that you take a simple random sample of the heights of your fellow plnu students who are doing virtual courses along with you on their couches when you compute the following statistics on your sample you find that your sample size is 25 and that the average of the sample is 67.3 you've also been told that the value of one parameter for the entire population of undergraduates enrolled at point loma i.e sigma is equal to 2.65 and what we're going to do is we're going to have a cutoff value of .05 which means we're going to use the z-score 1.96 to evaluate whether or not what we have is strange or not all right okay so let's go through our recipe so we kind of did things a little bit backwards ignore that we already took the sample okay but here we go step one suppose we want to have the claim somebody says oh i think the average heights of our undergraduate pointless students is 69 inches oh boy howdy they're tall then we're going to say the alternate hypothesis is that they're not 69 inches okay remember this is our significance level this is um sort of our measure stick for how weird is it okay so we're going to say that if the z value the absolute value of the z value is greater than 1.96 that means farther than two standard deviations from the middle we're going to say that that's a really weird result and it makes us doubt the null hypothesis it makes us doubt the average is 69 inches okay so we set those two things now in practice we should go find our sample but we know previous slide already told us that we had our sample so we're going to calculate our test statistic our z-test statistic so this is the observed so that's that's 69.3 excuse me 67.3 minus the expectant 69 divided by the spread so this is sigma divided by the square root of the sample size i strongly recommend that you stop the video here you pull out your calculator and you make sure that you know how to get this number on your calculator if you struggle with this please let me know and we can zoom and figure out what's going on it's really important i don't want you to be um i don't i don't want things to go wrong because your calculator you and your calculator are not getting along okay so make sure that you can do this so once you've done this you get negative 3.208 okay so the question is assuming that the average is 69 inches we got our sample how strange is our sample given this claim so how we test this is we take the absolute value of it and we check to see if it's greater than 1.96 and boy howdy it sure is think about this this is saying that it's more than three standard deviations away from the center that is really under the normal curved tails right it's really far under there really unusual and so we say yeah this gives us enough evidence to say something's funny right we don't believe the claim that it's 69 inches and so we're going to reject the null hypothesis now in class we'll talk more about i'll draw some pictures and i'll talk more about what's actually really going on here and i'll give you some of the vocabulary that we use in statistics but if you wanted to say this to the average person who's not taken statistics then the net result of this is the average height of point loma undergraduate is not 59 excuse me 69 inches or a 5 9 okay so that would be the summary in the non-statistical term now i just want to draw your attention to some questions that we have down here at the bottom again we'll talk about this in class but these are good things to think about um and i'm gonna let you read them and think about them and when next time we see each other we can go over more of this thanks you guys bye