hello my status stations i hope you're having a great day uh welcome to mostly math i'm ashley and we mostly do math on this channel um and because it is very close to ap time we're actually just gonna do some big picture reviews so today we're going to talk about how to choose what inference procedure to use is it a hypothesis test is it a confidence interval are we using z are we using t are we using proportions are we using means are we doing difference of means are we doing chi squared are we doing chi square goodness of fit or oh my god pool data oh my god there's so many different things how do you choose which one right so this video is not going to cover what each test does all right and and how to perform it and which um like what formulas to use okay if you want a review on the comparison between the different tests check out that video this is a review video so this is assuming that you already know the difference between all of these things if you just look at the list of inference procedures that are options for ap stats i mean there are more out there but this is like what was covered in ap stats and kind of like beginning stats in college there's a lot of them right you just take a look at that list there you know i mean it's it's crazy there's so many things that you would have to choose from and if you look at this list it's like super daunting um but i have a cheat sheet i mean not a cheat sheet i have a way to kind of cheat the chaos um so when i'm deciding on an inference procedure uh i first ask myself what kind of data do i have is the data categorical or is it quantitative is it stuff and things and yeses and no's or is it no my goodness is windy out there or is it quantitative or numbers numerical data all right as soon as you have that decision made you've eliminated half of the tests okay so that's question one question two is are we what what question are we trying to answer okay and in this case you want to look at there's three different things you want to be thinking about so one are you looking for um is that is what you're looking at a hypothesis test type of a question where you basically just get a yes or no answer um or mostly a no or a nod a sufficient not sufficient information answer right if you do a hypothesis test and you start with you know mu equals zero and you're testing to see if mu is not equal to zero at the end of that all you either get yep mu is not zero or no we don't know anything okay so it depends if that's your question then you know hypothesis test is great confidence interval gives you some information on the range of values um so if you're if you're trying to kind of get like you know does does the data lean one direction or another a confidence interval is typically really nice and then the the third question is uh are you trying to establish a relationship and if you're trying to establish a relationship that's when we get into kind of those funky tests the the test for uh the slope of a least squares regression line or um the chi-squared tests so um if it's asking about a relationship that's going to be your next question and then last but not least if we're dealing with hypothesis tests or confidence intervals we want to know is it one sample or is it two samples are you com are you comparing one sample to a claim um or are you trying to find information about a single sample or are you trying to compare two samples together and uh there is a special case there's actually a couple special cases but the two special cases that i think are important to refresh yourself on because the ap stats they love to ask these questions so is the difference um a mean of the differences so this would be paired data where you're like lining up the data and you're finding the difference and then finding the mean of the differences if you're not sure about that i've got another video that i just did it's like five minutes you can go watch that right up there and then uh god where was i yes difference okay so that is a special case where you have two sets of data you would think that you're doing two sample but because you're finding the difference between the two of them you're actually only doing a single test um a single you're only dealing with a single sample which is the difference between the two samples and then uh the other special case is with pooled data this only happens with proportions and it's only if you're doing a hypothesis test where you assume that p1 is equal to p2 and you do all of the calculations you're still doing two samples but you do all of the calculations assuming that p1 is equal to p2 um and so you can actually pull the data to to combine and get a larger sample size basically uh so those are your special cases but other than that those three questions is it categorical or quantitative are you looking at a hypothesis test a confidence interval or a relationship and is it one or two samples if you can get those three questions answered you should be good to go all right so that's the general overview um and i have if you're good with that and ready to move on great love you have a great day if you want to see a couple examples i have some lovely little examples over here and we're just gonna kind of talk about how to make the decision of which test to use for these guys so uh let's just jump right in number one which brand of double a batteries last longer duracell or energizer you should pause the video before we do the examples pause the video try these problems first okay that's the best way to learn is to try it first see where you messed up okay don't just like watch me talk and go through it all right pause it do your own thing okay now come back all right question one which brand of double a batteries last longer duracell or energizer all right our question one is it quantitative or qualitative data well it's asking about how long a battery lasts so that's going to be measured in probably minutes or hours or days so that's definitely quantitative numbers so question one quantitative for sure question two are we trying to establish a relationship between these two things um no definitely not it's asking which one lasts longer but it's not asking like is there a relationship between duracell and energizer that's that's not a thing you can't it's like a weird question to ask so that brings us to is it a hypothesis test or are we going to do a confidence interval and in this case it's asking which lasts longer right and if you think about doing a hypothesis test in this situation um you're gonna have brings us to our third question is it two sample or one sample we are gonna have duracell batteries and then we're also gonna have energizer batteries and then we're gonna be comparing the two so probably a two sample test so we just need to decide is it a hypothesis test or is it a confidence interval now a hypothesis test if you were to set up your null hypothesis and alternative hypothesis you would have your null is that they last the same amount of time so mu 1 equals mu 2 and then your your alternative hypothesis would be that they're either not equal or that one is greater than the other but if you try to prove that one is greater than the other and then you kind of get down to the bottom and then you know you can only conclude what you tested so if you test the wrong thing then you kind of like put yourself in a bind um and if you just prove that they're not equal that doesn't tell you which one lasts longer so in this case it's actually best to do a confidence interval because what you can do is find this you know if you're testing duracell minus energizer and how long they last right if the difference between the two of them is positive if you do duracell minus energizer d minus e if it's positive that means duracell tends to last longer if it's negative then energizer tends to last longer it usually helps to think about like where you want to go like where do you want to end up okay all right so confidence interval two sample confidence interval for mu one minus mu two all right you can't the reason that we ask the quantitative and categorical question is because if it's categorical you're only dealing with proportions if it's quantitative you're only dealing with means except if you're in that relationship that weird relationship thing which is chi-squared and the regression line but we're going to ignore those for a second um qualitative you can only deal with proportions quantitative you're dealing with means so that helps you kind of dial it all in so number one we have two sample t interval for mu one minus mu two all right number two according to a recent survey a typical senior has 250 contacts in their phone is this true at your school all right quantitative or qualitative well we're counting contacts so quantitative which means we're doing means uh and it's asking there's like a claim and then we want to test whether or not your school is in alignment with the with the test or with the claim or is it different than the claim all right there's no there's nothing leaning one direction or the other they're not asking um does your schools tend to have more like do the students at your school tend to have more contacts in their phone that's not what they're asking they're just asking is it different or the same so this is going to be quantitative data we've got uh we're just at this point deciding between hypothesis tests and a confidence interval um and just because we're asking is it this thing or is it not this thing it's pretty straightforward to do a one-sample t-test for the mean because your result if you if you uh reject the null hypothesis then you say yeah we have enough evidence to say that it is different at our school or you don't have enough evidence to say that it's different and so you kind of have to uh go with that that's all it's really asking so that's number two number three what percent of your students at your school are on tick tock all right so this if you think about like the study that you'd have to do you'd probably send out a survey and you'd ask people are you on tick-tock or not right you do a sample do a little random sample ask people are you on tick-tock or not the data that you're collecting is either a yes or a no okay so is that quantitative or qualitative that's our first question it is definitely qualitative you cannot measure yes um or no except with proportions or percentages so this is qualitative data and we want to know the next question is are we doing a hypothesis test are we doing a confidence interval or are we doing like a relationship test and it's not asking anything about a relationship we just want to find an approximate estimate of what how many people at your school are on tick tock or what percent of the people at school are on tick tock to answer this question you have you have no claim to compare to so a hypothesis test doesn't actually make sense in this case um and we're doing qualitative data the other option other than a confidence interval would be the chi-squared test um and in this case we're not we're not comparing a relationship there's no relationship to study so we're just asking a simple question and the best way to deal with this one is with a confidence interval because you actually get a range you know because if you do confidence interval you can say we're in 95 confident that the true proportion of people who are on tick tock at the school is between 30 and 60 or whatever whatever it is right uh so that was number three number four is there a relationship between how fast someone can run and how long they can survive the zombie apocalypse uh all right back to question number one is it quantitative or qualitative well what are we measuring we are measuring how fast someone can run which is probably miles per hour meters per second something like that and then how long they survive the zombie apocalypse which is probably measured in days months years or seconds if they're not very fast okay so we've got two types of data they're both quantitative so in your head you should be thinking all right well this is probably then either a two sample or maybe one of those relationship tests that we were talking about so um it does ask is there a relationship huh shocker is there a relationship between these two variables well since we're doing quantitative data the relationship test is the uh the t-test for um the slope of a least squares regression line because you would have like um you would have you know how fast someone can run mapped to how long they survived the zombie apocalypse right because you can have numbers on the x-axis and numbers on the y-axis and then you can plot some data points and then make a line and then you can test whether or not the slope of the line is not zero it's pretty pretty simple i think once you start asking those three questions it's like really not that hard i think that's in my head but maybe it's still really confusing for you and if that's the case i'm really sorry that i'm not helping you okay moving on so first okay so for this one we've got we either have a t interval for the slope or a t-test for the slope and [Music] you could probably do either one honestly because if you do the the hypothesis test uh you know you assume that the slope is zero and then you get your answer and then you say oh there is enough evidence to show that there is a relationship between these two things and or you can do a confidence interval and if the confidence interval is doesn't have zero it's a tricky thing here if you do a confidence interval if the confidence interval doesn't contain zero then you can you have enough evidence to show that there is a relationship nice little tip there all right number five is there a relationship between a student's political party and ah that's the wind it's crazy out there is there a relationship between a student's political party and what type of college they apply to all right let's go back to the drawing board question one is a quantitative or qualitative well we're looking at political party and type of college they apply to so probably like liberal arts college or uh d1 d2 or they're you know we're talking about like law school or medical school who knows but they're all categories they're all qualitative data so this is definitely qualitative data then what is the question asking is it asking for like a yes or no answer uh like a hypothesis test are they looking for an interval of like a direction that this information is leaning um or are they asking for a relationship is there a relationship between these two things and in this case yes in fact it is asking for a relationship between um the political party and type of school they go to so when we're dealing with qualitative data and relationships we're talking chi-squared so then you just have to decide okay is it goodness of fit is it homogeneity or association and independence and this asks is there a relationship between these two things so homogeneity tests like is it the same that's not what they're asking goodness of fit is like you know you have a claim and then does your data fit the claim this isn't that so we are trying to find is a relationship between these two things so we do chi-square test for association because association means relationship it's the same thing you got it good job woohoo all right so that is number five and number six who's more likely to own an ipod gen z's jen's ears or millennials i mean i think we already know the answer to this question but it's fine we'll just kind of keep going with it for the purpose of this video who's more likely to own an ipod all right question number one is it quantitative or qualitative well you're going to be sending out a survey asking people which generation are you in and do you own an ipod and so their answers are going to be yes or no and either gen z or millennial right so your data is definitely qualitative so we're talking proportions and all right so once you figure it out is categorical data not quantitative then we go on to our next question is it going to be a standard hypothesis test are we going to do a confidence interval or are we going to do one of the relationship tests so you can immediately eliminate the relationship tests that's not what it's asking and then we want to know okay are we going to do a hypothesis test or are we going to do a confidence interval and in this case it asks who is more likely to own an ipod because it's asking for a direction without information about like which way it should lean so if the question had said something like um it's been thought that millennials are more likely to have an ipod than gins ears then you could do a hypothesis test with millennials being having a proportion greater than the gen z um but that's not what the question is asking it's asking we don't know and we want to know who is more likely to own one so the confidence interval actually is the best option here because when you do a confidence interval again you're going to do your gen z's minus your millennials if you end up with a confidence interval that's all positive then you know the gen z's have tend to have more ipods and if it comes out negative all negative then you know that the millennials tend to have more right it just depends on which one you subtract from which but that leads us to our third question which i've already kind of answered is it one or two samples and obviously this is going to be two samples because you're gonna have to ask both gen z and both and millennials and then compare the two proportions to each other so this is a two sample z interval for p1 minus p2 that's how you calculate that one how you'd answer that question all right seven how long do 16 to 18 year olds spend doom scrolling each day all right this is another survey we send out and we ask hey how long do you spend doom scrolling each day and how much does that affect your heart because it hurts my heart really bad i hate it so much i'm definitely a millennial i'm sorry i'm sorry if i'm not cool no i got my side part i got my skinny jeans but i can do math so i hope you can too after these videos all right anywho moving on um all right so how long do they spend doomscrolling each day question one is a quantitative or qualitative well we're looking at time so we're measuring in minutes hours whatever so definitely quantitative uh is it a relationship question nope not at all and so now we just have to decide is it one or two sample and are we doing a confidence interval or a hypothesis test so there's nothing we're not comparing two groups so this is definitely one sample and the question asks how long do they spend you can't get that information with a hypothesis test all you can answer is the question do 16 to eight-year-olds doomscroll yes or no so a hypothesis test is not appropriate in this situation so confidence interval is the way to go so for number seven we have a one sample t interval for mu all right eight are the colors of skittles equally distributed what is that question asking well are all of the skittles colors like will you have the same proportion of reds as greens as yellows as blues um or are they different do they you know for whatever reason make more reds than greens or make more blues than oranges so we go back to question one is it quantitative or called i know i sound like a broken record and you probably already stopped watching the video but that's fine it's a quantitative that hopefully that means you understand it is a quantitative or qualitative well we're talking about the colors of skittles so we're doing qualitative and so this is a classic case for a chi-squared test we just have to decide as a goodness of fit homogeneity or association independence so goodness of fit will will always be you know oh skittles claims that they do you know 20 red and 30 blue and 40 orange and then you have to test like you have you know your bag of skittles and you have to test you know is this actually accurate is their claim accurate or not uh for your bag of skittles um so this is not that situation we don't have that information um and it is not asking about an association between the colors all it's asking is are the colors evenly distributed and that is a classic case of the chi-squared test for homogeneity because homogeneous means the same yay when math and words meet okay uh all right so number eight chi-squared test for homogeneity and then last but not least we have number nine which brand of razer gives a closer shave researchers recruited 25 men to shave one side of their face with razor a and then the other side of their face with razor b give you a second to think about it pause it if you got to think about it this one's a little bit trickier so this one is not terribly straightforward the reason why is they haven't told you how you're measuring uh the the closeness of the shave right so if they're measuring for example the length of the hair after a full shave and you can like measure how long the little stubble is after they shave then we're doing quantitative data but if we're like having their partner come and like touch their face and be like oh this side is smoother than this side um or like a smoothest level smoothness level this is on a scale of one to ten this side is okay but a scale of one to ten being it's still categories it's not really numeric on a scale of totally smooth to a little rough to ow then we're talking qualitative data so depending on how you measure it is how you would answer this question um and then let's just for shits and giggles we'll do quantitative measuring the length of the hair um and now we have one person shaving half of their face with one razor and shaving half of their face with another razor okay in this situation you could do a two sample um a two sample probably a confidence interval because you want to find out which one shaves closer than the other but because you're doing it on the same person and you're randomly you're probably randomly assigning which side of the face you use each razor on that's where the random assignment comes in um you can actually do a paired test on this one because you can do length of the hair on this face minus on this face this face this side of the face with uh and subtract the length of the hair on the other side of the face and so what you would actually be calculating and what you'd be testing is the difference between razor a and razor b so even though you think you kind of have two samples a better situation because you can actually pair the data in this situation if you can pair the data you should because it it often makes a much better study um when you do the difference you're actually you only have one set of data you have the differences and then you have a single mean of the differences and so if you do it that way you would actually end up doing a one sample t interval for mean but the mean of the differences so um because you have a paired test paired data you could do it with a two sample but guaranteed on the ap stats test you'll lose points if you didn't do a matched pairs design on that one guaranteed all right that is it folks man that was a lot i mean you probably didn't watch all of it but that's okay hopefully you got out of it what you needed to and i hope it helps you uh pick the right test for problem number six on your ap stats exam oh that's coming up good luck friends all right bye