Transcript for:
Understanding Type I and Type II Errors

all right in this video we are talking about type one and type two errors so a type one error occurs whenever you are rejecting the null hypothesis but you shouldn't have because H is actually true your null hypothesis is actually true so we shouldn't have rejected it a type two error is the exact opposite of that when you do not reject the null hypothesis but you should have because your null hypothesis was actually not true um so your type one error is Alpha and your type two error is represented by Beta hopefully Alpha looks familiar so that's the same Alpha that we've been working with whenever we talk about one minus Alpha our percent confidence that is the same Alpha um so we'll do an example of how you can calculate it a lot of times it's given to you and that's just something that we set and say okay this is how comfortable we are with the type one error occurring from our statistics um and you just kind of run with it but we'll show you a way to calculate it too depending on the context of your problem so to kind of give some context example to go along with what a type one error and type two error are um let's consider a fire alarm so a type one error would be when the fire alarm goes off but there is no fire and so this is if we set our null hypothesis to be that there is no fire fire if we end up rejecting that then we assume okay there is fire and so the fire alarm goes off we say hey we got to clear out of this building but there actually was no fire so that's an example of a type one error where with a type two error this would be if the fire alarm does not go off but there is a fire so again our not null hypothesis is that there is no fire and we did not reject that we said yep you're right there is no fire correct but there actually was and so that's type two error maybe pause the video here and think through that for a little bit really grasping the context behind a type one and type two error is really important for moving forward and actually understanding the math that goes along with it all right jumping into a type one error problem consider the null hypothesis that the average weight of male students in a certain college is 68 kilograms against the alternative hypothesis that it is une equal to 68 so that would mean h subo um average weight so that's mu is equal to 68 and H1 mu is not equal to 68 the critical region has been chosen as xar less than 67 and xar greater than 69 and so that is somewhat arbitrarily set by the researchers they're going to say okay this is the point that we are saying okay this is no longer true um that's the point at which we would reject our null hypothesis and so this here is a figure just kind of depicting that that if our weight is somewhere between 67 and 69 we're good but if it's less than or greater than 67 69 respectively then we're going to reject our null hypothesis so if we translate that to our normal curve this should look familiar we have our mean set at 68 and then we have our reject regions defined by these values of weight 67 and 69 and our type one error is going to be Alpha and it's going to be divided by these two regions this is similar to how when we do a two-tail test we use Alpha over two and so what this graph shows is assuming that our mean is true assuming that H subo is true so we are talking here assuming that that is true this is what our curve would look like and we are going to reject anything that is in our critical region and so the area in which we are rejecting are these shaded regions and so we want to use our Z table to identify what how much of our probability is covered in those critical regions and then we were also given a population standard deviation of 3.6 and a sample size of 36 so if we do the math here the probability of a type one error is going to be based on our DET table one because we have our population standard deviation um and we need to find the area under the curve so we need to find Z values for each of these Z is equal to what and Z equal to what once we have those Z values we can use our Z table and find the area under the curve so I'm going to call this 67 Z1 and 69 Z2 so Z1 is going to be equal to xar minus mu over Sigma over root n hopefully that looks familiar so our xar is the 67 this is the point of our reject and our mean is 68 we're assuming that this mean is valid and then we're going to have 3.6 / the < TK of 36 and this gives us a zv value of 1.67 Z2 same process is going to be 69 - 68 over 3.6 / < TK of 36 which is equal to positive 1.67 and so our Alpha is going to be probability of Z less than 1.67 is going to give us the shape shed region and then if we add the probability of Z greater than 1.67 that'll give us this shaded region adding those two together because our normal curve is symmetrical we can actually just do two times the probability of Z less than 1.67 so then we would go to our Z table find out what the area under the curve is for this value and we ultimately end up getting 0.095 therefore the probability of a type one error is 99.5% and so what that means is that 99.5% of all of our samples at a size 36 this is specific would lead us to reject our mean of 68 kilograms when in fact it is true all right I'm going to give you some kind of properties rules of hypothesis tests related to type one and type two errors um we will talk about type two errors more in the next video those get really in depth um so I wanted to break that out into a separate video but some properties of type one and type two errors I'm just going to give these to you but but it is going to be very important that you sit down and think through kind of the math and the theory related to them um this will likely show up on an exam at some point and memorizing them is not worth the amount of effort compared to thinking through and actually understanding what they mean so some properties um a decrease in the probability of one generally okay let me start over property number one and this is on page 329 of your textbook if you want the full version type one error and type two error are related a decrease in the probability of one generally results in an increase in the probability of the other so generally inverse to each other as one increases the other decreases and vice versa all of these are on page 329 let me clarify there the size of the critical region and therefore the probability of committing a type one error can always be reduced by adjusting the critical values so just like I said at the beginning of that example those critical values were pretty arbitrarily selected um you can just change those depending on how you want your study to look so your critical values directly impact your type one error and critical values are chosen and arbitrary you can just pick them move them around depending on how you want your study to look number three increase in an increase in the sample size n will reduce Alpha and beta simultaneously so as n increases Alpha and beta decrease simultaneously I've probably said this a hundred times at this point maybe I haven't but I'm gonna say it again regardless more samples is always better um they can get really expensive and that's another thing to consider consider you want to out you want to weigh the pros and cons of the expenses that come with having more samples with the statistics but when it comes to like doing the math the more samples you have the more reliable your results and conclusions are going to be um and so this is just another reason for that more samples you have the lower your Alpha and beta so your type one and type two so your error in general is going to be um so more samples means lower error and last property here if the null hypothesis is false beta is a maximum when the true value of a parameter approaches the hypothesis value hypothesized value the greater the distance between the True Value and hypothesized value the smaller beta will be so what that means is as the as the true mean gets [Music] further from the hypothesized mean the better SL [Music] smaller beta will be and this also means we are more likely to detect it so if you think about it if we hypothesize going back up to this example here with the weights of the class if the true mean is 60 if the true mean is 75 let's just say and we hypothesize that it's 68 all of our data is going to tend towards this and so we're going to be pretty quickly to notice that hey things are not lining up here if a lot of our data is trending towards 75 any calculation we do is going to show that we are not that close to 68 but if our true is 69 our data is only going to be slightly skewed higher that it's going to be a lot harder to really detect Is this different from that and how can we really kind of get those numbers we're going to need a lot of data to really support that it is off um so that's what that last one was talking about the further away we are from the further away our guests from our true mean the more likely we are to detect that we are guessing the wrong number in the first place it doesn't tell us a ton about the um true population but just keep that in mind as the true mean gets further from the hypothesis mean more likely we are to detect it and ultimately the better or smaller our type two error will be and we'll talk more about type twos in the next video