Transcript for:
Understanding Key Statistical Concepts

so uh last time we were talking about a number of different things so one of the things in particular that uh we were talking about was that we had I thought I thought I had decided that too dark that okay improved okay I thought was so where we had left off was that uh I had just covered the binomial distribution and I talked about how for the binomial distribution it's talking about a particular number of successes out of a total number of Trials a trial is any kind of thing that we could label as an experiment in particular what we would label as a beri experiment a beri experiment is one in which it has only successes or failures and then we come to then the central limit theorem and I believe I gave at least the rough overview of the fact that the central limit theorem by the way uh I I believe the link is posted did I post it as an announcement or how did I I post it did I put it in the module okay yeah so uh Central limit theorem as I stated the last time is very very important and it states that when we want to talk uh about a sample mean a sample mean is where we have collected information from a population a population is just the thing that we're curious about and then the sample are samples are what we have actually gathered as information if we want to talk about the sample mean that the sample mean should regardless of what the distribution actually looks like the sample mean should behave like a normal distribution now because I previously had talked about a normal distribution and described that that's what we are referencing for this so for instance we have this example of where it says you know a retail store analyzes the average daily sales for 30 days and thanks to the central limit theorem even if daily sales fluctuate wildly the average daily sales over many months will follow a normal distribution so for instance if we want to then look at a particular situation such as this uh so the information we're to gather from this is that if we wanted to sample we could use what's referred to as the random function which is R and D and then Open Bracket close bracket uh to be able to Generate random samples to be able to calculate the mean uh from that and then we could compare with the histogram now we're going to actually look at that today what that means in more specifics okay but let's now go through some of these interactive examples these interactive these interactive examples are for the whole topics of this section and this section at the moment that we're working in is uh where we're dealing with measures of central tendency and distributions so these are examples that have to do with all of those content areas so first interactive example is right here so a business tracks the number of new customers each month as 50 55 60 65 70 75 and 80 what are the mean and standard deviation okay so let's see if we could answer that so let's bring up our favorite uh Excel file and by our favorite Excel file I mean just any old Excel file we'll do and the way that we're going to do this is that we enter in the values 50 55 60 65 70 75 and 80 w that is not how we put 80 there we go that is how we put 80 and then write mean standard deviation now those do not do it that does not create the mean in standard deviation but I want those as words so that beside them then will go the values so now for me to write then the mean I could say simply so if I type that is that going to work well let's find out no we go whoa why because that's not what Excel calls it so if we just go here and we then instead type average boom that works and then if if we want to know how to uh do the standard deviation well the way to do the standard deviation is to type equals St okay now the particular one that we're doing is this one this is the formula that works and then I can then just select the units that I want to use and then I can close it off with a little bracket and click okay and I get the answer of 10 which is correct and if I go back to here I can see that that's exactly the answer as presented here cool so that's a worked out example where it it provides if you really wanted to do it by hand that's how you would do it by hand uh but there is absolutely no need for us to do that by hand and where we need to know how to do it in Excel we can see that we can also do it in Excel because I put those directions in there as well any question so far so that was also kind of a bit of a review because we had done those calculations before so that was kind of basically like a refresher okay so now then let's answer then this next part which is analyzing skewness and curtosis oo fun and so you can basically think of these examples as also like example questions I could ask you so these are the types of questions I could ask you and then you could answer so a retail store notices that most sales are either very low or very high with few in the middle so what does this suggest about the skewness and curtosis of the sales distribution so I'm going to actually give you a moment to turn to your partners obviously the answer's there because I mean there is the solution button but but see if you could come up with at least trying to answer as much of that as you could and then I'll explain what the answer is so turn to somebody describe discuss consult and it's been a few days since we've done it so I'm going to say this is very cool to be able to use the hint for okay so if all of the sales so let's imagine some some graph of the sales if all of them are either high or all of them are either low then what would that mean in terms of skewness if they're high like this like all of them I'm I'm doing this of where this is high meaning it's on the right hand side for you in other words I'm doing it like stage left stage right kind of thing so if this is high all of them are on the high side the sales what's that tell us about the skill is it skewed or is it not skewed for starters it's skewed right and which way is it skewed skewed towards the positive what about if all the sales were on the low end skewed towards the negatives yeah okay well now also then what would happen if all of them were kind of towards one side and that would make it so that there was a tail that the like the the tail on the right was high so the tails are the extreme values so if the really high sails have a lot of things what does that mean for curtosis so curtosis is how big the tail is if all the Sails are high towards an extreme then that means that the curtosis is high okay well let's see what the actual solution is if most sales are low with a few High sales then it's likely right skewed which is what we would call positive skewedness okay and then reverse that if you want to go with negative skewness the presence of many extreme values suggests High curtosis which means heavy tails and then if we actually had specific numbers we at the moment don't have any number numbers but if we had numbers this is how we would actually go through with doing the calculations and we saw both of those before so kind of making sense okay so now let's go with example three applying the central limit theorem now I haven't fully talked about the central limit theorem so I'm going to provide some graphical examples of a lot of these types of things and few moments so a manager tracks daily sales across 10 stores and notices that sales fluctuate wildly each day how can the central limit theorem tell the man help the manager predict average weekly sales so let's look at what the hint is that should give us a good hint well hopefully so so think about what happens when you take the average of multiple days sales across different stores so I'm going just going to go specifically to what it is the central limit theorem suggests that the distribution of the average daily sales across 10 stores over a week will be approximately normal and what does that mean it means that approximately what we want to have is we want to have it of where we have some Central value which is the most and then decreasing evenly on each side off from that cool that's what the central limit theorem says okay so that is the sort of the plain old notes version so now we have then and here we have some uh upon extremely intelligent request here we have some graphics to be able to explain each of these kinds of Concepts so let's look at how to interpret the graphics and what they mean so for starters if you just look at these as is it just looks like pictures with absolutely no explanation that's not the way to look at it so if you click on one of these what happens is is it adds text at the bottom so if I then click on low crosis for instance it puts text at the bottom that explains the L low curtosis so please make sure that you try clicking on things uh when possible to be able to have an explanation okay so first up is normal distribution so a normal distribution is bell-shaped and it's symmetric so Bell shape means that it's a curve like that and the big key feature is that the mean the median and the mode are all the same that's the important stuff to know about a normal distribution and if you go down to right here at the text it explains that exact same thing so most of the data values are what we would refer to as clustered around a central value awesome and now we then see right skewed distribution what is a right skewed distribution well can you see how the mode the mode is the most frequent value can you see how the mode is to the left of the halfway point the halfway point is what we would refer to as the median where half of the things are to the left and half of the things are to the right well that's what it means to be right skewed and as it says the tail on the right side is longer and that means that the mean is greater than the median and that's what it means to be a right skewed distribution left skewed distribution well that's the opposite it that's where there's more things that can occur on the left and if you click on that it'll give you that description just like so high curtosis if you click on high curtosis it will indeed correctly tell you that high curtosis distributions have heavy tails and a sharp Peak indicating more frequent extreme values so the idea is is that so let me adjust it so it looks where that I want it to the idea is is that you kind kind of imagine there is being a horizontal line on this graph right like so and the idea is is that for that bottom one which is the high curtosis there's more stuff happening on the edges there than on this picture there's less stuff happening on the edges on the low curtosis picture what do I mean by the edges as you get further away from the central point there's more further away from the central point on this one than there is on this one can you graphically see what I mean by that so more stuff's happening As you move away from the mean kind of cool with this so we also have then uh some graphs we have some additional graphs of high curtosis low curtosis and similarly if you want to click on them clicking on them brings up then little things so since I just reviewed these I'm not going to re- re-review these because there's actually more interesting pictures when I go to our next section and our next section is we're going to go to sampling and sampling is going to be where I'm going to say we will have the best time at hopefully trying to understand actually what the central limit theorem means so let me get this nice and big on the screen okay so we're going to use these These are little interactive widgets and we're going to use these to help to try to reach an understanding of what the central limit theorem means okay now these don't make sense without me telling you how to interpret them okay so let's suppose that we had something that was able to take on the following values it was allowed to take on the values of zero and it was allowed to take on the values of one and those are the only two numbers that it was allowed to ever be now could you come up with an example that of something that would be like that of where it's only allowed to take on the values of zero and one what something that kind of sounds like that flipping a coin now a coin only takes on the values of heads and tails but guess what the fact we call it heads and the fact we call it tals is kind of made up and imaginary right so we just as easily could call it zero and one bless you okay so now for zero and one what's the average value of 0o and one well how would you get that answer you would say we'll take 0 + one and divide by two which is the total number of values so what's the average value that that takes on0 five right so here's the thing if we wanted to know what should we expect to happen between zero and one the answer actually is one half now let's pretend okay this is not a coin can we all agree that this this is in fact the lid of my coffee cup and it's not a coin let's pretend it's a coin I'm not going to throw it because I don't want it to get gery uh but let's pretend that this is a coin and let's pretend that this is heads so this is heads and this is Tails when you flip a coin what is the thing that you should actually expect to happen well the answer is you shouldn't expect it to be heads and you also shouldn't expect it to be tals go well hold on a second I thought those were the only options well what you should actually expect is is that because this side and this side are equally likely it should land like that every time anybody ever flip a coin and it lands exactly like that perfectly I did one time you know how it happened I flipped a coin on top of a Muddy Pond so it landed straight in the mud it actually happened it blew my mind it was insane I was calling people later that day you never believe what happened it was awesome okay but the reality is that's actually what should happen because if this side and this side are equally likely it should land like that every single time but it doesn't right okay and the reason why it doesn't is because things don't actually happen perfectly the way that we imagin them to happen that's the only reason why that doesn't happen all the time okay so that really means that when we get 1/2 as an answer for this kind of a question it does make sense it's just not practically important because 1/2 is not one of the options that's allowed right it's either zero or it's one which is kind of like saying it's either heads or it's Tails it's not halfway heads and tails but that's what the one/ half means it means it's halfway between heads and tails okay okay now right here what we have is we have something that's going to give us an explanation of what the central limit theorem means now right here this line is the value of2 and up here we're going to take samples of this experiment of quote unquote tossing a coin and we're going to look to average all of the numbers we reach okay so this one that's right here means that the first time we flipped the coin we got a one which we can think of as heads it doesn't really matter whether we label it heads or tails but let's consider one heads I'm now going to increase my sample size ever so slightly and we'll see if I can drag it over to hopefully get like two okay that was not too there we go okay so I flip the coin now twice where am I getting the twice from with two numbers I got a heads and it turns out the second time I flipped it I got another heads now what happens if I take the average of those two numbers what's the average of one and one it's one let's now add a third number turns out I flipped it three times each time I got what each time I got one the average of all those numbers is one okay now these red dots are the answer to the successive averages at the moment are they close to what we said should be what we get when we flip a coin well we said what we should get is we should get 1/2 at the moment are they close to 1 half not at all okay now we added nine numbers in so first time we flipped it we got Heads next time we flipped it we got Heads next time we flipped it we got heads then we got a Tails then we got a Tails then we got a heads then we got a Tails then we got a heads then we got a heads but we did that as numbers so we got a one one one 0er 0 1 0o one one and what's the average when we take those numbers it's 667 which is closer to what we said we should get right which is2 okay now we're talking about 36 numbers I dragged this the slider over so here we see that the averages that we're taking they can move up they can move down but at the moment are we close to the number we said we should reach we're much closer to the number we said we should reach well now let's really bump it up a lot are we closer to the number we said we should reach yep we're closer to the number we said we should reach now we're a little bit less but are we still closer we're less than the number but we're closer than we were before and so that's the idea the idea is is that when we take averages of the samples the samples are the things that we actually collected what happens is is that we get really close to what we would expect the value to be and this is one of the Le this is one of the ideas of what we call the central limit theorem now the other way to look at it is that if we map out the frequency as we increase the sample size our frequency distribution which is what this is we'll make it so that we keep getting close closer and closer to looking like a normal distribution cool cool cool okay so that covers sampling for what we needed to know so the basic bless you the basic and most important idea we need to know know is that when we take averages the average makes us get closer to what we would sometimes refer to as our theoretical mean so just like for a coin flip our theoretical mean is 1/2 as we continue to sample more and more and more values we're going to get closer and closer to that number for what actually happens as our average okay so we have already talked a little bit about variance and there are two concepts that are highly related which are variant and covariant so they both have to do with relationships between variables now variables are one of the things that we use throughout mathematics and so a variable you can just simply think of as an unknown that's what a variable is but that's not something that's a really easy kind of thing to conceptually understand so I have a way of describing in simpler terms what a variable is so I have a special analogy for describing variables would you like to hear my analogy for describing variables okay now before I tell it to you I have to warn you that every analogy is flawed so the problem with using analogies to explain things is that analogies aren't perfect and you'll be able to poke holes in how it is that it explains the concept okay so you just have to suspend that for the time being and now is where I then tell you what the analogy is variables are cats boom that's the magic reveal okay how many people own a cat how many people have ever owned a cat and how many people have had any kind of serious contact with and or been around cats okay much higher number okay so one of the things I absolutely love cats by the way so you might probably pick this up by now so one of the things that are is true about cats is that they may or may not greet you when you come home and in fact most cats will not greet you when you come home they will acknowledge you maybe but they're not really going to greet you when you come home okay but even for cats that greet you when you come home the one thing that is fairly universally true is that at some point or another maybe every single day maybe multiple times s per week your cat will do something which is that it will hide somewhere and you will not have a clue where the cat is and you will call for the cat now and here's the thing that new cat owners really need to know the cat hears you and more often than not sees you and then even worse knows that you're calling for it and is choosing to ignore you now it's a really important thing also for anybody who's a new cat owner to realize the cat is not trying to be a jerk it's actually trying to be very helpful because the thing is about cats is that they happen to think humans are really clumsy giant cats because in most of the animal kingdom one species doesn't choose to cohabitate with another species so cats are like well the only reason this person is living with me and that's the way cats see it by the way they see you as living with them the only reason this person is living with me is because they must be a cat and then to add to it they happen to think you're terrible at catching things and they think that you're you probably have like the worst senses in the world so they see it as their job to train you so one of the things that they do is they hide because they're trying to have you find them and and honestly a lot of the times they're very judgmental that you didn't find them and they're like I was right over there couldn't you smell me I could smell you I smelled you as you were walking up to the door and so they're trying to train you to be a better Hunter that's very nice of them it's very generous well variables are like cats in the following way cats could be anywhere they could be anything anywhere who knows but there are two types of cats or at least two main varieties of cats how many have ever seen a sphinx cat so if I said it's a hairless cat do you know what I'm talking about so Sphinx cat Okay so that type of cat is the type of variable that we typically use throughout most of math most variables that we use in math are like Sphinx cats and the variables that we typically use in math are what we would refer to as deterministic variables so like for instance if you remember from algebra class however anciently long ago that was for you if you remember something of where you would have had like a yals x^2 that's an example of what we would call a deterministic variable and the reason why we would call it deterministic variables is because if I tell you what the x is you know exactly what the Y is if I told you that y was equal to x^2 and the X was 3 that means you automatically know that the Y is 9 that's just the way it works everything is determined and that's why we call it deterministic probabilistic variables they don't work that way what's an example okay well an example is so let's imagine that I had a coin and I put that coin in my pocket now I know the only Poss possibilities on what it can be correct I know that it's either heads or tails okay but let's pretend I have a coin in my pocket can you tell me what it is no can I give you information that would let you know what it was apart from directly saying at this moment it is tals no the only way for you to actually know at any point in time exactly what it is is for it to be exactly perfectly totally told to you there's no information I can give you ahead of time that would let you deduce what it was and so for that reason probabilistic variables are not just any old type of cat they're what so I think they're called Prussian cats so has anyone ever seen those extremely fluffy cats they're just like giant balls of fur they're adorable so that's what you have to imagine for probability variables and because the thing about those giant super fluffy cats is is that even when you have them in your hands you kind of don't know where any of their limbs are it's like is that a leg or is that a tail I can't really tell because there's too much fur for you to tell at all what's happening and that's the way it is for a probability variable if I hold a coin in my hand is it heads or is it Tails I don't really know I mean I know the possibilities the same thing for the cat I know the possibilities are it's either a hand or it's a you know or it's an elbow or it's a tail or it's the head I don't really know what I'm grabbing though same thing for the coin I don't really know what I'm grabbing and if I really precisely look I might know for a minute but the thing about cats is is that they move so the very next second it could be something different so that's why I analogously say that probability variables are like very fuzzy cats and it's a helpful analogy or at least so I am told by previous students for the past number of years and so that's a reference I'll make frequently is that probability variables are like very fuzzy cats and it's a statement that makes no sense unless you've heard my explanation okay so when it comes time to talk about then the variance the variance of a probability variable is kind of like how much does the cat move around which you could also think of as like the the range that the cat moves through but it's really what we would call the spread now the co-variance is a related concept it's a way of connecting okay not just when you have one C but when you have two cats how do they connect to each other in terms of the way that they move okay so let's suppose that you had a cat over here and a cat over here I'm totally using this analogy too much but I'm going to run with it let's suppose I had two cats and every time this cat moved this cat moved as well then that means that the two cats are kind of synced up right so like if this one moves this way this one also moves this way if this one jumps this way this one also jumps this way and so we would reference those two cats as having a high covariance okay but what about if it was now like this where if this cat jumped this way it startles this cat and this one jumps that way which is the opposite way that's what we're going to label as negative covariant meaning it kind of has an opposite reaction from each other so if we have the two cats right here and this one goes that way it like jumps over that one which what way does this one have to go that way to kind of like flee away that would be a Nega coari okay so that's the sign kind of like analogist setup to these things now let's actually talk about what these actually have as formal definitions so formal definition the co-variance and variance are two statistical measures that describe the relationship between two variables the co-variance measures how two variables change together that's like what I was saying in terms of the two cats how they interact with each other well correlation correlation is going to be a related concept to co-variance correlation standardizes this relationship okay so what do I mean by standardizing well let's imagine the following suppos again I'm I'm keeping with the cat analogy just got to roll with it hopefully like so anybody happens to be allergic to cats I can switch through different but I don't think that matters because I'm just talk okay so imagine right here I had two kittens kittens are cats right and let's suppose that what happens is is that this one jumps a little bit that way it's a kitten so it's not going to jump that much and pretend there's a table gear so they're not just jumping in M air this one jumps a little bit this way and they're positive Co variance between them if this one jumps this way which way does this one have to go also this way they track together okay what about if there's a negative to variance between them and this one jumps this way which way does this one have to jump that right now let's imagine now instead of tiny little kittens let's imagine that they're giant Siberian tigers so giant Siberian tiger here Giant Tiger here let's suppose that there's positive co-variance between them this one jumps 10t that one what should this one do jum Fe that way right positive coari okay is there a difference between when we had the kittens and when we had the Tigers kittens do you think they're going to jump 10 feet probably not okay but we also though can say but hold on a second the Tigers were kind of proportionally much bigger right so we went a way of adjusting when was it compared to the actual size of it when was it a big movement versus when was it a small movement because for a Siberian tiger is a 10-ft jump a big jump I'm not requiring anybody here to be a tiger expert but is a 10- foot jump a big jump for for a Siberian tiger no he not would it be the most athletic in the entire world if it jumped 10 ft yes it would also be terrifying as the kittens own you'd be like H then you'd wonder if you'd accidentally stole coffee in the kittens you know aable so absolutely so we need a way of adjusting for when are the reactions between the two things truly big reactions and when are they reactions that aren't in fact that big so that's going to be where we're going to have this thing called the correlation which talks about the same things as the co-variance but adjusts for the size of the variables involved so conceptually do these things make sense we haven't really like talked about any kind of numbers with them but by concept does this kind of makes does it kind of track what I'm talking about pretty good okay so here I give an explanation of where I talk about you know if two people are walking that's kind of basically like the same thing as the cat analogy I just used but let's go with what is a business example of these okay so I just got done talking about cats a whole bunch and unless you happen to be Petco you probably don't care that much about cats as a company so a company might want to understand the relationship between their advertising spending and their sales revenue co-variants would tell them whether increases in advertising spent that's a type of it should be spent not spend between advertising spent tend to bless you tend to result in increases or subsequently decreases in sales revenue so in other words we would want to know if I make the advertising more does that increase sales that's a pretty important question right that's actually like the key of any marketing research if I spend more on this particular marketing campaign will it increase sales that's a really important question okay so understanding the co-variance there is a formal definition of it and you can write it out by hand I will tell you however I don't particularly need you to ever write it out by hand the big thing I will need from you is your ability to you to do the following which is to calculate the co-variance using Excel okay so next time what are we going to do well next time we're going to generate an example that we will do in Excel of where we will calculate aod Co variant and then we will look at trying to interpret how a couple different examples what they actually mean based upon the covariance that we find then what we're going to do is is that then we're also going to try to understand the correlation from looking at that those same examples and Computing their correlation but the big thing is is that if you keep the analogies that I use today in your head it will make all of this make so much more sense so does that sound like a good plan so I think this is one of the rare instances of where you get to be told at the end of a class please between now and the next class think about cats