hello welcome to this lesson of mastering statistics here we're going to start to dive into the essential core concepts of statistics that you'll be using forever and ever as long as you study statistics here we're going to talk about random variables and discrete probability distributions you read that title it looks intimidating it's not a variable it's a random variable and it's not just a probability it's a discrete probability distribution see these these words can scare the life out of a lot of students and a lot of books are super hard to understand with what these things really mean and you look at equations and formulas and lots of drawings and then you can just go nuts I'm gonna make it very very easy for you to understand just take a deep breath realize that this is very simple by the end of this lesson you'll know exactly what a probability distribution is what makes it discrete what a random variable is and how it is different from a regular variable in algebra so we need to go through some definitions we are gonna take a little bit of time with this so let me write them down and then I promise I will make sure that everything is absolutely clear about by examples so here we have the idea of a random variable okay you know in algebra a variable is just something that can change or take on different values values variable the word variable means that it can change but in statistics a random variable is a variable whose value is determined by a random experiment you know and when we say experiment in the context of statistics I'm not talking about beakers and test tubes I'm talking about your do something like an experiment might be drawing two cards from a deck of cards what's the chances that I'm gonna get two aces or two fours or whatever there's a way to calculate that now if I repeat that experiment with new decks you know or maybe I shuffle the cards back into the deck to make a complete deck I can do the experiment again well did I get two fours that time no okay I can put it back in do it again well eventually I'm gonna hit two fours eventually if I do the experiment enough I have a deck of cards and I draw two cards at the time eventually I'm gonna get two fours I may not be on the first try may not be on the tenth try may take me fifty-five tries or 200 tries but eventually it will happen so when I say random variable it doesn't really mean that it's randomized like a like a random number generator it just means that the outcome of the experiment is that is is basically going to be assigned to a variable so the variable in this case could be what do i do what do I get when I pull two cards from the deck well the value of that variable quote-unquote the value or the outcome of that variable is determined by a random experiment or some kind of statistical process could be flipping coins it could be pulling stuff from a deck of cards it could be observing population and asking a survey or something it could be lots of different things a random variable is just the outcome of an experiment and you do the experiment a bunch of different times and you get a bunch of different outcomes so the random part just means that you're doing it over and over again you don't really know what's gonna happen ahead of time unlike in calculus or in algebra when you solve for a variable the variables always gonna be you know if you have a simple algebra equation you can solve that equation you get an answer answer might be two right that's the variable you solve for the variable that's the answer but in statistics the variable really depends on the experiment and if I do the experiment four times I might get different answers okay now what I'm interested in is statistics is what happens if I do the experiment a thousand times can I draw a conclusion is the random variable one to take on some values more likely than others if I do this experiment a lot of times but anyway the concept of the random variable is just I do an experiment and then I do it again and then I do it again and then I do it again each time I get a result usually I'm looking to see the specific outcome of something and we're calling that a random variable because it might change each time I do the experiment now I'll give you some more concrete examples in a minute let me show you what the idea of a discrete probability for probability I'll just do prob like that and for distribution I'll do dist like that so this is discrete probability distribution it is a table or a formula that lists the probabilities for each outcome of the random variable X now notice also the random variable we're calling X usually in statistics it's a capital X usually you know in algebra you have a little X or you could use Y or Z in algebra but in statistics when it's a random variable which just means it's an outcome of an experiment that's all that a random variable is you use a capital X no curve the X's or anything like that that's for algebra calculus and statistics we use capital X that means a random variable so don't worry about the word discrete I will explain that in a minute but the probability distribution focus on that it's a table or a formula that lists the probabilities for each outcome of the random variable X so as we talked about with the idea of shuffling a deck of cards we're going to get different outcomes when we draw two cards we're going to get different outcomes now sometimes we might get totally random things sometimes we might get two you know a pair when I get a pair which means two cards are the same value right now there may be different probabilities associated with all those different outcomes and the distribution of what those probabilities are is what we're talking about it's called the probability distribution most of the time it's represented as a table in terms of the discrete probability distribution here again I'll talk about the idea of a discrete in a second so or a concrete example I hate just you know putting something on the screen you know forever like this what I want to do now is I want to give you a real example and show you what a distribution a probability distribution is it's going to be much more concrete than a bunch of words like this so for an example of a discrete probability distribution let's say let's flip three coins so we're not just flipping a quarter we're flipping three quarters at the same time so we go flip flip flip and we observe the results right at time so then we will let random variable X be the number of heads Shelli okay so that's our experiment then we're going to say here is the discrete probability distribution of this experiment so I'll make sure you understand what's happening I've got three quarters in my hand this is the easiest way to do this three quarters I throw them up they all go up at the same time they all land on the ground sometimes I'm gonna get all heads maybe I'm really lucky I get all heads sometimes I'm gonna get all tails but most of the time it's gonna be a mixture of those two I'm gonna get some heads in some tails sometimes I'll get one head and two tails sometimes I will get one tail and two heads now what we're trying to do is figure out all the different outcomes of this part of this experiment when I have three coins all the different outcomes and what is the likelihood of getting any of these different outcomes the likelihood being the probability we've all kind of did a little probability refresher in the previous section so the way to do this a lot of times especially in the beginning is to draw a picture now you can't do this with a lot of problems but you can do it in the beginning to solidify so when I do this experiment I already said that sometimes you know if I'm really lucky I'm gonna get three heads in a row so let me just kind of write up here this is three heads because notice the experiments as I flipped three coins at the same time let the random variable X be the number of heads number of heads showing the random variable that I care which means the outcome of the experiment that I care about is just how many heads are showing that's what I care about so in this case I write down if this Falls I get three hits but of course I could have head head tail showing in which case I only get two heads right but I could also have heads tail heads like this which will also be two heads showing right but then I can also have head-tail-tail in which I only have one head showing right and then that kept a tail head head and then I have again two heads showing then I can have tail head tail which will only give me one head and then I can have tail tail head which is one head and then I can have tail tail tail which is the other way I'm really lucky let's say three tails in a row and I get zero heads all right effectively this kind of is a raw representation of the outcome of the experiment now remember the random variable I care about is how many heads so for every single possible outcome I have outlined every outcome that can possibly happen no other outcomes can happen other than what I've written on the board there's one two three four five six seven eight possible outcomes and for every possible outcome I have figured out what my random variable value would be which is the at which means the outcome of the experiment of whatever I care about now notice that there's only one way in which I can get three heads only one way that that can happen there's also only one way that I can get three tails so when you look at this and if you're if you're interested in the number of heads then you can just tell by looking at this that the odds of getting three heads is lower than everything else because there's only one way on the board one combination that gives me three heads but if you look at the number two I get two heads here two heads here two heads here so that's a little more likely to get two heads also I get one head here one head here in one head here that's also pretty likely because there's more ways in which that can happen and then getting zero heads is also pretty unlikely because there's only one way which that can happen so you can see that by flipping these three coins and if I'm interested in the number of heads sometimes you can just tell by looking at this is going to be more likely to get a mixture of heads and tails where I have two of one and one of the other so this is sort of the raw information but when we write the probability distribution it says it's a table or a formula that lists the probabilities I haven't listed any probability here because I just listed the raw outcome if I want to write down the actual probability that I might do something like this so I'll do X X means the outcome of the experiment notice this is like a curse of X the outcome of the experiment can be 0 heads 1 heads 2 heads or three heads and I'll just remind you over here I'll put heads so I'm gonna make a little table here just like this alright this is a little table now those are the possible outcomes of the experiment so what would be the probability of my random variable capital X equaling lowercase X this is a lot of the ways you'll you'll see it written in a book basically you're saying what's the probability that I'm gonna get a random variable equal to zero a random variable equal to one random variable equal to two and a random variable equal to three we've written all this stuff down on the board but we didn't write it in terms of probability remember I said in terms of basic probability the probability of an outcome is the number of ways in which I can get what I want like when you know flipping a coin there's only one head and there's two possible outcomes here there are eight possible outcomes so if I want the probability of zero heads it's only one way in which that can happen divided by eight possible outcome so the probability of getting zero heads is 1/8 and I can convert 1/8 to a decimal just like you know 0.5 or 0.3 or whatever and I will get that all right what it would be the probability of getting 1 head well I get 1 head here 1 head here and 1 head here so that's three ways in which I can get what I'm looking for which is 1 head out of eight possible outcomes so there's 3/8 again that's a number less than 1 that's a decimal right what's the probability of getting two heads I get two heads here two heads here and two heads here again there's three possible ways to get that and there's 8 possible ways in which my experiment can end and then finally what's the probability of getting three heads I only have one way in which that can happen and there's eight possible outcomes this is what we call a probability distribution and furthermore it's called a discrete probability distribution I'll explain what that means in a minute but basically on your exam on your test or on your quiz if the teacher gives you a problem like this and they say create a probability distribution this is what you're trying to write now some sort of table like this where basically you want to list all the different outcomes and the probabilities that can be associated with those and you can look at this in a glance you know you could sort of see it from here but you can look at this and really powerfully see that the probability of getting one head or two heads are equal right the probability of getting one head or two heads are equal even-steven 3/8 okay the probability of getting zero heads or three heads is also equal that kind of makes intuitive sense as well at but it's a lower probability than this one so I can look at this as a table and I can look at all the different possible outcomes and see that they all have different probabilities of course some of them are the same number but basically there's a different probability in each column right there so we're looking at the distribution or the spread of the probabilities among all the possible outcomes of this experiment now what makes it discrete what is the word discrete mean I'm not talking about when you tiptoe in the room and you're very discreet about it I'm talking about in terms of math or geometry discreet means that something is finite something can only take on certain values those are discrete values so for instance when you think about integers those are discrete numbers one two three four those are the integers zero negative one negative two those are also integers they're just negative so every whole number like that negative plus positive we call that an integer those are discrete numbers okay but when you start dealing with real numbers with everyday life you can have lots of numbers in between zero and one in fact there's infinite numbers between just between zero and one there's infinite numbers just between 3 & 4 there's infinite numbers because you could have three point four five six seven eight nine ten you know whatever you can have infinite decimal points to split up everything between an integer pair on the number line right so that would be called continuous when you look at continuous values you have a problem where you can take on values that that really can have any number of decimal points that would be continuous but this is not continuous because the outcome of the experiment can only have certain values you can only get zero or one or two or three heads you can't get two and a half heads or three point to five heads in this experiment you can't it just physically you're throwing coins and you look at the results so because of that you're only gonna have one two three four columns and so you can only have four probabilities you can't have anything more or less than that so this book called a discrete probability distribution when you have values where you can write it down in terms of a table and capture everything and that's all you can get that's called discrete let me jump ahead just a little bit and tell you that most of the time in statistics we're not going to be dealing with discrete probability distributions we're doing this to mostly show you the idea behind what a distribution is but in real life what if I wanted to look at a you know the probability of someone having a certain height or what if I took a survey and looked at everybody in the world and looked at their height right and I tried to figure out what would be the probability distribution of getting different heights of people right well some people are going to be 4 feet tall it's real some short some people are going to be 7 feet tall that's real tall and some people maybe even be taller or shorter than that right some people may be in the middle more like 5 or 6 feet tall but there's always going to be people that are not exactly 6 feet tall they may be 6 point 1 to 5 feet tall there's always gonna be people that are five point seven nine five feet tall in other words if you look at somebody and try to figure out the distribution of world Heights there's going to be basically almost infinite values in between because height is something that's not discrete height is continuous you might have somebody that's four feet five feet four point five feet four point seven five feet four point eight feet there's many many values that height can take so when you look at the real world and you're trying to take surveys of people - do and do problems with the probability of of certain height distributions in the country or the length of you know people's finger there's gonna be infinite number of answers there and so it's not discreet whenever you're talking about real world problems a lot of times that's the bottom line so as we go forward we're going to be continuing to talk about the probability distribution concept but there won't be discrete there'll be what we call continuous and we'll get to that a little bit later but for now just remember this is a discrete probability distribution there's one thing I want to point out to you that's extremely important okay here we said that probability of our random variable having zero heads is 1/8 probability of getting one head is 3/8 probability of two heads is 3/8 probability of 3 heads is again 1/8 let me show you something what is 1/8 plus 3/8 plus 3/8 plus 1/8 well you have to bust out your fraction math notice that all of the denominators are 8 so that's convenient so all we do is we put in 8 on the bottom when we add fractions if the denominator is the same 1 + 3 is 4 4 plus 3 7 7 plus 1 is 8 so we get 8 over 8 and that equals 1 when we do fractions 8 divided by 8 gives you 1 so what I'm trying to say is the sum this is extremely important actually the sum of all of the probabilities in our distribution equal 1 they say that again the sum of all of our probabilities in our distribution equal 1 that's very very important that means that if I do this experiment I'm always gonna get one of these results that's what it basically means if I'm looking at the probability of this or this happening or this happening or this happening I add them all together I'm guaranteed to cover every possible outcome because these are all the outcomes that will happen so when I throw 3 coins and I get a result I'm guaranteed to get one of these because the sum of all of the probabilities of all the outcomes is 1 which means 100% chance of happening all right like one percent chance of rain so that's going to be true of all distributions all probability distributions I should be able to look at the probabilities and sum them all up and they should always one because I'm trying to cover all possible outcomes of the experiment to guarantee that it will happen all right let's do another one real quick we're not going to beat it into quite as much detail but I think it's a very good problem so I want to do it for you so what would be the probability distribution for the random variable X that's capital X notice which would be the sum of two rolled dice right two rolled dice and so when I'm talking about dices I'm saying one die has six phases so we can get one two three four five six off of one die and then I roll a second die I can get one two three four five or six off of the second die so when I roll a pair like that and I add the two together so I'm saying the sum of the two rolled dice sometimes I roll two ones and I'll get an answer of two sometimes I roll two sixes I get an answer of 12 but much more commonly I'll get other combinations and I'll get the sum I'm trying to figure out what's the probability distribution of getting different sums whenever our roll dies so if I wanted to do that I would create a probability distribution I'll try to squeeze it in here it's a lot of answers but basically I would say the probability of my random variable capital X being equal to all the possible values that I have here what are the possible values that I can take well if I roll a pair of dice I can never get a 1 as a sum because I'm always gonna have to die and I'm always gonna have at least one on each so you can never get a sum of one but you can get a sum of two right so I can give a sum of two I can also get a sum of three and give us some are 4 5 6 7 8 9 10 11 in 12 the maximum sum I can give us 12 if I roll 6 and 6 so the sum of all of the different calm waves in which I can get if I roll 2 dice is really gonna be one of these answers I can never get a 1 and I can never get a 13 or higher so let me draw this like this and let me make this into like a little table because remember our discrete probability distribution should look like a table almost all the time all right so let's go and look at this here let's just take a couple of we're not going to do work out the answer for everyone but let's look at a couple let's look at the number two how can I get the number two right the only way in which I can get a number two is to get a one on one dice plus a one on another dice right I can get a one plus one if this ended up being 1 plus 2 then it wouldn't add right so the only way in which you can get 2 is 1 and 1 so as far as combinations go there's only one way in which I can get this guy to add up and there's 36 different combinations of the ways in which the dice can roll the reason there's 36 of them it's because there's 6 faces on one die and 6 faces on another die 6 times 6 means there's 36 ways in which the two dice can land all right 36 different possible outcomes in which they can land only one of those outcomes will yield a sum of 2 okay similarly look way over here when you look at the number 12 the only way I can get that is 6 plus 6 there's no other way I can add to 12 because the dies only go up to 6 all right so there's only one possible outcome here to give me a 12 but there's 36 possible ways in which the diet can be rolled and then we'll just do this one I sort of a last one if you look at the number 3 how can I get the number 3 how can I get the number 3 well one died I can I can get 1 plus 2 ok and the other die I can get 2 plus 1 those are the only ways in which I can roll a die roll this pair of dice to get the sum of 3 it's the only ways in which it can happen I can get a 1 on dice a and a 2 on dice B or a 2 on dice a and a 1 on dice B so there's only two possible ways I can sum to 3 again out of 36 again out of 36 so you can see what you end up having to do through here I would have to go look at the number 4 and I'd have to go figure out okay how can I get the number for two to be some together and you'll find that there's only three ways in which to sum to number 436 in fact I can just show you real quick how can we get to the number four well we can do one plus three we can do three plus one we're good do two plus two there's the only ways in which I can add to the number four so there's three outcomes that yield a sum of four there's 36 possible outcomes so I'll just fill in the rest of the chart for the number five there's four ways in which that can happen out of 36 outcomes this one is five ways in which it can happen this one here is six ways in which it can happen the number eight there's five ways in which it can happen the number nine there's four ways in which it can happen the number ten is three ways in which it can happen and the number eleven is two ways in which you can happen this one's pretty easy to understand eleven the only way I can add to eleven is 5 plus 6 or 6 plus 5 there's only two ways in which that can happen so there's two outcomes notice the denominator is always 36 because we're doing probability we're trying to say how many ways in which my outcome can happen divided by or with fraction bar over how many possible outcomes are there there's always 36 outcomes so now I can look at this this is a discrete probability distribution this would be the answer the reason it's discrete is because my experiment can only end by getting one of these answers it can only sum to 2 3 4 5 6 7 8 9 10 11 or 12 you can't sum to anything else and you can never have values between 7 & 8 or between 10 and 11 that could be valid outcomes you just can't because this is a discrete experiment with a discrete outcome as opposed to continue with outcomes like we discussed earlier with heights of people and things like that we'll get to those a little bit later so this would be the answer and again I think I showed you over here when we talked about this probability distribution I said the some of these always have to add to be equal to 1 well the some of these should always add to be 1 as well and if you think about it all I would have to do is add all of these together so if I wanted to add I could say well the sum is going to be equal to something over 36 because 36 is the common denominator so then I have to add one plus two plus three plus four plus five plus six plus five plus four plus three plus two plus one I promise you if you add that together you will get 36 which will equal one that means the sum of all of the outcomes of the experiment have to give you a hundred percent chance of happening so if I do this experiment I'm covered by every single possible outcome because there's no other outcomes there and so if I add them all together I should be covered by all of them there's one final thing I'll show you here we're getting the probability of each possible outcome of our random variable we're also showing that we sum all of these things together and we're covered because the outcome of all of those sums have one or a hundred percent the other thing is and we'll get to this a little bit later if I asked you what would be the probability that the sum of the dice that are rolled is less than or equal to three okay if I asked you that if I said not normally when I'm asking you a question on a table like this I'm saying hey what's the probability of getting a 7 when you sum them together and you get a 7 what's the probability of getting a 10 if you sum them together what's the probability of getting a 12 if you some of them dinner but I can also ask things like hey what's the probability of getting a sum less than or equal to 9 or a sum less than or equal to 5 let's say if I wanted to ask you what's the sum or what's the probability of getting a sum less than or equal to 3 that would mean it would be okay if I get a 3 as a sum or a 2 as a sum so I could say the probability of getting less than or equal to 3 would be the probability of this plus the probability of this which would be 3 36 3 over 36 I would just add the probabilities together so if I wanted to say hey what's the probability of getting a 6 or fewer or 6 or or a lower sum then it would be here all the way down I would just add all these probabilities together so looking at a table like this you can look at individual outcomes and see the probability you can also quickly calculate the probability of getting less than something or great something just by summing the probabilities of the outcomes together we're gonna do a little bit more of that later this has turned into a lengthy lesson I'm gonna stop it here we've covered everything we wanted to cover but it's so important because the rest of the course is really going to be built upon doing probability distributions and understanding different kinds and distributions and we'll see as we get a little bit farther the details involved in that but basically you have to understand backwards and forwards what a probability distribution is and I hope I've explained that here with a couple of examples so this knowledge and this kind of mental image that you built of what a distribution is of probabilities will carry on as we do all of the different kinds of problems forward so kind of engrain this in your head make sure I understand these concepts follow me on to the next lesson where we will continue working with statistics and distributions and using those concepts to calculate useful quantities in statistics