Transcript for:
CDFs and PDFs in Distributions

in this video we're going to talk about cumulative distribution functions the cumulative distribution function or CDF these functions are used to calculate the area under the curve specifically the area to the left of some point of interest now these functions they're used to calculate the accumulative sadar on the accumulated probability now keep in mind whenever you have a continuous probability distribution and the probability is equal to the area under the curve and since the maximum probability is 1 the total area under the curve will be 1 as well now probability density functions are different from cumulative distribution functions the PDF or probability density function is f of X and it tells you in the shape of the distribution so for instance let's say if we have a uniform distribution in this case f of X will equal a specific number it's going to be a constant value f of X is going to be 1 over B minus a for a uniform distribution it's a number it doesn't have the variable X in it you'll see something that looks like this so this is f of X and it varies from A to B so this right here is the PDF that is the probability density function for a uniform distribution it's equal to f of X now the CDF the cumulative distribution function will give us the area under the curve to the left so let's say the point of interest is X the CDF will give us the area to the left of X so the area of the shaded region in blue now this is basically the shape of a rectangle the area of a rectangle is the base times the height notice that the base is the difference between X and a so it's X minus a the height is f of X and f of X is what we see here B minus a so the CDF the cumulative distribution function will give us the area to the left so we have this formula the probability that the random variable capital X is less than or equal to some number is going to be X minus a over B minus a so this function right here highlighted in blue this is the CDF that is the cumulative distribution function for a uniform distribution it gives us the area to the left of some value X and the PDF Isha's f of X for uniform distribution that's 1 over B minus a now let's talk about the exponential distribution let's discuss the CDF and the PDF for this type of distribution the exponential distribution the maximum value is the y-intercept and it decreases over time the maximum value is also known as lambda a rate parameter the probability density function for an exponential distribution is going to be the function f of X which describes the shape of the distribution and f of X for this particular distribution is lambda e to the negative lambda times X and lambda is the rate parameter it's 1 divided by the mean so this is the probability density function it tells you the shape of the graph that's what it does in addition it also gives you the height of the curve above the x-axis at some point X that's another thing that it tells you now the CDF gives us the area under the curve to the left of some value X so let's say if we wish to calculate the area to the left so that's going to be a oh that's going to be the probability that X is less than or equal to some value and that's going to be equal to one minus e to the negative lambda times X so that is the cumulative distribution function it gives us the area under the curve to the left so if we wish to calculate the probability that X is between 0 to X or let's say a 0 to a this function will help us to get the probability it's equal to the area under the curve now what if we wish to calculate the area to the right as opposed to the area to the left we know that the total area is 1 so the area to the right plus the area to the left must also be 1 so the area to the right is 1 minus the area to the left and we know the area to the left is what we see here it's 1 minus e to the negative lambda times X now if we distribute the negative sign that is this negative sign it's going to be 1 minus 1 and then these two negative signs will cancel giving us a plus sign so plus e to the negative lambda times X so 1 minus 1 is 0 thus the area to the right is going to be simply e to the negative lambda times X now this is not the cumulative distribution function because it doesn't give us the area to the left it gives us the area to the right so the CDF is what we see here that's a CDF for an exponential distribution now let's say if we still have an exponential distribution but we want to calculate the area between a and B so we want to find a probability that X is somewhere between a and B how can we calculate the area under the curve between those two points so this area highlighted in blue is equal to the probability that X is between a and B and that is the difference between the probability that X is less than B minus the probability that X is less than a so these are areas to the left so we could use a CDF to calculate the probability that X is less than being a probability that X is less than a the probability that X is less than a would be the area shaded in red the probability that X is less than B would be the area shaded in green if you subtract the area shaded in green minus the area shaded in red you'll get the area shaded in blue so using the CDF function we can replace that value with this is going to be 1 minus e to the negative lambda instead of X we're going to replace X with B so this will give us the area to the left of B now we need to use the CDF function to get the area to the left of a and that's going to be 1 minus e to the negative lambda times a so this is how we can calculate the probability between a and B using or if we have an exponential distribution another thing that you want to keep in mind is for a continuous probability distribution the probability that X is greater than or equal to a or and rather less than or equal to B this is the same as the probability that X is between a and B so you want to keep that in mind there's no difference here another thing that you want to keep in mind when dealing with continuous probability distribution functions is that the probability that X is a single value as opposed to an interval of values let's say the probability that X is equal to a that's going to be 0 the only way to get a value greater than 0 is you need a range of X values because at x equals a all you have is a line you can't really calculate the area of a line because you have height but there's no whiff so therefore you can't calculate the area the probability that X is exactly a is 0 so just keep that in mind to review remember the PDF the probability density function is f of X it tells me the shape of the graph if it's an exponential distribution a normal distribution or uniform distribution the CDF the cumulative distribution function it gives you the area to the left of some value it can give you the area to the left of a or to the left of B but it gives you the area under the curve which is the accumulated probability up to that point so that's the difference between the PDF and the CDF thanks for watching