Transcript for:
Understanding Probability Distributions(Lecture6 Distributions2)

This will be the second video on distributions. If you've not yet watched the first video on binomial distributions, please go back and do that before jumping into this video. Just a quick summary of our binomial distributions.

These are for determining the probability of two mutually exclusive outcomes, and we know that the shape of our histograms, which illustrate the distribution of our respective events, changes. in relation to both the number of sampling events that occur, as well as the probabilities of particular events occurring. Now the binomial distribution is just one of many probability distributions, and these follow the same rules as probabilities that we're aware of.

Probabilities need to be between 0 and 1, and they're always a fraction or a percent. One event cannot occur simultaneously with another event. And if we know the probability of one event, well then we can determine the probability of another event. And the occurrence of one particular event during one trial or one sampling period doesn't influence the occurrence of another occurring in a different trial or a different sampling period.

We can use our binomial distributions to illustrate some of these key rules. We see that each one of these bars represents a different event. and we know that all these bars are going to be between 0 and 1, and the summation of all the bars equals 1. Same is true regardless of what the probability of P is, whether it's higher or lower than 0.5%. Now, in some events, we might not know the value of P or Q before we go out and collect our data. This is pretty frequent when we think about science, in that we don't really know 100% what to expect.

We may have some prior knowledge, but oftentimes, especially for new studies, we're not quite sure what to expect. But we can go out and we can collect data. Oftentimes we call this a pilot study. And then we can use this to make predictions about the probability of events occurring in the future.

So let's take an example here to illustrate this point. Let's say that NOAA is interested in making some adjustments to the red snapper fishery in the Gulf of Mexico. The current size limit is 40 centimeters in total length, meaning that fish smaller than this cannot be retained. They can't be kept. But NOAA is interested in increasing the amount of fish that are caught and kept in order to increase participation in the fishery.

Currently, with the size limit of 40 centimeters, it's 40%. But they're interested in increasing this to 45%. They want to know what the new size limit should be. be in order to account for this change in the amount of fish that are retained when folks go out fishing.

So we can use our distribution here in order to answer this question. Now again a couple important things that we need to note again all of our probabilities are going to be between 0 and 1 and the summation of all our probabilities will equal 1. And then also important that we need to note important fact that we need to know is the total number of animals that we've caught for a particular sampling distribution. So what we see here on the x-axis is the length of our fish and then the y-axis is the total number of fish that are caught. So this is the total frequency histogram rather than the relative frequency histogram, but we can use this to determine the probabilities of each event that we're interested in.

Let's say that we're interested in the probability of catching a really really large red snapper, one that's 70 centimeters or greater. Well, we can take the total number of 90 centimeter red snapper that we've caught, and we can divide this by the total number of fish that were caught, and we can get the probability, which is quite low. We can get the probability of catching a red snapper that's 55 centimeters in length.

We take the total number of fish of that size, 15, divided by the total number of events, the total number of fish, 325, we can get that probability. We can go through and we can do this for any particular size. Now for the question of interest, we're interested in knowing how we should change our size limit to accommodate a greater proportion of our total catch. So We know that the current size limit is 40 centimeters in total length, and so this would include fish in all size classes that are 40 centimeters and larger.

So in order to determine the probability, we would take the summation of each one of those size classes, and then we would divide it by the total number of fish that have been sampled. So in this case, this is 130. We would divide that by the total number of fish, And we get, as we would expect, 40%, which as the word problem indicates, is the current number of fish that are retained because of the size limit. Now, what NOAA is interested in is increasing this probability, or the likelihood of keeping a fish, from 40% to 45%.

So we need to add some fish that are smaller than 40 centimeters in which to do so. So we can do some simple mathematics and determine what the total number of fish are in order to meet this 45% expectation, which would be 146 fish. Now, the next smallest size class, 35 centimeters, has a bit more fish than we would like in order to meet that goal. It actually leads to 165 out of 325 total fish being retained. if the new size limit was 35 centimeters instead of 40. This would lead to 51% of our fish being retained in the fishery, and maybe NOAA is a bit uncomfortable with this many fish because they don't think it's quite sustainable.

Now, the goal of this example is not to test you or to show you information on red snapper size limits, but it's to illustrate some of the utility that we can have when we have a particular distribution in order to not only determine what the probability of a particular event is, but make predictions about what we might see in the future. And this leads into the next part of this example. Now we know that there are a certain number of fish that are sampled during this particular event.

In this case, it's 325 fish. But maybe we're interested in knowing what the probability is of catching fish when we sample... a thousand individuals rather than a total of 325 fish.

In this case we're interested in fish that are 30 centimeters in total length. How many of these fish will we catch when there are a thousand individuals sampled compared to 325? Well we can do this by finding out what the probability is of catching a 30 centimeter fish similar to the previous example taking the total number of 30 centimeter fish, dividing it by the total number of fish sampled, and getting our likelihood of catching a fish of this size.

In this case it's 13.8%. We then take that 13.8% and we multiply it by the total number of individuals that were interested in sampling, in this case a thousand. And in this particular probability distribution, we would say that there's probably, there's a good likelihood that we're going to have 138 fish that are 30 centimeters in total length when we sample a thousand individuals. Let's say that we're interested in fish that are less than 30 centimeters in length because these are the smallest fish, maybe they need a bit more protection. We've still sampled 325 fish, in this case we're interested in all the fish that are less than 30 centimeters in total length.

So we need to take the summation of their frequencies and divide that by the total number of fish that we've sampled. In this case, this leads to 35.4%. So we can then multiply this out by the total number of fish that we're interested in sampling, and we see that 354 fish would be expected to be caught less than 30 centimeters in total length based on our current distribution when we sample 1,000 individuals. Now again, we're not aiming to answer specific ecological questions here, but just getting an understanding of how to go about using some of these distributions to answer different different science questions. So in order to determine how many individuals will be in a particular size or just how many different events will occur based on a probability distribution, we simply take the number of samples or the number of observations that are occurring and we multiply that by the probability that a particular event is going to occur and this will give us the expected number of times that particular event will occur.

Now for more information I would say check out section 5.7.3 of the text and also check out practice problems 5.11 through 5.18. We will spend some time in class working on this but you should feel pretty comfortable coming into the lecture. Finally, we can also calculate the mean of our probability distribution using the data from our histogram and the probabilities of these events occurring. So in order to do so, we can list out our particular events. So in this case it's the size of our fish and the frequency that these events are occurring.

This should look familiar, this is a frequency distribution table. We then will determine what the probability is of these different events occurring as we've been doing so. So we simply take the frequency at which a particular event is occurring and divide it by the total number of observations. In this case again we've got 325 fish so we take our frequencies and divide this by each one of our respective events frequencies.

and we can then determine what the probabilities are for each of these. Now in order to determine the mean, we will then take the value for each one of those events, in this case it's the size of our fish, we'll multiply it by the probability of that particular event occurring, and then we'll take the summation of that. So we can go through and multiply the x column times the probability of x column to get that value, and we can go through and then take the summation of this and this will give us the mean of our particular distribution. In this case the mean of our fish based on this distribution is a little bit more than 38 centimeters in total length.

Again we're moving through this a little bit quickly but we'll cover this a bit in class and you can go back through the example and practice this on your own. The next lecture we'll cover we'll discuss the normal distribution which again is another type of distribution with some special attributes.