Continuing with chapter four in this video, we're going to talk about covariance. So we're down here. Covariance is a measure of the linear association between two random variables.
This word linear is really, really important to the concept. We'll talk about that a little bit more in a second. So like we had with variance, you have the formal equations for discrete and continuous.
And then you have. what I would consider the easier equation down here. For covariance, this equation is going to be significantly easier. So we're going to center our work around that.
So we're going to jump into an example and then we'll talk more about the theory behind covariance once we kind of understand what it looks like. So we have this table of data looking at the number of blue refills and the number of red refills for a ballpoint pen. We've used this data before. Back in the expected value video, we actually did the expected value of xy for this data and found that that was equal to 3 14ths.
And now it wants us to find the covariance. This is from our expected value video. If you need to revisit that, please do.
So now it wants us to find the covariance of x and y. So our covariance formula, the simple version is sigma x y. And notice that for covariance, this is not sigma squared. That's because we have sigma x y down here.
So the x times y is kind of giving us that squared. But this is how you write covariance. So the covariance is equal to the expected value of x times y.
minus mu x mu y. So we have our expected value of x y right here from our expected value video where we calculated that already. Now we just have to figure out our mu x and mu y. So mu x is going to be equal to the sum of x from 0 to 2 because that's what our x values range from up here.
And we're going to take this times x times g of x. And so we are using our marginal distribution of x times x. So we're going to be taking the 0 and this 5 14ths and 1 and 15 28ths and 2 and 3 28ths to find our expected value for x. So I'm going to take a second, pause the video and write that out. Same with the mu of y.
I'm going to go ahead and write that out instead of talking through it. And I'll be right back. All right.
And we're back. Here we have our mean of x equal to three fourths. And then I did the same with our y, took y times our marginal distribution for y, h of y, and got that our mean for y is one half.
So for y, I took these values and multiplied by these values over here. So now we have everything for our formula. So we're just going to go ahead and plug it in. Sigma xy is equal to three fourteenths minus... three fourths times one half, which is equal to negative 956. And that's it.
So that's our covariance for x of x and y for this problem. And so what we can take away from this is, whenever we have a, if we have random variable x and random variable y, If they move together, whether that's positively or negatively, we're going to end up with a positive covariance. If we have a random variable X and random variable Y that move opposite of each other.
So if X moves positive, Y moves negative or vice versa. we're going to end up with a negative sigma xy. So what we can tell is there is some sense of linear association between these variables and they're moving in opposite directions.
So we can also kind of see that here. Our x values are getting slightly larger as the x gets larger and our y, h of y, is getting smaller. as the values get larger.
So we can see that they're moving opposite and that's reflected with our negative covariance value. And then another concept to take away here is if x and y are independent, so we've talked about statistical independence a few times, but if the two values are independent of each other, we are going to get a covariance xy that's equal to zero. However, if your covariance equals zero, does not mean independent.
And this is where that linear association is super, super important. So if you get a covariance equal to zero, all that means is there's no linear association between the two, but they could still be dependent on one another in a nonlinear fashion. They could have a non-linear association.
So a non-linear association, they could be related in a bell shape. They could have something like that. But regardless, they are they could be related in a non-linear fashion where if.
they are independent, we're going to end up calculating that our covariance is equal to zero. But if we find our covariance is equal to zero, it does not mean that they are independent. So make sure that you have that straight in your mind.
And that relates back to the linear association. That's what our covariance tells us. And then lastly, important to know, I'm going to write it up here.
Your sigma xy has units. And the units are going to be your X units times your Y units. Our example problem doesn't have units, so I can't help you there. But just know that it does have units. And because it has units, it cannot be compared universally.
So often we want to be able to say, oh, yeah, it has a strong association. But because of these... funky units, we can't really tell much from it. So that's going to lead to our next video, where we will talk about how we can understand in the universe what the association means.