Understanding One-Way ANOVA Basics

Well, hello there. Today we're starting chapter 14. So welcome to In Class Activity 14A. And I know we've talked about two population means before and testing to see if they're equal or if there's a difference between the two of them. Today we're going to expand that a little bit and we're going to start talking about multiple populations, maybe three or four, to see if there's a difference in the means. And this test is called ANOVA, Analysis of Variance.

So here we go. All right, so to get started, let's do the warm-up. A researcher is interested in comparing the average weight loss over a 12-week period between individuals randomly assigned to one of four groups.

So one group just dieted. One group dieted and did assorted cardio routines four days a week. That's a lot. And then one group did cycling and dieting four days a week. And the last group did a combination of dieting and strength training and cardio four days a week.

So we've got three different groups are getting different treatments. I'm imagining that they're other than that, everything's going to be equal. I imagine they're going to be randomly assigned. I hope so. And the question is, explain why a one-way ANOVA should be considered in this situation.

So one-way ANOVA So one-way ANOVA is perfect for this because you're looking at multiple groups and you're looking at averages. And so not sample proportions, but you're looking at average, you're wondering about the average weight loss in these groups. So that is exactly what the one-way ANOVA test is designed to do.

So I'll just jot that down any which way I can. The researcher is interested in comparing group means, and not just for two groups, for four. groups. So if you've got multiple groups, this is the way to go.

So determining, so what we're going to be working on today, and we'll be looking at this case study to help us understand, determining whether to reject or fail to reject, I say keep. the null hypothesis in a one-way ANOVA must involve more than the comparison of the group means. So it's a little more complicated than it's been in the past, but we'll use technology and the same beautiful simple ideas are going to be evident for hypothesis testing.

You're going to be introduced to something called the Error sum of squares and the group sum of squares. Those are the other things involved in deciding if the means are equal or not equal. There are statistics that represent the variation within groups and between groups.

So how much noise is going on? How much chaos is there in the group you're studying? And then compare that to how much differentiation, how much variation there is from group to group to group. So don't worry.

We'll have an example. We'll make it better. We're going to discuss variation within groups and between groups.

And we'll be using good old graphical displays that you are actually really familiar with. So let's get to it. The first thing, and this is based on the assumption that you've already done the preview activity, is to state, write the null hypothesis for this situation, and also be sure to define each parameter.

So there are four groups, four samples. representing four populations. So you better have four parameters.

And all of this, everything that we're talking about has to do with means, average means. So the four parameters that we're interested in are averages. So we've got these four groups of people up here, boom, these four groups.

And we're going to assume that all of the different treatments, whether it's dieting only and dieting in a particular in cardio and all the way to the maybe the most extreme dieting and a combination of strength training and cardio, we're going to assume they all have the same effect, maybe even no effect in weight loss over the week. So I'm going to say H naught is going to be the population average for the first group. which we'll say is this one, is equal to the population average for the second and the third and the fourth.

So I need to make sure to define each one of those parameters. So mu one, if we look up here, I'm not going to try to color code it the way I had before. I'll just do different colors. So mu one is the population, the true mean value. weight loss for those who just diet.

Okay, so mu two is going to be and I'm going to cheat true mean weight loss for those who, what do they do? Diet and do cardio. So if we could get everybody in the world or those who are interested in to for 12 weeks doing this particular diet and cardio. Oh, now that I look at that, it looks terrible. So I will write it out.

True mean. weight loss for those who you could put this on fast speed. You probably learned that by now. So what did the third group do? They did cycling.

So true mean weight loss after the 12 weeks. for those who cycled. But don't forget they also dieted too, right? Diet and cycle. Last one, Mu4, what did that group do?

They did strength training and cardio four days a week. true mean weight loss for those who diet strength train they were very busy and did cardio all in the same specific time of, I think they said it was 12 weeks. Yeah.

Okay. But this right here, this is the hypothesis. We just went ahead and said what each of those are. So age not just like all null hypothesis is about. quality.

And HA is going to be against that one. So it's a competing idea. And what you want to think about for HA is you want to think, what can I do?

What is the least I can do to break H naught? I want something that shows H naught is not true. And I want it to be the very least I can do.

So do you have to show that every one of those is different from every one of them? No, the least you need to do is to show that one of them is different than the rest of them. So instead of having the rest, we'll just say one is different from another. And the good news is that I know you guys are tired.

It's the end of the semester. You've been working really hard. The null hypothesis for this test. and the alternate hypothesis is going to be the same. It's not going to change.

So here it is. At least two of the means, we won't list them all again, are different from each other. So that's the weakest statement that least you can do to show that the null hypothesis is not correct.

So I'll just highlight that. So and highlighted in happy yellow, because if you're doing an analysis of variance, this is the way the hypotheses look. There's no less than greater than there's nothing there's there. It kind of is a not equal. It's just.

And maybe it's mu1 and mu2 are not equal. Maybe it's mu1 and mu3. Maybe it's mu1 and mu4.

It could be they're all not equal, but at least two of them do not jive with each other. Okay, so before conducting a formal hypothesis test, which we'll do in the next section, the researcher would like to visually assess the data. So what are some good ways to look at visual assessments of the data?

Look at a graphical display. The following box plots, dot plots, so we've got box plots and dot plots, compare the distributions for each group. The sample means are in each group. They're going to be displayed as is the grand mean.

And so the grand mean is if you just dumped all the results into one one long column of data and took the mean of all of them mixed up together, not paying any attention to groups. The sample mean for each group in the grand mean. Okay. Oh, they'd say the grand mean is 17.1 pounds.

Wow. That's a lot. That's, that's a lot to lose in 12 weeks. I'm impressed.

Is the mean of all of them. Okay. So here we go. Here is a possible box plot, a possible dot plot.

So it doesn't look like there looks like the groups are pretty small. So what do we have here? 123455 people just dieted 12345 people did diet and cardio. So it's five, five and five. Okay, and so My favorite picture is the box plots.

I really like those. And this is a good review. So let me ask you, when you're looking at the box plot, is this right here, is that the mean of the data?

What does the box plot depict? No, it's the median. So these are the individual medians. And we want to know if the means are the same. So the nice thing is that they actually computed the means for us.

And they're above the box plot. So you can see them there. The first group who did just dieting, they lost an average of 12.8 pounds.

The second group did 16.2. The third was 18.3. And the people who did everything, who did cardio and strength training, as well as dieting, they lost the most, which was 21. one pounds. And if, and we're going to keep in mind that if we dumped the whole data set in and didn't pay attention to groups, the overall mean would be that 17.1. So there's the box plot.

I'm going to focus on the box plots. The dot plots are nice, but I really like to look at the box plots because I'm going to have box plots on the final. So it's a good review. So based on these graphs alone, so we haven't learned any fancy hypothesis testing.

We don't have a test statistic yet. We're just looking at the different means. Based on the graphs alone, does it appear that there's visual evidence that the diets differ in the average weight losses?

That is, is there visual evidence to reject the null hypothesis in favor of the alternate hypothesis? So first, yes or no, and then we'll explain. So what do you think? What's your thoughts on this?

I'm going to say yes. Those groups look really different, don't they? Yes, there seems to be evidence.

There seems to be evidence. Okay, now notice I'm not saying to identify the groups. You don't, and in fact, we don't want to do that. So what's the evidence? Let's be specific on this.

The sample means all look different. These all look different to me. So I wonder if that's too light. So I'll do the explanation in a different color.

The sample means all are actually different. So that might indicate that the population means are different. And what's...

Very interesting is there's no overlap between these groups. These groups seem, if I look at this, this group, that's the inner 50% of that data set. This is the inner. There doesn't seem to be a lot of overlap.

So the sample means all look very different and there doesn't seem to be much overlap. They seem like distinct groups in all the groups. So, it's possible that the last two groups, that this group right here, and this group right here, maybe they have the same mean, but I'm willing to bet that this group and this group don't. And remember, it only takes one to break it.

But when you reject the null hypothesis, if you can, you don't conclude, you don't go out and just say, it looks like something's different. You don't go out on the limb and conclude more than that. But I'm going to say yes. Um, so the conclusion would be, these are really different groups. So I'm going to reject H0.

Okay. Suppose the results are different than those. We don't want to look at the peak ahead first.

So these results were saying look pretty promising, but now we're going to mess with it. Suppose the results are different. than presented in question four.

An alternate result is reflected in the following box plot. Okay, so I want to make this bigger so we can see it. And here's the box plot. Oh, that looks kind of different, doesn't it?

So I was talking about not much overlap. There seems to be, oh, let me see this. All the different, and there doesn't seem to be, doesn't seem to be much overlap in question four. But question five. the groups are kind of blending together a little bit more.

And our overall grand mean, if we look, if we extend this, it's definitely possible that the yellow, the brown and the blue group would share the same population mean. Remember, the sample mean might be slightly different than the population mean just by natural fluctuation. And this one right here has a lot of spread to it. So it's conceivable that the samples, if the green sample was underperformed a little bit and the blue sample overperformed a little bit, it might look like they're different when they're actually not.

And what's, but what's shocking here. Based on the graphs alone, does it appear there is visual evidence that the diets differ in average weight loss? So remember the average weight loss, the averages are right here. They're not actually in the box plots, but I think the box plots give us a really strong sense of what those averages might be.

So. Based on the graphs alone, does it appear that there's visual evidence? I don't want to say there doesn't appear because I still, the green one looks quite different than the rest of them maybe, but I'm definitely going to say there's less evidence. So There's, you know, it's not like the first group where you're like, oh, I'm absolutely sure.

There is much less evidence indicating that the true population means. for the groups are not all equal. There's evidence to show that maybe, and in fact, at least two are not equal.

That's the competing idea. What is it that's compelling you to say that? What's the reason?

Why? Let's be specific. What's messing everything up in this box plot is that there's a lot of variation within the groups.

Each group is quite sloppy and they're all spread out. There's quite, I wonder if I'm overstepping my, a lot of overlap. between the distributions of the values. And by overlap, I mean, look at the brown and yellow.

There's lots of overlaps between those two groups. And then the yellow and the green, there's lots of overlaps in the brown and the blue. So it's conceivable that they're all kind of blending together. And that looks quite different.

from the first one. So let's, I'd like you to be a little bit of a detective and in problem number six. So look at you, if you can look at them simultaneously, compare and contrast the displays in question four and question five. So specifically what's similar.

How is this set similar to this set? And then once you're done with that, how are the sets different? So what do they have in common?

And I don't mean, well, they're all box plots because any data can be made, any quantitative data can be made into a box plot. Why don't you pause the camera and really look at this or the video and really look at this and see what characteristics are different and what characteristics are the same. Okay, I hope you paused and you really looked at it.

What's similar? And I apologize, I'm going to shrink it way down if I can. So I can answer this.

Oh, it doesn't look like they're going to let me. So I'm going to come back to it. But what I noticed, shock, shock, shock, is that actually, remember, you're trying to assess if the means might be different, the population means.

Well, in actuality, this population mean is identical to this population mean. And this population mean is identical to this yellow. population. And so is this one is identical to the brown.

And so is this one. It's the same value. So the means are actually all the same, the sample means in terms of green to green, yellow to yellow, and so on. And not only that, but the overall mean, if you dump everything together from the one, four groups all blended together is identical.

So the means both the sample means with from group to group to group and the overall means are actually all the same that's shocking because what you think oh the scenarios are so different but they're not actually so um all corresponding group and grand means are equal. I'm not saying that the green and the yellow are equal. I'm saying the green of question four is equal to the green of question five and so on.

And the grand is equal to the grand. So that's how they're similar. So what is it that's making them so different? What overwhelming characteristic? If I'm looking just at the box plots, it's going to be the length of the boxes are different.

Every one of the lengths of the boxes for the question four are smaller than the lengths of the boxes on question five. And so that is a measure of variation. So the variation within the groups are different. And consistently, the ones in question five are bigger.

The variations, and I'm going to say IQR is what I looked at, but you could look at the spreads as well. You could look at the spread of the dot plot. See this spread compared to this spread? Way more spread out.

this spread compared to this spread way more spread out so maybe you do the range for the dot plots but i'm going to stick with the box plots so the variation iqr of the groups in four are smaller than the IQRs. in question five okay so sloppier question five is way more sloppier way more chaotic so um which one appeared to provide more convincing evidence well i kind of gave that away before um, it's going to be, so I'm going to answer that question, more convincing evidence. It's going to be question four. When I look at question four, I'm like, oh yeah, these are really different groups. And question five, actually the same center measures of center, but, uh, really different there.

They might not be all different. groups. Okay.

So it's question four. I hope you can see that. I'll just put it over here.

Question four provides more support that the population means are different. how they're different, we're not going to go out on a limb and say what that is just yet. So the second part of this question is, what might the differences suggest by making a conclusion about, wait, what might the differences suggest about making a conclusion about the null hypothesis in this one-way ANOVA? In both sets, the measures of center are actually the same. In the second set, there was more natural variation within each group that could explain the fluctuation.

So let me see if I understand this. What might the differences suggest by making conclusions about the null hypothesis in this one way? Inova? Which one provided more supporting evidence?

What might the differences? I just made sense to me before. What are they asking here? What might the differences?

Okay. So if we're looking at question five, different color, if we look at the graph in question five, the display in question five, the difference. in the sample means might be explained just by the natural variation of the sampling distributions. By the natural.

variation, fluctuations of the samples. I'm going to say within the samples. Four, you can't say that.

You can't go, oh, well, you know, the difference in these means. The difference in these means that could just be explained by just the fluctuation and that particular sample that you drew. But when you look here, you got a sample that's kind of all over the place for each of them.

Maybe if you went back and pulled other samples, they would come closer together, more conceivable on the second one. Okay. So Now we're going to get into some technical stuff a little bit. Um, the test statistic and the p-value are calculated by considering a ratio.

So we are departing from the model of test statistic that we had up to now. Up to now, the test statistic is a standard score. It's either a z-score if you're dealing with p-hat, or it's a t-score if you're dealing with x-bars. So now we're going to make up just the end of the semester, just for fun.

We're going to look at a slightly different test statistic. And that test statistic is a ratio of the variation within groups. I'm going to make that red. No, I'll make that green. The variation, it's a ratio.

I know this color, I hope the color coding doesn't drive you nuts, but I'm gonna, so it's a ratio and the ratio is made up of the variation within each group compared to the variation between the groups. That is when you're going to see the difference. That is, when there's variation between each of the groups is significantly greater than the variation within each group, we will reject the null hypothesis and conclude that at least two of the means are different.

And I want to draw a picture of that for you. We kind of already did. But let's say that I draw. And I go, you know, I did some samples. I haven't drawn the boxes out yet, but there are the, maybe we're going to do, what could it be?

Men, women, and children's foot sizes. So there are men's foot sizes. Here are women's foot sizes, averages.

And here are the children's foot sizes. Okay. And you get, you look at that and you go, oh, we haven't got all the results yet, but that's looking like they're very different.

I haven't provided a scale. I've only drawn a little bit of the, and I'm going to do it again over here. Oops, we're getting ready for another one.

Okay. So you look at that and you go, wow, they, that. that measure of center, the first one looks really different than the second one looks really different than the third one.

And you can get all excited about that. And then when you see the box plot, here's all the men's shoe sizes. Here's all the women's shoe sizes for our sample. And here's all the children's shoe sizes. Okay.

Well, those look like really different groups. Just to finish off that it's a box plot. And remember, it could be a modified box plot.

So we've got one outlier, one man with tiny feet. And then here is the women. And guess who that is? That's Bronwyn with her gigantic feet.

And then the children, that's Teddy and Delia. big family. They got it from their moms. So you can conclude this right here is a picture of that first paragraph where the variation between each group, which is the red lines, is huge. And the variation within the group, there's not a lot of variation here.

The variation within each group, there's not a lot. But uh-oh, we go back and we discover that here were the results of the men, women, and children. And there seems to be a significant variation from group to group to group.

But uh-oh, here's our box plots. hmm and the children, is it conceivable that every one of those actually have the same population mean? It's a little more conceivable. So in the second picture, This is this, this little paragraph right here, let's read it. However, when there's a significant amount of variation within the groups, that's going to be this, this, and this relative to the variation between the groups.

Well, I made it so they had almost the same variation. variation between the groups, those are the red lines, will have less evidence of a difference. So, and I'm just going to say, let's say here, purple, is it possible that all of these groups have the same population mean?

It's possible. We might, we don't see that purple result, but it's possible that, that the last group had a lower number. being those now keep in mind the red lines are medians but they could be related but can i draw a purple line through here and have it hit the boxes i can't so uh this one doesn't exist but here it's possible they all really actually all equal each other um and so that would support age not okay so that you is a picture of that one. Okay, so the statistic measuring the variation within the groups is called the error of sums of squares. This is calculated by summing the variation within the groups.

The variation within each group can be visualized. in the box plot as the size of the boxes. So that's the chaos of within the groups. And in for the dot plot, it would be the spread of the dot plots.

The statistic measuring the variation between the groups is called the group sum of squares. And it can be calculated by summing together, summing the variation between each of the groups, group means, and the grand mean. So you're measuring.

So it's not exactly the medians of the box plot at all, but it's helpful. This is a helpful clue that they might all be equal. And this is a helpful clue that maybe one of them is different than the other two. Okay, so given the information about the sum of squares, the error of sums of squares, which of the data sets, which of the data collection, question four or question five, do you think...

would have the greater sum of squares, greater error of sums of squares of values. So look at the two original, not what I just drew here, but look at these here. Question four or question five has a greater, bigger...

Error sum of squares. Well, it's going to be question five, right? It's going to be this one because the boxes are literally bigger. So I'm going to write there the data in question five is likely has. the greater error sum of squares.

And why? Don't overthink it. Because the boxes are more spread out. Spread equals variation. So the group, there's a lot more chaos inside each of the groups.

So I just want to put here, this indicates more variability within each group. Okay, and the last question, given the information about the group of sums of squares, which data do you think would have the greater group sums of square value explained? Let's look again.

Which one? The first one or the second one? So what the greater sum of squares indicates, it's measuring the variation between the sample means and the grand mean. Sample means and the grand mean. And if you look at, take a closer look, you see they're identical.

So that was kind of a trick question there, whatever the measure is, and we haven't shown you how to calculate it, but when you're looking at the, how much the, the means from each group vary with each other and vary from the overall mean to all the same measures. So. these values they are identical in both questions um because the group sum of squares is calculated by the sample means and the grand means.

And they are all the same in each set. Okay, so that's a little disappointing. But it's all relative because you're going to make that ratio of the change from group to group to group. divided by the variation within all of the groups. And so that's good.

So that it'll, it'll come out as significant in question for anyway. All right. So this was an introduction.

We're going to get more specific in the next section, but determine whether to reject or fail to reject the null hypothesis in a one-way ANOVA must involve more than a comparison of group means. So we're comparing the group means to the variation within those groups. One, the variation, the error of the sums is actually the, you're looking at how much spread there is within each of the groups.

And then the group sums of squares, you're looking at how much the actual means fluctuate from each other and from the grand mean. And we'll get into that later. But right now, what we really want you to be able to do is to write the null hypothesis, write the alternate hypothesis. And we'll discuss.

So we've discussed the variation within groups from the variation. between groups and that is what's going to be used to make our test statistic. Okay, all right, I will see you in the next video.

Take a break, do the practice, and I'll see you in the next section.

Transcript for:Understanding One-Way ANOVA Basics

Transcript for:
Understanding One-Way ANOVA Basics