in this video we're gonna learn how to calculate the one-way analysis of variance or ANOVA by hand now this is admittedly a very laborious process there are lots of steps to follow there's lots of formulas to learn but I'm gonna walk you through all of that we're gonna go over all the formulas you need and I'm gonna show you some simple steps you can follow which will take a little bit of work to master but you can follow these steps to make sure you get the right answer every single time when you're calculating the one-way analysis of variance to start let's take a look at the test statistic that we're after this is the F ratio just in generic terms here this is our test statistic so the Z test has the Z test statistic the T test has the T test statistic but the analysis of variance have the F ratios or the F test statistic and this is called a ratio because look at the generic formula here it's a variance divided by a variance it's an analysis of variances and specifically it's a ratio of two different variances so let's break this down a little bit because the approach that we're taking with the one-way analysis of variance is very different than the Z or t-test world's that we were in before so in the numerator we have the variance due to group differences this is what we really care about right we're looking at differences between more than two groups typically so we want to know are there significant differences between any of those groups does the averages for example of whatever you're looking at vary significantly between one group and the next that's what we want to know now we have to control for something though we have to divide by something to see if whatever differences between groups we're seeing are really meaningful and that's where we get this sort of a little not really helpful term here but variants due to random chance now this is what we're controlling for and I'm gonna break that down a little bit so let me start to translate what you're seeing here to get us closer and closer to the actual formula for the F ratio so the numerator hasn't changed much the variance between groups but in the denominator we have the variance within groups now imagine if you're a researcher who's interested in looking at whether there are significant differences between many groups you really care about whether there are differences within a single group right different individuals might differ from each other who are all of the same group for example and some people might be interested in that but you don't care about that for our purposes in this study we're we're interested in between group differences the within group differences we're gonna simply call error or random chance differences that we don't care about differences among individuals within single groups so this is how we control or sort of look at whether the between group differences are really meaningful so here's how you can interpret the F ratio over all the variance between groups which is our numerator will be small if there are no treatment effects if there are no differences between groups leading to a small F statistic and therefore a non significant p-value another variance between groups will be large if there are treatment effects or if there are differences between groups leading to a large F statistic and a significant p-value the p-value of course coming typically from a statistical software so here's the actual formula really the first formula in a series of formulas that build on one of one another but here's the formula for the F ratio and as you'll see this is gonna actually be the final step when we're doing the problem which we'll get to in a little bit so we have M s between over m s within two terms you probably haven't seen before so let me give some context here M s refers to the mean sum of squares we've learned a little bit about sums of squares in the past this is a little different than what we calculated for example when doing sample variance or sample standard deviation this is a little bit of a different sum of squares and it's also a mean sum of squares it's an average sum of squares between groups and this is how we measure the variance between groups and then in the denominator we have the mean sum of squares within groups which is how we measure the variance within groups so we're just translating what we saw in words on the previous slides to an actual formula we can compute but as I said all these formulas have formulas of their own so M s between is calculated by dividing SS between your sum of squares between by DF between your degrees of freedom between so mean sum of squares is literally a mean sum of squares think about what a mean is you take a sum you add up some stuff and you divide by how many well that's kind of what we're doing here but it's sort of a very different look on the on the surface but that's what we're sort of inspired by so the mean sum of squares is the sum of squares divided by how many which is degrees of freedom so we have four between and with ins very similar formulas here m/s between equals SS between divided by degrees of freedom between four MS within we have SS within divided by degrees of freedom within and all four of those terms SS between and within degrees of freedom between and within have their own formulas again this is why it gets a little bit overwhelming let's start with the easy part the degrees of freedom this should typically take you 20 or 30 seconds to figure out on the left here the degrees of freedom between equals K minus 1 where K is simply the number of groups or levels of your factor if you want to get technical with the terminology so if you're analyzing in this one-way analysis of variance differences between three different groups k equals three so your degrees of freedom between would be three minus one or two again it takes just a few seconds degrees of freedom within is just as easy NT here stands for n total your total sample size among all groups in your entire study minus K the number of groups so if we had say 30 people in our study and there were three groups your degrees of freedom within would be 30 minus 3 or 27 so that's the easy part of the problem let's get to the hard part the sums of squares this is where 95% of the work is if you can calculate sums of squares which is what I'm gonna really focus on and teach you the steps for the rest of the problem takes about two minutes and it's really easy so you can see just on the surface it's a little bit more complicated we're even seeing things we probably haven't seen before at least in this video series for example we have Sigma Sigma we've seen Sigma many times before but we've never seen Sigma Sigma and so that's something I'm gonna help you again through steps figure out how to calculate I would really recommend not just trying to go straight for these formula you know you're just gonna get wrong answers follow the steps because some of these things like Sigma Sigma are not really too obvious how you would get straight to alright so let's get down to business the steps for calculating the sums of squares now here's the general structure of how these problems work we're gonna start by calculating some statistics within each group so just for group 1 we're gonna find a couple statistics and then just for group 2 some statistics and then just for group 3 or 4 however many groups you have just those groups individually and then once we have those within group statistics then we'll do the across group stuff the between groups stuff so for the first part of these problems the within group stuff here's all the values you need first of all you need the sample size within that group notice this is not n total this is just n so you'll end up with N 1 n 2 n 3 however many groups you have sample sizes for each group then you'll have Sigma X which means the sum of all the values in that group so again you're gonna have one of these terms for every single group the sum of the values of Group 1 the sum of the values of group 2 and the sum of the values for group 3 this third term actually isn't necessary to find F but it's a really helpful term just to find in general for interpretation purposes as I'll kind of describe here so x-bar is your mean so the mean of group 1 and then the mean of group 2 and then the mean of group 3 or 4 however many groups you have this again doesn't go into any of the formulas I've shown you so far but it's really helpful to figure out if there are differences where differences are coming from so let's say for example the group 1 mean is 7 and the group 2 mean is 7 and the group 3 mean is 84 ok so then it's pretty obvious from looking at these means that group 3 is driving the differences here is driving your significant result so I always recommend looking at the means but again if you just want to get the F test statistic you can skip this step altogether okay second of all or fourth of all I don't know where I got second then you take for each group again we're within each group we already have n we already have Sigma X you take those two values and you compute this you take that Sigma X you calculate it here you square it and you divide by the number of people in that group and again you'll have one of these terms for every single group and then you have the sum of the squared values within each group so this is why I always recommend when you're calculating one-way analysis of variance one of your very first steps should be squaring all of the values creating three new columns if you have three regular columns three groups of data create three new columns that are just squared versions of each of your original columns of data and again I'll illustrate through a sample problem very soon okay so once you have these values within each group then it's time to do the across group stuff which again everything builds on what comes before it and the same is true here so the first thing you need to find is n T well this is pretty simple right if you have the N for each group the sample size for each group to find the total sample size across all groups you're just going to add up all those sample sizes within each group this is also where we get Sigma Sigma X right we already have all the sigma x's for each group so to find Sigma Sigma X we're going to add up all the Sigma X s so the sum of each group's values and then the sum of the sum of each group's values right so just to visualize this imagine you have a sum of the values for Group one plus a sum of the values for group two plus a sum of the values for Group three that's where you get Sigma Sigma X next you're gonna see here this term looks ugly but look it's just the things we already calculated before it so we already have Sigma Sigma X this is one value by the way we're no longer you know in the world where you have a value for every single group we added all the group values into one single value so we already have Sigma Sigma X we're gonna square it and then we're going to divide by the total sample size and then we have two terms left so to get all of this term here so Sigma Sigma x squared over n we're going to add up the Sigma x squared over ends for each group and finally you might be noticing a theme in order to get Sigma Sigma x squared we're going to add up all of the Sigma X Squared's so we're gonna take each groups value here and we're going to add them up into one big value now notice these three terms so visualize these three terms take a mental note these last three terms that we got these are the three terms that get plugged into the formula for the sums of squares between and within so following these steps naturally leads you to the three terms you need for the SS formula by the way it's three terms if I haven't mentioned already because this term is repeated twice it's once in these sums of squares between and once in these sums of squares within formulas so just I would take a screenshot of this slide I also post a link to a PDF guide that has all of these formulas and how to use them in the description of this video but I would make a note of all these formulas because you're not gonna want to memorize these there's too many right but well what I will mention here is that these formulas build on themselves where you're typically gonna start at the very end and then you're gonna work your way up so the F ratio is gonna be your very last little you know division to do right you're not gonna start with this top formula you're gonna start with the degrees of freedom and then the sums of squares and then you're gonna use those two things to get the MS the MS and then you're gonna use those two things to get the F right so again once you've found some of these terms like the sums of squares all that is very easy and quick you just have to get used to it but let's do a practice problem to illustrate all right so here you can see I have a bunch of data I've already gotten us started a little bit you can see for example I have three different groups I've data for Group one data for group two and data for group three and I've also squared each of those groups values so here's Group one squared Group two squared and group three squared so for example this first person here in Group one got a six on whatever we're talking about right and then they're squared value is 36 the first person in group two got an eight and they're squared value is 64 and the first person in group three got a nine and they're squared value is 81 and then what I would recommend is just creating three sections on your page where you're gonna list all the statistics within each group because there's gonna be a lot of statistics floating around and you really want to keep them separate just label everything use subscripts as I will I think that really helps just keep everything straight and that's what we're gonna do here so first of all let's start with the within group statistics so the first thing we needed to find was the sample size for each group so N 1 the sample size for Group 1 equals 8 we have 8 people here right 1 2 3 4 5 6 7 and 8 for n 2 we also have 8 people and for n 3 we have you guessed it 8 people okay now you can also look at the means I'm just going to go ahead and do that now because again I think it's kind of useful so you can see for group 1 the mean if you were to actually calculate this out is 5 point 6 to 5 for group 2 the mean is again I'm keeping subscripts straight here I think that's really helpful 6 point 6 2 5 and for group 3 X bar sub 3 equals eight point one two five so already it looks like we have some between group differences right group 3 seems to be higher than group 2 by quite a bit and group 2 seems to be a little bit higher on whatever dimension we're looking at than group 1 and so the analysis of variance is going to ask whether some of those differences might be significant or not okay so we already have the sample size the next thing in that list of steps I gave you that we'll need to find is the sum of each groups values now I'm looking first of all just Sigma X we will soon have to find Sigma x squared as well but that's again another story so for group 1 if you were to add up 6 plus six plus five plus eight plus seven plus six plus three plus four Sigma X would give you 45 if you were to do the same thing add up all the values the regular X values not the squared values yet in group two eight plus seven plus six plus so on you would get 53 and finally be some of the values in the third group if you were to add up nine plus eight plus nine plus nine would be sixty five all right we have a couple more terms to find within each group and then we can get to be between group differences so we need to find Sigma X sub 1 squared over N 1 now this is just taking 45 squared over 8 and if you do this again it's just the two terms we already have you're going to get 250 3.125 okay we're going to do the same thing for Group two here Sigma X 2 squared over the sample size for group 2 we're gonna get again 53 squared over 8 and this term here will come out to be 350 one point one two five one more here Sigma X three squared over the sample size for Group three is going to be 65 squared over eight see the math is pretty simple and it all builds on what came before it as I've been saying a few times but it's really just about following the steps five twenty eight point one two five that's your last term here so we only have one more thing we need to find within each group before we start adding up stuff across groups we need to find the sum of the squared values for each group so we have Sigma x squared for Group one notice the squared is within the parentheses which tells you it's the sum of the squared X values equals if you were to add up 36 plus 36 plus 25 plus 64 plus 49 and you would get 271 now let's add up all the squared values for group two so Sigma x squared for group 2 equals 359 this is just adding 64 and 49 and 36 and so on and last but not least Sigma x squared for group 3 537 all right now we get to the good stuff now we can start adding up to get those final terms across groups so the first term we need across groups is n total n sub T now that we already have the ends for each group and sub T is very easy to find it's just the sample size of Group 1 plus the sample size of group 2 plus the sample size of group 3 8 plus 8 plus 8 which comes out to 24 next we have that term Sigma Sigma X which again probably looked foreign in the beginning but now you should kind of have a guess of how we find that we already have the Sigma X for Group 1 Sigma X for group 2 and Sigma X for group 3 so to find Sigma Sigma X overall across groups we're going to add those three terms up so 45 + 53 + 65 which comes out to 163 so now all that's left is to find those three key terms that gets plugged into the sums of squares formulas which then gets plugged into the mean sum of squares formulas which then gets plugged into the F ratio I told you there's a lot of formulas but hopefully you see that this math is actually pretty easy and it's really just a matter of practicing it over and over again so the first key term is going to be Sigma Sigma x squared over N total well this should be pretty easy right we already have Sigma Sigma X and we already have n total so to find Sigma Sigma x squared over n total we're gonna do 163 squared over 24 and this is going to come out to our first key term let me clean that up a little bit our first key term of 1100 and 7.0 for the fours aren't cooperating here there we go all right so next we have just two more things we have the Sigma the sum of Sigma x squared over N so look here we already have Sigma x squared over n for each group here's Sigma x squared over n for group one Sigma x squared over n for group 2 and Sigma x squared over n for group three so to find Sigma Sigma x squared over n we're just going to add those three terms up so 253 point one two five plus 350 one point one two five plus so I'm getting a little small here I just want to make sure we have enough room for the rest of the problem plus 528 0.125 this gets us our second key term for the formula for the sums of squares one thousand one hundred and thirty-two point 38 all right last term you can hopefully kind of guess how to get this one based on everything we've learned so far so Sigma Sigma x squared means we're going to find we've already found excuse me B Sigma x squared for each group so to find Sigma Sigma x squared we're going to add up each of those values so Sigma x squared for group one two hundred and seventy one plus sigma x squared for group two 359 plus Sigma X for Group three 537 which gets us our last term 1167 okay now we have the degrees of freedom which again should be pretty quick and then we'll just be plugging in for the rest of our lives here right so we have degrees of freedom between I'm just gonna put B instead of between to save some time here which equals K minus one try and remember what is K equal the number of groups right we have three different groups so k minus one is three minus one which is two and the degrees of freedom within is going to equal n t minus K our total sample size we already have right it's 24 minus the number of groups which is three is gonna equal 21 now we can plug into our sums of squares formula right you might want to look back at your notes to see or I guess not necessarily your notes but maybe either the PDF I have linked below or maybe scroll back in the slides here a little bit to see those formula for these sums of squares between within because I'm not going to rewrite them but for sums of squares between we're gonna plug in this first term here one one three two point three eight which is what we found right over here minus one one zero seven point zero four which gets us 25 point three four and SS within equals one one six seven again this term right here I'm just pulling these these terms and literally just plugging them into the different parts of the formulas one one six seven minus one one three two point three eight which gets us to 34 point sixty two all right sigh of relief at this point in time because we have 99% of the work done we just have a couple formulas to plug into and then we're done notice we already have sum of squares between and degrees of freedom between sum of squares within and degrees of freedom within so to find m/s between we're gonna take our SS between divided by degrees of freedom between now again it's not really a lot of extra math we're just taking values we already have and dividing them so for MS between we're gonna have 25 point 34 divided by 2 which comes out to 12 point 67 and for MS within we have SS within divided by degrees of freedom within which comes out to thirty four point six two divided by twenty one or one point six four congratulations you're down to your last last formula here so here we have F equals M s between over m s within which comes out to twelve point six seven divided by one point six four and your f test statistic is seven point seven zero you're done this is it that is the test statistic or the F ratio for the one-way analysis of variance now I'll note that you can also calculate the effect size for in one-way analysis of variance it's very simple to do once you already have all your terms it's simply taking your sums of squares between SS between / SS total which is just between plus within and that's going to get you a measure of effect size called ADA squared you can look that up if you'd like some more information I just want to briefly mention it because note we already have SS between and we already have SS between and within which we can add together to find the total so if you want to calculate the effect size it's pretty simple to do and it just takes a few seconds so that's how you calculate the one-way analysis of variance and the effect size for the one-way analysis of variance