And so we're about to round off Chapter One. What did we do so far? In Section 1.2, we established really foundational words when it comes to studying data objects of interest: population, sample, variable—numerical, categorical. Baseline ideas. In Section 1.4, we talked about how am I going to organize categorical variables. But lastly, in Section 1.5, a really important question is how. How am I going to collect this data? See, the 'how' that we will collect data is going to be a very important part to the type of results we get to have at the end of the whole study. See, how you collect data will get to indicate whether or not two different variables are either just somehow related or if one causes another. See, in statistics, we have this thing called causality where causality is simply those 'if-then' statements. If I do this, will this happen? If I go to Disneyland, will I be happy? Notice those are two separate variables. Where I'm starting is whether or not I will go to Disneyland, and where I'm ending is whether or not I will be happy. See, when it comes to statistics, we will want to study those two variables: the treatment variable is where you start. So, the treatment variable with my example was whether or not I'm going to go to Disneyland. If I go to Disneyland, and then what's the result? What's the response? What's the outcome? What do I end with? Well, the end is: Will I be happy? And so ultimately, when it comes to discussing case studies, these are the two variables we will want to study: the treatment and the outcome variable. So the big question is: why? Why do we want to do that? Well, it's ultimately because we want to see if the treatment variable will cause change in the outcome variable. We want to see if going to Disneyland, my treatment variable, will cause change in my outcome variable: will I be happy? Change. Change means you're going from one place to another; changing locations means you're going from point A to point B, meaning you are comparing. So, what are we going to compare? Ultimately, we are going to compare that treatment into two groups. Ultimately, you're going to take that treatment: to go to Disneyland or not to go to Disneyland. And ultimately break them into two groups: a treatment group—the people who do receive the treatment, who do have that characteristic of interest like getting to go to Disneyland—and you will compare them then to a control group: the group that does not receive the treatment, that does not have that characteristic. We'll talk to people who go to Disneyland; we'll talk to people who don't go to Disneyland. And then ultimately ask them both the same outcome variable: Are you happy? Are you happy? And you will then compare the two outcomes of the two groups. That is literally going to be the point of this statistics class: we are going to figure out how I can start with a treatment variable, put them into two separate groups, and then ultimately compare those responses. The treatment variable is always what you start with. The treatment variable will always come from the 'If' part of the statement. Now, in this particular problem, the treatment is literally using online homework. So how do you write that as a variable? Well, you have to write it as a whether or not statement. Why? Well, because as we saw from the prompt, some students are getting online homework, but some students are not. So, the treatment variable is whether or not a student is using online homework. Really quick: whether or not, is that a categorical or numerical variable? Yeah, definitely categorical. Yes or no is super important because those who say 'Yes, I'm getting the treatment' will then get to go in the treatment group, whereas those who say 'No' are the ones who then go in the control group. The treatment group are those who are going to get the concept we even want to test, which is using online homework. The treatment will always be the group that gets what we are interested in understanding: so, online homework. Whereas, the control group will always be those who don't. So in this case, those who don't use online homework: students who are going to do traditional homework. And so, the outcome variable is ultimately going to be a variable we will check from both groups. So what is the outcome we are looking for? For all of the students, what outcome are we looking for? Literally, what do you get at the very end of taking a class? It's your overall grade. Your overall grade is going to be the outcome variable. The overall grade is going to be your outcome variable. And so, the big question students ask is: Shannon, why didn't you include the word 'improve'? See, the word 'improve' is actually going to be what we'll do when we compare the grades of both groups and see. The act of improvement is going to be comparing if the grades of those with online homework are higher than the grades of those with traditional homework. The act of identifying improvement is what we are going to study in chapters seven and eight because it's going to take a lot more math skills to be able to show with confidence there was an improvement from the control group to the treatment group. And so when it comes to your outcome variable, your outcome variable is simply just the result of a single object of interest. So you go to that student and you ask them: What is your grade? There's no improvement because it's just their grade. You end this class with exactly one grade, whether it be a percentage or it be a letter grade, depending on if we're looking at percentages or just your transcript. The point is: you only get one. There's no comparison for a single student. Rather, this in word of improvement is about comparing the two groups and seeing if the overall grades from one group to another group is higher.