So we want to talk about statistics. I am not a statistician. I've had three statistics classes in my college career over the last how many years I've taken college classes.
And not because I've failed stats twice, because every program I've been in, you have to kind of take their version of statistics. And it's pretty much all the same stuff. What I'm going to show you today are a couple of statistical tests that are going to be important for us. You don't need to know all the theory behind it and all the math and how to do the calculations by hand. I'm going to give you a website and we're going to practice using that website of how to copy and paste our data and interpret the results.
That's it. Okay, so it should be fairly simple. How many of you have had a stats class in college or in high school?
A few of you. Okay, so I'm going to do it a little different. I'm going to simplify it.
and use words that I think are going to be easier for us to understand what we're doing, because this is not a statistics class. However, it's important to understand why scientists use statistics in their research, and that's kind of what part of what I'm going to talk about today is about. So obviously when we do research, we collect data, and the data that we collect, we have to think about what it means.
Whoops, what's going on here? We have to think about what that data mean, and then when we do that, we're really looking for two things. We're looking for differences in the data, so we have group A and we have group B, and we're kind of comparing the results. Are we seeing the same thing happening in both groups?
That is one kind of statistical test, or a couple kind of statistical tests that we're going to focus on today, actually. We're going to be looking at what's called a t-test and something called an ANOVA. And depending on how much data you have, how many data sets, and how many groups, that essentially determines what test you're going to use.
So we also look for relationships or can look for relationships between data. In other words, what type of relationship is there between variable X and variable Y? How much influence does X have on variable Y?
In other words, is X likely causing a change or is something else possibly causing a change? And this kind of statistical test is called the correlation and regression test, or at least the relationship test we're going to eventually do. Another point here for number one, not only are we looking for differences But we can also be looking at changes.
So changes or differences between groups is what we're doing for that test. We will talk about number two here in a few weeks when we get to graphing because it is tied with graphing. And so the statistical tests that we do help us to ask the real fundamental questions.
Are these differences, are these changes, are these relationships real or are they just due to? the randomness of nature. Essentially those are the two choices.
Either the differences or changes we're seeing from group A and group B are due to just random chance, roll the dice, or there's something specific happening. And so statistics allows us to look at what's the probability that these relationships or associations or differences are real and not due to random chance. And that's what we're going to focus on today with some examples using the data we collected. Okay.
Are there any questions at this point? All right. So let's say we wanted to measure the temperature of every student at Muskegon Community College.
That's a pretty big task. That group at MCC... every single student would be called what we call the population.
But logistically, it's probably not going to be possible to do that. It's very difficult to do. So to solve this problem, scientists or statisticians or researchers use what's called a sample.
So if this box represents our entire population, we're going to take kind of a cross-section or what you should do is take a cross-section of everybody that is in that population, ages, genders, ethnicities, whatever the case is, and you're going to do just a small subset. And just test them. It's much easier to do that.
And then you're going to run your statistics. Now, we're not going to do this, but one of the other things that statistical analysis can do, it allows you to determine the probability that this sample is going to be a reflection of the population as a whole. That's a big thing that they talk about in your statistics classes.
How does this sample represent the population as a whole? margin of error and what's the probability that this is really how the population is. We're not really going to get that far.
We're basically just going to be running statistics to look at are the differences real or are they due to randomness? Are the changes we're seeing real or due to randomness? Now when we do our research, obviously we get data and that data can fall into really kind of two categories.
Categories or what the gentleman in the video referred to as buckets so you can have group whoops group one and group two and then within each of those buckets you have information about those subjects so maybe it's their age maybe it's their test score okay maybe it's their where they live their location or whatever the case is all that is data however typically the data we put into those categories can fall under what we call numeric data, numbers. And within this numeric data, we can have what are called continuous data sets, and we can have what we would call ordinal data sets. Continuous data would be data that you can plot along a number line. like somebody's age, it's kind of chronological, or a test score.
Ordinal data would be like, well, how many people meet this particular criteria, males versus females, or the number of students that got 50% or less on the test, or the number of students that scored 90% or higher on the test, okay? That would be more ordinal data, where it's just a number. Yes, you might be able to plot that number on a number line. You can plot any number on a number line. But when it's referencing just how many of something, that's what we call ordinal data.
Now, when we get that data, depending on what you're looking for, part of the analysis in the scientific process, depending on the data you get, to a degree, determines how you're going to analyze that. And so one of the things when we have categorical data, we look for percentages. We do counts.
How many? fall into this category. And when you graph something like that, if you're graphing percentages, you're going to have, that's where these pie charts come in. So maybe 15% fall into this category, and 60% fall in this category, and 25% fall in this category. That's the only thing pie charts are good for, is showing percentages.
However, if you wanted to compare average test score for group 1 versus group 2, that's where a column graph would come in. So test score, the average test score for group 1 was here. Let's say that's 90%. And then the average test score for group 2, maybe that's 85. So that's how you display some of this data.
You ask these scores. Even though these scores, 85% would be continuous data, because we're focusing on each category here, that's why we're choosing a column graph. And we're going to take some time and talk about graphing in three, four weeks or so.
I'll come back to some of that. I don't expect you to understand everything about graphing just yet, but that's how we would graphically represent categories of data. As far as numerical data goes, a lot of times mainly we're talking about continuous data.
Like if we had 50 test scores, 50 students and their test scores on exam one, that's continuous data. There are ways that we can look and analyze that data. And one of the important things to look at is called the spread or the distribution of that data. The measures of central tendency. Now we've done the mean, right?
That's the average. You add up all the numbers and divide by how many? Okay, that gives you the average.
That doesn't tell you everything about the data set. It just tells you what the average is. It doesn't tell you what the highest score was, what the lowest score was, how many scores are in between.
It doesn't tell you where the score in the middle was and doesn't tell you what grade occurred most often. What do we call the score in this example that if we were to line them up in chronological order, the score right in the middle? What?
The median. That's the number in the middle, the score in the middle. If you were to line them up chronologically, that's the median. Hopefully that median, that score, if you're of all those, is pretty close to the mean. Now we're not going to get into when you should, how to compare those.
That's a statistics thing, but there are ways to look at a set of data. and say, well, the mean is this, but the median is way far away. So what does that tell us about our data?
And what's the one that occurs most often? What? The mode, okay?
That's the score that occurs most often. And these are things that I do when I analyze our exams. I look at the mean, median, mode, look at the distribution. The range simply is looking at the difference between the highest score or the highest value and the lowest.
So if the highest score in the test was 99, the lowest was 20, the range in this case would be 79. Now we may do a little bit with range depending on what your project is. I may ask you to find a way to display and represent that data when you do your presentation. Now standard deviation is kind of a measure of the accuracy and precision.
I probably spelled that wrong. I don't have spell check in my hand. In other words, it's a way that you can gauge your repeatability.
Are you getting the same numbers every time? When you compare group A, group B, and group C. Remember I talked about the sponges last time?
when I talked about replicates. So I have three groups of sponges that I grew and there's let's say 15 sponges in each petri dish. What I would do is I would look at how close the size of those sponges are to one another.
Theoretically they should all be pretty close to the same size. So the standard deviation is a way to check how good your repeatability is and how accurate and precise you are. Now the way to represent this data.
Okay, a histogram would be, all right, how many people got this score? How many people got this score? All right, so maybe I had these with people that scored 50 to 59 percent and 60 to 69 percent and 70 to 79, et cetera.
That's what a histogram is. It's simply showing how many fall within certain categories. Now, a box plot or a box and whisker plot shows you distribution.
kind of what the highest and the lowest is, where the middle 50% of your data fall, and then where your mean is. Now, this is actually a statistical test, a box plot is. And so a box plot will tell you if there's any data that kind of are what we call outliers, that fall outside of what your data really should be. Some of you, I may ask to do a box plot for your final project. I will show you how to do it.
They're very easy. This helps you to see if there are any outliers. And if you have a data point that's way out of whack, why is that?
You can go back to your data. You can go back to your notes and say, well, this person, this is when there's a door that slammed and they were distracted. So they weren't able to complete what they were supposed to in the amount of time that they were. And so that's why this one is so much larger than the rest.
Okay. I don't think I really need to spend time talking about. Calculating the mean, we know how to do that.
As far as standard deviation goes, don't worry about definitions, okay? We're really probably not going to do a whole lot with it. I might ask you to calculate it on your final project.
We'll deal with that at the time. However, I do want to spend a few minutes on this diagram, just giving you an idea of what I mean by precision and accuracy and repeatability, okay? So if this first bullseye here, number one, represents all of the sponges that I grew.
Notice that they're all over the place. There's no consistency in the size of these sponges. And if that were my data, that might tell me that, you know, something didn't go right with this experiment.
You know, let's say my target was this red bullseye. That's the target size of the sponge. Now, maybe that red represents a certain size. That's not really good data. Now, bullseye diagram number two here.
Notice that all of my data points are clustered together. Well, that's pretty good. That means you're doing something right.
You're repeating the same thing and you're getting the same results every time. However, you're just not necessarily within your target for some reason. This isn't necessarily bad. Sometimes you don't know what your target is. So this isn't not necessarily bad.
It just depends what you're doing. But if you're looking at repeatability, that you do have a goal in mind. You're obviously not making your goal.
You're very precise. You're getting consistent results. You're just not accurate. You're just not meeting the target. This one over here, you're pretty accurate.
They're all pretty consistent within the first two rings, but they're not really packed closely together. So you're not very precise. This is what you want. All of our data points are clustered closely together. They're at our target area.
where we want them to be. So that says, yeah, we've got pretty good data. We did a good job with our experiment, at least initial analysis indicates. So this last one here is kind of what likely you're going to see is that, yeah, you got some pretty good accuracy and precisions, but maybe you have a couple of outliers. Depending on what you're doing, a box and whisker plot would show you, yeah, that these are outliers.
That, you know, obviously something went wrong here. Let's try and figure out what. Something you can then add and discuss in your paper. Why do we see these outliers? What do you think happened?
And how can you correct that for next time? Okay. So this is essentially what standard deviation is.
It's looking at accuracy and precision and repeatability. Now, I'm not going to plot all these numbers on the number line, but just one more example of how this works. Let's say I have two sets of data and the data are both I know it's weird, but let's say these are test scores from the same test, and this is group one, and this is group two, and these are all their test scores for whatever reason.
When we take the average of those test scores, we find that the average test score is a six for both of them. If that's all the information you have is that group one and group two both averaged six on this test. That doesn't tell you a whole lot about the data.
However, if you knew at least the standard deviation, you could be able to predict which data set, which test scores were more closely clustered around that mean value and not spread out as far. That's another thing the standard deviation can tell you. So this was the standard deviation of 2.16. And this with the standard deviation of 4.5 tells us that the lower standard deviation, that means that each of the individual test scores is more closely clustered around that mean value than the second set. That the range is 14 and the range of this one is 6. So the data are more spread out in set two whereas in set one they're closer together.
And depending, again, depending on what kind of questions you're asking, what you're doing in your research, that could be an important piece of information. Now, you can compare standard deviations of two sets of data to help you determine which set of data has less spread and how the data are more closely arranged around that mean. But you can only do that if the data are the same. Meaning if group one was biology 252 exam one and group two was biology 152 exam one.
Well, I can't compare those data sets at all because it's two completely different exams. However, if group two was another group of students that took exam one of biology 252, then I could say, yeah, yeah. Group one definitely has.
a less, a shorter, a smaller range and the data points are more closely clustered around that average value as opposed to group two. That's the only time you can take two standard or three standard deviations or four and compare them. for, I wouldn't say accuracy, but distribution and, you know, to help judge where each data point is in reference to the mean. All right, so let's get to some examples and get into our statistics.
So here's an example. And what I'm going to do is I'm going to go through some examples and we're going to get on the computer and we're going to do some things and we're going to kind of go back and forth a little bit. All right, so here's a research question.
Is there a difference in the mean oral temperatures of males and females. Pretty generic question, but it's something we could potentially answer with the data we collected last week. Now, if we were to write the hypothesis in the long form, we would write it in the if-then-because format, and we're not going to do that, but let's just make a prediction that males are going to have a higher temperature because typically males have more muscle mass.
I think there's some research that may indicate that. Don't know if that's the case, but let's just go with that. So we're going to make just a general hypothesis.
Males due to higher muscle mass in general. It's a generalization. So before we do our statistical test, obviously you have to have a research question. You're going to have to have a research hypothesis. This is what this would be.
This would be a research hypothesis, and I'm calling that to your attention because in a few minutes we're going to talk about what's called the statistical hypothesis. So obviously when we are analyzing this data, we're going to look at, there's our male bucket, female bucket, and then we have their average temperature in each one of these. And so we want to compare those average temperature values that we measured to determine are those differences in the average temperatures due to the fact that one group is a group of males and one group is a group of females? Or are those differences due to simply random chance? Remember, way back at the beginning, I said we're talking about reality or randomness.
And so that's where... Our statistics come in that. So we're going to be comparing these average values.
So let's say that our average male temperature was 99.3 and females were 98.8. Now the chances that both groups are going to have exactly the same mean, not necessarily very likely. And so when we look at those values, 99.3 Fahrenheit, 98.8 Fahrenheit, those are just values.
Yes. one is bigger than the other. But just by looking at those, can we say that, yep, because the male average temperature is higher than females, that this research question is true, that we've proved our research?
No, we can't. Because the statistical test will tell us if those values, if those differences are due to something, in this case because of gender, or are those differences just simply? random chance.
So obviously let's say when we did this last week we collected a variety of data, age, gender, oral temperature, tympanic. We also collected the non-contact infrared temperature and so for this particular situation we would need to know male or female and we'd also need to know what their oral temperature was. How would we classify gender as far as data type?
Categorical or numeric? It's a category, it's a bucket. And within that bucket, we have numbers, we have values, oral temperature.
These would be categorical numeric data, because we can plot those on a number line, kind of on a continuum. Now, if we were to count the number of people of males that had a temperature within this range and within this range within this range that would be ordinal data okay that's a different statistical test but we're looking at two a category and a cat sorry a category and a numeric sorry why did i write categorical So we're looking at categorical data and continuous data. So there's a statistical test that we would do then with those data points, and we would get a value, and then based on that we'd have to determine, okay, are these differences due to the fact that one group's male, one's female, or due to randomness?
We could also analyze other combinations of data. We could compare the three temperature measurements from everybody and determine are there differences in the oral, tympanic, and infrared temperatures, and are those differences due to the fact that we measured them in different places? and at different places in the body don't give off as much heat.
We could look at age ranges and temperature. Are there differences in people's body temperatures based on their ages? Okay, so there's other questions we could ask and try and answer with the data we have.
Now this is a similar chart to what he had on his video. Depending on the data you have and if there how many categories and sets of numeric data you have that's going to determine what test you do now we are not going to do any of these first two you probably did them in your statistics class like a chi-square test that type of thing you're going to do this one here for homework and this would be what's called a single sample t-test where you're comparing the oral temperatures versus a hypothetical value. So if you found a research article from 1950 and they had an average value of something in there, but they didn't include their original actual data, just the average, you can run this single sample t-test against each one of your pieces of data and get a result that might help you figure out that, yeah, your data is the same.
as that average. Now how that test work, I don't know. You're only going to do one this semester and it's for homework.
I just thought I'd throw that in there, something different. What we're going to focus on this semester is what happens when you analyze numeric data, okay, and categories at the same time, like gender versus temperature or age groups and temperatures. So these are where the t-tests are going to come in.
A t-test is when you have two groups, two samples. So you have group A and group B, but the data within each group is the same. It's the same categorical, same continuous data like test score or temperature. However, there aren't ways that we could compare more than two groups and looking at their temperatures and comparing those temperatures. And that's what's called the ANOVA test.
I mentioned that before. That's when you have more than two categories that you're testing. And we're going to do that today as well. And this last one we're not going to do today. This is comparing two sets of numeric data, typically two sets of continuous data, things that can be plotted on a number line.
And that would be the correlation and regression that I mentioned earlier. So if we're comparing age and oral temperature, looking at not age groups, but very specific ages, the results of that test would be able to help us determine whether or not as someone gets older, their age will cause their temperature to go up or down. And we'll work with that in a few weeks. We're going to start with that. So we're not going to do that today.
So for the question we asked originally here, we said that, I said, that males are likely going to have a higher average temperature. But what we need to do every time we do a statistical test is we need to determine what our statistical hypothesis is. In other words, and there's two of them, maybe three. But the statistical hypothesis helps us to determine whether or not our research hypothesis is correct.
So let's just go through this and we'll talk about some examples. So the first one, this H0, is what we call the null hypothesis. And the H1 is called the alternative hypothesis.
Now just to let you know ahead of time here is that the result of the statistical test, at least the ones we're doing today, the result of that test helps us to make a decision about whether or not our data show that that difference is real or due to randomness. But we have to have these two because Sometimes the data show, I mean, it could be random. Sometimes it could be real.
So the null hypothesis is what we would call kind of the fallback position. And as it relates to what we are talking about as far as male and female temperature, and for most things, the null hypothesis is going to say. There is no difference between the data sets, or you could say there is no change between group A and group B or something like that.
So obviously, if we go back to this, as far as the hypothesis goes, we have three choices. as far as answers to this research question, don't we? We have three potential predictions we can make.
We made one. Males are going to have the higher temperature. What's another one?
Females have a higher temperature. What's the third? There is no difference in the temperatures.
That effectively, male and females are not going to have a difference in their temperatures. So based on that, this is where the null hypothesis comes in. It's that the null hypothesis for our purposes is always going to be there is no difference, there is going to be no change in whatever. Which makes then the alternative hypothesis mean that there is a difference. Now this is a very generic statement that I'm writing here.
There is a difference or there is a change. between group A and group B. It just depends on what your question is.
So write it out fully. We could say for the null hypothesis, there is no difference in the average oral temperatures of males and females. For the alternative, we could write, there is a difference in the average oral temperatures of males and females. Now notice that alternative hypothesis. Didn't specifically state that males would have a higher temperature or females would have a higher temperature.
You don't necessarily have to write it Directionally where you're picking one sometimes you do that can affect how you interpret your results So we'll deal with that as as it comes up But I would for the most part write your alternative hypothesis as there is a there will be a difference there is a difference between the oral mean oral temperatures of males and females. Now let me give you another example to maybe help you understand this. So someone goes to trial.
You have the prosecutor presenting evidence to the jury about why this person is guilty and then you have the defense that presents evidence to the jury or rebuts the evidence provided by the prosecutor to show why the defendant is not guilty, right? So when the jury deliberates and they come back, what are the choices that they can come back with? I guess there's three choices.
They could come back guilty or not guilty, or they can come back where they don't agree and they get a hung jury and mistrial and they got to do it again. Basically, the jury is the statistical test. They're weighing the evidence presented. Our statistical test, the t-test or the ANOVA, the correlation regression test, That test is interpreting the evidence, the data that we're providing it, and coming up with a result.
So if the jury comes back and says guilty, it means that the evidence presented is strong enough to show that this defendant committed this crime. That would be the alternative hypothesis. But if the evidence doesn't show that the defendant committed the crime, or there wasn't enough evidence to support that, What does the jury come back with? What does the jury say?
They say not guilty. That's their null hypothesis. That's their fallback.
If the evidence doesn't support... The person committing the crime, well then, they're not guilty. If the data that we collect doesn't support our hypothesis that there is a difference, then we fall back to the null hypothesis.
Okay, any questions about that? So these are going to be important. You do want to make sure that you think about these null and alternative hypotheses.
when you're running your statistical test. Okay, now the result of the statistical test we get is called a p-value, and that can range anywhere from 0.000000 all the way to 1.000000 infinity. That p-value, or this p-value that we get, is basically the probability The evidence we presented, the probability that that difference is real and not due to random chance. That's essentially what the p-value is.
What's the likelihood our data is real? What's the likelihood that the differences we see are due to random chance? Hence the word p-value.
Now, the value that we actually get tells us the level of randomness or the probability of the differences being due to random chance. For example, if we got a p-value of 0.0971. Now, by the way, Yes, there's always a zero before the decimal, but you don't have to say it.
You don't have to write it. Don't say 0.0971. I think it just confuses people.
0.0971 is fine. So if we have a p-value of 0.0971, that essentially means that there's a 9.71% chance that the differences we're seeing are due to random chance. If we had a p-value of 0.45, that essentially means that the probability that the differences we're seeing are due to random chance is 45 percent.
So when you see that p-value, it gives you the answer to the question of what is the probability that these differences we're seeing or the changes we're seeing before and after are due to random chance. What we have to do now is come up with what's called the alpha value. What's an acceptable value of randomness that we're okay with?
It's like gambling. What's the probability that you're going to make money playing blackjack? Same idea.
Where are you going to be okay with? Are you okay with 50% chance that the results are due to random chance? No. 80%? 90%? So I always get the question, well, how do we know?
It's kind of based on your discipline. So in some disciplines like social sciences, you know, education, psychology, sociology, it might be 90%, meaning that there's a 10% probability of randomness we're okay with. 95%, meaning that we're okay with 5% probability that the... results are due to random chance. In some instances, it's 99% or 99.9%, meaning that you're only okay with a, let's see, what would that be?
This would be a 1% chance of randomness, and this would be a 0.1%. The probability that the differences are due to random chance is 0.1%. That's a pretty high standard.
Unless you're told otherwise, we're going with this. So basically what we're looking for is if we're looking at an alpha value of 95%, what that means then is that we want a p-value that's going to be less than 0.05, meaning that the probability that the differences we're seeing are due to random chance that just happened to happen that way is 5% or less. We're okay with that.
Because that also means, in a way, that... There's a 95% probability that the differences we're seeing are due to the fact, in this situation maybe, that males have a higher body temperature. Because they're male.
There's something about them. So when we do the statistical test, you're going to get p-values and then you have to interpret that. Now based on the p-value you get, that's going to tell you which of these statistical hypotheses to pick. So if that p-value is less than 0.05, you're going to pick the alternative. You're going to go with that.
Yes, there is a difference. The data show that there is a difference between group A and group B. And that most likely that difference is due to the fact of whatever. If that p-value is greater than 0.05, 0.05001.
0.097, 0.064, 0.888. That's 88%. Then you're going to go with the null. That means that you don't have enough evidence to say the difference exists. So you have to fall back and say, well, at this point in time, there is no difference.
However, maybe we only had 20 people in our study. Let's get 100 people and see what happens. Maybe we just didn't have enough data.
We're actually going to see that today. That sometimes when you have a lot more data, a lot more evidence to provide to the jury, you get a different result. Okay, so we're looking at is there a difference in the male, female, at least that was our original question.
Kind of falls in this. We're going to use a t-test. Okay, because we're just comparing group A and group B. There's one category, gender, and one set of numeric data, temperature.
We're going to do a t-test.