Transcript for:
Understanding Pseudoreplication

You’ve learned why replication is important. But it’s not always as easy as it looks. Sometimes you think you’re replicating an experiment, but you’re not. You need to worry about pseudoreplication. That’s the topic of this module in Experimental Design. [credits] In a perfect experiment, we’d measure the entire population of living creatures in the world to see how some factor affects them. That’s never going to happen. It’s just not possible. Because of that, we have to rely on samples of the entire population to inform our understanding of the population at large. As we learned from our last module, the larger the number of subjects we measure in an experiment, the better. The larger the number of times that we repeat the experiment on different subjects, that’s better, too. Pseudoreplication can occur when we think we’re measuring new subjects or taking new measurements, but we’re not. If the measurements and samples are not independent of each other, that can be a real problem. Let’s go back to our squirrel example again. Are male squirrels bigger than female squirrels? Now, if we take four squirrels and measure them each 25 times, that’s not the same thing as taking one hundred squirrels and measuring them once. In the latter, we’re getting a decent sense of how a bunch of squirrels differ in size. In the former, we’re just collecting a ton of data on four squirrels, who may be outliers. Measuring the same four squirrels over and over doesn’t improve our overall accuracy. It just improves the chances that we’re getting four different measurements really accurately. Measurements need to be independent, and in this case they’re not. Just as important, however, is that the samples for replication of entire experiments need to be independent. If we only use squirrels raised at one facility, for example, then even if we run many experiments on hundreds or thousands of squirrels, it’s likely that the results aren’t replications, but pseudo replications. These problematic sources of pseudo replication come in a number of flavors. As always, let’s consider a mouse model and a human model. With mice, common enclosures can be a problem. Mice living in a cage might share some common factor that is important. Similarly, people living in the same house or building could have a common problem or factor that is interfering with the results of an experiment. Common environments are a problem. If you collect your mice from the wild, and all of them come from an area where a certain parasite is common, maybe that’s interfering with the experiment. Perhaps you’re studying people who all live in the same climate, or in an area of terrible pollution. This could easily disrupt a study of asthma, for instance. Genetics are a really big source of pseudoreplication. If all of your mice are from the same parents, there might be some factor in there that changes the results of an experiment. The same thing can happen in people. Time can be a factor. Maybe mice act differently in the morning than in the afternoon. You may need to perform experiments at different times of the day. In a different season of the year. At different times in a mating cycle, perhaps. At different points in their lives. Are you getting the picture? Too often, when we think we’re replicating, we’re pseudo-replicating. We become convinced we’ve got robust results. But then, someone else does the experiment, and can’t replicate our findings. That’s bad. What can we do? A number of things. The first is to recognize that this isn’t a problem that can be fixed just with statistics. You need experts in the appropriate scientific fields to do this. Let’s consider the mouse experiment. Experts in mice might know which of the factors I mentioned are important. They might point them out. They might tell you to make sure that you replicate your experiments with mice from different genetic lines, that you use both males and females, that you make sure to take mice from different environments and locations. In human experiments, this might mean making sure that your experiments involve people of different races, from different socio-economic classes, and from different parts of the country. You might need to make sure your results hold in people of different genders, of different sizes, and of different ages. Maybe you need to make sure that they’re not all sick, or all healthy. Or that they’re not drinking different amounts of alcohol, or doing different drugs, or have different levels of physical fitness. Avoiding pseudo-replication is hard. And it’s not always possible to avoid it. But identifying sources is an excellent start. Sometimes you can fix them and sometimes you can’t. But you always need to do the best you can and identify places when you can’t.