But there's one word here that this was ultimately hinging on, and it's that we were taking a survey of people. I want you to note that the one word that this all hinged on is that we were surveying people. And so, one of the big questions we have when it comes to even thinking about summarizing what's happening in my population is: Was this survey done well? Was this survey done well? And ultimately, what we're going to boil this down to is that surveys can be bad or good. Honestly, surveys can either be done well or be done poorly. And in particular, if we are looking at a bad survey, it's occurring because there was a bias. Bad surveys are going to occur when there was some sort of bias in how your data was collected. See, surveys can be done with biased methods because that method will produce an untrue value. So, what are different biases that can occur when people are running a survey? Probably the most common bias is going to be a sampling bias, where a sampling bias is a specific way of collecting your data where the sample does not represent the population. Now here's the thing about sampling biases. There are actually some subgroups of what falls under a sampling bias. So, let's talk about what some of these subgroups are. The first subgroup of a sampling bias is a voluntary response bias. It's one where people who have strong feelings about the subject will be the only ones who will respond. So for instance, I love boy bands and if I see some sort of polling about who was the best boy band of all time, I'm gonna go and respond to that because I want to make sure people know I think *NSYNC is the best boy band of all time. Voluntary response: you will only respond if you feel strongly about it. The flip side of the point is, people might choose not to respond for whatever reason. They might refuse to answer, whether it's fear of retribution, fear of being exposed, and so even like wanting to hang up on telemarketers, that is refusing to answer. The last type of bias could be a convenience bias. Convenience biases are more practically about how you are reaching the people that you are trying to survey. So, for instance, again, let's go back to my silly boy band example, and let's say that there was someone who wanted to do research about what is the most popular music in America. But how did they collect this data? Well, they stood outside of a concert hall, say someone featuring a boy band. So of course, all the people who are going to be coming in and out of that concert are going to be like, "Oh, I love boy bands." It's a convenience bias because it's made up of people who are easy to reach. Those three are all examples of sampling bias, but there's also one other type of bias, and this bias has to do with the wording. If the survey questions are misleading or creating words that are supposed to trigger a specific response, a specific emotion, or even are just worded in a confusing manner, we call that a measurement bias. And so what I need us to do is to be able to identify when biases occur in a sample so that frankly, if you see that bias in how your data was collected, you throw that data out the window, you burn it, you never ever use it, because you will automatically realize that sample is tainted. And so that's what we're going to do for example four, is we are going to identify possible biases. A student asks all 250 Facebook friends if they prefer Facebook over Twitter. What type of bias do you guys think we have here? This is absolutely a convenience bias. Convenience bias because you are asking about Facebook on Facebook, and more so, those who are on Facebook clearly like Facebook, and so those who are on Facebook are going to feel strongly about wanting to support Facebook. So, this is a really good example where two different biases were actually present, but I would definitely agree the number one bias here was convenience bias. A researcher asks 500 randomly selected people, "Are you in favor of the unfair tax burden that hardworking successful business people have so that the lazy unemployed can receive a paycheck without working?" What type of biases do you guys see here? 100% this is a measurement bias. There are so many trigger words in this wording that are supposed to trigger an emotion to make you feel one way or another, but I also agree once again, this is voluntary response because those who feel strongly about this are going to be the ones who will respond. On July 4th, CNN posted on their website a question asking if they supported the current US military operations. What type of bias are we going to have from this survey? Voluntary response because those who feel strongly about how the military is being used are going to be the ones who respond. And more so, look at who is conducting this survey, it's CNN. It's a very conservative website. More so, when are they asking this question? They're asking it on July 4th, the most patriotic day of the year. So of course, those who feel strongly in favor are going to be the ones who will respond. Ten randomly selected Americans were asked by a researcher, "Do you currently have a sexually transmitted disease?" Which type of bias do you think we'll have here? Definitely non-response. If we're talking about STDs, most people don't want to respond for whatever reason. Most people are not going to want to respond to this question. A researcher stood outside a grocery store and asked 250 shoppers, "Do you eat at a restaurant at least three times per week?" This is definitely a convenience bias. But what's interesting about this convenience bias is that it wasn't a convenience bias to gather people who are in favor of what you want. See example A was a good example of a convenience bias of hosting your survey somewhere where you'll get lots of positive feedback. But the reason why I like example E is it's an example of convenience bias where you will get negative feedback. Those who are at a grocery store are obviously going to buy groceries to cook at home, so in this case, they're going to get a more skewed negative response to going to eat at restaurants. And so the point of this example side by side is to emphasize convenience biases are not just about going to places where you'll get positive responses, it could also be getting negative responses. Let's try one more. Gallup, which is a survey company, randomly selected a thousand phone numbers from the Yellow Pages and called to ask if they supported Government funding of high-speed rails. What type of bias do we have here? I think most of us feel like non-response is a big one because if someone is randomly calling us, most of us are probably going to hang up on what feels like a telemarketer. But there are some people who will listen in to what they are talking about, and if they feel strongly about what this government wants to fund, whether in favor or against, they might then choose to respond. But there's actually one more hidden bias here, and it's actually the convenience bias. Why? Well, let's look at how they selected these numbers. They selected these numbers from the Yellow Pages. For any of you guys who don't know what Yellow Pages are, it's a book that lists all of the landlines. Landlines meaning not cell phones. It is those phones that are in a building that are connected to the wall that are connected to some wiring system that connects all of the landlines. And so my question for you: How many of you guys have a landline? I don't. I only have a cell phone. Majority of us don't have a landline. And so this is yet another example of a convenience bias because this convenience bias is actually completely excluding all of us who only have cell phones. And so this is yet another vantage point of convenience bias. Sometimes you might go somewhere where you'll get all positive responses, sometimes a convenience bias is going somewhere where you'll only get negative responses, and sometimes convenience biases are just going to completely ignore a huge demographic of your population. Ultimately, the purpose of example four is for you to be able to identify when we have a bias. Why? Because if you read the context of the problem and see a bias, you need to throw whatever data, whatever result they find, and you need to throw it out the window because ultimately it's bad. It was a bad survey and therefore it has bias and it's not going to produce a true value.