Correlation vs. Experimentation

hi everyone and welcome to module 0.4 correlation experimentation pause here if you need the vocab what is a correlation a correlation is the measure of two variables and how they are correlated related linked Associated to each other we cannot say in a correlation how one variable causes the other there might be a cause and effect relationship but a correlational study will not give us that information write this down it's super important correlation does not imply causation we cannot say cause and effect in a correlational study for the purpose of this mod and for all of our mods going forward we will probably hear this word a lot variable a variable is anything that we can measure and we often use correlational studies for things that we can't ethically conduct in an experiment for example um pregnancy and smoking of tobacco we cannot ethically require a group of pregnant women to smoke nicotine and tobacco or chew it or whatever so we would have to do something like a correlational study where we collect data do you smoke if so how many cigarettes a day or how many how much do you vape a day whatever it is and get that information and see if there's a Corel ation that doesn't mean that that that tobacco use can cause let's say premature birth but there is a correlation there is an increased risk perhaps there's some there's a link between the two how do we measure this what does it look like so we collect our data so all the dots in the graph you see are different data points so if we're talking about that pregnancy one imagine that each one of these is a data point right and that is called a scatter plot when you see all those dots there are positive correlations negative correlations and no correlation so what does this look like on a graph you will need to know that a positive correlation looks like this where both the X and the Y AIS show that as one variable increases so does the other so the more the person consumes tobacco the more likely they are to have premature birth let's say no correlation would look like this the dots are all over the place there is no line and by the way this is a perfect positive correlation a straight line in numerical form that's called a correlational coefficient so if they don't give you a graph but they say the correlational coefficient is+ one that is the strongest positive correlation you can get numerically so the opposite of that would be zero which is no correlation I'm sorry that's not really the opposite is it um but if there is no association between the two you do your study you think there might be some correlation between these two data points there's nothing the data is all over the place that would look like this on a graph and it would be a zero numerically for a correlational coefficient a negative correlation would show that as one variable goes up the other variable goes down numerically that would be a ne1 so this says which of the following is the strongest correlational coefficient POS it if you need the answer would be B right negative I'm sorry .91 and the weakest by the way is 05 so it's always whatever is closer to one it doesn't matter if it's positive or negative you can pause these and um test yourself so what I'm going to go through is just you want to be able to say if it's positive negative no correlation and you want to be able to put the variables into a sentence so pause now if you need to so this is a negative correlation as you um run more per week your weight goes down so as one goes up the other goes down this one is a positive correlation the more hours you study the more your grade goes up this one is a negative correlation the more miles traveled the less gas in the tank and this one is a positive correlation the more precipitation the more cucumber yield and by the way each one of these if I didn't say that is a data point right so if they asked you how many people were in this study you could just count one two three four five six they only had six people in the study right and they might ask you something like that um on a test question sometimes we think that two variables are correlated with each other but when we actually do out the data there is no correlation truly there here are some examples from your textbook where people believe because of confirmation bias and because they've been told this and now then they look for it right confirmation bias that these things exist and what so cor so a common example is that Sugar increases hyperactivity right oh they ate that cookie and that's why they're being hyper right there's so many other reasons why they might be hyperactive in that moment but we connect it because we already have this elusory thought regression towards the me is similar and that again our brain is trying to make meaning and connection to things when sometimes there is no random there is no meaning to it sometimes it is just random so that could be an example where your team plays particularly bad one game and so the coach yells at them and the next game they do much better they go back to their good um playing um the next game they regress towards the mean right you have a really good game you go back to your average you have a really bad game you go back to the mean you go back to your average data tells us we tend to go back to our average no matter what but that coach who yelled at the team after they played really badly might think hey because I yelled at them that's why they did better and we might make that Association when really it's just because of regression towards the mean in correlational studies we cannot test cause and effect as I said a few times now so we might see um you know an obvious example of a third variable problem so this is when a change in an unmeasured or unintended variable is causing a random or coincidental relationship so for example sunb burning increases when ice cream sales increase there is a correlation between sunburns happening and ice cream sales but again that doesn't mean one causes the other so the third variable here is hot weather so third variable problem is we never know if there is a third variable that's really at play here so maybe in our correlational study about um pregnancy and smoking maybe there's another variable there that we are not aware of maybe women who smoke are also more likely to X Y and Z and that's actually a more direct cause of the premature birth we can ALS so we also have directionality problem when we are trying to look at it again we can't to so for example if social media correlates with teen risk of depression that may or may not indicate that social media use causes increased risk of depression there's a lot out there in news headlines that increased social media is correlated to teen risk for depression but they're they seem to be indicating that social media might be causing that risk for depression when there could be many other variables at play here right so we cannot determine direction we can't say that using social media causes um depression risk or is it that people who are depressed use social media more there's so many other ways to say that right we can't tell um cause and effect so here's just another example of it you can pause this and kind of talk through it but it's about interest being correlated to academic achievement but which one came first is it interest or academic achievement that's a directionality problem right which one is causing and affecting the other one we don't know maybe neither maybe there's a third variable at play amount of time studying so let's move on to experiments experiment is where the investig Ator manipulates one or more independent variables to observe in effect this is the only type of research method you better have this part in your notes that can test cause and effect relationships correlation cannot so in this study right here using this histogram think about what is the um cause and effect here so it says the effects of caffeine on how many words someone can type per minute so one of those caffeine and productivity are variables one of those is the independent variable in an experiment this is the one that is being manipulated this is our cause a dependent variable is the one that we are measuring as like it's being affected by the independent variable so dependent is the effect independent is the cause so pause if you need to before I say the answer this would be if C caffeine consumption then productivity and you want to say it in your brain like that or out loud like that so that you can piece together which one's the independent variable and which one's the dependent this is an extremely common almost kind of guaranteed we can't say 100% but a very strong likelihood that you will see this on your test questions so you want to remember if is independent variable then is the effect is the dependent variable so if caffeine consumption then productivity because they will ask you what is the independent dependent variable down here you can kind of do the same does room temperature affect test scores so if room temperature very hot very cold then test scores will be low high in an experiment you can have a confounding variable problem and this is where there is some outside variable that you are not trying to test but it is influencing your test results so in this example they're trying to test alcohol on heart disease but there's this other factor of tobacco smoking that's confounding it in this example let's say they have a really hot room that could be our um experimental room we could have a really really cold room and then we have one that's just just right okay but let's say they test them at different times of day one really early in the morning one late at night that could be a confounding variable that affects test scores when we are doing an experiment we take our random sample that's large representative all those good things that we talked about already and now we need to do something as a Next Step this is only true for experiments if you see this word used that means that the study we're reading is an experiment if so the easiest way to say this if you have to write it out and they say say how how could you randomly assign you would say they take the random sample and they draw names out of a hat to determine if the participant is going in a control group or if they are going in the experimental treatment group okay it has to be done at random to protect for any confounding variable so that's going to reduce the chance that all let's say in the test taking room right let's say you split them up by like first to show up maybe first to show up is correlated to higher IQ right so maybe and that will affect our test scores to reduce confounding variables we use random assignment who is going into the experimental group and who is going into the control group so what is the experimental and control group the experimental group is the group in an experiment that is getting the independent variable so in this case they're testing blood pressure medicine the experimental group is the one getting this medicine if the participant takes this medicine then their blood pressure will be reduced right so they're going to get the if the independent variable the medicine in this case the control group is to have a whole another group to measure against the experimental group did that independent variable make a difference or did everyone blood everyone's blood pressure changed so that is called in a control group and they in this case would get a placebo which is a sugar pill designed not to have any effect um but the person wouldn't know if they getting the experiment experimental medicine or not why is that important why do we want them to not know if they're getting the experimental drug or not we want to prevent bias and there are two types of bias we want to prevent the first is participant we talk about participants in previous modules wanting to please the researchers wanting to give the answer that is pleasing to the researcher to prevent that we do a single blind procedure this is where the participant does not know if they are receiving the placebo or the actual true medicine single blind is for participants double blind adds in so it also has the participant not knowing and the researcher involved in meeting with that participant taking data on that participant handing them the pills they don't know if that is Placebo or experiment and that the purpose of that is to prevent any sort of experiment or bias where they unintentionally maybe write extra data that where the person was talking about a reduction in symptoms because it supports the fact that they got the real medicine and maybe ignoring um someone in the placebo group or ignoring someone in the experimental group because it doesn't fit into the narrative that they're looking for to prove that the medicine is effective what is a placebo a placebo like I said earlier could be a sugar pill it's the easiest way to explain it where you take in a control group you might take um a sugar pill and you think it's to reduce arthritic pain and because you believe you're receiving the experimental medicine you might actually feel less pain that is a a proven effect that a sugar pill meaning a fake pill can have real effects on how you perceive your pain to be reduced you can feel better from a sugar pill if you believe it to work so if someone says I take this supplement and it works I feel X Y and Z because I take it that could be placebo effect and that is why it's so important that we use experimental research we look at studies to see did do we how do we know has this been in a double blind procedure do we know that you're not just experiencing placebo effect there's also something called no sibo effect where instead of like let's say feeling less pain or less stress or whatever the positive effect is no SEO have you ever seen those big lists of side effects you could potentially have on a medication and maybe it's like really bad headache really bad stomach ache right your doctor might not go in doesn't go into all of the possibilities of things that you could have because that could actually be harmful to you because if you know that you might get a really bad migraine a really bad stomach ache you can actually experience those negative side effects because you believe that you know it's a side effect that you can have you might actually feel worse and finally when we are looking at research we want to make sure that our research is valid let's say every time we do the experiment it gives us the same answer can you write this down separately if it gives us the same answer every time that means that the research or the test is reliable it means reliable means consistent so if every time a mouse gets on a scale and it says the mouse weighs 600 lb that is consistent the little mouse gets on the scale 600 lb each time wow it's always hitting that same Mark right it's consistent we need that if it's going to be valid if it's going to actually test what it's intending to test but we know that little mouse isn't 600 pound so we know it's probably not valid so a test can be reliable give you the same IQ score every time for example but it cannot be valid can maybe it's not an accurate way to measure your IQ so we want something to be reliable and valid to actually test what it's intended to test our takeaways correlation does not imply causation remember cause and effect can only be tell figured out in an experiment zero correlation coefficient means no correlation plus one negative 1 equally strong plus one is positive negative 1 is negative correlation dependent variable is the effect independent variable is the cause the thing that the experimenter is manipulating to to see if it has an effect the experimental versus control groups remember the experimental group is the one that receives the independent variable the control group does not receive the independent variable remember in an experiment we randomly assign that representative sample into um a controller experimental group by drawing names out of a hat or using a random number generator it's got to be at random um and remember that our brains impact the way we see events so that's why it's important in research that we protect for those things why we use single and double blind experiments why we use sio effect um we are careful about no sibo effect and we prevent experiment or bias that's all for 0.4 thanks see you next time

Transcript for:Correlation vs. Experimentation

Transcript for:
Correlation vs. Experimentation