hi class so today we're going to be talking about Survey Research design and if this sounds familiar it should we talked about this in more of a broad sense in our last lecture and it's been mentioned previously throughout the course and why we're going to be going into more detail about this is really because when you look at social science research and in particular psychological research based on the methods and goals that we often have Survey Research design tends to be one of the most widely used methodologies that takes place in modern psychological research so it really should have some added focus for us in any good research methods in psychology course so we're going to be talking about some different ways of conducting research from a survey research design perspective and then we're also going to be talking about some of the limitations and problems with this as well now surveys are a widely used method of gathering scientific information and often the purpose of a survey is simply to determine how people feel about a particular issue such as let's say gun control or the performance of a President of the United States other surveys may attempt to find out the effect of some event on people's behavior like the experience of coronavirus on hand-washing for example a major function of surveys is also to dispel myths one such myth is that women whose children have grown up and left home suffer some kind of depression called the empty nest syndrome Lillian Rubin in 1979 surveyed 160 women in this situation and found that rather than being depressed virtually all of them experienced a sense of relief other surveys surprised us such as by indicating how highly related physical punishment and aggression are in children an international survey of mothers and children from six different countries showed that in general as physical punishment increased so did anxiety and aggression and children rather than an abatement of aggression now because Survey Research is technical and complex we give only a brief overview here but it's still more in-depth than what we went into in our previous lecture nevertheless it's really important to have an idea of the techniques because Survey Research is really used so often now designing a questionnaire is a surprisingly complex procedure that involves a great many considerations it shares many of the considerations of research design in addition to concerns that are inherent any written or oral form of communication frequently researchers use existing questionnaires rather than designing their own instruments thus they avoid redesigning the wheel and they can compare their results with those of previous studies using the same instrument and so one of the things that we're going to be talking about though is when that can't happen as easily so especially if you're looking at a unique topic you have variables in your topic that have not been studied together before or if the questionnaire which looks like it's most appropriate to use was used for a wildly different purpose which could affect your the reliability or validity of your outcome then you would potentially want to design your own questionnaire and this is something of a laborious task it's something that I've done I did for my master's thesis and dissertation when I was in graduate school it's something that is commonly executed but there are some things to consider moving forward in research design whether you're using a pre-existing metric or whether you're designing your own because this knowledge needs to be foundational regardless so the first question to ask when designing a questionnaire or using a questionnaire that's already out there is the same for any research it's what do I expect to accomplish so I mentioned this here because beginning researchers sometimes tend to design and administer a questionnaire without thinking through the purpose of doing the survey in the first place so suppose the students at Valley College are concerned about campus security so someone might design and administer a questionnaire that shows that students are in fact concerned about the problem well this is not particularly useful information what would be useful information is about what specific things could be done to improve campus security including increasing the frequency of patrols by campus police providing some sort of on-campus escort for people who are taking night classes walking between their classrooms and their cars or reducing the number of entrances to buildings or alternatively installing electronic security systems in this way the administration would know what changes would be acceptable to the Community College commute a campus security setup and B gives given some guidance in deciding how to best allocate resources to improve campus security overall now one thing that we need to really be considering is that we should really try to anticipate questions of interpretation that may arise when we have the data there may be groups of students who see things a little bit differently based on their experiences or their perspectives for some an increase in a police presence on campus might be experienced as more threatening could give rise to more incidents of racial bias on campus could give rise to more violent acts by people in authority or could present issues based on just the relationship between students in the administration there could also be issues based on the type of students when we talk about university students who are coming and taking a couple classes either over a summer session or night classes who are used to certain kinds of security setups like closed campuses with fences with checkpoints where you need to show IDs versus students who are at the college to get two year degrees and transfer who may have a different perspective based on them not being so used to that so it's important to think about different perspectives and different experiences answers to very important questions are not always cut and dry when we talk about survey design so it's important to try to think outside the box when designing a survey or picking a survey now when you think about what kind of questions that your respondents could answer it's important to note that survey questions can really be divided into two basic categories these are open-ended and closed-ended types of questions now an open-ended question permits the respondents to answer in their own words and a closed-ended question limits the respondents two alternatives determined in advance by the questionnaires designers now each type of question has advantages and disadvantages open-ended questions permit respondents to answer more completely and to reveal the reasoning behind their answers using open-ended questions makes it more likely that the questionnaire will discover something not anticipated by its designers which could be really good or it could be really bad open-ended questions are harder to code these are part of the bad elements of it because answers are in narrative form so it's net it's necessary to categorize responses in some way to summarize the data to make it analyzable to to quantify it and this must be done after the survey is complete making data analysis a really messy job when you're talking about open-ended questions and making it likely that you're going to have to break a cardinal rule research deciding in advance how you're going to analyze your data in addition open-ended questions require more effort from the respondents and are more difficult for less articulate respondents to answer the advantages and disadvantages have open into questions make them more useful for smaller or pilot preliminary studies which we talked about the previous lecture coding a small number of open-ended surveys may be manageable whereas hundreds would not imagine if you had 500 respondents and all 500 had a unique answer to one question and then you had a 30 question survey right so this could get really laborious really cumbersome and probably isn't the best method if you're doing a large-scale study in addition trying out a preliminary version of a survey with open-ended questions on the plus side can determine the range of likely answers so if you have a smaller group let's say 15 20 30 students this and you give them open-ended questions then you can categorize those responses and then this would permit you to standardize the alternatives for a closed-ended question into a smaller more limited grouping because you could see the relationships between answers and you could say okay well I've got 20 respondents that gave one type of answer and I have another 20 respondents that gave another type of answer so how can i word this type of answer into a multiple choice option right hopefully that makes sense if you have any questions about this feel free to reach out to me but essentially what you're doing is you're taking open-ended options that are given to you by a small group of respondents and you are categorizing them and grouping them into closed-ended options for a later more extensive survey that would be designed to be given to a larger group of people so when you think about the the advantages of a closed-ended survey that you might generate versus an open-ended survey or as a result of doing a smaller open-ended survey closed-ended questions have complimentary advantages and disadvantages to open-ended ones they just do their they're easier to code and analyze there are a few off-the-wall kinds of responses or unique responses because they only have a limited number of responses and when we talk about closed-ended responses we're talking about a multiple choice question err so I think your midterm exam the alternatives are presented to the response the respondents so that they don't have to think is hard the respondents do not need to be as articulate to formulate their answers as they do with open-ended questions but there's disadvantages here that we could pretty readily come come to see the disadvantages of closed-ended questions are that the issues being studied may be too complex to reduce to a small set of alternatives or the respondents may not agree with any of them resulting in simplistic answers or answers that just are not as accurate closed-ended questions tend to put words into the mouths of respondents suggesting alternatives that respondents might never have come up with themselves furthermore errors can creep into the closed-ended question err if a respondent misinterprets the question or if there's a clerical error made encoding the data there may be no way to discover what the fact actually is to reduce errors many questionnaires require that each response be recorded in two places or that there's redundant questions that are worded in slightly different ways to see if there is consistency between the responses and those duplicate questions see if the answers are really true but instead of take it all together the flexibility of open-ended questions really makes them more useful for small-scale and pilot studies whereas the standardization of closed-ended questions makes them more suitable for these larger studies often the two types of questions are mixed in a single study when respondents maybe offer the opportunity to expand on answers to closed-ended question and this is actually really useful because it allows you to have one group of data from the same respondents that is open-ended as and another group of data from the same respondents that's closed-ended and so you can have data that's easy to analyze and then data that's a little bit harder to analyze but if you have any questions about the reliability and validity of the answers to the closed-ended questions you can look to the open-ended questions to try to get a sense of what's going on and what the motivations of the respondents really were it makes it a huge project but it makes it a more accurate one and certainly that's something that we see a lot in large university funded studies so moving forward when we think about how we would outline the the principles of a questionnaire consider construction and so we'll be able to see some of the the best ways to do it and also some of the major pitfalls the principal concern of a questionnaire yes a survey questionnaire is that the items the questions that you have and the response options that you have if it's closed-ended be unambiguous each item should address a single question and do so in a really clear manner so what what would this look like so the following item that I'm gonna give you is ambiguous because it's double-barreled so just hear me out on this one college students should receive grades in their courses because this prepares them for competitive for the competitive world outside of college so this item that I just listed out to you contains both an opinion about grading and a reason for grading an opinion might agree with giving grades our person might agree with in grades but disagree with the reason stated for grading students it would be better to phrase the item as follows college students should receive grades in their coursework it's concrete another item could address the desirability of preparing students for a competitive society or a competitive college environment but by having it be a concrete unbiased statement this allows us to really get a sense of what the respondents actually agree with or disagree with right because if you have too many different components it becomes too ambiguous and you don't know what they're really agreeing with or what they're really disagreeing with the next consideration is to write a question in a way that will not bias the results so to members of Congress may survey their respective constituents on attitudes towards let's say abortion the first ones newsletter might ask do you believe in killing unborn babies the second ones newsletter asks should women be should a woman be forced to bear an unwanted child so even if people in the two congressional districts had identical attitudes towards abortion the survey results could indicate dramatically opposite attitudes towards abortion just based on the way that the question is worded if you're structuring the question in such a way that it is leading the respondent towards an answer then there's too much bias in it so you're not going to get an accurate assessment of the views of the respondent so a particularly good example of the effects of bias is given by a questionnaire that was administered by presidential candidate Ross Perot in 1991 1992 campaign throw reported that 99% of those who answered his mail and survey answered positively to the question quote do you believe that for every dollar of tax increase there should be two dollars in spending cuts with the savings earmarked for deficit and debt reduction so when the the Yankelovich firm repeated the question with a scientifically chosen sample agreement dropped to 67 percent but agreement dropped all the way to 37 percent when they tested a more neutrally worded question quote would you favor or oppose a proposal to cut spending by $2 for every dollar in new taxes with the savings earmarked for deficit reduction even if that means cuts in domestic programs like Medicare and public education so when you were a question to lead or responded towards a specific answer you're gonna be more likely to get the results that you're looking for but that may make it more inaccurate based on what's actually out there if the goal is accuracy if we're really doing this in a scientific way we want to be as careful as possible to avoid bias speech the other pieces to make alternatives clear so there's a particular need to write closed-ended questions in such a way that the options are distinctly different from one another and that they cover all possibilities so a philosopher would say that the answers must be mutually exclusive and exhaustive so categories are mutually exclusive if no individual case could belong to more than one category at a time so the categories of undergraduate and graduate are mutually exclusive because you can't be both an undergraduate student and a graduate student at the same time undergraduate means you haven't received your bachelor's degree yet graduate student means you have and you're receiving an advanced degree so you could be both an undergraduate student and receiving a scholarship though and so that would not be a mutually exclusive category because you could be both so noting those differences is really important when you're designing a question because you can add noise to your data by having an on usually excluse category so four categories to be exhaustive which is the other point that I mentioned in there all cases really need to fall into one or another of the alternative so using only graduate student undergraduate student leaves out the possibility that someone has a bachelor's degree but is taking undergraduate courses to prepare for an application to graduate school I have a few students like that in my courses right now so shout out to you guys we might define the category non-degree student for this type of individual because of the difficulty of thinking of all alternatives questions sometimes include the category of other this category should be used with care however because you get in trouble if other turns out to be a popular answer because it's a catch-all for all different kinds of possibilities and then there's meaningful information embedded in there that you just can't get to because it's all captured by this other category so that's something to think about as well now bias often enters when respondents perceive one alternative is more socially acceptable than the other so phenomenon called social desirability is what this is and researchers avoid this problem by wording questions so that each alternative appears equally socially desirable now when we go into a little bit more detail about this some personality tests include a set of questions designed to detect if a person has a tendency to be overly influenced by social desirability a collection of such items designed to detect dishonest responses is sometimes called a verification key so the Minnesota multiphasic personality inventory MMPI you should all be at least somewhat familiar with this from your abnormal psychology classes is a widely used personality test it has a verification key called the lie scale like a liar so this is you know pretty pretty explicit with what it's looking for one question on the lie scale might ask whether a person has ever stolen anything no matter how small no matter how small most people to answer truthfully would have to say they had people who say they have never done so would raise some suspicions unless their moral compass is really through the roof these scales can be quite sophisticated so it may be hard to make yourself look better than you really are without scoring high on the life skill so that's that's an important one to keep in mind that if you want to get around this issue you would include some sort of verification key in your in your data in your questionnaire now another one is the one to be aware of is acquiescence and although it may be tempting to format a questionnaire entirely with binary closed-ended questions which would look like yes or no answers or agree or disagree or true/false a binary meaning there's only two choices because they're really easy to administer that that would be one reason to do it but these floor mats are really highly susceptible to bias caused by acquiescence so what does this mean well participants exhibiting acquiescence will have a tendency to agree to any statement on their inventory or questionnaire survey regardless of its content The effect of this bias is that participants will agree with a statement and its opposite so for example will respond yes to both I like fish and I don't like fish those are two opposite questions but they'll just have a tendency to schools say yes to everything it's estimated that the effect accounts for approximately 10 percent of responses according to cross next a 1999 study where they were looking at this very effect so 10 percent is such a significant amount of your data that that would be a big problem that you have to address later on if you found that this was likely to be the case so with these kinds of binary closed-ended questions they can be problematic unless they're relatively portion of your survey and unless you have some additional checks on the data so when the answers can take various formats depending on the type of question like I said that could be true false but it could also be multiple-choice or rating scales so if respondents are asked to indicate their agree or indicate their agreement or disagreement with a particular position this may be achieved with a visual analogue scale which is often a single line labeled at either end in terms of minimum or maximum levels of a statement or a sensation so an example given here on your slide is asking patients in a hospital about their intensity location on set duration variation quality of pain right so zero would be no pain and ten would be unbearable pain so this is kind of similar to the Likert scale which we're going to be talking about a little bit more detail but this would be a visual analog scale because also we have a visual representation of this as well and so you can see you know it's represented kind of simplistically by happy faces or sad faces this can help people in these kinds of conditions to really communicate what they're experiencing scaling questions can be really helpful also in therapeutic contexts where you ask patients to rate their degree of depressive symptoms or anxiety symptoms as they move through an experience or in the session so this can be really helpful on surveys and in interventions and treatments alike so a Likert scale we talked about this you've seen this scale before rating scales are often called Likert scales named after the person who made them popular and these can be used and are widely used in Survey Research in fact I would say just anecdotally speaking I see these as the skills that are used in Survey Research and you may find yourself to be tempted to use these that's certainly something that is is very valid I've used them in my own research this is an example of my own research here attitudes elicited by questionnaire items are frequently measured on a four seven or nine point scale seven categories of agreement are the maximum that can be distinguished on most dimensions though and so the reason why we would also want to use the most data points as possible this is a four point scale you see here but you could use a seven point scale because that tends to allow more information so for example on this survey that you see here on your screen it's saying I usually feel happy before listening to heavy metal or emo music one would be strongly disagree a four would be strongly agree but what if somebody was more neutral a scale like this would not capture that so if you moved towards a seven point scale you can get more data on this as well but there can be some pitfalls people tend to take more neutral positions on something they don't have strong feelings on so they might tend to might end up having a lot of midpoint answers so it's something to think about when you think about Likert scale designs because this can get you into trouble but the tendency is to use towards scale use to move towards scales and use scales with more data points as well so when we move forward when we talk about branching items this is one area where your questionnaire can get a little bit more complex so when constructing questions it's often convenient to write branching items that permit respondents to skip inappropriate items and move through a questionnaire more efficiently so when we talk about a drinking portion of a health behavior questionnaire um if you're looking at somebody who is who you'd want to get more information and skip less pertinent information then you would then you would have a kind of skip to the appropriate question option so for example if you're asking somebody about classifying their intake of alcoholic beverages if they answer they have never consumed an alcoholic beverage you wouldn't then ask them the frequency of how often they consumed alcohol because they don't do it so you'd have them skip to the appropriate question now obviously on a paper-and-pencil questionnaire it would look like what's on your screen on a computerized questionnaire conversely this could be automated so if they give a certain response and automatically skips them to the appropriate next question and skips inappropriate questions as well so it's also important to think about the sequence of survey questions the order in which you put them in so care should be given to sequence the items in a way that is unbiased so for example if you have a campus security survey that asked a large number of questions of different types the first group of questions might concern demographics if you're taking a survey of students on your campus the first thing that you want to determine is whether the participant is actually a student so that would be what's referred to as a filter question if you're not if their students for people who are hanging out on campus their non matriculated or not enrolled students they're auditing a class you might be less concerned with their opinions and so then you can kind of filter their responses out of the survey the other question is is that is the you are you going to want to include demographic questions on your survey there are several reasons why you might want to do this the first and probably biggest reason is that if you have demographic questions on your survey it allows you to group by demographics which can be useful information in terms of making your research more inclusive because you're speaking to populations that might otherwise be lost in the shuffle there might also be more richness to the data if you see that there are differences between certain groups of students think students who are those University students who are taking community college classes over the summer they might have very different responses that might be valuable you might want to get a sense of why that is and it might also be valuable if you want to filter out those responses as well another question to consider is how the questionnaire is going to be scored and analyzed so once again this should be done in advance of collecting data for any research project so if you think that non-binary individuals might respond differently than binary individuals like men or women that might be important to keep in mind and think about the way that you're going to now analyze the data based on that if you think that women will respond differently than men if you're surveying university students if you think commuters will respond differently than warming students first-year students different than second-year students and so on you need to include questions to permit the classification of the students on these dimensions and it's also worthwhile to decide what kind of statistics you're going to be used using and you know starting believe it's next week we're going to be getting a little bit more into the statistics side of things too to look at you know just how we would measure this how we would analyze this so this is really important because the way that you might numerically represent their responses which is called coding so numerically representing the the responses of your research participants is going to have a big impact on what types of data you can or what types of statistical analysis you can use so you want to have an idea of what you do going in because there are certain types of analysis where for example if you have a response that is coded as zero versus one versus two versus three the zero might be considered by a certain type of analysis as meaningful and affect the data véra or by another analysis it might be considered not meaningful and might affect the data in a way that you don't anticipate so this is something to keep in mind mathematically speaking so moving forward there is really essentially four different modes for administering a survey will actually fly for per se but there's a group of different modes of administration so there's computerised administration right yeah is the computerized administration going to take place online or is it going to be in person or is it going to involve written responses or clicked answer responses are you gonna do over the phone or telephone administration are you gonna do paper and pencil administration so these are all things to consider they come with their own unique pitfalls and their own unique issues so for example if you have computerized administration and it's in person that could be really positive because you get a sense of what students are doing or participants are doing I should say but if you do online administration then people you can't control the environment with which an individual is taking your survey so they may be distracted they may have other things on their mind they may be more error-prone this is something to keep in mind if they have written responses what about the legibility of their writing so if it's paper and pencil versus if they're typing it in their expertise with typing is English their first language and is the survey done and executed in English or is it executed in their primary language if you have telephone administration how can you be sure that the person that you're talking to is the same participant as you move forward you can have somebody else come on the phone you could have things that get in the way of the things that get in the way of the respondents in terms of connection quality of the of the phone call of whether there's interference or static on the line these are all things to consider that can negatively affect your data and your results as well so moving forward a principal concern with all methods of administering surveys is the problem of the response rate so we're bombarded by surveys from a great variety of sources many of them are actually sales pitches designed to surveys so one thing that's important to consider is is that when we take it all together more than a third of the American population may refuse to participate in surveys and you know when you look at response rates they vary significantly based on the methods of administration so surveys in in magazines may have like a 1 or 2 percent response rate mail-in surveys have return rates between 10 and 50 percent telephone surveys 80 percent face-to-face survey is 90% so it can vary tremendously magazines and radio stations often publish questions for their audiences to respond to but these broadcast surveys really black reliability so once the station says quote our poll is not scientific survey but is a rough estimate of the views of our listeners well if it's not scientific then it might not actually be a rough estimate especially if somebody can respond to the same survey again and again and again you may have seen this online that's when somebody's kind of fishing for a certain kind of a certain kind of response so another thing to think about is is that if the response rates can be super low then targeting where to administer your survey is oftentimes the most helpful or offering some sort of inducement like the the opportunity to enter into drawing with a gift card that's something that can get people to be more likely to participate in your research as well now surveys differ greatly in value according to how the respondents are sampled so I'm gonna discuss a few different types of these sampling methodologies that are important to keep in mind when you think about surveys so haphazard surveys are one method and so sometimes the surveyor has control over whom to sample but uses haphazard methods of obtaining people so a television station may send a crew out to interview ten people on the street with instructions to include five women and two men three teenagers and one child right so these haphazard samples are almost worthless perhaps the most famous haphazard survey was conducted by the now-defunct literary digest which obtained respondents from from telephone books and automobile registration lists this survey predicted that Landin would win the 1936 presidential election over Roosevelt by a landslide but it overlooked the fact however that during the Great Depression people could have could afford telephones and automobiles were more likely to vote Republican because they oftentimes had more money so this was something that didn't get to the inherent bias and their sampling methods and was not a and you've heard me referenced this before representative or generalizable sample right so that's what's really important to keep in mind so haphazard samples are oftentimes really problematic but then that brings us to purposeful samples so frequently researchers will base a survey on a sample that is chosen to meet some particular definition and a purposive or purposeful sample is selected non randomly but for some particular a researcher may survey the opinions of presidents of several leading colleges about the desirable changes in college curriculum the opinions of these people may be more valuable than those that would be obtained in a random sample of all college presidents purposive samples can almost be considered to constitute a population for example all presidents of leading colleges in practice of course a researcher frequently does not have access to an entire population even one as small as the presidents of the top 50 colleges nor will there necessarily be agreement on which are the top 50 colleges nevertheless a purpose of sample is frequently preferable to a random sample if you can get everybody that represents that group so a list of leading colleges composed by a researcher is more likely to contain the researchers own college than would list compatible one compiled by somebody at a different College another problem is that the presidents of leading colleges might not know what the most desirable curriculum would be for students at the colleges that can constitute the most of the population of colleges out there in the country and also might not reflect let's say Community College populations with which a large number of students who ultimately attend top colleges come from so you'd have a problem with that type of sampling as well another kind of example of sampling that's quite acceptable is similar to the purposive or purposeful sample in that it selects a desirable group of people but differs in that it might not come close to sampling all of the population so this is called a convenient sample and a researcher may want to study the effects of integration on social development in school children there may be many appropriate school to choose among but it's much more convenient to study the one in one in the researchers own City even though such a selection is not random one would usually be willing to generalize the results to other schools and similar children and in fact and I mentioned this before especially for my informally in-person classroom settings most research in psychology is done using convenience sampling meaning that students who are enrolled in introductory psychology courses are the ones who are sampled now it helps if you then randomly select the responses from those samples but for convenience sampling that's not necessarily required now to take a problem probability sample of a population and it's necessary to define the population whenever you're talking about any kind of sampling you need to know what kind of population you're trying to represent now suppose you want to take a survey of 10% of your research methods class the population in this case is the class the class however contains some individuals who have not yet officially registered for the course let's say it's in the first couple weeks or who will drop before the end of the term so you must develop a definition of the population for the purposes of the survey and this may be different from the actual population for example you might define a pop of the population for the purposes of the study as those whose names appear on the official class roster as of a certain date so any who have not yet registered will not be considered even if they're attending class the population that you will work with for your particular study is called a sampling frame to take another example the sampling frame for the purposes of studying the population of Allegheny County Pennsylvania might exclude those who are in jails or mental hospitals because they're brought there based on the location of those facilities and they're not actually residents per se so similarly somebody who shows up for a couple sessions of the class and decides to not enroll or somebody who ends up dropping the class after the first week or two would not really be considered somebody who's taken a research methods class so each individual that falls within the sampling frame is called an element and you would sample a number of elements from the sampling frame now a systematic sample is prot is a probability sample which we're gonna be talking about a little bit but not a random sample suppose you want to select a sample of 20 students from your research methods class of 80 students your first step would be to obtain a class roster from the instructor myself then you would need to identify each element you could use the students names but this would require you to have some way of randomizing their names because it's much more convenient to work with numbers you would identify each element by a number so if there were 80 students in the class that would be numbered from 1 to 80 if you were to choose every 4th name from your class roster you would have a probability sample because 25% of the class would have been selected the sample would not be random however because those whose names were in positions 1 5 9 and so on had a hundred percent chance of being selected and everyone else had a zero chance so this method fails the equal probability part of the definition of random sampling which we're gonna get to but it's it's something that U is a method that you could utilize so what is random sampling right how do we end up defining it well most probability sampling methods rely on random sampling although there are important exceptions as we're going to talk about before we go further it's necessary to really discuss the cons of random sampling now although you may have an idea of random selection the concept is not a simple one first it's necessary to realize that the term random as used in science is a technical one and very different from our everyday use whereas we might say we picked the socks we wore today at random in that usage random means you pick the first pair of socks that fell to your hand right or the first ones that you happen to grab and it implies that you would wear those same socks much more than some of your other pairs if they were let's say at the top of your sock drawer that would happen if you always replace the laundered socks in the drawer on top of the previously laundered ones and pulled them out again from the top of the pile right so actually there'd be a high probability that you'd be using the same socks again and again so as a first approximation selection is random when it's controlled by chance alone a common example is selecting a state lottery number the authorities would want to be sure that no one will be able to predict the number better than chance another way to define a random sample is to say that the selection process is random if every member of the population has the same probability of being selected and the selection of one individual is independent of the selection of another the equal probability of selection of the selection part may seem obvious but the necessity of independent selection of individuals requires some comment so suppose you have a number of people that attend a party their names are John Marsha Bob Carol Ted and Alice and theirs are supposed to be two door prizes awarded so they come to the door they you know they have a chance of getting a prize so if the host puts each couples names on a slip of paper and pulls one paper out of the hat then it's obvious that if John's name is chosen Marcia's will also be if they're a couple right so Marsha select was dependent on John's and vice-versa so John and Marsha as a couple each had a one in three chance of winning but because only both or neither could win their selection was not random if the host pulls all the names on separate pieces of paper and pulled two names however then the selection would be random both John and Marsha would have a one in three chance of being selected as before but Marcia's chances of selection would not depend on her partner John's this example is a bit contrived I admit but it's quite possible for structure to exist in some ordering of individuals so you may have been in a group that was being divided into two smaller groups by counting off if you were in one of my in-person classes we've done this before if people tend to sit together in pairs than counting off with a result in each member of the pair winding up in a different group this may be the desired result but again it's not random random selection would result in separating some but not all pairs thus any method of selection other than a true random method could result in some non independence among members of the group so exactly how to select elements randomly from a population may require some considerable thought and ingenuity as you can see so some methods that seem random may actually not be random so the basic simple random sample is used when we believe that the population is relatively homogeneous with respect to the question questions of interest so let's continue with an idea of sampling the selection of twenty students from a class of 80 which is something I mentioned a few minutes ago so after you've obtained the roster and assign each student a number as before you would next obtain a list of random numbers random number tables are available in many about research methods and statistics or you can generate a list of random numbers with a computer I think SPSS and even Microsoft Excel may have a random number generator and so that you'd then go down the list of numbers looking for numbers between 1 and 80 and each time you see one write it down on a list until you found 20 numbers these become the identification numbers of the people who are selected for the sample occasionally a number will repeat before you have completed your sample and so you'd simply ignore those numbers because the people they represent are already in the sample and so this is one way of getting somebody to be randomly sampled but if you're surveying a population that has identifiable subgroups that are likely to differ markedly in their responses you can improve the validity of your study by obtaining a stratified random sample so suppose you know that the college has 55 percent women and 45 percent men just for the sake of argument and you have reason to believe that males and females may respond differently on your dependent measure if you look at a simple random sample the ratio of males to females would probably not match the population exactly by stratified random sampling you can ensure that the portion proportion of men and women in the sample matches the college population stratified random sampling essentially treats the population as two or more separate subpopulations and creates a separate random sample of each in this case you take one sample from the female sub population and one sample from the male sub population first you determine how many of each you need because you want your sample to contain one-fourth of the population of 80 students and have the sex ratio as the population the same as the population you need one-fourth of the females and one-fourth of the males there if there are for 48 males and 32 for 48 females I should say and 32 males in the class you will require 12 males at our 12 females and 8 males so 48 females and 32 males in the class your sample would require 12 females and eight males ok next you number the females from 1 to 48 and select from the random number table in the same manner as I described a couple minutes ago then you number the males from 1 to 32 and repeat the process now you can be sure that the sample exactly matches the population with regard to the ratio of males to females the procedure is still random however because every member of the population had an equal and independent chance of being selected now I'm not going to require you do this but it is important that if you're designing a data collection method that you would take this into consideration because there doesn't need to be some degree of randomness in order for the results to be generalizable to the population that you're studying so sometimes stratified random sampling is used to over sample some subgroup of the population that is to purposely include some group at a greater frequency than it is represented in the population so suppose you're interested in comparing the opinions of one group of students to another group of students in some manners let's say first year to second year students you would want to include the same number of first year and second year students in the survey to get as reliable of an estimate of the attitudes of first year as 2nd year students even though let's say first-year students may constitute only 10% of the population if they drop out for some reason like let's say first year students decide to delay their participation in the course or in any courses due to kovat 19 forcing classes to shift fully online but second year students are have transfers that are already in progress and need to finish their classes so you would stratify based on class assignment and include 50% first year students and 50% second-year students in your sample the sampling would still be random within the subpopulations even though there'd be this this potential difference random samples can be extremely accurate a sample as small as a thousand individuals will allow a survey to estimate within plus or minus 3.2 percent of a population as large as that of the United States so yet there is a lot of reasons to do random sampling because it allows you to make very large estimates of a population at large now many populate in many populations it would be impossible or impractical to number so for instance making a list of every person in the United States would be pretty much impossible even a random sampling of all students in a college may be difficult because there is no student directory you may decide to obtain the students from classes rather than taking one tenth of the students in each class sampling every student in one tenth of the classes would be more efficient you would obtain a list of all classes at the college and from this list you would randomly select one tenth of the classes to study this method would produce what's called a cluster sample even though a student that you sample by clustering would probably be more alike than those in a purely random sample because students within the classes are likely to be similar in background the ease of obtaining the sample would permit you to study more individuals and therefore offset the disadvantages of not having a purely random sample if you wanted to make sure that your sample contains some proportion of students in particular categories such as the college as a whole you could stratify your clusters so you might separate the class into Sciences humanities and so forth as well as into lower and upper division courses if you were at a university or more vocational courses versus more transfer oriented courses at a community college you would then randomly select one tenth of the classes in each category a sophisticated form of cluster sampling is known as multistage sampling so commercial pols such as a Gallup poll uses multistage sampling you see this for polls about you know political races that are occurring like our current election which we're leading up towards so first they may randomly select several zip codes and from these zip codes streets are selected randomly and from these streets addresses are selected randomly for practical reasons it is common to select a number of individuals from within a given cluster an example of multistage sampling can be seen in a study of drug use among high school seniors in the United States so Lloyd Johnston Patrick O'Malley and Gerald Bachmann in 1991 first selected a number of geographical areas next they selected one or more high schools within each geographical area and then they selected senior students within each high school and this method was much more efficient than trying to make a simple random selection from all high school seniors it's likely that no single list of all such individuals exists and even if it did it would be very inefficient to try to study one or two students in a given school so obtaining access to the students and administering the test is vastly more efficient in groups and in contrast to cluster sampling this study is used only or this study used only certain students from each school so if they had stayed all students in a given school they would have had more individuals than they need by the time they had obtained enough clusters to be representative of different types of schools and regions throughout the country so although cluster sampling is not as accurate as random sampling because each stage of cluster sampling introduces another source of sampling error it can be very accurate cluster samples are able to determine attitudes within a margin of error of plus or minus about four percent with a sample size of a thousand of the entire US population compared with 3.2 percent for a simple random sample of the same size so this goes to show that you know while random sampling is oftentimes considered to be superior the difference between random sampling error and cluster sample error you know you're only talking about one point two percent in a lot of cases so these are all things to keep in mind when talking about surveys and I know that this was a relatively long lecture but it's an important one when we think about survey designs because this is the majority of what psychologists do when they're conducting research is they take these elements into account and use it to not only develop their own surveys but to pick a good survey to use if one already exists all right class that's it for this week be sure to respond to your discussion questions on that will be posted today if they are not posted already by the time you are watching this video I wish you the absolute best and if you have any questions or concerns please do feel free to reach out to me through your canvas inbox take care