Transcript for:
Understanding Dependent-Groups Designs

Title: Ch.12_ Dependent-Samples T (1) URL Source: blob://pdf/d49f5476-f794-47f1-93ea-f79596b11d29 Markdown Content: Research Methods, Statistics, and Applications, Third Edition Chapter 12: Dependent-Groups Designs Stephen Lippi, Ph.D. Spring 2025, Mercer University Repeated-Measures Designs A repeated-measures design, or a within-subjects design, is one in which the dependent variable (DV) is measured two or more times for each individual in a single sample The same group of subjects is used in all of the treatment conditions ADVANTAGE : we use exactly the same individuals in all treatment conditions; no risk for participants in one treatment group to be different from participants in the other > 2 # Dependent-Groups Dependent-groups design Powerful design; we decrease random error in participant characteristics Sometimes, selecting related samples can be more practical Selecting related samples minimizes standard error Computing difference scores prior to computing your test statistic eliminates the between-persons source of error, which reduces the estimate of standard error Increase in power Reducing our estimate of standard error increases the value of the test statistic > 3 # Designs with Dependent Groups Dependent-groups designs: The participants in different conditions are related or are the same people . Participants can be related in two ways: They are observed in multiple conditions ( repeated measures design ) - subjects experience every condition in a study Most common related samples design They are matched , experimentally or naturally, based on the common characteristics or traits that they share > 4 ## Rpt. Measures and Matched-Subjects ## Designs In a matched-subjects study, each individual in one sample is matched with an individual in the other sample The matching is done so that the two individuals are equivalent (or nearly equivalent) with respect to a specific variable that the researcher would like to control In repeated-measures or matched-subjects designs, were comparing two treatment conditions (typically a pre- v. post design) Two scores are obtained for each individual or each pair of subjects > 5 # Matched v. Rpt. Measures > 6 Matched Designs Measuring some trait before matching Intelligence Genetics Repeated Measures Within-subject Pre-post or multiple levels; same individuals in both treatment conditions Comparing Repeated- and ## Independent-Measures Designs A repeated-measures design typically requires fewer subjects than an independent-measures design The repeated-measures design is especially well suited for studying learning, development, or other changes that take place over time The primary advantage of a repeated-measures design is that it reduces or eliminates problems caused by individual differences Individual differences are characteristics such as age, IQ, gender, and personality that vary from one individual to another These individual differences can influence the scores obtained in a research study, and they can affect the outcome of a hypothesis test > 7 ## Comparing Repeated- and ## Independent-Measures Designs Within-groups designs are more powerful than between- groups designs Easier to detect differences between conditions Reason : were keeping extraneous differences between participants constant across all conditions (since theyre the same people) Power is the probability that a study will show a statistically significant result when an IV truly has an effect in the population > 89 Within-groups designs also generally require fewer participants overall Time related factors and order effects The primary disadvantage of a repeated-measures design is that the structure of the design allows for factors other than the treatment effect to cause a participants score to change from one treatment to the next Specifically, in a repeated-measures design, each individual is measured in two different treatment conditions, often at two different times The order that treatments are presented can also influence potential scores! One way to deal with order effects is to counterbalance the order of presentation of treatments That is, the participants are randomly divided into two groups, with one group receiving treatment 1 followed by treatment 2, and the other group receiving treatment 2 followed by treatment 1 The goal of counterbalancing is to distribute any outside effects evenly over the two treatments > 10 # Avoiding Order Effects Counterbalancing (when levels of the IV are presented to participants in different orders/sequences) The idea behind counterbalancing is that any order effects should cancel each other out when all the data are collected Two types: Full counterbalancing - occurs when all the possible condition orders are presented; example: with two conditions, there are two orders; with three conditions, there are six orders Partial counterbalancing (example: Latin Square ) - occurs when only some of the possible condition orders are used; example: a researcher could present a randomized order for each participant or a Latin square (in which each condition appears in each position at least once) > 11 # Counterbalancing Examples Full : Repeated-measures design with 3 conditions A, B, C, we get a total of 6 possible sequences Partial : Only some of the possible conditions are represented; present the conditions in a randomized order for every subject or use a Latin Square design > 12 ## Disadvantages of within-groups ## designs Three main disadvantages : 1) Potential for order effects Solution: counterbalancing 2) Might not be practical or possible Ex: Comparing methods on how to teach children to ride a bike 3) Experiencing all levels of the IV changes the way participants act (demand characteristics ) Demand characteristics (aka experimental demand) occur when participants pick up on cues that lead them to guess the experimenters hypothesis > 13 # Other potential disadvantages Dependent-groups designs may not be effective in situations where participants are changed in some permanent way Carryover effect (impact of treatment lasts longer than the time between conditions) Practice effect (participants performance improves as a result of repetition) Fatigue effect (participants performance worsens as a result of repetition) > 14 ## The t statistic for a repeated-measures ## research design The t statistic for a repeated-measures design is structurally similar to our one-sample t test! The major distinction of the related-samples t test is that it is based on difference scores rather than raw scores (X values) > 15 # Difference Values Difference score - score or value obtained by subtracting two scores In a related-samples t test, this is obtained prior to computing the test statistic We subtract pairs of scores first, then compute the test statistic, which eliminates the source of error associated with observing different participants in each group or treatment > 16 # Formulas and Calculations MD = Mean difference, or the average difference between the scores of matched pairs or the scores for the same participants across two conditions; computed by subtracting one score of a matched or repeated pair from the other score and dividing by N SMD = Standard error of the mean difference, or the standard deviation for the differences between means for the sampling distribution when D = 0 > 17 ## The t statistic for a repeated-measures ## research design The single sample t-statistic formula is used when developing the dependent-samples t test! For the repeated-measures design, the sample data we use are the difference scores and are identified by the letter D, rather than X We use Ds in the formula to emphasize that we are dealing with difference scores instead of X values The population mean that is of interest to us is the population mean difference and we identify this parameter with the symbol D > 18 With this formula, we compute the estimated standard error the exact same way as it is computed for the single-sample t test! We compute variance for the sample of D scores Hypothesis Tests for the Repeated- ## Measures Design In a repeated-measures study, each individual is measured in two different treatment conditions and we are interested in whether there is a systematic difference between the scores in the first treatment condition and the scores in the second treatment condition A difference score is computed for each person The hypothesis test uses the difference scores from the sample to evaluate the overall mean difference, D, for the entire population The hypothesis test follows the same four-step process that weve used for other tests 1) State hypotheses 2) Select alpha and locate the critical region 3) Calculate the t statistic and 4) make a decision! > 19 ## The Hypotheses for a Related- ## Samples t test Our goal is to use the sample of difference scores to answer questions about the general population We want to know whether there is any difference between the two treatment conditions for the general population We are interested in difference scores We want to know what would happen if every individual in the population were measured in two treatment conditions (X1 and X2) and a difference score (D) was computed for everyone Our null hypothesis states that the mean difference for the general population is zero The alternative hypothesis states that there is a treatment effect that causes the scores in one treatment condition to be systematically higher (or lower) than the scores in the other condition > 20 ## Assumptions of the Related-samples t ## test The related-samples t statistic requires two basic assumptions : 1) The observations within each treatment condition must be independent . This refers to the scores within each treatment 2) The population distribution of difference scores (D values) must be normal > 21 # Related-samples t DF The degrees of freedom for the related- samples t test: df = nD - 1 Very similar to the one-sample T test! > 22 ## Directional Hypotheses and one-tailed ## tests In many repeated-measures and matched-subjects studies, the researcher has a specific prediction concerning the direction of the treatment effect This kind of directional prediction can be incorporated into the statement of the hypotheses, resulting in a directional, or one-tailed, hypothesis test > 23 # Our T statistic The t statistic for a related samples t test is: The larger our test statistic, the less likely a sample mean difference would occur if the null hypothesis were true The estimated standard error for difference scores, is placed in the denominator. The standard error for a distribution of mean difference scores (s MD ) is computed as above > 24 > MD= sample mean difference; D= population mean > difference # Related-Samples t Example Study conducted to test whether or not caffeine influences heart rate Participants are given decaf coffee in one session whereas the second session they are given coffee with caffeine Difference in heart rate when given decaf coffee or regular coffee > 25 Do heart rates (# beats per minute) differ at a 0.05 level of significance? Step 1: State your Hypotheses State the hypotheses H0: D = 0 (there is no mean difference in heart rate when given decaf or caffeinated coffee) H1: D 0 (there is a mean difference in heart rate in presence of caffeine) > 26 ## Step 2: Set your Criteria for Rejection ## and Dfs Level of significance is 0.05 df = n D - 1 = 10 - 1 = 9 Locate 9 df on the T table Critical values are 2.262 Two tailed test (hence why its +/-) > 27 ## Step 3: Calculate your Test Statistic You need to calculate/know a few things: Number of difference scores (n D) Difference scores Sum of your difference scores ( D) Sum of the squared distances ( D2) Mean of difference scores ( MD = D/n D) Variance of difference scores SD of difference scores = Sqrt(s 2D) > 28 29 MD = D/n D = -81/10 = -8.1 1,399 (-81) 2 / 10 1,399 - 656.1 = 742.9 = SS 742.9 / (10-1) = 742.9/9 = 82.54 Sqrt ( 82.54) = 9.09 Standard deviation for difference scores = 9.09 Test Statistic Time! What we now have: MD = - 8.1 sD = 9.09 nD = 10 Now we compute the estimated standard error for difference scores: Compute the test statistic > 30 9.09 / Sqrt(10) 2.87 (-8.1 - 0)/ 2.87 = -2.82 Step 4: Make a Decision Compare our obtained value to the critical value Reject the null hypothesis if the obtained value exceeds the critical value tobtained = -2.82, which exceeds the lower critical value Falls within the rejection region - decision is to reject the null Participants had a significantly higher heart rate when given caffeine than when given decaf, t(9) = -2.82, p < 0.05 Participant heart rate was significantly affected by caffeine, t(9) = -2.82, p < 0.05. > 31 ## Effect Size and Confidence intervals for ## the repeated-measures t The most commonly used measures of effect size are Cohens d and r 2, the percentage of variance accounted for The size of the treatment effect also can be described with a confidence interval estimating the population mean difference, D > 32 # Effect size We want to determine the magnitude or strength of the effect of the new intervention. We can compute Cohens d, which is the effect size typically reported for dependent-samples t tests. Cohens d tells us the magnitude of our effect in standard deviation terms > 33 Estimated Cohens d = M D / s D Estimated Cohens d = -8.1 / 9.09 = -0.89 (pretty strong effect) Confidence Intervals The confidence interval describes the limits within which the mean difference between two related populations is likely to be contained In our example, students were given decaf at one time point and caffeine in another: Heart beat in bpm was recorded MD =-8.1 sMD = 2.87 Find the 95% CI for this data > 34 # Confidence Intervals -8.1 2.262(2.87) -14.59 -1.61 Note: there is not a 0 within our confidence interval! Our 95% CI is the range of values that we can be 95% certain contains the true mean of the population 95% confident that the mean difference in the population falls within this range because 95% of all sample mean differences we could have selected from this population fall within the range of sample mean differences we specified > 35 36 ## Another Example!