Transcript for:
Understanding Variability and Its Measures

CHAPTER 4: VARIABILITY VARIABILITY: means that they are not all the same. * Provides a quantitative measure of the difference between scores in a distribution and describes the degree to which the scores are spread out or clustered together. * Defined in terms of distance. It describes the distribution. It tells how much distance to expect between one score and another. * Measures how well an individual score represents the entire distribution. * PURPOSE: to obtain an objective measure of how the scores are spread out in a distribution. * If the scores in a distribution are all the same, then there is no variability. THREE MEASURES OF VARIABILITY: * RANGE: obvious first step toward defining and measuring variability. * The distance covered by the score in a distribution. Form the smallest score to the largest score. * The most obvious way to describe how spread out the scores are---simply find the distance between the maximum and the minimum scores. * The difference between the largest score (Xmax) and the smallest score (Xmin). * Works well for variables with precisely defined upper and lower boundaries. * Range = Xmax – Xmin. * When the scores are measurement of a continuous variable, the range can be defined as the difference between the upper real limit (URL) for the largest score (Xmax) and the lower real limit (LRL) for the smallest score (Xmin). * Range = URL – LRL * DISADVANTAGE: it is completely determined by the two extreme values and ignores the other scores in the distribution. Thus, a distribution with one unusually large (or small) score will have a large range even if the other scores are all clustered close together. * It often does not give an accurate description of the variability for the entire distribution. * Considered to be a crude and unreliable measure of variability. * STANDARD DEVIATION: the most commonly used and the most important measure of variability. * uses the mean of the distribution as a reference point and measures variability by considering the distance between each score and the mean. * provides a measure of the standard, or average, distance from the mean, and describes whether the scores are clustered closely around the mean or are widely scattered. * Primarily descriptive measure; it describes how variable, or how spread out, the scores are in a distribution. * The square root of the variance provides a measure of the standard, or average distance from the mean. * SD = √variance * DEVIATION: distance from the mean for each individual score. * Because the sum of the deviations is always zero, the mean of the deviations is also zero and is of no value as a measure of variability. * deviation score = X - µ * The sign tells the direction from the mean---that is, whether the score is located above (+) or below (-) the mean. * VARIANCE: also known as the mean squared deviation, or the mean of the squared deviations. It is the average squared distance from the mean. * It results in a measure of variability based on squared distances. FOR EXAMPLE: Compute the variance and standard deviation for the following set of N=6 scores: 12, 0, 1, 7, 4, and 6. Mean: 5 X X-μ (X-μ)² 12 7 49 0 -5 25 1 -4 16 7 2 4 4 -1 1 6 1 1 Variance: , , 16. SD: √16 = 4. Therefore, your variance of 16 and a standard deviation of 4. VARIANCE: the mean of the squared devations. * V = SUM OF SQUARES: or SS, is the sum of the squared deviation scores. * Has two formulas but produce the same answer: * DEFINITIONAL FORMULA: the symbols in the formula literally define the process of adding up the squared deviations: * SS = ∑(x-μ)² * COMPUTATIONAL FORMULA: performs calculations with the score (not the deviations) and therefore minimizes the complications of decimals and fractions. * SS = ∑X² - POPULATION VARIANCE: represented by the symbol (sigma squared) and equals the mean squared distance from the mean. It is obtained by dividing the sum of squared by N. σ² = POPULATION STANDARD DEVIATION: represented by the symbol (sigma) and equals the square root of the population variance. σ = √σ², or √ INFERENTIAL STATISTICS: to use the limited information from samples to draw general conclusions about populations. * The goal is to detect meaningful and significant patterns in research results. * HIGH VARIABILITY: tends to obscure any patterns that might exist. * LOW VARIABILITY: means that existing patterns can be seen clearly. SS FOR A SAMPLE: SS (definitional) = ∑(X-M)² SS (computational) = ∑X² - SAMPLE VARIANCE: represented by the symbol s² and equals the mean squared distance from the mean. It is obtained by dividing the sum of squares by n-1. * Often called estimated population variance. s² = SAMPLE STANDARD DEVIATION: represented by the symbol s and equal the square root of the sample variance. * Often called estimated population standard deviation. S = √ DEGREES OF FREEDOM: determine the number of scores in the sample that are independent and free to vary. Defined as: df = n-1 * This produces an unbiased estimate of the corresponding population variance. * UNBIASED: if the average value of the statistic is equal to the population parameter. * BIASED: if the average value of statistic either underestimates or overestimates the corresponding population parameter. TRANSFORMATION OF SCALES: * ADDING A CONSTANT TO EACH SCORE DOES NOT CHANGE THE STANDARD DEVIATION: Note that the mean moves along with the scores when increased. However, the variability does not change because each of the deviation scores (X-μ) does not change. * MULTIPLYING EACH SCORE BY A CONSTANT CASUES THE STANDARD DEVIATION TO BE MULTIPLIED BY THE SAME CONSTANT: multiplying each score caused the distance to be multiplied, so the standard deviation also is multiplied by the same amount. FUNCTIONS OF THE STANDARD DEVIATION: 1. DESCRIBING THE ENTIRE DISTRIBUTION: research reports typically summarize the data by reporting only the mean and the standard deviation. 2. DESCRIBING THE LOCATION OF INDIVIDUAL SCORES. ERROR VARIANCE: this term is used to indicate that the sample variance represents unexplained and uncontrolled differences between scores. * As the error variance increases, it becomes more difficult to see any systematic differences or patterns that might exist in the data. * High variance can make it difficult or impossible to see a mean difference between two sets of scores, or to see any other meaningful patterns in the results from a research study.