Understanding Simple and Multiple Regression Analysis

Oct 2, 2024

Assessing the Fit of the Simple Linear Regression Model

Introduction to Sum of Squares

  • Sum of Squares Error (SSE):

    • Measure of error in using the estimated regression equation to predict values of the dependent variable.
    • Calculated as: ( SSE = \sum_{i=1}^{10} (E_i)^2 ).
    • Example value: 8.0288.
  • Total Sum of Squares (SST):

    • Measures clustering of observations around the mean (( \bar{Y} )).
    • Also called Sum of Squares Total.
    • Example value: 23.9.

Regression Line and Deviations

  • Difference from average ( Y ) value vs. predicted ( Y ) value.
  • Points cluster closer to the regression line than the mean line ( \bar{Y} ).

SSR, SSE, and SST Relationship

  • Sum of Squares Regression (SSR):
    • Difference between predicted ( Y ) and average ( \bar{Y} ).
    • SST = SSR + SSE.

Coefficient of Determination (( R^2 ))

  • Ratio of SSR to SST, measures goodness of fit.
  • Values between 0 and 1.
  • Example: ( R^2 = 0.6641 ) implies 66.41% of variability explained by the model.

Multiple Regression Model

Definition and Components

  • Extends simple linear regression to multiple variables.
  • Dependent variable ( Y ).
  • Multiple independent variables ( X_1, X_2, ..., X_q ).

Interpretation of Coefficients

  • Change in mean ( Y ) corresponding to one unit change in ( X_i ), holding others constant.

Estimation and Equation

  • Estimated equation: ( \hat{Y} = B_0 + B_1X_1 + B_2X_2 + ... + B_qX_q ).
  • Obtained using least squares minimizing error sum of squares.

Practical Application

Using Excel for Regression

  • Trendline in Excel allows display of ( R^2 ).
  • Data analysis tool pack for multiple regression.

Interpretation of Results

  • Output includes coefficients and ( R^2 ).
  • Graphical representation limited to two independent variables.

Conclusion

  • Explored how deviations and fit affect regression analysis.
  • Introduction to multiple regression and its calculation.
  • Next topic: inference in regression.