Regression Analysis for Body Fat Measurement

Oct 2, 2024

Lecture 44: From Data to Decisions

Introduction

  • Focus on using Excel and R for multiple regression.
  • Calculate goodness of fit to assess model performance.
  • Use body fat data as an example for model building.

Measuring Body Fat

  • Reliable method: measure body density using hydrostatic weighing.
    • Displacement of water determines volume.
    • Use Siri equation to calculate percentage body fat.
  • Challenge: Density measurement is complex.
  • Easier measures: Circumference of neck, chest, abdomen, etc.

Model Building

  • Goal: Use regression to predict body fat percentage from easier measures.

Using Excel for Regression

  • Example 1: Abdomen circumference as predictor for body fat.
    • Scatter plot and linear regression line.
    • Goodness of fit: R squared = 0.66.
  • Example 2: Weight as predictor.
    • R squared = 0.37.
    • Explanation: Height influences weight not directly related to body fat.

Goodness of Fit Measures

  • Adjusted R squared: Adjusts for number of predictors.
  • Akaike Information Criterion (AIC): Penalty for added complexity.
  • Bayesian Information Criterion (BIC): Also penalizes complexity.

Multiple Regression

  • Combine abdomen circumference and weight.
    • AIC improved from 800 to 756.
    • Adjusted R squared increased.
  • Combining chest and weight.
    • AIC worse than chest alone: weight not a significant predictor.

Multicollinearity

  • Correlation between predictors affects model.
  • Abdomen circumference and weight are correlated, complicating interpretation.

Using R for Regression

  • Load data from CSV into a data frame.
  • Plotting and single regression using lm() function.
  • Confidence Intervals: Calculated using confint().

Calculating Information Criteria in R

  • Manually using AIC formula.
  • Built-in functions: AIC() and extractAIC().
  • Differences due to additive constants.
  • Calculate BIC using AIC() with k = log(n).

Adjusted R Squared in R

  • Part of summary() output for regression model.

Multiple Regression with R

  • Add variables using + in lm().
  • Example adding abdomen and weight.
  • Parameters: intercept, abdomen, weight coefficients.

Conclusion

  • Next lecture: Addressing multicollinearity in multiple regression.