📊

Understanding Correlation and Analytical Tests(Lecture12 Correlation1)

Jan 22, 2025

Lecture Notes: Correlation and Analytical Tests

Overview

  • Discussion on analytical tests for data across different groups or populations.
  • Focus on continuous numerical data associated with categories (e.g., time periods, species, gender).
  • Importance of assessing relationships between data without specific groups.

Key Concepts

Hypothesis Testing and Chi-Squared Test

  • Example: Hypothesis on temperature's effect on phytoplankton biomass.
    • Null hypothesis: Same mean biomass across temperatures.
    • Alternative hypothesis: Increased temperature leads to more biomass.
  • Importance of treating temperature as a continuous variable.

Data Visualization

  • Use of scatter plots to represent two continuous numerical variables.
  • Examples include relationships between length and weight of animals and their prey.

Correlation

  • Measures strength of association between two continuous, numerical variables.
  • Variables measured in x-y pairs for each data point.
  • Determine if a relationship exists and its strength.

Examples

  • Positive direct relationship: More humans, more pollution.
  • Negative relationship: Higher temperatures lead to fewer sharks in South Florida.

Correlation vs. Regression

  • Correlation: Determines degree of association.
  • Regression: Determines level of dependence (cause and effect relationship).
  • Key terms: Response variable and predictor variable.

Correlation Focus

  • Determine if variables are related and how strongly.
  • Examples: Body size and prey size relationship.

Strength of Relationship

  • Correlation coefficient (r): Indicates strength of linear relationship.
  • Values range from -1 (strong negative) to 1 (strong positive).
  • r=0 indicates no linear relationship.

Pearson’s Correlation Coefficient

  • Used for population parameters, often estimated using sample data.
  • Represents test statistic for correlation.

Important Notes

  • Slope of line is not crucial; focus on clustering of data points around the line.
  • Next lecture will cover calculation of correlation coefficient.

Preparatory Work

  • Review section 13.1 of the textbook for background information.