Happiness Factors and PCA Analysis

Aug 26, 2024

Notes on Happiness and PCA Analysis

Overview of Happiness Factors (2021 UN Report)

  • Six Factors Analyzed:
    • GDP
    • Social Support
    • Life Expectancy
    • Freedom
    • Generosity
    • Other factors

Challenges in Visualization

  • Difficulty in visualizing six dimensions simultaneously.
  • Possible approach: Selecting 2-3 factors for analysis (e.g., GDP, Social Support, Life Expectancy).
  • Risk: Losing information from other important factors.

Principal Component Analysis (PCA)

  • Purpose: Combine multiple factors to produce new correlated factors ranked by importance.
    • New factors called Principal Components.
    • First few components provide a faithful representation of the data.

Selection of Principal Components

  • Simplified example with first three columns of data.
  • First Component Selection:
    • PCA seeks a line arrangement of points to preserve maximum information.
    • Projection of points on a line vs. projecting on axes.
  • Projection Explained:
    • Projecting a point (x) onto a unit vector (u) gives a new point (x').
    • Inner product gives magnitude; maximum when x is parallel to u.

Mathematical Optimization

  • Maximization Problem:
    • PCA seeks unit vector (u) that maximizes the sum of squared inner products.
    • Solving via Lagrange multipliers method.
  • Simplified optimization leads to the covariance matrix (C).
  • Eigenvectors and Eigenvalues:
    • The optimal direction (u) satisfies the equation: C * u = lambda * u.
    • Eigenvalue indicates amount of information preserved.

Principal Components Interpretation

  • First Component:

    • Represents combined contributions of original factors; labeled as Power.
    • High positions: Countries like Norway, Iceland (high happiness).
    • Low positions: Countries like Niger.
  • Second Component:

    • Orthogonal to the first component; seeks to maximize similar quantity.
    • Labeled as Balance: Difference between individualistic and social factors.
    • Projection results: Happiest countries are the most balanced.
    • Example: Singapore has high GDP (Power) but lower happiness due to individualism.

Eigenvalue Analysis

  • Importance of Components:
    • PCA eigenvectors are orthogonal; eigenvalues show component importance.
    • Power explains about 85% of the data.
    • Balance explains about 10% of the data.
    • Remaining components explain the rest.

Conclusion

  • PCA is a powerful tool for analyzing high-dimensional data.
  • For further exploration, references are available in the description.