Coconote
AI notes
AI voice & video notes
Try for free
ЁЯзо
Principal Component Analysis (PCA) Overview
Jul 13, 2024
Principal Component Analysis (PCA) Lecture Notes
Review of the Previous Video
Main Purpose of PCA
: Reduce the dimensionality of data.
Benefit
: Provide better results for machine learning algorithms.
Today's Learning Goals
Explore the mathematical aspects of PCA.
Describe the problem and solution.
Demonstrate with code.
What Kind of Problem Does PCA Solve?
Example: Data Reduction from 2D to 1D
Goal
: Project data onto a single linear vector/axis.
Formula
:
Projection Formula:
Calculate the unit vector (U).
Maximize the variance.
How to do it
:
Project all data points onto a unit vector.
Calculate the variance for each projection.
Choose the unit vector with the maximum variance.
Covariance and Variance
Covariance
Definition
: Measures the variance of data, showing the relationship between different features.
Calculation
:
Covariance Matrix
:
Combination of variances and covariances.
Symmetrical.
Eigenvectors and Eigenvalues
Their Importance
Eigenvectors
: Vectors whose direction remains unchanged when transformed by a matrix.
Eigenvalues
: Indicates how much the vector will stretch or shrink.
Choosing Unique Vectors
: Select the vector with the largest eigenvalue in PCA.
The eigenvectors with the largest eigenvalues become the principal components.
PCA Step-by-step Process
Problem Formulation
Reduce a 3D dataset to 2D.
Centering Data
: Subtract the mean.
Covariance Matrix Calculation
: Calculate variances and covariance.
Extract Eigenvalues and Eigenvectors
: From the covariance matrix.
Projections
: Project data onto new axes.
Practical Implementation
Prepare the Dataset
: A data frame with three columns F1, F2, F3.
Plot 3D Data
.
Centering Data
: Using
preprocessing.scale
.
Extract Covariance Matrix
: Using Numpy.
Eigenvalues and Eigenvectors
: Using
np.linalg.eigh
.
Data Projection
: Create new data via dot product.
Summary
PCA reduces data in a meaningful way for practical applications.
Discusses real-world applications of data reduction.
ЁЯУД
Full transcript