Machine Learning Algorithms: K-Means Clustering

Jul 10, 2024

Machine Learning Algorithms: K-Means Clustering

Overview

Categories of ML Algorithms:
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
Today's focus: Unsupervised Learning (specifically K-Means Clustering)

K-Means Clustering

Objective: Identify clusters in a dataset without class labels.
Steps:
1. Start with defining k (number of clusters).
2. Randomly set k centroids in the 2D space.
3. Assign each data point to the nearest centroid, forming initial clusters.
4. Adjust centroids to the mean position of associated data points.
5. Repeat the process until data points no longer change clusters.

Visual Example

Given a 2D dataset with no class labels.
Visual examination suggests two clusters.
Initial random centroids -> Assign points to closest centroid -> Adjust centroids -> Repeat till convergence.

Choosing `k` with Elbow Method

Elbow Method: Helps determine the optimal k.
- Calculate Sum of Square Error (SSE) for varying values of k.
- Plot SSE vs. k and look for the

Full transcript