Machine Learning Algorithms: K-Means Clustering

Jul 10, 2024

Machine Learning Algorithms: K-Means Clustering

Overview

  • Categories of ML Algorithms:
    1. Supervised Learning
    2. Unsupervised Learning
    3. Reinforcement Learning
  • Today's focus: Unsupervised Learning (specifically K-Means Clustering)

K-Means Clustering

  • Objective: Identify clusters in a dataset without class labels.
  • Steps:
    1. Start with defining k (number of clusters).
    2. Randomly set k centroids in the 2D space.
    3. Assign each data point to the nearest centroid, forming initial clusters.
    4. Adjust centroids to the mean position of associated data points.
    5. Repeat the process until data points no longer change clusters.

Visual Example

  • Given a 2D dataset with no class labels.
  • Visual examination suggests two clusters.
  • Initial random centroids -> Assign points to closest centroid -> Adjust centroids -> Repeat till convergence.

Choosing k with Elbow Method

  • Elbow Method: Helps determine the optimal k.
    • Calculate Sum of Square Error (SSE) for varying values of k.
    • Plot SSE vs. k and look for the