Coconote
AI notes
AI voice & video notes
Try for free
📊
K-Means Clustering Algorithm
Jul 19, 2024
K-Means Clustering Algorithm
Introduction
Goal:
Divide dataset into clusters using K-Means and Euclidean distance.
Data Points:
A1, A2, A3, B1, B2, B3, C1, C2.
Initial Centroids:
A1, B1, C1 (provided)
If not provided, any points can be chosen as initial centroids.
Steps to Apply K-Means Algorithm
Calculate distances:
Use Euclidean distance formula to compute distances from data points to centroids.
Euclidean Distance: ( \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} )
Assign Clusters: Select the smallest distance to assign points to clusters
Update Centroids:
Calculate the new centroids for each cluster.
Repeat:
Recalculate distances with new centroids and reassign points until convergence.
Distance Calculation Example
Formula:
[ \text{Distance} = \sqrt{(x2 - x1)^2 + (y2 - y1)^2} ]
Example:
Distance between A1 (2, 10) and centroid (2, 10)
[ \sqrt{(2-2)^2 + (10-10)^2} = 0 ]
Example:
Distance for another point calculation:
[ \sqrt{(2-2)^2 + (10-5)^2} = 5 ]
First Iteration
Clusters assignment based on distances
A1 to Cluster 1
A2, C1 to Cluster 3
A3, B1, B2, B3, C2 to Cluster 2
New Centroids Calculation:
Cluster 1: A1
Cluster 2: Mean of A3, B1, B2, B3, C2
Cluster 3: Mean of A2, C1
Following Iterations
Recalculate distances using new centroids.
Reassign clusters:
Based on the smallest distance.
Update centroids
as necessary.
Continue until cluster assignments do not change.
Convergence
Final clusters established
when assignments stop changing.
Example final clusters:
Cluster 1: A1, B1, C2
Cluster 2: A3, B2, B3
Cluster 3: A2, C1
Conclusion
K-Means Algorithm
effectively partitions data points into clusters using iterative distance recalculations and centroid updates.