Coconote
AI notes
AI voice & video notes
Try for free
🌐
Understanding Hierarchical Mean Clustering
Dec 4, 2024
Hierarchical Mean Clustering Intuition
Overview
Discusses the mathematics behind hierarchical clustering
Comparison with k-means clustering: both are unsupervised machine learning techniques, but use different methodologies
Hierarchical clustering involves building a dendrogram
Hierarchical Clustering Process
Initial Step:
Start with each data point as an individual cluster
Finding Nearest Points:
Determine two nearest points or clusters at each step
Use a dendrogram to visualize the hierarchy:
X-axis: points
Y-axis: distance
Steps to Build Dendrogram:
Combine nearest points (e.g., P1 and P2)
Determine the next nearest points or clusters (e.g., P3 and P4)
Continue the process until all points are clustered
Distance Calculation
Uses Euclidean distance:
Formula: ( \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} )
Determining Number of Clusters
Use the longest vertical line in the dendrogram that does not intersect any horizontal line
Count the number of points it passes through to determine clusters
Practical Implementation
Use sklearn to implement hierarchical clustering based on Euclidean distance
Key Concepts
Unsupervised Learning:
Focus on grouping data points based on similarity (distance)
Dendrogram:
A visual representation of the clustering hierarchy
Conclusion
Hierarchical clustering is a method to group data points based on Euclidean distance.
The longest vertical line hack helps determine the number of clusters.
Additional Notes
Ensure understanding of previous topics like k-means and Euclidean distance for better comprehension.
Consider using sklearn for practical hierarchical clustering applications.
Subscribe for more learning content
Keep learning and exploring machine learning techniques!
📄
Full transcript