# Clustering ## Algorithms - [[k-means|K-Means]] - [[dbscan|DBSCAN]] - To reduce dimensions - [[pca|PCA]] - [[t-sne|t-SNE]] ## Evaluation - High intra-cluster similarity, low inter-cluster similarity - External - Precision, Recall - F1 score - Internal Consistency - Sum of Square Error (SSE) - BetaCV, compactness and separability, average between intra-cluster and inter-cluster distance - _Silhouette Coefficient_ - Cohesion and separation - ranges from $-1$ to $1$ - Calculate average coefficient of all points, $0.7$ is strong, $0.5$ is reasonable, $0.2$ is poor - $a(i)$ = mean distance with all points in the same cluster, $b(i)$ = mean distance with all points in the nearest other cluster $ s(i) = \frac{b(i) - a(i)}{\max(a, b)} $