# Clustering
## Algorithms
- [[k-means|K-Means]]
- [[dbscan|DBSCAN]]
- To reduce dimensions
- [[pca|PCA]]
- [[t-sne|t-SNE]]
## Evaluation
- High intra-cluster similarity, low inter-cluster similarity
- External
- Precision, Recall
- F1 score
- Internal Consistency
- Sum of Square Error (SSE)
- BetaCV, compactness and separability, average between intra-cluster and
inter-cluster distance
- _Silhouette Coefficient_
- Cohesion and separation
- ranges from $-1$ to $1$
- Calculate average coefficient of all points, $0.7$ is strong, $0.5$ is
reasonable, $0.2$ is poor
- $a(i)$ = mean distance with all points in the same cluster, $b(i)$ = mean
distance with all points in the nearest other cluster
$
s(i) = \frac{b(i) - a(i)}{\max(a, b)}
$