Data Mining-Model Based Clustering
Data Mining-Model Based Clustering
Maximization Step
objects and
iterations
Conceptual Clustering
Conceptual clustering
Ck)
Sibling nodes at given level form a partition
Category Utility
processed
Can use adaptive strategy
PROCLUS PROjected CLUStering
Dimension-reduction Subspace Clustering technique
Finds initial approximation of clusters in high
dimensional space
Avoids generation of large number of overlapped
clusters of lower dimensionality
Finds best set of medoids by hill-climbing process
(Similar to CLARANS)
Manhattan Segmental distance measure
Initialization phase
Greedy algorithm to select
a set of initial
medoids that are far apart
Iteration Phase
Selects a random set of k-medoids
Replaces bad medoids
For each medoid a set of dimensions is chosen
whose average distances are small
Refinement Phase
Computes new dimensions for each medoid based
on clusters found, reasigns points to medoids and
removes outliers
Frequent Pattern based Clustering
Frequent patterns may also form clusters
Instead of growing clusters dimension by dimension
sets of frequent itemsets are determined
Two common technqiues
Frequent term-based text Clustering
Clustering by Pattern similarity