ML Paper Review
ML Paper Review
ML Paper Review
INSTITUTE OF TECHNOLOGY
SCHOOL OF COMPUTING
DEPARTMENT OF COMPUTER SCIENCE (post graduate)
Paper Review on “Machine Learning course”
Title:” K-Multiple-Means: A Multiple-Means Clustering Method with Specified K Clusters”
[Hayu Bekele………………. SGS/0039/12]
INTRODUCTION
Machine learning is a field of study that is devoted the system that performs tasks like prediction,
diagnosis, object recognition which is generally considered as artificial intelligence. Those tasks
are associated with pattern recognition in machine learning. These pattern recognition in
machine learning so of three learning types (supervised learning, unsupervised learning and semi
supervised learning). So, the aim of the paper was on unsupervised type of learning mostly
described by clustering. Clustering is a main concept in data science and this clustering may be
different according to the purpose of the cluster. The concept of clustering is given a set of
unlabeled objects, grouping the objects so that the objects with high similarity are grouped into
the same group. According to the paper, although there are several types of clustering, k-means
clustering and K-Nearest Neighbor (KNN) are the most common clustering algorithms over a
decade. K-means algorithm is the most popular clustering algorithm as stated in the paper and it
was described clearly how it works and more emphasize is given to Fuzzy C-means which is
one variant of k-means in order to compare the work (K-Multiple-Means).
The paper discovered the K-multiple-means as an advantageous over other clustering algorithm
like nonlinear clustering methods (kernel based clustering and spectral clustering), multiple
prototypes (class center or class means). The most need for this algorithm is the need for sub
class of a cluster and the problem of overlapping intensity of different categories.
The statement of the problem
As the result of robustness of K-means (either hard or fuzzy) algorithm, many data scientist
communities have been attracted many attentions with impressive results. However, the k-Means
algorithm is well appropriate in hyper-spherical of K-Means type. This type of K-Means
algorithm makes the algorithm difficult to capture the patterns like non-convex doe to squared
error criterion. Besides, subclasses of each class are required to be included in different
application, which is hard in such single prototype. A nonlinear like spectral clustering which
dose a low-dimension embedding of the similarity matrix of data points and kernel based which
map the data into some feature space for a linear partition method are raised as an option. As
discussed in the paper, these two methods are claimed to solve the problem of linear separation
of clusters as hyper-surfaces. However, it was still difficult to design the construction of data
graph and the kernel as well. As an option to the method described above, another research area
‘class mean’ which focus on multiple prototypes was stated in the paper to find gaps leading to
the proposed method KNN. Despite to the nonlinear prototype, this class means also known as
class center allows adjusting to non-spherical shapes having more than one representative per
class. Even if the multi prototype method of K-Means method detected the non-spherical
clusters, the paper stated as the work is still less. So as a cause of the authors proposed algorithm,
they investigated the following gaps throughout the above K-means type related algorithms.
The gap identified in the paper
Referring different references, the authors of the paper ware described the most two types of K-
Means algorithms (single-prototype and multi-prototype) to find and fulfill gaps existed.
• In single-prototype K-Means method, the squared error criterion tends to work well in
hyper-spherical clusters, but prohibits the algorithms to capture the non-convex patterns.
• In multi-prototype of K-method, even though non-spherical clusters can be correctly
detected, the work was not sufficient.
• Most of the methods in multi-prototype method consists merge and split which is based
on an agglomerative strategy. Besides, this agglomerative strategy encounters difficulties
to the merge or split selection.
• Each object is allowed to have memberships in its neighboring sub-clusters rather than
having a distinct membership in one single sub-cluster.
• The partition is based on both the distribution of multi-prototypes and the distribution of
data points.
• The algorithm can update the assignment iteratively; no matter it is the sub-cluster
assignment for each data point or the cluster assignment for each prototype.
The aim of the paper
So, the paper ‘K-Multiple-Means method’ is proposed to fulfill (update) the K-Means specially the multi-
prototype k-means and make it more flexible providing the following additional features.
The KNN method provided in the paper uses the second equation. However, the second term of
the equation which is described as a regularization parameter goes to zero if the value of γ is
zero. In such case the equation turned back to K-Means squared error criterion which leads to
hard clustering. However, as the value of γ goes larger all m prototypes are connected to
neighbor data points with the same probability 𝟏⁄𝒎. In addition to this regularization parameter,
the data point observed to be connected to the neighboring prototype, is held by a probability of
Si j where Si j is a particular data point.
Following are step wise equation derivation in order to assign neighbors to a data point xi as
discussed in the paper as a technique.
Firstly, the assignment of neighboring prototypes xi was written in vector form as following.
Next referring to “Feiping Nie, Xiaoqian Wang, and Heng Huang. 2014. Clustering and
projected clustering with adaptive neighbors” the problem of the above equation is solved using
closed form solution.
Then through updating of S and iteratively updating the aj the derivation reached to the
following equation so that it achieves the ideal neighbor’s assignment.
However, they still faced a challenge to easily tackle the technique. So, in order to overcome
these challenges, they proposed another algorithm, optimization strategy. In this method, a
matrix and its diagonal transpose was used along with normalized Laplacian matrix which has an
important property, to develop following theorem.
“The multiplicity k of the eigenvalue 0 of the normalized Laplacian matrix LS is equal to the number of
connected components in the bipartite graph associated with S.”
According to the theorem stated above the following problem equation was formed.
So, as it was proposed using optimization strategy, terms in the above equation S, F and A ware
updated iteratively. This is done by “fix A, update S, F” and “fix S, F update A”, respectively.
Analysis
The most analysis held in the paper is the theoretical analysis stating connection between K-
means clustering and that of the K-Multiple Means. Notably, an additional feature which is used
in the paper over K-Means cluster is a regularization parameter γ. the equation (squared error
criterion of) KNN become the same as that of K-Means clustering if the value of γ= 0. This show
the linkage between K-Means and K-Multiple Means clustering.
Experiments
In the paper two basic experiments has been applied. The first experiment was ‘experiment on
synthetic data set’. In this experiment multi exemplar affinity propagation (MEAP) algorithm
and the variant of MEAP K-MEAP ware conducted. The second is ‘experiment on real
benchmark data sets. In this experiment six world data sets “Abalone, Ecoli, HTRU2, Palm,
BinAlpha, Wine” are conducted.
To be honest it’s a little bit difficult to me to really find the weakness of this paper and criticize
it, however mathematical operations and the derivation from one equation to another and some
theorems are hard to easily understand.
CONCLUSION
In the paper, the K-Multiple-Means method is proposed to group the data points with multiple sub-
clusters means into the specified k clusters. The proposed method formalizes the multiple-means
clustering problem as an optimization problem and updates the partitions of m sub-cluster means and k
clusters by an alternating optimization strategy. In each iteration, the data points with multiple-means are
grouped based on the partition of a bipartite graph associated with the similarity matrix. The theoretical
analysis of the connection between KNN method and K-means clustering is shown. Experimental
extensions have been conducted to demonstrate the effectiveness of the algorithm.