User profiles for Andrea Vattani
Andrea VattaniML Leadership at Reddit, Computer Science PhD from UC San Diego Verified email at cs.ucsd.edu Cited by 1964 |
K-means requires exponentially many iterations even in the plane
A Vattani - Proceedings of the twenty-fifth annual symposium on …, 2009 - dl.acm.org
The k-means algorithm is a well-known method for partitioning n points that lie in the d-dimensional
space into k clusters. Its main features are simplicity and speed in practice. …
space into k clusters. Its main features are simplicity and speed in practice. …
Scalable k-means++
Over half a century old and showing no signs of aging, k-means remains one of the most
popular data processing algorithms. As is well-known, a proper initialization of k-means is …
popular data processing algorithms. As is well-known, a proper initialization of k-means is …
Fast greedy algorithms in mapreduce and streaming
Greedy algorithms are practitioners’ best friends—they are intuitive, are simple to implement,
and often lead to very good solutions. However, implementing greedy algorithms in a …
and often lead to very good solutions. However, implementing greedy algorithms in a …
Hartigan's method: k-means clustering without voronoi
M Telgarsky, A Vattani - Proceedings of the thirteenth …, 2010 - proceedings.mlr.press
Hartigan’s method for $ k $-means clustering is the following greedy heuristic: select a point,
and optimally reassign it. This paper develops two other formulations of the heuristic, one …
and optimally reassign it. This paper develops two other formulations of the heuristic, one …
[PDF][PDF] The hardness of k-means clustering in the plane
A Vattani - Manuscript, accessible at http://cseweb. ucsd. edu …, 2009 - cseweb.ucsd.edu
We show that k-means clustering is an NP-hard optimization problem, even for instances in
the plane. Specifically, the hardness holds for k= Θ (nϵ), for any ϵ> 0, where n is the number …
the plane. Specifically, the hardness holds for k= Θ (nϵ), for any ϵ> 0, where n is the number …
Finding red balloons with split contracts: robustness to individuals' selfishness
The present work deals with the problem of information acquisition in a strategic networked
environment. To study this problem, Kleinberg and Raghavan (FOCS 2005) introduced the …
environment. To study this problem, Kleinberg and Raghavan (FOCS 2005) introduced the …
Near-optimal bounds for cross-validation via loss stability
…, S Vassilvitskii, A Vattani - International …, 2013 - proceedings.mlr.press
Multi-fold cross-validation is an established practice to estimate the error rate of a learning
algorithm. Quantifying the variance reduction gains due to cross-validation has been …
algorithm. Quantifying the variance reduction gains due to cross-validation has been …
Learning mixtures of Gaussians using the k-means algorithm
One of the most popular algorithms for clustering in Euclidean space is the $k$-means
algorithm; $k$-means is difficult to analyze mathematically, and few theoretical guarantees are …
algorithm; $k$-means is difficult to analyze mathematically, and few theoretical guarantees are …
Hiring a secretary from a poset
The secretary problem lies at the core of mechanism design for online auctions. In this work
we study the generalization of the classical secretary problem in a setting where there is only …
we study the generalization of the classical secretary problem in a setting where there is only …
[PDF][PDF] Preserving personalized pagerank in subgraphs
Choosing a subgraph that can concisely represent a large real-world graph is useful in many
scenarios. The usual strategy employed is to sample nodes so that the induced subgraph …
scenarios. The usual strategy employed is to sample nodes so that the induced subgraph …