Toyota Technological Institute at Chicago (TTIC)
Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering
Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1+ε) under a projection onto a random O(log(k/ε)/ε^2)-dimensional subspace. Further, the cost of every clustering is preserved within (1+ε). More generally, our result applies to any dimension reduction satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p.
For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.
Based on a joint work with Konstantin Makarychev and Ilya Razenshteyn.
Speaker Bio: Yury Makarychev received his MS degree in Mathematics from Moscow State University in 2001 and his PhD degree in Computer Science from Princeton University in 2008. He spent the following two years as a Postdoctoral Researcher at Microsoft Research in Redmond, WA and Cambridge, MA.
Yury’s research interests include combinatorial optimization, approximation algorithms, semi-definite programming, and low-distortion metric embedding. He has recently worked on developing approximation algorithms for unique games, constraint satisfaction, graph partitioning, and vertex ordering problems, investigating tradeoffs between local and global properties of metric spaces, and studying lift-and-project hierarchies of mathematical relaxations.