|   | CMU-CS-02-193 Computer Science Department
 School of Computer Science, Carnegie Mellon University
 
    
     
 CMU-CS-02-193
 
Adventures in Ultra-Small-Space Clustering 
Adam Meyerson, Liadan O'Callaghan*, Serge Plotkin* 
December 2002  
CMU-CS-02-193.psCMU-CS-02-193.pdf
 Keywords: Algorithms, approximation, clustering, sampling,
k-median, facility location
 We give a sampling-based algorithm for the k-Median problem,
with running time where [.....] k is the desired cluster number and
is a confidence parameter. This is the first k-Median algorithm
with fully polynominial running time that is independent of n,
the size of the data set. It gives a solution that is, with high
probability, an O(1) approximation if each cluster in the optimal
solution has [....] points. We give near-matching lower bounds showing that
this assumption about cluster size is necessary. We also present a
related algorithm for finding a clustering that excludes a small number
of outliers.
 
13 pages  
* Department of Computer Science, Stanford University
 |