Vectors, Sampling and Massive Data

Tuesday, November 1, 2011 - 4:30pm
Klaus 1116
Ravi Kannan, Principal Researcher
Microsoft Research India
Modeling data as high-dimensional (feature) vectors is a staple in Computer Science, its use in ranking web pages again of its effectiveness. Algorithms from Linear Algebra (LA) provide a crucial toolkit. But, for modern problems with massive data, these algorithms may take too long. Random sampling to reduce the size suggests itself. I will give a from-first-principles description of the LA connection, then discuss sampling techniques developed over the last decade for vectors, matrices and graphs. Besides saving time, sampling leads to sparsification and compression of data.

Reception in the Atrium of the Klaus building at 4PM.