Friday, 01 March 2013 at 11:30am
Room 1400 Biomedical and Physical Sciences Bldg.
Refreshments at 11:30
Speaker: Mauro Maggioni, Departments of Mathematics, Computer Science, and Electrical and Computer Engineering, Duke University
Title: Multiscale Geometric Methods for Data in High Dimensions
Abstract:
We discuss recent work on multiscale geometric analysis applied to
high-dimensional data sets. A first application to the estimation of the
intrinsic dimension of noisy data, a second one to the construction of
data-driven dictionaries for efficient sparse representations data sets
and a novel geometric multiresolution analysis framework for encoding data.
Finally we discuss the problem of estimating a probability measure in high
dimensions, whose support is (nearly) low-dimensional and has some geometric
structure, for example that of a manifold, or a union of hyperplanes. We
construct a multiscale geometric tree decomposition of the data and use this
decomposition to construct an increasing family of approximation “spaces” in
the space or probability measures, parametrized by certain subtrees of the
multiscale tree, and perform a multiscale bias-variance tradeoff using this
family of approximation spaces. We obtain finite-sample results that
guarantee that with high probability the Wasserstein distance between the
(random) measure estimated by our algorithm and the true measure is small,
depending on the number of samples, a measure of complexity of the models
we use (typically this depends only on the intrinsic dimension and not on
the ambient dimension!), and a notion of “regularity” of the true measure.