Published: Dec. 5, 2017

Data Science Meets Geometry

Every minute, the humankind produces about 2000 Terabytes of data and learning from this data has the potential to improve many aspects of our lives. Doing so requires exploiting the geometric structure hidden within the data. Our overview of different models in data and computational sciences starts with the ubiquitous linear subspace model, where we will present an alternative formulation of the Principal Component Analysis (a statistical tool for detecting linear structure in data) and describe its remarkable implications. We will also present state-of-the-art algorithms for streaming and distributed PCA, accompanied with an example in structural health monitoring.

Next up is learning from data with nonlinear structure. In this part, we will discuss the powerful continuum-of-subspaces model and (known) manifold models, reinforced with applications in super-resolution and nonlinear time-series analysis. Sometimes, however, the nonlinear structure in data has to be learned itself from the data. In the last part of the talk, we will discuss manifold learning from incomplete data and training neural networks, both leading to highly nonconvex optimization programs. Finally, we will list a number of pressing challenges in data and computational sciences, and lay out a path towards addressing them.

 Tuesday, December 5, 2017 at 11:00pm to 12:00pm

 Engineering Center, ECCR 257 - Newton Lab 
1111 Engineering Drive, Boulder, CO 80309