Course Overview: This will be a Trinity term special topic course for the MMSC.
Lecturer(s):
Prof. Per-Gunnar Martinsson
Course Synopsis: The course will describe and analyze algorithms that are designed to be capable of handling large and high-dimensional datasets, and to be able to take full advantage of modern computing hardware. The common theme in the methods to be discussed is a reliance on randomized projections and randomized sampling to reduce the effective dimensionality of problems, without undue loss of accuracy. The course will discuss the mathematical theory underlying these methods, which involves techniques from random matrix theory, numerical linear algebra, probability, and functional analysis. It will also discuss practical issues that are important to attain high computational speed in practice, such as blocking of algorithms, how to minimize communication, how to handle situations where data is stored on slow memory (hard drives) or is stored on distributed memory system.
For each algorithm considered, we will discuss some relevant application areas and describe the mathematical modelling involved. Examples of problems to be discussed include statistical data analysis (PCA in particular), linear regression problems, clustering, spectral methods for analyzing graphs, and many more.