Randomized Algorithms for Matrix Computations

$249,979FY2016MPSNSF

University Of Colorado At Boulder, Boulder CO

Investigators

Abstract

This project will develop mathematical techniques for accelerating computational tasks such as simulating electromagnetic scattering, medical imaging, extracting useful information from large datasets, machine learning, and many others. In all these computations, the step that tends to be the most time-consuming, and which therefore limits how large problems can be solved, concerns the manipulation of large square or rectangular arrays of numbers, called "matrices". Many of the matrices that arise in practical applications have redundancies, and can be compressed to enable them to be stored using less space. Using the compressed format, computations involving the matrix can also be greatly accelerated. The problems that will be addressed are deterministic in nature, but the algorithms that will be developed are probabilistic. It turns out that by exploiting certain mathematical properties of large ensembles of independent random numbers, one can build algorithms for compressing matrices that are much faster than traditional deterministic techniques. The new randomized algorithms can in theory fail, but the likelihood of failure can be shown to be lower than 1 time out of 10,000,000,000 runs in typical applications. Randomized algorithms of this type have recently attracted much interest due to the fact that they perform particularly well on emerging computing platforms such as mobile computing (where conserving energy is the key priority), computing using graphical processor units (where the vast numbers of computational cores create challenges), and distributed memory parallel computers. The methods also perform very well when applied to massively large datasets that must be stored on hard drives, or on large server farms. The project will train one doctoral student, and will lead to the release of a publicly available software package that implements the methods that will be developed. From a technical point of view, the objective of the project is to develop efficient algorithms for factorizing matrices and for solving large linear systems of algebraic equations. The algorithms will be based on randomized sampling, and will exploit remarkable mathematical properties of random matrices and random orthogonal projections. Such randomized algorithms require less communication than traditional methods, which makes them particularly attractive for modern applications involving multicore processors, distributed computing, out-of-core computing, etc. Specifically, the project will address the following problems: (1) Computing full matrix factorizations (e.g. the so called "column pivoted QR factorization") which are core building blocks in scientific computing. Preliminary numerical experiments demonstrate speed-ups of close to an order of magnitude compared to state-of-the-art software packages. (2) Solving linear systems involving many unknowns and many equations. We expect to achieve substantial practical acceleration, and are cautiously optimistic about the possibility to develop solvers with substantially better asymptotic complexity than the cubic complexity achieved by standard techniques. (3) Developing randomized methods for accelerating computational simulations of phenomena such as electro-statics, composite materials, biochemical processes, slow fluid flows, Gaussian processes in 2 and 3 dimensions, etc. Technically, this will be achieved by developing randomized methods for compressing so called "data-sparse" or "rank-structured" matrices.

View original record on NSF Award Search →