Nonlinear partial differential equations and continuum limits for large discrete sorting problems
University Of Minnesota-Twin Cities, Minneapolis MN
Investigators
Abstract
The goal of this project is to study algorithms for sorting large amounts of high-dimensional data. Sorting, or ordering, data is one of the most fundamental problems in computational science, and in today's data-driven world, there is a need to develop new algorithms and tools for handling massive amounts of data in novel ways. Many sorting algorithms are computationally intensive, but have a highly predictable structure when applied to very large amounts of data. A deep mathematical understanding of this structure will lead to new insights that have the potential to significantly improve performance. Applications of sorting are ubiquitous in science and engineering, and include the analysis of DNA sequences, sorting of hits in web searches, and fingerprint identification. A significant improvement in any sorting algorithm would have a broad impact on many fields of science and engineering. This project will study two algorithms for sorting multivariate data: non-dominated sorting, and convex hull ordering. Non-dominated sorting is fundamental in multi-objective optimization, which is commonly used in scientific and engineering contexts. It has recently been shown that non-dominated sorting of random points in Euclidean space has a continuum limit that corresponds to solving a Hamilton-Jacobi equation. The first objective of this project is to study the regularity of viscosity solutions of this Hamilton-Jacobi equation, and to develop highly accurate numerical schemes for approximating its solution. The second, and main objective, is to study convex hull ordering, which is widely used in robust statistics. It is conjectured that convex hull ordering has a continuum limit that corresponds to affine invariant curvature motion. This project aims to study, and prove rigorously, this conjectured continuum limit. This result provides an asymptotic distributional theory for convex hull ordering, which is an open problem in robust statistics. Another goal of this project is to exploit this continuum limit to develop a fast algorithm for approximate convex hull ordering that can handle massive amounts of data.
View original record on NSF Award Search →