OAC Core: Small: Collaborative Research: Scalable distributed algorithms for tree structured astronomical data
University Of Illinois At Urbana-Champaign, Urbana IL
Investigators
Abstract
Spatial astronomical data is often extremely large and it is highly non-uniformly distributed. Algorithms that deal with such data have to be parallelized over large distributed memory supercomputers to deal with its size. The non-uniformity in the spatial distribution can be extreme, with some regions of space having million times more particles than other similar size regions. This creates significant challenges for scalable and efficient performance, as well as for the productive programming of such algorithms. Yet, the field of computational astronomy increasingly needs such scalable algorithms in the coming era. The raw computing capability unleashed by modern PetaFLOP/s and ExaFLOP/s computers, respectively, executing up to quadrillions and quintillions of calculations per second, is making it potentially feasible to get answers via simulations to some fundamental questions in the field, including those of galaxy formation and the properties of dark matter and dark energy. As the Large Synoptic Survey Telescope maps out the entire visible sky every few nights, it is expected to generate more than 10 terabytes per day, and this data needs to be analyzed in a timely fashion to fulfill its scientific goals of discovering hazardous asteroids, new minor planets, and exploding stars. This project provides new techniques and tools for researchers to use for high-performance simulations of non-uniform data. This enables previously untenable computer simulations to be done by astrophysicists, unlocking new insights and answering questions about the nature of the cosmos. The results are also used as case studies and educational material in classes taught by the investigators. Additionally, the project aim to involve women and undergraduate students in performing this research, continuing their experience of having done so in the past. This project thus aligns with the NSF's mission: to promote the progress of science and to advance the national health, prosperity and welfare. This project aims at developing novel parallel algorithms, data structures, and application demonstrations for computational problems involving data organized into hierarchical trees. A canonical example of such a domain is astronomical data, where particles representing clustered mass (stars or galaxies) are spread over the space of a simulation box or survey field in a highly non-uniform manner. Organizing them into trees, with multiple alternative tree organizations possible, including k-d trees, octrees, space-filling-curve based trees, etc., allows the efficient computation of various quantities such as gravitational forces, densities (and therefore hydrodynamics), two-point or three-point correlations, etc. The optimum choice of tree structure and algorithm depends both on the problem and the parameters of the parallel machine. The research methods used will include complexity analysis and, more significantly, empirical comparisons over a range of possible application scenarios including particle distributions and classes of traversals and algorithms. This will include formulation of algorithms and their implementations on parallel machines. The main outcomes of this project will be research papers describing effective algorithms and comparison and evaluation of particle decomposition techniques and tree types. This project is funded by the Office of Advanced Cyberinfrastructure in the Directorate for Computer and Information Science and Engineering and the Division of Astronomical Sciences in the Directorate for Mathematical & Physical Sciences. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →