SHF: Medium: A DSL for Data Visualization and Analysis in Imaging-Based Science and Scientific Computing
University Of Chicago, Chicago IL
Investigators
Abstract
This research project is focused on the design, implementation, and application of Diderot, a high-level domain-specific programming language for analyzing and visualizing 3D and 4D (spatio-temporal) data, such as produced by imaging modalities and computational simulations on finite-element meshes. Many areas of science now require computing with digital representations of the objects and systems being studied, but creating new research software to visualize and understand the data is a major bottleneck. With new ways of volumetrically scanning specimens and increasingly nuanced models of dynamic systems, research software is becoming more complex, and the computational expense of running the software is significantly increasing. Diderot addresses this urgent problem by simplifying the process of creating new parallel software to visualize and process complex scientific data. The intellectual merits are how Diderot will accelerate expressing mathematical ideas in working code, expand the kinds of computational representations that Diderot understands, leverage the power of large supercomputers, and help programmers debug their programs. The project's broader significance and importance are derived in part from the kinds of data that Diderot will handle, such as the high-resolution images produced by a microCT scanner at Argonne National Lab (used daily by scientists from all over the world), or the latest generation of light-sheet microscopes used in developmental biology to investigate fundamental questions about how cells organize themselves into tissues. The project's significance also derives from the range of people who will use Diderot, from mathematicians writing specialized finite element method programs, to college students getting their first taste of scientific computation by working with real-world microscope data capturing the formation of the nervous system in a fish embryo. Computation is an increasingly important tool for science, but the semantic gap between the available computational tools and scientific reasoning is often large. One pressing research topic within computer science is how to create programming tools that can track the shift in parallel computing hardware from traditional general-purpose CPUs to heterogeneous processors with high-performance accelerators like GPUs. This proposal seeks to accelerate research at the intersection of computation and science by exploiting domain-specific language (DSL) technology to bridge this semantic gap and to transform how scientists use software to understand data from measurement and simulation. The PIs will build on the parallel DSL Diderot that they have designed. Preliminary experience with Diderot demonstrates that it is possible to write visualization and analysis algorithms in a very mathematical programming notation that has performant parallel implementations on a variety of parallel hardware. The project will build on these preliminary results in several ways: the PIs will extend Diderot to support a wider range of data models and to provide a richer set of computation tools for analyzing data; the PIs will work on scaling Diderot to handle larger data sets and larger parallel platforms; and the PIs will explore techniques and tools to better support domain-specific software development. These research thrusts will involve close collaboration between the areas of programming languages and image analysis. The design of language features will be driven by the needs of image analysis algorithms, as well as the foundational principles of programming languages. The design and implementation of Diderot will be evaluated using the latest image-analysis algorithms from the literature, as well as being used to prototype new algorithms.
View original record on NSF Award Search →