SI2-SSE: The Next Generation of the Montage Mosaic Engine
California Institute Of Technology, Pasadena CA
Investigators
Abstract
Images produced by the new generation of astronomical instruments are addressing fundamental questions about the Universe, such as the formation of the very first galaxies after the Big Bang, and the very first stages of the formation of stars in massive dust clouds in our Galaxy. Exploiting this new generation of data is difficult because the data sets they produce are sufficiently complex and large as to demand new approaches to data processing that lag far behind developments in instrumentation. A growing community is working to rectify this state-of-affairs. This project will deliver software tools that will aggregate data from the new instruments into images of large scale regions of the sky so that astronomers can fully study scientific questions such as those identified above. This approach of studying aggregated images, or mosaics, is a powerful tool in astronomy. The project will deliver the next generation of an existing mosaic-building engine, Montage, which is in wide use in astronomy and in educational activities. It will support processing of the new data sets such that they can be visualized in immersive tools such as the World Wide Telescope, widely used in developing innovative approaches to education, and such that that they can generate data used by Citizen Science services such as Zooniverse. Montage will come be bundled with a set of tools that will enable astronomers to process massive collections images on powerful "cloud computing" platforms. These tools will be applicable to many data-intensive problems in fields such as earthquake prediction, DNA sequencing, and climate modeling. Finally, Montage is in wide use in developing and testing national cyberinfrastructure to benefit the U.S. science community. We anticipate that the next-generation Montage will be used in the same way to develop ever more powerful cyberinfrastructure as data volumes grow rapidly in all fields. In greater detail, the project will deliver the next generation of the Montage image mosaic engine, which will offer new capabilities that respond to the changing astronomy data and computing landscapes. These capabilities, requested by the user community, are: 1. Support for mosaicking of data cubes, now routinely generated by modern instrumentation; 2. Support for two widely used sky-partitioning schemes, HEALPix and TOAST; 3. An API to enable users to call Montage directly in Python and other languages. The work to develop memory management and subsetting techniques to support mosaicking will be available for others to use and extend. Support for HEALPix will enable integration and analysis of far-infrared, cosmic background data sets with other image data sets. TOAST will enable essentially any image data set to be incorporated into the WWT. Montage will be bundled with a turnkey package of open source tools that provision resources and run applications on cloud platforms. This package will build on knowledge gained in creating data products at scale with cloud platforms. These tools will bring cloud computing to scientists who have little system configuration knowledge, one of the biggest barriers to entry; these tools are general purpose and will be applicable to data intensive applications in may fields. Thus Montage will provide powerful new capabilities to astronomers, to projects analyzing data at scale to create new data products, and to scientists in data-intensive fields outside astronomy. The next-generation toolkit will inherit the sustainable Montage architecture, which has attracted a large user base among astronomers, E/PO specialists, and computer technologists. Montage is written in C, is portable across all common Unix platforms, highly scalable and delivered as components that are easy to incorporate into pipelines and processing environments. Montage is the only mosaic engine with all these characteristics. The project will use the evolutionary delivery lifecycle model. The code will be a made accessible on the GitHub repository, and released as Open Source code with a BSD 3-clause license. A Users' Panel will advise on detailed specifications.
View original record on NSF Award Search →