CDS&E: AGNs Amidst the Data Deluge

$179,482FY2014MPSNSF

Drexel University, Philadelphia PA

Investigators

Abstract

In the fifty years since the discovery of quasars, a picture of these enigmatic objects has emerged wherein quasars are supermassive black holes residing at the centers of large galaxies. The black hole may have a mass of millions to billions times that of the sun. As this picture implies, the quasar phenomenon is one and the same as that of the objects referred to as "active galactic nuclei." All are characterized by enormous luminosity such that the quasar frequently outshines the light from all of the billions of stars in its host galaxy. The luminosity of the quasar allows it to be seen at very large distances, distances at which the object is unresolved, i.e., its image is like that of a star in our galaxy. Over the decades, quasars have been discovered by a variety of diverse techniques, each of which has a tendency to select objects of a limited range of characteristics such as color or radio emission, properties that can differentiate it from the huge number of foreground galactic stars. As astronomy enters the "Big Data" era, it is poised to consider massive data sets, the volumes of which are far larger than any considered previously. Specifically, large imaging surveys of the sky such as those produced by the PanSTARRS and Large Synoptic Survey Telescope will contain millions of quasars. This project will develop algorithms and techniques for sifting through the data sets to pull out the quasars for study of the evolution of the properties of the quasar population with cosmic time. The project will create the largest sample of quasar assembled to date using modern statistical techniques and the simultaneous combination of color, variability, and astrometric information available in existing imaging surveys. Quasar selection with these parameters is currently performed separately and generally with sub-optimal algorithms. It will only be through simultaneous consideration of heterogeneous data types and the use of modern statistical methods that next-generation imaging surveys will achieve large quasar samples that are both optimally complete and uncontaminated, enabling fulfillment of their promise and maximizing their science output. This work is at the crossroads of astronomy, statistics, and computer science, as it deals with the problem of object classification in massive astronomical data sets that would be intractable with standard methods. It will be possible to extend this work to the next-generation of imaging surveys. A proof of concept of this work is in hand, establishing that the project is both well-conceived and that the required resources are in place. The algorithms will be tested on existing surveys for which confirming spectroscopic data exist.

View original record on NSF Award Search →