IIBR Informatics: A generalized modeling framework for integrating multi-species data sources to estimate biodiversity processes

$782,676FY2020BIONSF

Michigan State University, East Lansing MI

Investigators

Abstract

Biodiversity is linked to the health and integrity of ecosystems with species varying in their contributions to ecosystem functions. It is critical to assess the status and dynamics of whole communities of species and not just those species that have large amounts of data. This project develops ‘integrated community models’, a statistical modeling framework to simultaneously use multi-species data sources to estimate the status, trends, and dynamics of biodiversity. The objective is to create a flexible infrastructure for estimating species and community processes that can incorporate multiple data types on multiple species through simulations and empirical case studies on animal communities including birds, small mammals, and butterflies. Estimates of species distributions, abundances, and demographic rates form the basis of scientific understanding of biodiversity dynamics and community responses to external threats, delivering critical information for biological conservation. The development of integrated community models will enable researchers to obtain detailed inferences on species and communities across spatiotemporal scales during an era of accelerated biodiversity loss. This project also provides training to graduate students and postdoctoral scholars in hierarchical statistical modeling and creates a K-12 outreach module to teach middle school students about biolodiversity conservation. The integrated community modeling framework uses a hierarchical approach merging single-species integrated models (which combine multiple data sources on a target species) and hierarchical community models (which estimate multi-species occurrence or abundance patterns but only from a single data source). Although there have been recent advances in single-species integrated models and hierarchical community models, both approaches have shortcomings: the former is limited to a single species, whereas the latter fails to take advantage of the benefits gained from merging multiple data sources and data types (e.g. estimation of both abundance and demographic rates simultaneously, increased spatiotemporal coverage). By bridging the gap between single-species integrated models and hierarchical community models, integrated community models leverage the capabilities of both and overcome traditionally narrow inferences (in terms of space, time, and information gained) on biodiversity parameters. The modeling framework uses each of the different available data sources to inform various components of the underlying biological process model through hierarchical, observation models linked together with a joint likelihood. The biological process models for communities can range from simple (e.g. estimates of species occurrence) to complex (e.g. estimates of species survival, reproduction, and abundance) and depend on both the biology of the taxonomic group and the quantity/type of available data. This project advances the fields of population and community ecology because it allows scientists to take advantage of multiple data sources (despite differences in sampling protocols and spatiotemporal data structures and quantities), leading to increased accuracy and precision of species-level dynamics and biodiversity metrics (e.g. richness, composition). The results of the project will be made available at https://ezipkin.github.io. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →