GGrantIndex
← Search

Collaborative Research: IIBR Informatics: Keeping up with the genomes - Continual Learning of Metagenomic Data

$249,706FY2020BIONSF

Rowan University, Glassboro NJ

Investigators

Abstract

Microbiomes are communities of microscopic organisms that are found everywhere on earth and are important in help to digest food in the gut. In the intestines, they can produce vitamins (good) or toxins (bad), so we need to understand what organisms and genes are present in these microscopic communities. This project uses artificial intelligence (AI) to identify organisms and their genes that live in microbiomes. Existing works for this effort have been hampered due to very rapidly growing amount of data, which often need to be repeatedly re-analyzed as new data become available. Such a process is not only inefficient, but is increasingly unsustainable, even for our growing computational resources. This approach is unique because it uses less computing power. Instead of continuously reentering massive amounts of data, the proposed state-of-the-art system has the ability to recall and reuse prior information without requiring reentering or re-analyzing prior data,saving substantial computing time and ultimately money. The goal is to find AI methods that achieve the best cost savings while not sacrificing accuracy. Many unidentified organisms are also found in microbiome experiments and are discarded and never used to identify the same organisms in other experiments. An AI based approach will keep, remember, and reuse their information in case those new organisms show up in again later in other experiments and eventually help in their identification. If the organism is identified in the future, the method can automatically update old data and the knowledgebase effectively and efficiently. This project will develop a dynamic, scalable, and semi-supervised learning framework that continually updates a classification model, with large unlabeled, experimental data. In addition to creating richer models that can leverage both reference and experimental data, the primary innovation is that the model will identify unknown organisms and proteins and integrate them into reference database for future model updates. This framework will be validated on the hundreds of metagenomic studies (composed of potentially thousands of samples) annually submitted to the microbiome computing website MG-RAST. MG-RAST is used by scientists to upload their microbiomes to study and improve agriculture, diagnoses, medicine, making biofuels, and a variety of other applications on which microorganisms have a deep effect. This work will contribute to college student training on artificial intelligence and its application to the microbiome. Results will be shared broadly with other educators and researchers through summer workshops. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →