GGrantIndex
← Search

RII Track-4:@NASA: Automating Character Extraction for Taxonomic Species Descriptions Using Neural Networks, Transformer, and Computer Vision Signal Processing Architectures

$23,928FY2024O/DNSF

University Of Puerto Rico Mayaguez, Mayaguez PR

Investigators

Abstract

This project would provide a fellowship to an Associate professor and training for a graduate student at the University of Puerto Rico Mayaguez. Arthropoda, a group that includes insects, spiders, and millipedes, is Earth's most diverse phylum with over 1.01 million known species. With an estimated 7 million species yet to be discovered, and a discovery rate of just 7,000 species per year, it would take around 850 years to identify them all, a process currently taking an average of 21 years per species. This project aims to expedite this through "Descriptron", a groundbreaking artificial intelligence tool leveraging machine learning and computer vision to accelerate species descriptions and taxonomic key generation. In collaboration with NASA Marshall Space Flight Center, the project will utilize advanced imaging technology to automate the capture and description of arthropod morphological features, reducing human error and ensuring reproducible results. The implications extend across ecology, evolutionary biology, and developmental biology. Emphasizing the role of citizen science, the project involves the wider community in data annotation via iNaturalist, fostering public participation in scientific discovery. This endeavor advances our understanding of biodiversity in our own backyards and accelerates the identification of undiscovered life on Earth. Panarthropoda, encompassing Onychophora, Tardigrada, Chelicerata, Myriapoda, and Pancrustacea, is Earth's largest and most diverse clade, with an estimated 7 million species yet to be discovered. Through "Descriptron", an artificial intelligence (AI) pipeline, this project will significantly accelerate taxonomic species descriptions and key generation through the utilization of state-of-the-art transformers, convolutional neural networks, and computer vision techniques. Key to this endeavor is a strategic collaboration with NASA's Marshall Space Flight Center, providing advanced imaging technology for Descriptron's development. Advanced imaging techniques will greatly speed up the development of novel training data needed for the automation of instance segmentation and text description process of arthropod morphological features, reducing human error and ensuring highly reproducible, objective results. By creating a library of models for sclerites and descriptive terms including color, texture, and shape, Descriptron will automate the process of producing a skeletonized taxonomic species description. This project leverages citizen science by engaging the broader community in the data annotation process via the iNaturalist platform. This approach not only facilitates public understanding and appreciation of biodiversity but also contributes essential data to the project. The use of Descriptron promises wide-reaching impacts across various fields such as ecology, evolutionary biology, and developmental biology that depend upon accurate morphological data. By effectively involving citizen scientists and accelerating taxonomic discovery, this project holds substantial potential to advance our understanding of Earth's biodiversity and expedite the biodiscovery process. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →