SHF: Small: Methods, Workflows, and Data Commons for Reducing Training Costs in Neural Architecture Search on High-Performance Computing Platforms

$623,999FY2022CSENSF

University Of Tennessee Knoxville, Knoxville TN

Investigators

Michela Taufercontact Catherine Schuman Silvina Caino-Lores

Abstract

Neural networks are powerful artificial-intelligence models that capture embedded knowledge in scientific data automatically. Scientists can use the knowledge to solve problems in domains such as physics, materials science, neuroscience, and medical imaging, among others. Finding accurate neural networks for a specific scientific dataset or particular problem comes at a high training cost: it requires searching among thousands of neural networks on a large number of high-performance-computing resources. This project delivers methods, workflows, and a data commons for reducing the training cost of neural networks. The methods are based on parametric modeling and enable rapid search termination early in the training process, making the search process faster and cheaper. The workflows decouple the search from the accuracy prediction of neural networks for different datasets and problems. The data commons shares the full provenance of the neural networks so other scientists can deploy the neural networks in their own research. Advances in neural networks research have a far-reaching impact on many scientific applications. Accurate neural networks can be used to extract structural information from raw microscopy data, predict performance of business processes, analyze cancer pathology data, map protein sequences to folds, and predict soil moisture or crop yield. The researchers’ efforts to build a broader community of high-performance-computing experts also have a far-reaching impact on the efficient design and use of artificial-intelligence products. The team of researchers promotes increased participation of students through mentoring. Furthermore, the researchers also develop curricula tailored for a population of graduate and undergraduate students across scientific domains beyond the department of computer science. This project addresses the urgent need to reduce the use of high-performance-computing resources for the training of neural networks, while assuring explainable, reproducible and nearly-optimal neural networks. To this end, the team of researchers proposes a flexible fitness-prediction method that uses parametric modeling to predict future fitness of neural networks and allow for early termination of the training process. Through this project, the researchers create an index of effective parametric functions for a diverse suite of fitness curves, including edge cases in the modeling (e.g., neural networks that never learn or neural networks that experience a learning delay). The researchers transform neural-architecture search implementations from tightly-coupled, monolithic software tools embedding both search and prediction into a flexible, modular workflow in which search and prediction are decoupled. Project workflows enable users to reduce training cost, increase neural-architecture search throughput, and adapt fitness predictions to different fitness measurements, datasets, and problems. The researchers build a searchable and reusable neural-network data commons of record trails that capture the neural network’s lifespan through generation, training, and validation stages, recording the neural network architecture, the training dataset, and loss and accuracy values throughout each stage. The neural-network data commons enables users to study the evolution of neural-network performance during training and identify relationships between a neural network’s architecture and its performance on a given dataset with specific properties, ultimately supporting effective searches for accurate neural networks across a spectrum of real-world scientific datasets. Furthermore, the data commons provides the scientific community with a resource to study the relationships between datasets, network architectures, and performance. To assess robustness for different datasets, the project considers both well-known benchmark datasets and real-world scientific datasets of protein diffraction patterns from x–ray electron laser beams in protein structural analysis, crop-scouting images from drones in precision farming, and forestry-scouting drone images for wildfire prevention. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →