Collaborative Research: III: Medium: New Machine Learning Empowered Nanoinformatics System for Advancing Nanomaterial Design
University Of Pittsburgh, Pittsburgh PA
Investigators
Abstract
The research objective of this proposal is to address the computational challenges in the innovative nanomaterial data analysis or nanoinformatics for predicting nanomaterials properties. Nanomaterials are very small materials that can be used in a variety of applications, including nanomedicine development. The vast quantities of existing experimental data require new nanoinformatics approaches and toolkits for data extraction, analysis, and sharing. This can help guide the safe design of next-generation of nanomedicines with desirable therapeutic activities, while also ensuring they have limited side effects. However, there are currently two critical limitations to using machine learning approaches in nanoinformatics modeling studies. First, most existing data available for modeling were based on a limited number of nanomaterials that also have limited experimental characterization of their chemical properties. Second, despite significant efforts from various researchers, the available modeling approaches that have been developed are applicable only for a specified small set of nanomaterials and have rarely been used to design nanomaterials. This project will address the computational challenges in large-scale nanomaterial data mining, development and validation of an automated informatics framework to digitalize nanostructures, identify molecular markers, and support fast nanomaterial retrieval and integrative analysis. This project will also facilitate the development of novel educational tools to enhance several current courses at Rutgers University, University of Pittsburgh, and University of Minnesota. The investigators will engage the minority students and under-served populations in research activities to give them a better exposure to cutting-edge science research. In this project, a novel machine learning based nanoinformatics framework will be developed to integrate new digital nanostructure representations with the emerging key computational techniques. The project focuses on designing principled machine learning and data science algorithms for analyzing large-scale nanomaterial data to create new informatics toolkits to facilitate the nanomedicine-based treatments and new nanomaterial design. Specifically, the following research goals will be met in this project: 1) new computational tools to automate nanostructure digitalization; 2) interpretation method to enhance deep learning based predictive models; 3) new cross-modal deep hashing network for fast and accurate nanomaterial data retrieval; and 4) evaluate the proposed methods and system using real large-scale nanomaterial data and release the database and nanoinformatics toolkits to the public. Unlike most existing nanoinformatics strategies that perform modeling and analysis at a small scale, this project will provide promising new directions to the analysis of large-scale complex nanomaterial data by addressing the critical data-intensive analysis issues including efficiency, scalability, and interpretability. The investigations combine rigorous theoretical analysis and emerging application studies and will contribute to both academic research and potential commercialized products. This project will advance and thus extend the relationship between engineering innovation and computational analysis, and hold great promise for nanomaterial and nanomedicine developments. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →