CISE-MSI: DP: IIS:III: Deep Learning Based Automated Concept and Caption Generation of Medical Images Towards Developing an Effective Decision Support System (DSS)

$439,817FY2022CSENSF

Morgan State University, Baltimore MD

Investigators

Md M Rahmancontact Ming-Hsing Chiu Yi Chung Chen

Abstract

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2). Identifying and labeling important features in medical images such as X-rays and ultrasounds is fundamental to both diagnosis itself and to building libraries of images that support education, training, and auditing of medical quality. This work is time-consuming even for trained experts, making it an impactful and important problem domain to study for researchers in computer vision, machine learning (ML), and natural language processing (NLP). These artificial intelligence (AI)-based techniques have made great progress in object recognition and labeling for everyday camera images; however, medical images pose additional challenges because of the need to account for detail and relationships between substructures in the image, the need to generate captions that apply not just to the whole image but to these important substructures, and the need to handle noise and artifacts created in medical image processing. Further, the tolerance for error is low; interpretations need to be coherent, grammatically, and semantically correct in order to be useful. This project focuses on the intersection of biomedical informatics and imaging science, working to develop high quality datasets of human-annotated visual concepts in images that appear in public collections such as open access biomedical journals, then using those datasets to train novel vision, ML, and NLP algorithms. The work will support multi-institutional research and educational collaborations between three minority-serving institutions, providing advanced research and classroom training in AI, ML, and cloud computing to students from groups historically underrepresented in computing. To improve image interpretation and retrieval effectiveness, this project will (1) create a crowdsourcing-based annotation system to clinically annotate important regions of interest (ROIs) of images; (2) advance object detection models to segment images and map medical image ROIs; (3) advance multilabel concept classification techniques by considering correlations between concepts; and (4) apply contextualized embeddings via deep language models to generate the captions. The proposed approaches will be evaluated through comparison with current methods in benchmark datasets, including the ones constructed for this project. The end goal is the development of an AI-based prototype that helps physicians focus on interesting image regions, find relevant comparison images, and describe findings in correct and standard ways, all of which can reduce medical errors and benefit both medical departments and society by reducing the cost per exam. In addition to the research objectives, the project will implement a research-education medical AI training program including cloud-enabled classrooms, cross-institutional mentoring, and partnering with an existing industry internship “pathway to success” initiative to build the science and technology workforce of the future. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →