NSF FDA/SiR: Development of eeDAP microscopy platform software, validation data, and statistical methods to assess performance of candidate Software as a Medical Device (SaMD)

$200,000FY2023ENGNSF

Yale University, New Haven CT

Investigators

Abstract

Computational pathology has recently exploded with the advent of sophisticated quantitative imaging microscopes and machine learning (ML) analysis software. FDA scientists have indicated that there is a need for statistical methods and relevant data for designing studies of readers skilled in the art of histology slide review (e.g., pathologists, immunologists, pathology extenders) and of AI/ML models. Therefore, this NSF/FDA Scholar-in-Residence project focuses on developing blueprints for components of reader studies in which readers interpret medical images with and without AI/ML model outputs. The work includes design of software for an FDA designed optical and digital microscope, validation data, and statistical methods used for assessment of AI/ML models. The first objective of this NSF/FDA Scholar-in-Residence project is to validate the existing MATLAB based Evaluation Environment for Digital and Analog Pathology (eeDAP) platform as a tool for reader studies and develop free open-source software tools to control the eeDAP platform. The team will build a “bridge” between the precisionFDA portal, multiple expert readers’ computers, and the local eeDAP unit. Histology slide annotation software and Python microscope hardware control scripts will be integrated using a client-server model with the use of containerization, workload orchestration, and overlay networks to create an application that is scalable across multiple devices. This will facilitate interactions with the precisionFDA portal and with experts during discussions, crowdsourcing, and teaching. The second objective of the project is to measure, account for, and develop strategies to reduce reader variability. Methods that threshold the data will be used. The methods separately evaluate agreement above and below the threshold for each member of an expert panel. The test statistic is the agreement between the end user and each expert minus the agreement between each pair of experts, averaged over the experts. Models of this test statistic will be used to generate hypothesis tests. Multiple thresholds and corresponding hypothesis tests will be treated sequentially to investigate performance across the data range. These tools are needed to facilitate reproducible, generalizable, statistically efficient, and practical device assessments in this space for use in evaluation of software planned for use in clinical care. Other diagnostic imaging tests suffer for the same challenges. Therefore, methods developed in this proposal have applicability well beyond digital pathology. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →