GGrantIndex
← Search

PRIMES: A Biological and Socio-Environmental Approach to Machine Learning for Equitable and Proactive Cancer and Health Screening

$278,537FY2023MPSNSF

Northeastern Illinois University, Chicago IL

Investigators

Abstract

This project is in collaboration with the Institute for Mathematical and Statistical Innovation (IMSI). The project starts with the participation of the PI in the IMSI fall 2023 long program, entitled Algebraic Statistics and Our Changing World, then continues with a series of three scientific projects along with educational/outreach activities aimed to empower individuals to effectively assess their cancer and health risks, and thus enable us to be proactive in detecting cancer and diseases at earlier stages. Using electronic medical records of patients in Chicago, related to colorectal cancer, lung cancer, and postpartum health outcomes, along with socio-environmental information, a novel and equitable machine learning methodology will be tested and compared to current and broadly used algorithms, to not only predict cancer and health outcomes, but to also study the effect of exposure to violence on our health. The full mathematical and statistical investigation in each of the three research projects (colorectal cancer, lung cancer, and postpartum health outcomes) will not only advance healthcare predictive modeling, but also inform and advance the use of machine learning for an effective, accurate, and unbiased use of predictive modeling in healthcare and beyond. The participation in the IMSI long program will expose the PI to the state of the art of research in related fields and ideas for the future, and will provide adequate time for discussion with workshop participants with the potential to develop new scientific collaborations, and enhance the research of his students and collaborators. The scientific and educational activities will improve the well-being of individuals in society, reduce inequities in society, reduce health distrust among underserved communities, and increase the number of women and underrepresented minorities in STEM in general, and in mathematics and statistics in particular. Scientific evidence is emerging suggesting that societal and neighborhood level factors can elicit a toxic and sustained stress response that promotes biological changes associated with the development of cancers. Using electronic medical records of patients in Chicago related to colorectal, lung, and postpartum cancer/health outcomes, along with socio-environmental information, PI will perform a full investigation of a classification machine learning methodology, i.e., the triple discriminant scoring methodology, and compare its performance to existing and broadly used techniques, such as Extreme Gradient Boosting and Neural Networks. Our preliminary predictive modeling of colorectal adenomas using our triple discriminant scoring methodology, which relies on a high number of simulations and optimization tests across all possible subsets of data variables, showed robustness against change in training data distribution, unlike for the Extreme Gradient Boosting. The three research projects, i.e., predicting colorectal, lung and postpartum cancer/health outcomes, will provide three large electronic medical records data sets on which we will conduct a comprehensive mathematical, statistical and empirical investigation of various machine learning classification methods to extract an effective, accurate, and unbiased use of machine learning in healthcare and beyond. In addition, the PI will participate in the Institute for Mathematical and Statistical Innovation (IMSI) fall 2023 long program, entitled Algebraic Statistics and Our Changing World, to harness and expand his interdisciplinary research, develop new research collaborations, and enhance his students and collaborators research. Finally, scientific and educational activities will improve the well-being of individuals in society, reduce inequities in society, reduce health distrust among underserved communities, and increase the number of women and underrepresented minorities in STEM in general, and in mathematics and statistics in particular. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →