Short Course in Data Science for Environmental Public Health
Fred Hutchinson Cancer Center, Seattle WA
Investigators
Abstract
Nearly all science fields have been drastically changed by the big data era, and datasets have the potential to create major breakthroughs in environmental health. However, professional development in data science lags behind for environmental health researchers and practitioners. What training does exist primarily benefits research-intensive institutions. We propose the Short Course in Data Science for Environmental Public Health to bridge this education gap. Through the Fred Hutch Cancer Center Data Science Lab, we will leverage our combined 25- plus year track record of developing educational materials, scalable courses, scalable research experiences, and building communities around data science education to create this multimodal course. The program, which will empower 30 learners annually, begins with a two-week online course that solidifies R programming foundations. These two weeks will use a combination of didactic lectures on best practices and active hands-on lab activities to practice and engrain programming skills, a model for which the lead instructors have earned recognition for excellence in teaching and successfully used to train over 100 professional learners. Participants will practice new skills one topic at a time to make the content more manageable. This foundation will prepare participants for participating in a three-day in-person intensive âCode-a-thonâ where they work on authentic environmental health projects. The Code-a-thon will allow participants to practice data ethics skills in peer code review, reproducibility, and transparency in a supportive environment. Additionally, to ensure that we are responsive to the needs of all participants, we will allow learners a mechanism to provide anonymous feedback throughout and beyond the program. To create scalability, we will adapt a companion Massive Open Online Course (MOOC) so that potentially thousands of participants can benefit. We will also harness the strengths of in-person instruction by creating a yearly training for instructors hoping to reproduce this course in their own institution or community. These efforts will be bolstered by an online data community where participants can support, troubleshoot, and collaborate with peers, as well as monthly reminder newsletters to help participants retain what they learn. We will work with our existing network of faculty from a variety of institutions to recruit researchers and faculty from primarily education focused institutions, such as community colleges to participate in the live course. The course will be open to anyone and offered for free to help break down barriers to participation. Throughout, learners will work with relevant health datasets with the ultimate goal of understanding and addressing grand challenges in environmental health.
View original record on NIH RePORTER →