Forging Consensus: A Data-Driven Framework for Studying Scientific Consensus and Debate
Northeastern University, Boston MA
Investigators
Abstract
Healthy debate between scientists drives scientific progress, as competition between ideas and hypotheses encourages the search for new evidence and motivates scientists to reckon with their different, often diverging theories and worldview. Understanding science, therefore, requires understanding where and why scientific debates occur. However, the study of scientific debate has historically been challenged by a lack of accepted data and methodological approaches. This project will overcome this challenge using a data-driven approach that leverages increasingly-available data on scholarly activity to identify debates across millions of published scientific documents, and to provide a quantitative scoring of the level of debate surrounding a research topic. The data and techniques developed for this project and publicly released will provide a foundation for a new science of scientific debate, and their potential demonstrated by applying them towards studying the evolution of scientific debate within COVID-19 research. A strong understanding of scientific debates, including where they occur and why, is needed to inform theoretical models of science, new tools for improving the accessibility of the scientific literature, and better decisions in science policy and governance that will accelerate the pace of scientific discovery. This project addresses three primary objectives. First, using a variety of heuristic and machine learning based techniques, a corpus of exemplar debates will be automatically identified from among millions of scientific publications indexed in major bibliographic and full-text databases. Second, we will then use this corpus to develop and rigorously validate a suite of topic-level quantitative indicators of debate that leverage state-of-the-art techniques applied to publication metadata, citation linkages, and full-text information. The most successful of these indicators will be combined into a mathematical model and used to infer a singular debate score for topics. These indicators will for the first time facilitate the empirical study of the incidence of debate and evolution of consensus across all of science. Finally, we will demonstrate the potential of these indicators by using them to address policy-relevant research questions relating to the impacts of the COVID-19 pandemic on science; specifically, we investigate the role of consensus in the societal usage of knowledge and the incidence of fake news, and whether practices that accelerated science during the pandemic also accelerated consensus formation. The research will contribute to several fields, including the science of science, science communication, and public policy. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →