Collaborative Research: Explaining Differential Success in Biodiversity Knowledge Commons
Purdue University, West Lafayette IN
Investigators
Abstract
Scientists increasingly rely on community-governed digital portals to store and access open data. While these digital portals have improved access to data and other scientific products, portals are expensive to implement, challenging to maintain, and often fail to have uptake among key stakeholders. Like those working in other areas of the digital economy, scientists have increasingly adopted platforms to implement and tailor portals to particular communities and needs, a model that lowers infrastructure costs and enables benefits of scale across networked portals. Yet portals built from the same platform nonetheless show a large variation in their outcomes. Participation in any particular portal is often short-lived and the impacts of data portals on the scientific process are challenging to evaluate; portal usage typically fails to map onto traditional measures of research productivity such as publications and citation counts. This project is the first to systematically investigate scientific data portals built from a common platform in order to understand portal communities and outcomes. The research design, which compares biodiversity data portals, will inform studies of other platforms and digital knowledge commons, such as open source software and peer-production communities. Specific findings and recommendations will be shared with biodiversity portal stakeholders regarding the effective design and use of these portals in order to improve uptake and access to these important species data and facilitate science and decision making around environmental change. This project uses fuzzy set Qualitative Comparative Analysis (fsQCA) to analyze open data portals as a kind of knowledge commons that impose minimal restrictions on access or reuse. The study sample comprises the 37 active and 4 inactive biodiversity data portals built from the Symbiota platform, one of the largest and earliest scientific data platforms still under continual development, with hundreds of participating biodiversity collections and several dozen individually managed portals. In 2020, Symbiota portals collectively provided access to over 60 million biodiversity data records and accounted for 90% of Web traffic accessing specimens digitized through the NSF’s Advanced Digitization of Biodiversity Collections (ADBC) program, which has invested over $50 million in this area to date. The project synthesizes across multiple types and sources of quantitative and qualitative data to identify why some of these portals achieve sustained growth and others do not. To do so, the project collects and analyzes up to ten years of analytics and other information from the portals, including tracked usage data, community building activities, features of portal governance, and resource inputs, as well as observation data collected during portal site visits and interviews with a stratified sample of portal managers. The fsQCA approach enables comparative inferences about those portal features most likely to foster productive and sustained outcomes, including collective benefit and building inclusive scientific communities. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →