Collaborative Research: Metadata Portal for the Social Sciences
National Opinion Research Center, Chicago IL
Investigators
Abstract
This is a pilot project to provide enhanced access to two seminal social science data resources - the American National Election Studies (ANES) and General Social Survey (GSS) - by creating structured, machine-actionable metadata and a portal populated with new tools for data discovery and analysis. The project will also analyze the current workflows that produce the ANES and GSS data and make recommendations for transitioning to metadata-driven processes to streamline data production and guard against the metadata loss that currently occurs. A team including representatives from the ANES, GSS, and ICPSR will develop rich metadata in Data Documentation Initiative (DDI) XML, a well-defined standard for the social sciences. This work will entail retrofitting all existing documentation, which currently exists in disparate formats, to a uniform XML structure. A sample of the ANES and GSS documentation will be enriched with detailed information on provenance, universe, and other contextual information that accumulates across the data lifecycle. This sample will also include information to facilitate comparison and harmonization. All documentation created for the project will be freely available. The Metadata Portal for the Social Sciences will demonstrate DDI-based open-source tools for advanced searching, dynamic codebooks, question banks, harmonization, and other functions. The portal will also feature links to bibliographic citations for both surveys and will provide opportunities for researchers and others to comment and interact. An important aspect of the project will be to re-envision the workflows currently used to produce the ANES and the GSS and to lay the groundwork for new metadata-driven workflows to realize a more seamless "interview to internet" process based on DDI and the Generic Statistical Business Process Model (GSBPM). This project will facilitate transparent and user-friendly access to the ANES and the GSS to enable expert use of the data as well as exploration by more novice users. Production transparency in terms of how data came into being is essential, and this project will provide structured metadata on provenance of variables as well as detailed universe statements to permit users to understand the routing patterns for specific respondents and missing data. Many researchers are interested in determining comparability of data items and questions over time, and this project will demonstrate ways to assess comparability for these key time series. In general the project will provide more information about the ANES and the GSS data than researchers have had access to in the past. Improving the ANES and GSS workflows will lead to the automated capture of more metadata "upstream" that can be made available across the life cycle. Improved access to data through better search, extraction, and analysis tools will enable greater participation across all segments of society interested in democratic process and social trends. The demonstration portal and its tools will illustrate the potential of using structured documentation as a foundation for tools development and will be extensible to other surveys, leading to improved accessibility for other social science data resources. The large base of metadata and the open source applications developed for this project will encourage software developers to create new ways to access ANES, GSS, and other data. New workflows will focus on metadata re-use over the life cycle, leading to greater efficiencies and cost-savings in creating DDI metadata for all social science data projects.
View original record on NSF Award Search →