GGrantIndex
← Search

"Ensuring Legacy Data Access & Dissemination: Occupational Coding In The General Social Survey"

$500,000FY2011SBENSF

National Opinion Research Center, Chicago IL

Investigators

Abstract

SES - 1123510 Michael Hout, University of California, Berkeley, and National Opinion Research Center/NORC Peter Marsden, Harvard University and National Opinion Research Center/NORC Ensuring Legacy Data Access & Dissemination: Occupational Codes in the General Social Survey Abstract In the past thirty years, changes in technology, business, and government practice have substantially altered the American occupational structure. Our project provides a foundation to understand the consequences of new occupations on the current economy and contemporary society, and to preserve unique data key to documenting these fundamental historical changes. Specifically, this project modernizes the occupational and industry data in the General Social Survey (GSS) from the 1970s to the present time. The project has several goals. They dovetail recent key NSF recommendations that encourage large infrastructure data sources such as the GSS to facilitate increased data access and dissemination. This can be done by presenting data and metadata according to a well-defined protocol, which will allow desirable modes of data access, search, downloads, and documentation. The project also meets the NSF challenge to retrofit historical or legacy data and metadata to become machine readable. This will possibly open up vast amount of data for dissemination and analysis once issues of confidentiality and disclosure are resolved. To accomplish this goal, this project will (1) retrieve GSS respondents? detailed verbatim descriptions of their work activities, occupations, and industries from the physical questionnaire manuscripts from early GSS waves, (2) convert them into machine-readable form, (3) recode them to reflect 2010 occupation and 2007 industry categories developed by the U.S. Census, and (4) attach external data such as socioeconomic scores and prestige assessments to the recoded categories. The intellectual merit of digitizing occupational information and recoding occupational and industry categories in the process is that it enables researchers to use the full potential of the occupation and industry information recorded in the GSS over time. Doing so will enhance the value of the GSS as a resource for comparative and contemporary research on social inequality, mobility, and other fields and preserve its growing value as a historical database describing trends in U.S. society over two generations. Ensuring the longevity of such legacy data by converting hand-written text into machine-readable text, the project also develops an archive of verbatim descriptions that will allow future researchers to code them using other standards, including U.S. Census standards that may become available in upcoming decades. Broader Impacts The GSS is a public resource as well as a scientific one. Public media, especially newspapers, make extensive use of the GSS. By improving the quality of occupational and industry information in the GSS and ensuring that it is coded in a consistent way over time, this project will help journalists and citizens make sense of social trends and patterns. Also, high schools and colleges make extensive use of the GSS as a teaching tool. Teachers and students will get more out of these exercises from the new data products this project will produce when data reflect contemporary distinctions among occupations and industries as accurately and precisely as possible.

View original record on NSF Award Search →