GGrantIndex
← Search

Construction of a Panel Database from Linked Population and Agricultural Census Records, 1850-1880

$151,516FY2004SBENSF

Vanderbilt University, Nashville TN

Investigators

Abstract

This project will construct an unbalanced panel from existing micro-level federal census data. Specifically, it will link the records for individuals, families and households from the 1850, 1860, 1870 and 1880 federal censuses of population for a sample of townships in the northern United States to the records of the censuses of agriculture for these communities in those census years which are the only years for which comprehensive agricultural census data are available. The core of the panel is an expanded version of the widely-used Bateman-Foust sample from the 1860 censuses of agriculture and population to include more western agriculture (including the West Coast) and to improve coverage of agriculture in the Northeast. Temporal coverage was expanded to include 1850 as well as 1870 and 1880. Virtually all the individual data have been collected. This proposal underwrites the record linkage and panel creation and its distribution. Intellectual Merit: Not only is 1850-80 the only period for which a panel of linked data between the census of population and the census of agriculture can be created but it is also a period of fundamental economic and social change in an area that has been largely ignored in recent scholarship: the agrarian northern part of the United States. It therefore promises to reopen scholarship in this area by providing a new, large-scale, user-friendly resource. Among the changes taking place in the communities in the panel are initial European settlement, a spreading transportation network, growing urbanization, and rural stagnation and depopulation. The panel provides much greater detail (albeit for just a limited number of communities) plus an essential time series dimension not found in other micro-datasets such as the IPUMS. However, it is a complement to rather than a substitute for these other samples. Indeed, to enhance complementarity we will recode using IPUMS codes and construct the same variables (e.g. "own children"). The principal challenge in this project is the linkage of the individual census databases one to another. Specifically, individuals in the census of population in each of the 140 communities will be linked across census years if they remain in those communities. Also farmers in each year will be linked to the farms that they operated in the township. Record linkage is primarily through the names of the individuals (though supplemented by other information). This task will confront the same kinds of challenges that face all who assemble and maintain records from assorted sources including biomedical records, taxes and security services as well as historians and genealogists. These challenges include data discrepancies and transcription errors that serve to obscure and confuse true links. Although linkage is partially computerized using various phonetic encoding schemes and also taking account of the effects of aging on ink and paper, we have found that human involvement is unavoidable. Indeed, it is critical to success and the prime consideration behind this funding request. Broader Impact: While the PI has his own research agenda for these data, the primary objective remains the creation of the panel database as a public good. Numerous scholars have indicated their interest in these data and their potential value in training graduate students and providing material for graduate theses. Many of the individual samples have already been made available through the PI, USGenWeb and various state and county historical and genealogical societies. Indeed, USGenWeb has formed a special projects taskforce to handle these data. The rest of the individual data will be made available shortly as it is checked. The completed panel will be archived with ICPSR and also made available on the web page of the PI where other databases that I have helped collect already reside.

View original record on NSF Award Search →