This data linkage project develops record linkage methods necessary to create an unprecedented data resource that covers the American population over seven decades. Specifically, this project develops new strategies for placing unique protected identification keys (PIKs) on twentieth century census records, evaluates the results of this process and, subsequently, optimizes this data for population and health research. These strategies will facilitate the linkage of census, survey and administrative records with the goal creating an integrated database for life-course and intergenerational analyses of health and well being.
Within a secure data environment, the Census Bureau assigns PIKs on recent census and survey data which allows them to uniquely identify and link individuals across data sources for the purposes of improving data quality and program efficiency while maintaining confidentiality. This project proposes research to obtain PIK rates on 1940 census data that approach the Bureau's success on recent data. If successful, by matching 1940 cross-sectional data with recent cross sectional and panel data, this work will allow the research community to (1) construct longitudinal data on individuals over long periods of time; (2) assemble longitudinal data on related individuals (siblings and parents and children) over Long periods of time; and (3) establish data on multiple generations of families (dynasties). This data and analysis informs future efforts to develop a data infrastructure program linking a range of data sources on individuals and families over Long periods of time to study life-cycle and intergenerational issues.