The urge to merge: linking vital statistics records and Medicaid claims

Med Care. 1994 Oct;32(10):1004-18.

Abstract

This paper describes a procedure used to link Medicaid claims data to California vital statistics records for very low birthweight infants. The linkage involved about 53,000 infants born from 1980 to 1987 and 1.46 million claims for delivery/birth-related hospital admissions during the same period. Because the two data files did not share a unique identifier, record linkage required combining evidence across several linking variables: delivery hospital, delivery/birth date or hospitalization period, names, mother's age, and zip code. To combine the various pieces of evidence, we used record linkage theory to compute scores that measure the likelihood of a match, i.e., that two records correspond to the same delivery. These scores appropriately weight the various pieces of evidence for or against a match. Implementation required dealing with large amounts of missing data in one of the files, errors and variations in reported names, and the need to minimize the number of incorrect links. The approach applies to a wide range of linkage problems. The ability to combine existing datasets to form new datasets containing analysis variables from each facilitates analyses that would otherwise be impossible, or prohibitively expensive.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bias
  • Birth Certificates*
  • California / epidemiology
  • Databases, Factual*
  • Death Certificates*
  • Delivery, Obstetric / economics
  • Delivery, Obstetric / statistics & numerical data
  • Delivery, Obstetric / trends
  • Female
  • Fetal Death / epidemiology
  • Health Care Costs / statistics & numerical data
  • Health Care Costs / trends
  • Hospitalization / economics
  • Hospitalization / statistics & numerical data
  • Hospitalization / trends
  • Humans
  • Infant Mortality*
  • Infant, Low Birth Weight*
  • Infant, Newborn
  • Insurance Claim Reporting
  • Likelihood Functions
  • Medicaid / statistics & numerical data*
  • Medical Record Linkage / methods*
  • Pregnancy
  • Reproducibility of Results
  • Risk Factors
  • Survival Rate
  • United States