Table 4

Additional variables used to inform match determinations in hard cases

ConsiderationIntuitionPlace of application*
Rarity of name in the population
(see section IX of the online supplementary appendix)
Two records with same name but minor discrepancies on another link variable are more likely to be the same if first, middle or last name is uncommon.Step B: substep 2, substep 3(2)(d)
Step C: substep 2
Step D: name bins 3, 4, 8
Geodistance between discrepant addressesPersons who move addresses are more likely to relocate near (eg, same city or county) than far (eg, distant city or county).Step A: all name bins
Step B: substep 3(2)
Step C: substep 1
Step D: auto rule-in bin B; routing rule for name bin 11; name bins 1, 5
Geodistance+ruralityAll else equal, two records that match on all variables except address are more likely to be true matches if both are in the same sparsely populated area than if both are in the same densely populated area.Manual review only
Time interval between discrepant dates of birthWhen errors in (or intended alternate uses of) birth dates occur, the conflicting dates are more likely to be proximate than distant.Step D: blocking key; auto rule-in bin D
  • *Refers to locations in the charts of linkage algorithms provided in section V of the online supplementary appendix.