Abstract
Addresses captured in administrative health data can facilitate the analysis of place-based exposures and their influence on health. If patients change address, hospital records are updated when individuals next use health services: precise timing of addresses changes must therefore be inferred. There is therefore a need to determine the temporal accuracy of address recording in these data. Using deidentified data, we compared Lower Super Output Areas (LSOAs) derived from English hospital records (NHS_E, n = 40,102), with linked data on addresses recorded in the UK Longitudinal Linkage Collaboration cohorts (Cohorts, n = 40,963) from January 1989 to April 2023. We compared the accuracy of three methods for estimating the timing of changes in LSOA: 1) inferring the end date of the current address as the start date of the next reported address minus 1 day (N-1 method); 2) using median date between current and next start date as the end date for current address, and to update the start date for the next address (Median method); 3) generating the address end date as a function of beta distribution defined in terms of the current and next start dates, assuming that most people update their address soon after they move (random method). In total, 39,216 (95.7%) Cohort members had at least 1 matching LSOA reported in both Cohorts and NHS_E data. 47% of these matching LSOAs were recorded in the hospital data within the two years prior to or following the Cohorts recorded dates. All three methods demonstrated LSOA agreement of ~ 78%, with negligible differences across methods. Methods for estimating timing of changes in hospital-recorded LSOA, based on address, are reasonably similar when compared to LSOA of residence in Cohorts data despite different updating methods. Researchers should consider the assumptions and implications of each method and justify their approach. More accurate and standardised methods are needed for recording address across systems.