Abstract
BACKGROUND: Coherence across sites in multicenter datasets is one substantial data quality dimension for reliable health data reuse, as unexpected heterogeneity in data can lead to biases in data analyses and suboptimal generalization of results. OBJECTIVE: This work aims to characterize and label the data coherence across sites in the first European multicenter dataset for cancer prevention in people and early detection among the homeless population in Europe: coadapting and implementing the health navigator model. This dataset emerged to enable research to address disparities in health challenges and health care access due to barriers such as unstable housing, limited resources, and social stigma in people experiencing homelessness. METHODS: The dataset comprises 652 cases: 142 from Austria, 158 from Greece, 197 from Spain, and 155 from the United Kingdom. All participants fit classifications from the European Typology of Homelessness and Housing Exclusion. This longitudinal study collected questionnaires at baseline, after 4 weeks, and at the end of the intervention. The 180-question survey covered sociodemographic data, overall health, mental health, empowerment, and interpersonal communication. Data variability was assessed using information theory and geometric methods to analyze discrepancies in distributions and completeness across the dataset. RESULTS: Substantial variability was observed among the 4 pilot countries, both in the overall analysis and within specific domains. In particular, measures of health care empowerment, quality of life, and interpersonal communication demonstrated the greatest discrepancies among pilot sites, with the exception of the health domain. Notably, Spain exhibited the most pronounced differences, characterized by a high number of missing values related to interpersonal communication and the use of health care services. CONCLUSIONS: Health data may be comparable across the 4 countries; however, substantial differences were observed in the other questionnaires, requiring independent, country-specific analyses. This study underscores the heterogeneity among people experiencing homelessness and the critical need for data quality assessments to inform future research and policymaking in this field.