Abstract
Objective cognitive tasks in the National Health and Aging Trends Study (NHATS) are used to classify probable dementia in Medicare beneficiaries. Whether these tasks measure the same latent construct with comparable precision across English-proficiency groups has not been established. While the NHATS offers English and Spanish versions, it remains unclear whether translation alone achieves measurement equivalence. This study tests measurement invariance of the NHATS cognitive screening battery across English-language-proficient (ELP) and limited English-proficient (LEP) groups to determine whether noninvariance produces measurement bias that could misclassify probable dementia and distort prevalence estimates. Methods round 12 (2021) of the NHATS (N = 5,628) data were analyzed using a multi-group confirmatory factor analysis that incorporated the complex survey design. Factor loadings were constrained equal across ELP and LEP groups, and differences in factor (latent) variance and item residual variances were used to assess measurement noninvariance. Results all standardized indicators loaded positively on the cognition factor (p < .001). The factor (latent) variance was larger in ELP (1.80) than LEP (1.49). Residual (error) variances were higher in LEP, especially for clock drawing (ELP 0.87 vs. LEP 1.38) and name recall (ELP 1.07 vs. LEP 2.63), indicating lower measurement precision in the LEP group. Discussion these findings indicate that the NHATS cognitive screening battery does not measure the same construct with equal precision across English-proficiency groups, suggesting that translation alone is insufficient to ensure measurement invariance. Larger factor (latent) variance in ELP and higher residual (error) variance in LEP suggest lower measurement reliability and possible cultural bias in item interpretation. Such differences may lead to misclassification of probable dementia and distort population estimates. These results highlight the need for culturally and linguistically adapted cognitive screening tools to improve the validity of dementia surveillance and research.