Weighting the United States All of Us Research Program data to known population estimates using raking

使用加权法将美国“我们所有人”研究计划数据与已知的人口估计值进行匹配

阅读:1

Abstract

BACKGROUND: The All of Us Research Program aims to collect longitudinal health-related data from a million individuals in the United States. An inherent challenge of a non-probability sampling strategy through voluntary participation in All of Us is that findings may not be nationally representative for addressing health and health care at the population level. We generated survey weights for the All of Us data that can be used to address the challenge. RESEARCH DESIGN: We developed raked weights using demographic, health, and socioeconomic variables available in both the 2020 National Health Interview Survey (NHIS) and All of Us. We then compared the unweighted and weighted prevalence of a set of health-related variables (health behaviors, health conditions, and health insurance coverage) estimated from All of Us data with the weighted prevalence estimates obtained from NHIS data. SUBJECTS: The sample included 100,391 All of Us participants 18 years of age and older with complete data collected between May 2017 and January 2022 across the United States. RESULTS: Final variables in the raking procedure included age, sex, race/ethnicity, region of residence, annual household income, and home ownership. The mean percentage difference between known proportions obtained from the NHIS and All of Us was reduced by 18.89% for health-related variables after applying the raked weights. CONCLUSIONS: Raking improved the comparability of prevalence estimates obtained from All of Us to known national prevalence estimates. Refining the process of variable selection for raking may further improve the comparability between All of Us and nationally representative data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。