POPDx: an automated framework for patient phenotyping across 392 246 individuals in the UK Biobank study

POPDx:英国生物银行研究中392246名个体患者表型分析的自动化框架

阅读:1

Abstract

OBJECTIVE: For the UK Biobank, standardized phenotype codes are associated with patients who have been hospitalized but are missing for many patients who have been treated exclusively in an outpatient setting. We describe a method for phenotype recognition that imputes phenotype codes for all UK Biobank participants. MATERIALS AND METHODS: POPDx (Population-based Objective Phenotyping by Deep Extrapolation) is a bilinear machine learning framework for simultaneously estimating the probabilities of 1538 phenotype codes. We extracted phenotypic and health-related information of 392 246 individuals from the UK Biobank for POPDx development and evaluation. A total of 12 803 ICD-10 diagnosis codes of the patients were converted to 1538 phecodes as gold standard labels. The POPDx framework was evaluated and compared to other available methods on automated multiphenotype recognition. RESULTS: POPDx can predict phenotypes that are rare or even unobserved in training. We demonstrate substantial improvement of automated multiphenotype recognition across 22 disease categories, and its application in identifying key epidemiological features associated with each phenotype. CONCLUSIONS: POPDx helps provide well-defined cohorts for downstream studies. It is a general-purpose method that can be applied to other biobanks with diverse but incomplete data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。