A data-fusion approach to identifying developmental dyslexia from multi-omics datasets

利用多组学数据集进行数据融合以识别发育性阅读障碍

阅读:1

Abstract

This exploratory study tested and validated the use of data fusion and machine learning techniques to probe high-throughput omics and clinical data with a goal of exploring the etiology of developmental dyslexia. Developmental dyslexia is the leading learning disability in school aged children affecting roughly 5-10% of the US population. The complex biological and neurological phenotype of this life altering disability complicates its diagnosis. Phenome, exome, and metabolome data was collected allowing us to fully explore this system from a behavioral, cellular, and molecular point of view. This study provides a proof of concept showing that data fusion and ensemble learning techniques can outperform traditional machine learning techniques when provided small and complex multi-omics and clinical datasets. Heterogenous stacking classifiers consisting of single-omic experts/models achieved an accuracy of 86%, F1 score of 0.89, and AUC value of 0.83. Ensemble methods also provided a ranked list of important features that suggests exome single nucleotide polymorphisms found in the thalamus and cerebellum could be potential biomarkers for developmental dyslexia and heavily influenced the classification of DD within our machine learning models.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。