Dynamic Factor Analysis for Sparse and Irregular Longitudinal Data: An Application to Metabolite Measurements in a COVID-19 Study

针对稀疏且不规则纵向数据的动态因子分析:以新冠肺炎研究中的代谢物测量为例

阅读:2

Abstract

Factor analysis (FA) can be used to identify key biomarkers in biological processes by assuming that latent biological pathways (statistically, "latent factors") drive the activity of measurable biomarkers ("observed variables"). However, biological pathways often interact, meaning that the classical FA assumption of independence between factors is questionable. Motivated by sparsely and irregularly collected longitudinal measurements of metabolites in a COVID-19 study, we propose a dynamic factor analysis model that accounts for cross-correlations between pathways via a multi-output Gaussian processes (MOGP) prior on the factor trajectories. To mitigate against overfitting caused by sparsity of longitudinal measurements, we introduce a roughness penalty upon MOGP hyperparameters and allow for non-zero mean functions. We also propose a scalable stochastic expectation maximization (StEM) algorithm that, in simulations, is both 20 times faster and provides more accurate and stable MOGP hyperparameter estimates than a previously-proposed Monte Carlo Expectation Maximization algorithm. In the motivating COVID-19 study, our methodology identifies a kynurenine pathway that affects the clinical severity of patients with COVID-19 disease and uncovers the role of the biomarker taurine. Our R package DFA4SIL implements the proposed method.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。