Gradient Retention Time Modeling in Ion Chromatography through Ensemble Machine Learning-Powered Quantitative Structure-Retention Relationships

基于集成机器学习的定量结构-保留关系在离子色谱梯度保留时间建模中的应用

阅读:1

Abstract

Quantitative structure-retention relationships (QSRRs) have been a popular modeling approach in ion chromatography to predict retention time from molecular structures. It is often coupled with solvent strength models to extend it to other isocratic chromatographic conditions. While this approach has achieved reasonable success, potential inconsistencies from the solvent strength model may propagate to the QSRR models, thereby amplifying their errors. In this work, we aim to incorporate information on the isocratic conditions directly into the QSRR model to reduce error propagation and build global models. Four machine learning approaches that can account for both global and local sources of variability in chromatographic retention, random forest regression, gradient boosting regression (GBR), extreme gradient boosting (xgBoost), and adaptive boosting (AdaBoost), were evaluated and compared. The partial least-squares model was built as a baseline to compare against. GBR and xgBoost have shown superior predictive ability among the evaluated models with root-mean-square errors (RMSEs) of isocratic retention of 0.025 (+0.009, -0.006) and 0.025 (+0.008, -0.006), respectively. Developed QSRR models were further incorporated into the isocratic-to-gradient model to predict gradient retention. GBR and xgBoost QSRR models have outperformed the other models with RMSEs of gradient retention of 0.358 (+0.199, -0.107) and 0.385 (+0.387, -0.139) min, respectively. Such an approach demonstrates the benefits of incorporating the eluent composition into prediction models, with the potential to extend to other chromatographic techniques.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。