Explainable rule-based prediction of cultivation media for microbes

基于可解释规则的微生物培养基预测

阅读:2

Abstract

Knowledge of microbial growth preferences remains dispersed-often confined to research articles or human experts-making new experiment design heavily reliant on manual expertise and literature searches. While previous computational efforts have explored media prediction through phylogenetic similarity or leveraged genomic data for trait modeling, they often produce predictions whose underlying biological rationale is not transparent or rely on biased features (e.g., incomplete genome annotations). To address this need for greater interpretability, we used the recently introduced KG-Microbe knowledge graph, a harmonized resource of microbial organismal traits and other properties, to explain growth media preferences. We employed explainable methods by developing a simple, rule-based classifier from these traits and compared its performance and interpretative power to that of a high-performing black-box model. While the black-box model showed slightly higher overall predictive performance, the transparency of the rule-based system and its ability to generate verifiable, biologically plausible rules make it a more sustainable and insightful framework. To explore feature importance, we applied SHAP to the black-box model and compared the results with a rule-based feature-importance method. Finally, leveraging the resulting rule set-together with insights from a large language model (LLM) and domain expertise-we propose strategies to advance microbial research. Code, models, and results are available at https://github.com/culturebotai/microbe-rules.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。