SR-LLM: An incremental symbolic regression framework driven by LLM-based retrieval-augmented generation

SR-LLM:一种基于LLM的检索增强生成驱动的增量式符号回归框架

阅读:1

Abstract

Symbolic regression (SR) has regained research prominence as deep learning advancements accelerate the search for analytical models from observational data. However, the vast search space often hinders existing algorithms to yield complex analytical expressions. We present SR-LLM, an SR framework integrating retrieval-augmented generation mechanisms based on large language models (LLM) to achieve incremental learning. Specifically, our framework is capable of leveraging accumulated prior knowledge and past exploration results from external knowledge bases to retrieve the most relevant information for current regression tasks. It first composes prior information into small symbolic groups with the assistance of the LLMs and then utilizes deep reinforcement learning to combine these groups to formulate complex yet explainable analytic expressions that are more easily understood by humans. The capability for efficient knowledge utilization enables our framework to integrate all previous human experiences and exploration results, effectively learning by standing on the shoulders of giants. To validate the effectiveness of our proposed method, we not only test the framework on popular symbolic regression benchmarks but also extend its application to a domain where the explicit optimal model remains controversial: how to analytically describe human car-following behavior based on observed vehicle trajectories? Experiments confirm that our method outperforms on standard benchmarks, successfully rediscovers famous traditional car-following models and discovers new models from empirical trajectory data, achieving both fitting effectiveness and interpretability.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。