Optimal Query Selection Using Multi-Armed Bandits

使用多臂老虎机进行最优查询选择

阅读:1

Abstract

Query selection for latent variable estimation is conventionally performed by opting for observations with low noise or optimizing information theoretic objectives related to reducing the level of estimated uncertainty based on the current best estimate. In these approaches, typically the system makes a decision by leveraging the current available information about the state. However, trusting the current best estimate results in poor query selection when truth is far from the current estimate, and this negatively impacts the speed and accuracy of the latent variable estimation procedure. We introduce a novel sequential adaptive action value function for query selection using the multi-armed bandit (MAB) framework which allows us to find a tractable solution. For this adaptive-sequential query selection method, we analytically show: (i) performance improvement in the query selection for a dynamical system, (ii) the conditions where the model outperforms competitors. We also present favorable empirical assessments of the performance for this method, compared to alternative methods, both using Monte Carlo simulations and human-in-the-loop experiments with a brain computer interface (BCI) typing system where the language model provides the prior information.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。