A micro-genesis account of longer-form reinforcement learning in structured and unstructured environments

结构化和非结构化环境中长篇强化学习的微观发生机制

阅读:1

Abstract

We explored the possibility that in order for longer-form expressions of reinforcement learning (win-calmness, loss-restlessness) to manifest across tasks, they must first develop because of micro-transactions within tasks. We found no evidence of win-calmness or loss-restlessness when wins could not be maximised (unexploitable opponents), nor when the threat of win minimisation was presented (exploiting opponents), but evidence of win-calmness (but not loss-restlessness) when wins could be maximised (exploitable opponents).

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。