权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Deep Bayesian Reinforcement Learning

深度贝叶斯强化学习

基本信息

批准号：
522237-2018
负责人：
McLeod, Robert
金额：
$ 0.79万
依托单位：
University of Manitoba
依托单位国家：
加拿大
项目类别：
Engage Plus Grants Program
财政年份：
2018
资助国家：
加拿大
起止时间：
2018-01-01 至 2019-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=659378
关键词：
Deep Bayesian Reinforcement Learning

项目摘要

The proposed Engage Plus grant will continue to explore the utility of Bayesian Reinforcement Learning (BRL)**and integrate it in to the High Performance computing (HPC) framework of Sightline. In general**Reinforcement Learning (RL), is a class of machine learning (ML) methods in which a software agent interacts**with an unknown environment, with the goal of learning or finding a policy, to optimize some performance**metric. Underlying models include Markov Decision Processes and their variants. BRL is typically competitive**and unsupervised with the objective of attempting to estimate a random variable by producing a probability**distribution for it. Although well-thought-out and well-designed, BRL has not been widely applied or adopted.**BRL is still in relative infancy in spite of demonstrated successes and this aspect is of direct interest to**Sightline who are approaching machine learning from the application perspective. It is anticipated that the**combination of powerful models for policy approximation such as those associated with deep probabilistic**networks in combination with a BRL approach can further facilitate better exploration-exploitation trade-offs**associated with policy optimization. This Engage Plus opportunity will provide Sightline with another ML**offering to add to its suite of analytics tools and explore the underutilized and untapped potential of BRL.

拟议的Engage Plus拨款将继续探索贝叶斯强化学习(BRL)**的效用，并将其整合到Sightline的高性能计算（HPC）框架中。一般来说，强化学习（RL）是一类机器学习（ML）方法，其中软件代理与未知环境进行交互，以学习或找到策略为目标，以优化某些性能指标。基础模型包括马尔可夫决策过程及其变体。BRL是典型的竞争性和无监督的，其目的是试图通过产生随机变量的概率分布来估计随机变量。BRL虽然经过深思熟虑和精心设计，但并没有得到广泛的应用和采纳。尽管取得了成功，但BRL仍处于相对起步阶段，这方面对从应用角度接近机器学习的**Sightline有直接的兴趣。可以预见的是，将强大的策略逼近模型（如与深度概率网络相关的模型）与BRL方法相结合，可以进一步促进与策略优化相关的更好的勘探-开发权衡。这次Engage Plus的机会将为Sightline提供另一个ML**产品，以增加其分析工具套件，并探索BRL未充分利用和未开发的潜力。