Task-general reinforcement learning algorithms
任务通用强化学习算法
基本信息
- 批准号:2426703
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2020
- 资助国家:英国
- 起止时间:2020 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project falls within the EPSRC Information and communication technologies (ICT) research area. The goal of the project is to develop algorithms capable of extracting task-general structure from data and using that structure to efficiently learn on novel tasks. These algorithms would enable deployment of reinforcement learning agents in the real world, where they could acquire new skills based on little experience. This data efficient skill acquisition could lead to more economical automation of industrial processes and household routines. While task-generalization in reinforcement learning is not a new research topic, effectively tackling it requires zooming out from the single-task perspective that has been prevalent in the field recently. The first steps away from that perspective are taken by considering generalization performance to novel tasks a key measure of success. This is adopted as a central evaluation criterion for the algorithms developed in this project. The evaluation will consist of gathering both empirical and theoretical support for the algorithms. The outcomes of the development and evaluation are published in the top conferences and academic journals of the artificial intelligence and machine learning communities.One way to improve task-generalization in reinforcement learning is to consider a higher-level learning problem, called meta-learning, which aims to learn the learning algorithm itself with the explicit objective of fast learning on novel tasks. Meta-learning is a promising tool for task-generalization since it enables leveraging the strength of deep learning when abundant data is available by turning the problem of generalization also into learning. While meta-learning has seen a surge of interest and many exciting contributions in the past few years, the generalization performance of these approaches to genuinely novel tasks outside the training task distributions has garnered only limited attention. This lack of attention serves as a signpost guiding this research project into the relatively unexplored territory of tackling the questions of task-generalization explicitly.Concretely in this project, new algorithms and training environments are developed for task-general reinforcement learning. To develop new algorithms, novel meta-parameterizations of reinforcement learning agents and the algorithms themselves will be considered. The generalization performance of reinforcement learning agents does not only rely on the training algorithm but on the training environments and datasets as well. Therefore, new generalization-focused training environments will have to be developed.
该项目福尔斯属于EPSRC信息和通信技术(ICT)研究领域。该项目的目标是开发能够从数据中提取任务通用结构的算法,并使用该结构有效地学习新任务。这些算法将使强化学习代理能够在真实的世界中部署,在那里他们可以根据很少的经验获得新的技能。这种数据高效的技能获取可能会导致工业流程和家庭日常工作更经济的自动化。虽然强化学习中的任务泛化并不是一个新的研究课题,但有效地解决它需要从最近在该领域流行的单任务角度进行缩小。摆脱这一观点的第一步是将新任务的概括性能视为成功的关键衡量标准。这是通过作为本项目中开发的算法的中央评价标准。评估将包括收集算法的经验和理论支持。开发和评估的结果发表在人工智能和机器学习社区的顶级会议和学术期刊上。在强化学习中提高任务泛化能力的一种方法是考虑一个更高级别的学习问题,称为元学习,其目的是学习学习算法本身,明确目标是快速学习新任务。元学习是一种很有前途的任务泛化工具,因为它能够在大量数据可用时利用深度学习的优势,将泛化问题也转化为学习。虽然元学习在过去几年中引起了人们的兴趣和许多令人兴奋的贡献,但这些方法对训练任务分布之外真正新颖任务的泛化性能只受到了有限的关注。这种关注的缺乏是一个路标,引导本研究项目进入相对未开发的领域,明确解决任务泛化的问题。具体来说,在本项目中,为任务一般的强化学习开发了新的算法和训练环境。为了开发新的算法,将考虑强化学习代理和算法本身的新元参数化。强化学习代理的泛化性能不仅依赖于训练算法,还依赖于训练环境和数据集。因此,必须开发新的以泛化为重点的培训环境。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似国自然基金
Toward a general theory of intermittent aeolian and fluvial nonsuspended sediment transport
- 批准号:
- 批准年份:2022
- 资助金额:55 万元
- 项目类别:
一类新的连分数动力系统的研究
- 批准号:11361025
- 批准年份:2013
- 资助金额:33.0 万元
- 项目类别:地区科学基金项目
全身麻醉药作用于生殖系统GABAA受体对男性生殖功能的影响及机制研究
- 批准号:30901390
- 批准年份:2009
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
图的一般染色数与博弈染色数
- 批准号:10771035
- 批准年份:2007
- 资助金额:18.0 万元
- 项目类别:面上项目
全麻药作用脑内G蛋白相关基因表达谱和调控网络的研究
- 批准号:30371375
- 批准年份:2003
- 资助金额:20.0 万元
- 项目类别:面上项目
相似海外基金
Prospective Validation of Neurophysiologic Outcome Prediction in Acute Brain Injury
急性脑损伤神经生理结果预测的前瞻性验证
- 批准号:
10584338 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative Research: DMS/NIGMS 2: Novel machine-learning framework for AFMscanner in DNA-protein interaction detection
合作研究:DMS/NIGMS 2:用于 DNA-蛋白质相互作用检测的 AFM 扫描仪的新型机器学习框架
- 批准号:
10797460 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Quantifying the cognitive processes supporting computations of stochasticity and volatility in humans
量化支持人类随机性和波动性计算的认知过程
- 批准号:
10732422 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Investigating mechanisms mediating enhanced THC reinforcement by nicotine
研究尼古丁增强 THC 增强作用的机制
- 批准号:
10739859 - 财政年份:2023
- 资助金额:
-- - 项目类别:
CRCNS: Computational Foundations for Externalizing/Internalizing Psychopathology
CRCNS:外化/内化精神病理学的计算基础
- 批准号:
10831117 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Evaluating the Efficacy of Telehealth-Delivered Brief Family Involved Treatment (B-FIT) for Alcohol Use Disorder among Veterans
评估远程医疗提供的短期家庭参与治疗 (B-FIT) 对退伍军人酒精使用障碍的疗效
- 批准号:
10705831 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Towards Fully Integrated Deep Learning and Reinforcement Learning for General Spatial Domains.
迈向通用空间领域的完全集成深度学习和强化学习。
- 批准号:
RGPIN-2018-04381 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual