权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

General forward model

通用正演模型

基本信息

批准号：
2109450
负责人：
金额：
--
依托单位：
Queen Mary University of London
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2018
资助国家：
英国
起止时间：
2018 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2109450
关键词：
General forward model

项目摘要

Reasearch ContextArtificial intelligence (AI) algorithms such as deep learning has been used successfully in many fields such as Computer Vision, Natural Language Processing and Decision making systems.AI has been particularly successful in learning how to play games such as the ancient chinese game of Go, Atari games such as Breakout and Pong, and more recently complex games such as Starcraft and DOTA2.Many successful algorithms require an ability to effectively see-forward in time, and plan next moves based on the outcome of these internal experiments. This usually requires access to a perfect underlying model of the sytem, or a very accurate hand-crafted replica of the underlying system. There have been several attempts recently to actually build models to predict the possible future outcomes and plan accordingly without access to the underlying model.Aims and ObjectivesThe aim of this research is to create an efficient and genric way for simulated environments to be accurately predicted. There are several questions that will be answered by this research:Can predictive models of the simulated environment improve the results of training?Are accurate models of the environment necessary to train AI algorithms to perform complex tasks? Or can models that effectively "filter" unecessary information be used?In games with several players which amy or may not have predictable actions, can predictive models be used to create better adversaries?Can predictive models of simulated environments be used to understand risk?Potential Applications and benefitsUsing games as a test-bed for research has advantages in safety, cost and control of time. For example, when training self-driving cars to avoid collisions. In a simulated environment,the cars can make mistakes without endangering the public; they can exist entirely in software and no physical components need to be maintained or replaced; simulations can be 'sped up' meaning that several hours of training time can be compressed into minutes.If there are ways to predict outcomes in real-world environments, then the applications of this research are widespread. Being able to accurately predict the outcomes of actions several seconds into the future would provide a large advantage for safetly-critical applications. Additionally in the context of environments with hard to predict parameters, AI algorithms that can plan ahead for best and worse case scenarios would be able to understand underlying risk in it's actions.

深度学习等人工智能（AI）算法已成功应用于计算机视觉、自然语言处理和决策系统等许多领域。AI在学习如何玩游戏方面特别成功，如中国古代的围棋游戏，Atari游戏，如Breakout和Pong，以及最近的复杂游戏，如《星际争霸》和《DOTA2》。许多成功的算法都需要能够及时有效地预见未来，并根据这些内部实验的结果规划下一步行动。这通常需要访问系统的完美底层模型，或者底层系统的非常精确的手工复制品。最近已经有几次尝试，实际上建立模型来预测未来可能的结果，并相应地计划，而无需访问底层model.Aims和ObjectivesThe本研究的目的是创建一个有效的和genric模拟环境的方式来准确地预测。这项研究将回答几个问题：模拟环境的预测模型能否提高训练结果？精确的环境模型是否是训练AI算法执行复杂任务所必需的？或者可以使用有效“过滤”不必要信息的模型？在有多个玩家的游戏中，玩家的行为可能是不可预测的，预测模型能用来创造更好的对手吗？模拟环境的预测模型可以用来理解风险吗？潜在的应用和好处使用游戏作为研究的试验台在安全、成本和时间控制方面具有优势。例如，当训练自动驾驶汽车避免碰撞时。在模拟环境中，汽车可以犯错误而不会危及公众;它们可以完全存在于软件中，不需要维护或更换物理组件;模拟可以“加速”，这意味着几个小时的训练时间可以压缩到几分钟。如果有方法预测现实世界环境中的结果，那么这项研究的应用将非常广泛。能够准确预测未来几秒钟的动作结果将为安全关键型应用提供很大的优势。此外，在参数难以预测的环境中，可以提前计划最佳和最坏情况的AI算法将能够理解其行动中的潜在风险。