权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Machine Learning for Game World Generation using Intrinsic Motivation as an Objective Function.

使用内在动机作为目标函数的游戏世界生成机器学习。

基本信息

批准号：
2119222
负责人：
金额：
--
依托单位：
Goldsmiths University of London
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2018
资助国家：
英国
起止时间：
2018 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2119222
关键词：
Machine Learning Game World Generation

项目摘要

New forms of machine learning are increasingly important in games research, and generative models demonstrate great potential for content creation. Gatys et al. [1] demonstrate a method for texture synthesis using convolutional neural networks that exceeded the state of the art in the fidelity of of textures generated. Chen et al. [2] develop a method for high definition photographic image rendering from semantic pixel labels using reinforcement learning. Thompson et al. [3] train a convolutional neural network to model complex fluid simulations, allowing them to approximate fluid simulations more efficiently. Bansal et al. [4] train reinforcement agents in competitive self-playing environments resulting in the development of complex emergent behaviours in otherwise simple environments. When such systems are computationally efficient enough to render in real time, they will transform the way video games are both produced and played.Variational, adversarial and autoregressive systems can be very successful at performing inference on the statistical distribution of high dimensional data. This allows models to generate content with sophisticated representations. However, they are not very good at novel content generation.Schmidhuber et al. [5] propose a Reinforcement Learning agent, motivated not by external reward, but by intrinsic motivation. The agent is driven by Compression Progress. It constantly samples the world, trying to compress it, inferring patterns and regularities in the data, but motivated to seek out data that is novel "as long as the algorithmic regularity that makes it simple has not yet been fully assimilated by the adaptive observer who is still learning to compress the data better", which they define as 'the time-dependent subjective Interestingness'.I will develop new kinds of machine learning systems driven by intrinsic motivation (such as time-dependent subjective interestingness [5]), optimised towards creating systems that display complex emergent behaviour. Defining the structure and complexity of emergent behaviour is inherently subjective [6]. An agent that has this subject measure of novelty - with respect to other structures in an underlying system - and intrinsic motivation to seek it out (in the form of an objective function), should be able to detect or encourage the development of complex systems that produce more novel outcomes.Research Questions- Can Intrinsic Motivation be used to develop efficient procedural content generation systems that can be deployed in live game environments for texture, model, and world generation?- Can Intrinsic Motivation be used to develop agent behaviour policies or physical simulations that behave in complex, unpredictable ways?- What kinds of machine learning techniques are best suited to implementing this approach? (Reinforcement learning, variational inference, differentiable neural computers)- Can this technique be augmented with existing generative systems to encourage sampling more interesting content?Research Plan I will begin by designing systems that use objective functions to create agents driven by compression progress. These will then be applied to problems of procedural content generation and complex agent behaviour. At the end of this process I will create a game that implements and demonstrates these methods for procedural content generation, driven by machine learning.[1] Gatys, Leon, Alexander S. Ecker, and Matthias Bethge. "Texture synthesis using convolutional neural networks." In Advances in Neural Information Processing Systems, pp. 262-270. 2015.javascript:WebForm_DoPostBackWithOptions(new%20WebForm_PostBackOptions("ctl00$oSaveBar$btnSave",%20"",%20true,%20"",%20"",%20false,%20true))[2] Chen, Qifeng, and Vladlen Koltun. "Photographic image synthesis with cascaded refinement networks." In The IEEE International Conference on Computer Vision (ICCV), vol. 1. 2017.[3] Tompson, Jonathan, Kristofer Schlachter, Pablo Sprech

新形式的机器学习在游戏研究中越来越重要，生成模型在内容创建方面表现出巨大的潜力。Gatys等人。[1]展示了一种使用卷积神经网络进行纹理合成的方法，该方法在生成的纹理的保真度方面超过了最先进的技术水平。Chen等人[2]开发了一种使用强化学习从语义像素标签渲染高清摄影图像的方法。Thompson等人[3]训练卷积神经网络来模拟复杂的流体模拟，使它们能够更有效地近似流体模拟。Bansal等人[4]在竞争性的自我游戏环境中训练强化代理，导致在其他简单环境中发展复杂的涌现行为。当这些系统的计算效率足够高，可以在真实的时间内渲染时，它们将改变视频游戏的制作和播放方式。变分、对抗和自回归系统可以非常成功地对高维数据的统计分布进行推断。这允许模型生成具有复杂表示的内容。Schmidhuber等人[5]提出了一种强化学习代理，其动机不是外部奖励，而是内在动机。代理由压缩进度驱动。它不断地对世界进行采样，试图压缩它，推断数据中的模式和数据，但有动力寻找新颖的数据，“只要使其简单的算法规则尚未被仍在学习更好地压缩数据的自适应观察者完全吸收”，他们将其定义为“依赖时间的主观意愿”。我将开发由内在动机驱动的新型机器学习系统（例如依赖于时间的主观兴趣度[5]），优化以创建显示复杂涌现行为的系统。定义涌现行为的结构和复杂性本质上是主观的[6]。一个具有这种新奇的主体度量--相对于底层系统中的其他结构--和寻找它的内在动机的主体（以目标函数的形式），应该能够发现或鼓励复杂系统的发展，产生更多的新成果。研究问题-Intrinsic Motivation能否用于开发高效的程序化内容生成系统，并将其部署在实时游戏环境中，以生成纹理、模型和世界？内在动机可以用来开发代理行为策略或以复杂，不可预测的方式表现的物理模拟吗？什么样的机器学习技术最适合实现这种方法？（强化学习，变分推理，可微神经计算机）-这种技术可以用现有的生成系统来增强，以鼓励采样更有趣的内容吗？研究计划I将开始设计系统，使用目标函数来创建由压缩进度驱动的代理。这些将被应用到程序内容生成和复杂的代理行为的问题。在这个过程的最后，我将创建一个游戏，实现并演示这些方法，用于由机器学习驱动的过程内容生成。[1]作者：Gatys，Leon，亚历山大. Ecker，and Matthias Bethge“使用卷积神经网络进行纹理合成。”在神经信息处理系统的进展，pp。262-270. 2015.javascript：WebForm_DoPostBackWithOptions（new%20WebForm_PostBackOptions（“ctl00$oSaveBar$btnSave”，%20”"，%20true，%20”"，%20false，%20true））[2] Chen，Qifeng，and Vladlen Koltun.“使用级联细化网络的摄影图像合成。IEEE International Conference on Computer Vision（ICCV），vol. 1. 2017. [3]汤普森、乔纳森、克里斯托弗·施拉赫特、巴勃罗·斯普雷奇