权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Evolving under tasks of incomplete information: streaming and self play

在不完全信息任务下进化：流媒体和自玩

基本信息

批准号：
451239-2013
负责人：
Heywood, Malcolm
金额：
$ 6.41万
依托单位：
Dalhousie University
依托单位国家：
加拿大
项目类别：
Collaborative Research and Development Grants
财政年份：
2015
资助国家：
加拿大
起止时间：
2015-01-01 至 2016-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=575395
关键词：
Evolving under tasks incomplete information

项目摘要

Tasks that only are only partially observable present unique challenges for machine learning in general and genetic programming (GP) in particular. The notion of a partially observable application will be considered from two specific application contexts of relevance to the partner organization. The first context is that of streaming data analysis under limited label availability. The goal is to begin with some small subset of labelled data representing the current status of a task, and then be able to recognize when the underlying nature of the non-stationary process generating the data stream has changed. On recognizing such a change in the stream's behaviour, the current GP solutions should selectively request labels for the data corresponding to the change. Decisions then need to be made to determine which GP individuals currently employed for detecting change / labelling the stream, should be updated or replaced. The second application context is that of learning to construct non-player character (NPC) through self play under a game of incomplete information. A NPC is computer based entity who appears in a game to provide additional interest for human players. In this work we are particularly interested in the case of NPC who appear in the game of poker. The goal however, is not to produce the strongest possible player, but to provide players who are complementary to the human players currently participating. Poker represents an interesting task domain because it is also based on incomplete information (no knowledge of the other player's cards), stochastic (cannot predict the order of card apparence) and the capability of opponent players is unknown. The stochastic nature of the task, lack of complete information and non-stationary nature of strategies adopted by other players all make the poker task particularly challenging from the perspective of machine learning.

仅部分可观察的任务通常对机器学习，特别是遗传编程（GP）提出了独特的挑战。部分可观察应用的概念将从与伙伴组织相关的两个具体应用环境中加以考虑。第一个上下文是在有限的标签可用性下的流数据分析。目标是开始与一些小的子集的标记数据表示当前的任务状态，然后能够识别的非平稳过程生成的数据流的基本性质发生了变化。在识别流的行为中的这种改变时，当前的GP解决方案应当选择性地请求用于对应于改变的数据的标签。然后需要做出决定，以确定目前用于检测变化/标记流的GP个体应该被更新或替换。第二个应用背景是在不完全信息博弈下通过自我博弈学习构建非玩家角色（NPC）。NPC是基于计算机的实体，出现在游戏中为人类玩家提供额外的兴趣。在这项工作中，我们特别感兴趣的情况下，NPC谁出现在扑克游戏。然而，我们的目标并不是培养出最强的球员，而是提供与目前参与的人类球员互补的球员。扑克代表了一个有趣的任务领域，因为它也是基于不完整的信息（不知道其他玩家的卡），随机（无法预测卡出现的顺序）和对手玩家的能力是未知的。从机器学习的角度来看，任务的随机性、缺乏完整信息以及其他玩家所采用策略的非平稳性都使得扑克任务特别具有挑战性。