基于自主学习的Ad hoc Agent序贯决策研究-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

基于自主学习的Ad hoc Agent序贯决策研究

结题报告

批准号：

61502322

项目类别：

青年科学基金项目

资助金额：

20.0 万元

负责人：

陈盈科

依托单位：

四川大学

学科分类：

F06.人工智能

结题年份：

2018

批准年份：

2015

项目状态：

已结题

项目参与者：

桑永胜、郭际香、汪洋旭、周尧、严明、王利团、何涛、范荣

关键词：

模型学习决策方式不确定性

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

多智能体（Agent）决策技术的研究常假设智能体之间通过通信与协调来完成既定任务。该假设不适用于具有竞争关系的多智能体系统。因此，在未知决策环境下，开发具有自适应能力的智能体，即Ad hoc Agent，是多智能体研究领域极具挑战的新兴问题。本项目将提出一个基于个体智能体自主学习与决策的新框架，以构造并求解多Ad hoc Agent序贯决策问题。其主要研究内容包括：通过机器学习方法，使Ad hoc Agent能从交互数据中自主构造出准确刻画其他智能体行为特征的模型，并更新自身的决策模型；在此基础上，将针对个体智能体行为模型的学习算法，推广到学习群体智能体抽象行为中；最终搭建一个以无人驾驶飞机仿真为背景的Ad hoc Agent仿真平台。本项目期望构造能自主发掘并合理应对陌生智能体行为的新型Ad hoc Agent，为将多智能体技术应用于更加复杂多变的现实场景中，提供理论依据与实践指导。

英文摘要

Multi-agent decision making techniques always assume cooperative agents that can resolve pre-defined tasks through communication and coordination. The techniques however are not applicable for solving decision problems with competitive agents. It is a challenge to develop an adaptive agent, namely Ad hoc agent, that can construct and solve decision problems in an environment commonly shared by other agents of unknown relationships. This project will solve sequential decision making problems involving Ad hoc agents from individual agent perspective. A subject agent will learn behavior of other ad hoc agents by adapting machine learning techniques, and accordingly update its own decision models. This project will extend learning algorithms for constructing behavioral model of a single agent to learn behavioral patterns of a population of other agents. Based on the scenario of unmanned aerial vehicle, this project will build a platform for simulating interactions, performing learning and conducting evaluation for ad hoc agents. In summary, this project will develop a new type of Ad hoc agent that can actively explore the environment with other unknown agents. The research outcomes will facilitate applications of multi-agent technologies in complex problem domains, and provide theoretical guarantees and practical guidelines.

多智能体（Agent）决策技术的研究常假设智能体之间通过通信与协调来完成既定任务。该假设不适用于具有竞争关系的多智能体系统。因此，在未知决策环境下，开发具有自适应能力的智能体，即Ad hoc Agent，是多智能体研究领域的极具挑战的新兴问题。本项目提出了一个基于个体智能体自主学习与决策的新框架，以求解多Ad hoc Agent序贯决策问题。其主要研究内容包括：通过机器学习方法，使Ad hoc Agent能从交互数据中自主构造出准确刻画其他智能体行为特征的模型，并采用模型检测技术对智能体行为进行分析；结合博弈论，研究了贝叶斯智能体的类型对决策过程的影响。此外，本项目将针对个体智能体的决策方法，推广到群体智能体中，实现了千个智能体的交互。本项目除了搭建一个以无人驾驶飞机仿真为背景的Ad hoc Agent交互、学习、验证平台，还以多人在线手机游戏为真实测试载体，验证了智能体行为对人类玩家行为的理性应对方式。

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

Decision-Theoretic Planning Under Anonymity in Agent Populations

代理群体中匿名决策理论规划

DOI：10.1613/jair.5449

发表时间：2017-08

期刊：

Journal of Artificial Intelligence Research

影响因子：5

作者：

Ekhlas Sonu;Yingke Chen;Prashant Doshi

通讯作者：Prashant Doshi

A Brain-inspired SLAM System Based on ORB Features

基于ORB特征的类脑SLAM系统

DOI：10.1007/s11633-017-1090-y

发表时间：2017-10-01

期刊：

INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING

影响因子：4.3

作者：

Zhou, Sun-Chun;Yan, Rui;Tang, Huajin

通讯作者：Tang, Huajin

Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

有界和自利的代理人可以成为队友吗？

DOI：10.1007/s10458-016-9354-4

发表时间：2016-11

期刊：

Autonomous Agents and Multi-Agent Systems

影响因子：1.9

作者：

Muthukumaran Ch;rasekaran;Prashant Doshi;Yifeng Zeng;Yingke Chen

通讯作者：Yingke Chen

Approximating behavioral equivalence for scaling solutions of I-DIDs

I-DID 扩展解决方案的近似行为等效性

DOI：10.1007/s10115-015-0912-x

发表时间：2016

期刊：

Knowledge and Information Systems

影响因子：2.7

作者：

Zeng Yifeng;Doshi Prashant;Chen Yingke;Pan Yinghui;Mao Hua;Ch;rasekaran Muthukumaran

通讯作者：rasekaran Muthukumaran

Learning Deterministic Probabilistic Automata from a Model Checking Perspective

从模型检查的角度学习确定性概率自动机

DOI：10.1007/s10994-016-5565-9

发表时间：2016-11

期刊：

Machine Learning

影响因子：7.5

作者：

Hua Mao;Yingke Chen;Manfred Jaeger;Thomas D. Nielsen;Kim G. Larsen;Brian Nielsen

通讯作者：Brian Nielsen

国内基金

海外基金

会员权益说明：