权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Assistive sequential decision making framework

辅助顺序决策框架

基本信息

批准号：
RGPIN-2019-05460
负责人：
Lee, ChiGuhn
金额：
$ 1.89万
依托单位：
University of Toronto
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2019
资助国家：
加拿大
起止时间：
2019-01-01 至 2020-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=681374
关键词：
Assistive sequential decision making framework

项目摘要

This research focuses on developing algorithms to maximize the impact of smart technologies when they are integrated into business processes. We will specifically investigate team decision making problems, in which only a sub-set of decision makers are capable of making decisions optimally to achieve the overarching goal of the entire team. The agents either optimal or sub-optimal make decisions over time as the system makes transitions through states. Upon reaching a decision, each agent will receive a reward conditional on the system state and actions taken by her as well as all other agents.******The proposed decision process is useful in a wide range of applications, from manufacturing to supply chain management to military operations. For instance, the decision process could maximize the productivity of a robotic assembly cell where two robots and a human operator work together to assemble a large automotive component. Rather than separating robots from human operators for safety, the proposed decision processes could help ensure the robots can plan their paths of movement so that they complement the operations done by the human operators while ensuring the safety of the human operators.******The proposed assistive decision process is an extension of the Markov decision processes to multi-agent scenarios as well as reinforcement learning. The presence of multiple decision makers poses challenges such as computational complexity, stability of the process and behavioural modeling of human decision makers. These challenges have resulted in the current literature severely lacking in this area despite the increasing importance of the topic as the society turns more and more automated. This program will tackle these three challenges using dimension reduction, optimal trade-offs between exploration and exploitation, and inverse reinforcement learning. ******This research is timely given the rapid adoption of artificial intelligence and autonomous systems. The assistive decision process can be used as the mathematical backbone in control systems to overcome the challenges that currently limit the use of autonomous multi-agent systems. This work will thus help Canada become a leader in autonomous manufacturing, transportation, surveillance and security, and communications. **

这项研究的重点是开发算法，以最大限度地发挥智能技术的影响，当它们被集成到业务流程。我们将专门研究团队决策问题，其中只有决策者的子集能够做出最佳决策，以实现整个团队的总体目标。随着系统在状态之间的转换，代理人会随着时间的推移做出最优或次优的决策。在做出决定后，每个代理将根据系统状态和她以及所有其他代理所采取的行动获得奖励。建议的决策过程是有用的，在广泛的应用，从制造业到供应链管理的军事行动。例如，决策过程可以最大限度地提高机器人装配单元的生产率，其中两个机器人和一名操作员一起工作来装配大型汽车部件。建议的决策过程可以帮助确保机器人可以规划其移动路径，从而在确保人类操作员安全的同时补充人类操作员所做的操作，而不是将机器人与人类操作员分开。建议的辅助决策过程是马尔可夫决策过程的扩展，多智能体的情况下，以及强化学习。多个决策者的存在带来了挑战，如计算复杂性，过程的稳定性和人类决策者的行为建模。这些挑战导致目前的文献严重缺乏在这方面，尽管随着社会变得越来越自动化的主题越来越重要。该计划将使用降维，探索和利用之间的最佳权衡以及反向强化学习来解决这三个挑战。* 鉴于人工智能和自主系统的迅速采用，这项研究是及时的。辅助决策过程可以被用作控制系统的数学骨干，以克服目前限制自治多智能体系统使用的挑战。因此，这项工作将帮助加拿大成为自动制造、运输、监控和安全以及通信领域的领导者。**