权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Prior knowledge elicitation and policy explanation for decision-theoretic planning and learning

决策理论规划和学习的先验知识获取和政策解释

基本信息

批准号：
312388-2008
负责人：
Poupart, Pascal
金额：
$ 1.82万
依托单位：
University of Waterloo
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2012
资助国家：
加拿大
起止时间：
2012-01-01 至 2013-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=503415
关键词：
Prior knowledge elicitation policy explanation

项目摘要

Consider spoken-dialogue managers, mobile robot controllers, automated monitoring/prompting systems for seniors with dementia or any other complex system that must accomplish a fairly complicated task. The conception of such systems is particularly challenging due to the noisy nature of the sensors (e.g., noisy speech recognition, noisy sonars) as well as the uncertain and interdependent effects of system actions (e.g., uncertain effect of prompts on seniors, interdependent and noisy motor controls in robotics). As a result, it is generally impossible to design complex robust systems by hand coding control policies. The fields of decision-theoretic planning and learning have made significant advances in the development of automated techniques to generate robust control policies that could revolutionize the next generation of computer systems. Instead of programming a policy directly, an algorithm is used to optimize a policy based on a model or simulator of the system and its environment. However, eliciting the domain knowledge necessary to specify a model or simulator, and validating/explaining the resulting policy are two major bottlenecks ignored by the research community that are holding back the adoption of this disruptive technology. Knowledge elicitation and policy explanation are particularly challenging since non-technical domain experts tend to have partial and imprecise knowledge, and often need high-level explanations of the policy where technical details are abstracted away to better convey the intuition. Hence, the objectives of this research are i) to design general and principled techniques to elicit and encode partial/imprecise domain knowledge about the system, the environment and the desired policy, ii) to develop algorithms that can exploit as much domain knowledge as possible to improve scalability, and iii) to create generic tools to validate and explain the decisions made by a policy at an appropriate level for developers and non-technical experts.

考虑口语对话管理器，移动的机器人控制器，老年痴呆症患者的自动监控/提示系统或任何其他必须完成相当复杂任务的复杂系统。由于传感器的噪声性质（例如，噪声语音识别，噪声声纳）以及系统动作的不确定和相互依赖的影响（例如，提示对老年人的不确定影响，机器人技术中相互依赖和嘈杂的运动控制）。因此，它通常是不可能的设计复杂的鲁棒系统的手编码控制策略。决策理论规划和学习领域在自动化技术的发展方面取得了重大进展，这些技术可以产生强大的控制策略，从而彻底改变下一代计算机系统。不是直接编程策略，而是使用算法来基于系统及其环境的模型或模拟器来优化策略。然而，引出指定模型或模拟器所需的领域知识，以及验证/解释由此产生的策略是研究界忽视的两个主要瓶颈，这两个瓶颈阻碍了这种颠覆性技术的采用。知识获取和政策解释是特别具有挑战性的，因为非技术领域的专家往往有部分和不精确的知识，往往需要高层次的政策解释，其中技术细节被抽象出来，以更好地传达直觉。因此，本研究的目标是：i）设计一般和原则性的技术，以引出和编码关于系统、环境和所需策略的部分/不精确的领域知识，ii）开发可以利用尽可能多的领域知识来提高可扩展性的算法，以及iii）创建通用工具，以便在适当的级别上为开发人员和非技术专家验证和解释策略所做的决定。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Poupart, Pascal其他文献

Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks

前馈和循环和积网络的在线结构学习

DOI：
发表时间：
2018
期刊：
Advances in Neural Information Processing Systems (NeurIPS
影响因子：
0
作者：
Kalra, Agastya;Rashwan, Abdullah;Hsu, Wei-Shou;Poupart, Pascal;Doshi, Prashant;Trimponias, Georgios
通讯作者：
Trimponias, Georgios