权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

User's Intention Understanding Using Multi-modal Information for Intelligent Interfaces

利用智能界面的多模态信息理解用户意图

基本信息

批准号：
13680471
负责人：
TSURUTA Naoyuki
金额：
$ 1.09万
依托单位：
Fukuoka University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
2001
资助国家：
日本
起止时间：
2001 至 2003
项目状态：
已结题

项目摘要

This research developed a framework of user's intention understanding using multi-modal information, and enabled natural and robust, intelligent man-machine interface.In generally, intention understanding processes are constructed from following three stages. (A) Tracking of humans walking around the system. (B) Detection of a human coming up to the system, and confirmation of one's intention of use the system. (C) User's intention understanding for man-machine dialogue. Traditional researches focused on only one of those stages but not transition between stages. Therefore, natural dialogue interface is not developed yet.This research focused on (B), (C) and transitions between them and got following three results, while results of traditional researches were used for (A). (1) When the system detects a human coming up to, the system gathers information using a new active-vision method and confirm his/her intention implicitly. This implicit confirmation enables natural transitions from (B) to (C). (2) In the stage (C), a combination of a vision based lip-reading and a. context analysis with the traditional spoken language recognition, which enables high recognition accuracy, was proposed. Using the proposed methods for (B) and (C), a very robust dialogue system could be developed. (3) The recognition accuracy for dialogues, however, was very high but not perfect. Therefore, a touch-panel device and menus on it were additionally introduced, and a new modal switching method was proposed. Using this method, user can communicate with the system using audio-visual dialogue as frequently as possible under being premised on a perfect success.

本研究发展了一个使用多模态信息的用户意图理解框架，并实现了自然、健壮、智能的人机界面。(A)跟踪在系统中行走的人。(B)检测到有人接近系统，并确认某人使用系统的意图。(C)人机对话中的用户意图理解。传统的研究只关注其中的一个阶段，而忽视了阶段之间的过渡。本研究主要针对（B）、（C）以及两者之间的转换进行研究，并取得以下三个结果，而传统研究的结果则用于（A）。(1)当系统检测到有人走近时，系统使用新的主动视觉方法收集信息，并隐含地确认他/她的意图。这种隐式确认使得能够实现从（B）到（C）的自然转换。(2)在阶段（C）中，基于视觉的唇读和基于视觉的唇读的组合。提出了具有传统口语识别的上下文分析，其能够实现高识别准确度。使用针对（B）和（C）提出的方法，可以开发非常鲁棒的对话系统。(3)然而，对对话的识别准确率非常高，但并不完美。为此，增加了触摸屏设备和菜单，并提出了一种新的模式切换方法。使用该方法，用户可以在保证成功的情况下，尽可能频繁地使用视听对话与系统进行交流。