Study on Integrated Processing of Speech and Gesture in Multimodal Communication

多模态通信中语音与手势综合处理研究

基本信息

  • 批准号:
    10480083
  • 负责人:
  • 金额:
    $ 5.89万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (B).
  • 财政年份:
    1998
  • 资助国家:
    日本
  • 起止时间:
    1998 至 2000
  • 项目状态:
    已结题

项目摘要

The purpose of this research is to develop the multimodal communication system which can recognize multimodal Information such as speech and gesture on natural dialog, understand the intention of human by the integration of them, and respond to human appropriately.First of all, it is necessary to clarify the structure of understanding of human intention by the integration of multimodal information and response by multiple modalities. Therefore we have analyzed the acoustic features of speech such as fillers and the roles of gestures such as head movement on the various natural human dialogues.Then we have made studies of speech and gesture recognition algorithm that is fundamental technique for multimodal communication system. We suggest a recombination strategy for multi-band automatic speech recognition which gives more accurate recognition, especially in noisy acoustic environments. And we propose a speech decoder in which the language models are modified to deal with timing of the turn taking and the speaker models are also utilized. We apply a new pattern matching method, Partly-Hidden Markov model, in which the first state is hidden and the second one is observable, to gesture recognition. And we propose the face extraction and the pose detection method to recognize the head movement.Finally, we have implemented multimodal communication model to the human-machine dialogue system. This system uses a method of generalization considering trade-off between variety of dialogue and easiness to describes rules and provides a domain independent platform. Also, it has a spoken dialogue control model for improvement of dialogue efficiency and a dialogue management model for detection of misunderstanding in spoken dialogue system.
本研究的目的是开发一个能够在自然对话中识别语音、手势等多通道信息的多通道交流系统,通过它们的整合来理解人类的意图,并对人类做出恰当的反应。首先,需要通过整合多通道信息和多种通道的响应来明确人类意图的理解结构。因此,我们分析了填充词等语音的声学特征和头部运动等手势在各种自然人类对话中的作用。然后,我们研究了多模式通信系统的基础技术--语音和手势识别算法。我们提出了一种多频段自动语音识别的重组策略,它可以提供更准确的识别,特别是在噪声环境下。我们还提出了一种语音解码器,其中语言模型被修改来处理话轮转换的时间,同时也利用了说话人模型。我们将一种新的模式匹配方法--部分隐马尔可夫模型应用于手势识别,其中第一个状态是隐藏的,第二个状态是可观测的。提出了人脸提取和姿态检测方法来识别头部运动。最后,将多通道通信模型实现到人机对话系统中。该系统采用一种兼顾对话多样性和易用性的泛化方法来描述规则,提供了一个与领域无关的平台。此外,它还具有用于提高对话效率的口语对话控制模型和用于检测口语对话系统中误解的对话管理模型。

项目成果

期刊论文数量(88)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
K.Aoyama,K.Shirai: "Controlling Non-verbal Information in Speaker-change for Spoken Dialogue"2000 IEEE International Conference on Systems Man and Cybernetics (SMC2000). 1354-1359 (2000)
K.Aoyama,K.Shirai:“控制口语对话中说话人变化中的非语言信息”2000 年 IEEE 国际系统人与控制论会议 (SMC2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Shigeki Okawa 他: "A Recombination Strategy for Multi-band Speech Recognition Based on Mutual Information Criterion"Proc. of EUROSPEECH'99. Vol.2. 603-606 (1999)
Shigeki Okawa 等人:“基于互信息准则的多频带语音识别重组策略”Proc. of EUROSPEECH99 第 2 卷 603-606(1999 年)
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
M.Yokoyama,K.Shirai: "Use of Non-Verbal Information in Communication between Human and Robot"Proc.of International Conference on Spoken Language Processing (ICSLP). 2351-2354 (1998)
M.Yokoyama,K.Shirai:“人与机器人交流中非语言信息的使用”国际口语处理会议(ICSLP)会议记录。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
H.Kikuchi,K.Shirai: "Controlling Dialogue Strategy According to Performance of Processes"ESCA Workshop,Session5.2. 85-88 (1999)
H.Kikuchi、K.Shirai:“根据流程性能控制对话策略”ESCA 研讨会,Session5.2。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
K.Aoyama,K.Shirai: "DESIGNING A DOMAIN INDEPENDENT PLATFORM OF SPOKEN DIALOGUE SYSTEM"Proc.of International Conference on Spoken Language Processing (ICSLP). (CD-ROM). (2000)
K.Aoyama,K.Shirai:“设计口语对话系统的领域独立平台”Proc.of 国际口语处理会议(ICSLP)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SHIRAI Katsuhiko其他文献

SHIRAI Katsuhiko的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SHIRAI Katsuhiko', 18)}}的其他基金

A Study on a framework of spontaneous communication depending on dialogue situation
基于对话情境的自发交流框架研究
  • 批准号:
    17300066
  • 财政年份:
    2005
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Construction of Multimodal Emotion Representation Model for Computer Animation
计算机动画多模态情感表达模型的构建
  • 批准号:
    14208031
  • 财政年份:
    2002
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Research on application to language education of multimodal ICAI system
多模态ICAI系统在语言教育中的应用研究
  • 批准号:
    07458075
  • 财政年份:
    1995
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Studies on CAD system of Application Specific VLSI Circuits for Signal Processing
信号处理专用VLSI电路CAD系统的研究
  • 批准号:
    03452174
  • 财政年份:
    1993
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (B)
Co-Operative Study on Modeling and Machine Inplementation of Spoken Language Conversation
口语对话建模与机器实现的合作研究
  • 批准号:
    02305010
  • 财政年份:
    1990
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Application of an Intelligent CAI System with Graphical and Voice Media to Educaton in University for Developmental Scientifical Research
图形和语音媒体智能CAI系统在发展科学研究大学教育中的应用
  • 批准号:
    01880035
  • 财政年份:
    1989
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Developmental Scientific Research
Research on the Architecture of the Speech Recognition System and the Computer Aided Design System for Signal Processing LSIs.
语音识别系统体系结构和信号处理LSI计算机辅助设计系统研究。
  • 批准号:
    61460135
  • 财政年份:
    1986
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (B)

相似海外基金

WORKSHOP: Doctoral Consortium at the International Conference on Automatic Face and Gesture Recognition
研讨会:自动人脸和手势识别国际会议上的博士联盟
  • 批准号:
    2315559
  • 财政年份:
    2022
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Standard Grant
WORKSHOP: Doctoral Consortium at the International Conference on Automatic Face and Gesture Recognition
研讨会:自动人脸和手势识别国际会议上的博士联盟
  • 批准号:
    2026967
  • 财政年份:
    2020
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Standard Grant
3D gesture recognition interface with millimeter wave radar
毫米波雷达3D手势识别接口
  • 批准号:
    543146-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Master's
WORKSHOP: Doctoral Consortium at the IEEE International Conference on Automatic Face and Gesture Recognition (FG 2018)
研讨会:博士联盟在 IEEE 自动人脸和手势识别国际会议 (FG 2018)
  • 批准号:
    1829167
  • 财政年份:
    2018
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Standard Grant
Visual speech and facial gesture recognition for children with CP and complex communication needs
针对脑瘫和复杂沟通需求的儿童的视觉语音和面部手势识别
  • 批准号:
    525729-2018
  • 财政年份:
    2018
  • 资助金额:
    $ 5.89万
  • 项目类别:
    University Undergraduate Student Research Awards
Custom Gesture Recognition Solution
定制手势识别解决方案
  • 批准号:
    530410-2018
  • 财政年份:
    2018
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Applied Research and Development Grants - Level 1
Eye-gaze gesture recognition using images of the lateral view of the eye
使用眼睛侧面图像的眼睛注视手势识别
  • 批准号:
    17K01594
  • 财政年份:
    2017
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CRII: CSR: Pervasive Gesture Recognition Using Ambient Light
CRII:CSR:使用环境光进行普遍手势识别
  • 批准号:
    1565609
  • 财政年份:
    2016
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Standard Grant
Gesture Recognition for Automotive Applications (GRAAppl)
汽车应用手势识别 (GRAAppl)
  • 批准号:
    132057
  • 财政年份:
    2016
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Feasibility Studies
Posture and Gesture Recognition for Wireless Musical Gloves
无线音乐手套的姿势和手势识别
  • 批准号:
    132056
  • 财政年份:
    2015
  • 资助金额:
    $ 5.89万
  • 项目类别:
    Feasibility Studies
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了