A multi-modal sensor fusion architecture for audio-visual speech understanding

用于视听语音理解的多模态传感器融合架构

基本信息

  • 批准号:
    184129-2007
  • 负责人:
  • 金额:
    $ 1.97万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2007
  • 资助国家:
    加拿大
  • 起止时间:
    2007-01-01 至 2008-12-31
  • 项目状态:
    已结题

项目摘要

Modern speech recognition systems remain challenged in noisy environments despite all the technical advances that have been made to enhance their capabilities. There has been significant research work to address the effect of noise and word ambiguities on recognition performance. For speech recognition systems to be of practical use in noisy environments, such as in automobile, crowded areas and/or simultaneous human computer discourse applications, the issue of robustness must be addressed. The applicant maintains that this can possibly be accomplished by utilizing other sensing modalities to complement the acoustic signal of the speech. An example in line with this strategy is to fuse visual lip movements and expressions with the acoustic signal of the speech so as to maximize information gathering about the words uttered and to minimize the impact of acoustic noise.Understanding speech from visual information is an attractive approach that has captured the interest of many researchers to improve speech recognition performance. Not only does this approach have the potential to improve speech recognition performance but it may also be possible for speech impaired people to interact with devices in human machine interfacing applications.In this research project the applicant investigates the integration of audio and visual lip movement features (during speech signal production) to detect and recognize spoken phrases. A multi-modal fusion system will be designed. The system should be able to acquire visual and acoustic signals of speech and fuse them to detect and recognize words spoken by the speaker in a noisy environment. The system will also be tuned and tested in situations where the speaker is speech impaired. It is expected that the proposed research project will result in the training of 6 highly qualified personnel in human machine interfacing- an area of strategic importance to the Canadian economy.
尽管已经取得了所有技术进步来增强现代语音识别系统的功能,但现代语音识别系统在噪声环境中仍然面临挑战。人们已经开展了大量的研究工作来解决噪声和单词歧义对识别性能的影响。为了使语音识别系统在噪声环境中得到实际应用,例如在汽车、拥挤区域和/或同时人机对话应用中,必须解决鲁棒性问题。申请人认为,这可以通过利用其他感测方式来补充语音的声学信号来实现。符合这一策略的一个例子是将视觉嘴唇运动和表情与语音的声学信号融合起来,以便最大限度地收集有关所说单词的信息,并最大限度地减少声学噪声的影响。从视觉信息中理解语音是一种很有吸引力的方法,它引起了许多研究人员提高语音识别性能的兴趣。这种方法不仅具有提高语音识别性能的潜力,而且还可以让有语言障碍的人与人机接口应用中的设备进行交互。在这个研究项目中,申请人研究了音频和视觉嘴唇运动特征(在语音信号产生期间)的集成,以检测和识别口语短语。将设计多模态融合系统。该系统应该能够获取语音的视觉和听觉信号,并将它们融合起来,以检测和识别说话者在嘈杂环境中说出的单词。该系统还将在讲话者有言语障碍的情况下进行调整和测试。预计拟议的研究项目将培训 6 名人机界面方面的高素质人才,该领域对加拿大经济具有战略重要性。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Basir, Otman其他文献

Semantic understanding of general linguistic items by means of fuzzy set theory
  • DOI:
    10.1109/tfuzz.2006.889817
  • 发表时间:
    2007-10-01
  • 期刊:
  • 影响因子:
    11.9
  • 作者:
    Khoury, Richard;Karray, Fakhri;Basir, Otman
  • 通讯作者:
    Basir, Otman
Wideband L-Shaped Circular Polarized Monopole Slot Antenna
Exchange strategies for multiple Ant Colony System
  • DOI:
    10.1016/j.ins.2006.09.016
  • 发表时间:
    2007-03-01
  • 期刊:
  • 影响因子:
    8.1
  • 作者:
    Ellabib, Issmail;Calamai, Paul;Basir, Otman
  • 通讯作者:
    Basir, Otman
Feature-Selected Tree-Based Classification
  • DOI:
    10.1109/tsmcb.2012.2237394
  • 发表时间:
    2013-12-01
  • 期刊:
  • 影响因子:
    11.8
  • 作者:
    Freeman, Cecille;Kulic, Dana;Basir, Otman
  • 通讯作者:
    Basir, Otman
Farthest point distance: A new shape signature for Fourier descriptors
  • DOI:
    10.1016/j.image.2009.04.001
  • 发表时间:
    2009-08-01
  • 期刊:
  • 影响因子:
    3.5
  • 作者:
    El-ghazal, Akrem;Basir, Otman;Belkasim, Saeid
  • 通讯作者:
    Belkasim, Saeid

Basir, Otman的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Basir, Otman', 18)}}的其他基金

Coordination and Cooperation in Self-Driving Vehicles
自动驾驶汽车的协调与合作
  • 批准号:
    RGPIN-2018-04342
  • 财政年份:
    2019
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
Multimodal Mobility Modeling and Traffic Profiling in Cyber-Physical Systems
网络物理系统中的多模式移动建模和流量分析
  • 批准号:
    184129-2012
  • 财政年份:
    2017
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
Multimodal Mobility Modeling and Traffic Profiling in Cyber-Physical Systems
网络物理系统中的多模式移动建模和流量分析
  • 批准号:
    184129-2012
  • 财政年份:
    2015
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
Multimodal Mobility Modeling and Traffic Profiling in Cyber-Physical Systems
网络物理系统中的多模式移动建模和流量分析
  • 批准号:
    184129-2012
  • 财政年份:
    2014
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
Multimodal Mobility Modeling and Traffic Profiling in Cyber-Physical Systems
网络物理系统中的多模式移动建模和流量分析
  • 批准号:
    184129-2012
  • 财政年份:
    2013
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
Multimodal Mobility Modeling and Traffic Profiling in Cyber-Physical Systems
网络物理系统中的多模式移动建模和流量分析
  • 批准号:
    184129-2012
  • 财政年份:
    2012
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
A multi-modal sensor fusion architecture for audio-visual speech understanding
用于视听语音理解的多模态传感器融合架构
  • 批准号:
    184129-2007
  • 财政年份:
    2011
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
A multi-modal sensor fusion architecture for audio-visual speech understanding
用于视听语音理解的多模态传感器融合架构
  • 批准号:
    184129-2007
  • 财政年份:
    2010
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
A multi-modal sensor fusion architecture for audio-visual speech understanding
用于视听语音理解的多模态传感器融合架构
  • 批准号:
    184129-2007
  • 财政年份:
    2009
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual
A multi-modal sensor fusion architecture for audio-visual speech understanding
用于视听语音理解的多模态传感器融合架构
  • 批准号:
    184129-2007
  • 财政年份:
    2008
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

基于异构医学影像数据的深度挖掘技术及中枢神经系统重大疾病的精准预测
  • 批准号:
    61672236
  • 批准年份:
    2016
  • 资助金额:
    64.0 万元
  • 项目类别:
    面上项目

相似海外基金

NSF-SNSF: Rapid Beamforming for Massive MIMO using Machine Learning on RF-only and Multi-modal Sensor Data
NSF-SNSF:在纯射频和多模态传感器数据上使用机器学习实现大规模 MIMO 的快速波束成形
  • 批准号:
    2401047
  • 财政年份:
    2024
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Standard Grant
Reconfigurable 3D Origami Probes for Multi-modal Neural Interface
用于多模态神经接口的可重构 3D 折纸探针
  • 批准号:
    10738994
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
CAREER: Enhancing ambient capacitive sensing through improved resolution and multi-modal sensor fusion
职业:通过提高分辨率和多模式传感器融合增强环境电容传感
  • 批准号:
    2237945
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Continuing Grant
A Multi-Modal Combination Intervention to Promote Cognitive Function in Older Intensive Care Unit Survivors
促进老年重症监护病房幸存者认知功能的多模式组合干预
  • 批准号:
    10662893
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
SCH: Artificial Intelligence enabled multi-modal sensor platform for at-home health monitoring of patients
SCH:人工智能支持的多模式传感器平台,用于患者的家庭健康监测
  • 批准号:
    10816667
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
A Multi-Modal Wearable Sensor for Early Detection of Cognitive Decline and Remote Monitoring of Cognitive-Motor Decline Over Time
一种多模态可穿戴传感器,用于早期检测认知衰退并远程监控认知运动随时间的衰退
  • 批准号:
    10765991
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
Functionalized Multi-Modal Tetrode Arrays for Real-Time, Site-Specific Neurochemical Monitoring
用于实时、特定部位神经化学监测的功能化多模态 Tetrode 阵列
  • 批准号:
    10759908
  • 财政年份:
    2023
  • 资助金额:
    $ 1.97万
  • 项目类别:
A Multi-Modal Remote Monitoring Platform for Frontotemporal Lobar Degeneration Syndromes
额颞叶变性综合征的多模式远程监测平台
  • 批准号:
    10597923
  • 财政年份:
    2022
  • 资助金额:
    $ 1.97万
  • 项目类别:
A Multi-Modal Remote Monitoring Platform for Frontotemporal Lobar Degeneration Syndromes
额颞叶变性综合征的多模式远程监测平台
  • 批准号:
    10707379
  • 财政年份:
    2022
  • 资助金额:
    $ 1.97万
  • 项目类别:
Collaborative Research: IIBR Multidisciplinary: mSAIL (Michigan Small Animal Integrated Logger): a milligram-scale, multi-modal sensor and analytics monitoring platform
合作研究:IIBR 多学科:mSAIL(密歇根小动物综合记录仪):毫克级、多模式传感器和分析监测平台
  • 批准号:
    2043017
  • 财政年份:
    2021
  • 资助金额:
    $ 1.97万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了