Multi-Modal Blind Source Separation for Robot Audition
机器人试镜的多模态盲源分离
基本信息
- 批准号:EP/H012842/1
- 负责人:
- 金额:$ 14.69万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2009
- 资助国家:英国
- 起止时间:2009 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This proposal draws on expertise in blind source separation and multimodal (audio-visual) speech processing within the Centre for Vision Speech and Signal Processing at University of Surrey. The objective is to perform source separation of the target speech in the presence of multiple competing sound sources in room environments and thereby ultimately provide progress towards automatic machine perception of auditory scenes within an un-controlled natural environment. The fundamental novelty in this work is to exploit visual cues for enhancing the operation of frequency domain blind source separation algorithms. Exploitation of such audio-visual processing is targeted at mitigating the permutation problem, the underdetermined problem (i.e. when the number of sources is greater than the number of microphones), and the reverberation problem, which currently limits the practical applicability of blind source separation algorithms. The focus of the work is therefore on the signal processing algorithms and software tools that can be used to perform automatic separation of sound signals, e.g., for a robot. The body of work in this proposal is underpinned by the substantial experience of the investigators, two from the areas of blind source separation and digital speech processing, and one from the area of computer vision and pattern recognition. The outcomes of the proposed research will be of considerable value to the UK defence industry working especially in the areas of target separation, detection and multi-path mitigation (or dereverberation), with applications in, for example, human-robot interaction, security surveillance and human-computer interaction.
该提案借鉴了萨里大学视觉、语音和信号处理中心在盲源分离和多模式(视听)语音处理方面的专业知识。目标是在室内环境中存在多个竞争声源的情况下执行目标语音的源分离,从而最终提供在不受控制的自然环境中对听觉场景的自动机器感知的进展。这项工作的基本新奇是利用视觉线索,以提高频域盲源分离算法的操作。利用这样的视听处理的目标在于减轻置换问题、欠定问题(即,当源的数量大于麦克风的数量时)和混响问题,这些问题目前限制了盲源分离算法的实用性。因此,工作的重点是可用于执行声音信号自动分离的信号处理算法和软件工具,例如,对于一个机器人。这项提案中的工作以调查人员的丰富经验为基础,其中两人来自盲源分离和数字语音处理领域,另一人来自计算机视觉和模式识别领域。拟议研究的成果将对英国国防工业具有相当大的价值,特别是在目标分离,检测和多路径缓解(或去混响)领域,例如,人机交互,安全监视和人机交互等应用。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement
- DOI:10.21437/interspeech.2010-192
- 发表时间:2010-09
- 期刊:
- 影响因子:0
- 作者:Qingju Liu;Wenwu Wang;P. Jackson
- 通讯作者:Qingju Liu;Wenwu Wang;P. Jackson
Joint Mixing Vector and Binaural Model Based Stereo Source Separation
- DOI:10.1109/taslp.2014.2320637
- 发表时间:2014-09
- 期刊:
- 影响因子:0
- 作者:Atiyeh Alinaghi;P. Jackson;Qingju Liu;Wenwu Wang
- 通讯作者:Atiyeh Alinaghi;P. Jackson;Qingju Liu;Wenwu Wang
Interference Reduction in Reverberant <newline/>Speech Separation With Visual <newline/>Voice Activity Detection
通过视觉 <newline/> 语音活动检测来减少混响 <newline/> 语音分离中的干扰
- DOI:10.1109/tmm.2014.2322824
- 发表时间:2014
- 期刊:
- 影响因子:7.3
- 作者:Liu Q
- 通讯作者:Liu Q
Source Separation of Convolutive and Noisy Mixtures Using Audio-Visual Dictionary Learning and Probabilistic Time-Frequency Masking
- DOI:10.1109/tsp.2013.2277834
- 发表时间:2013-11
- 期刊:
- 影响因子:5.4
- 作者:Qingju Liu;Wenwu Wang;P. Jackson;M. Barnard;J. Kittler;J. Chambers
- 通讯作者:Qingju Liu;Wenwu Wang;P. Jackson;M. Barnard;J. Kittler;J. Chambers
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Wenwu Wang其他文献
Characterization of Cu-doped Cd1xZnxTe thin films sputtered from multiple targets
Cu 掺杂 Cd1 的表征
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:4
- 作者:
Zhe Zhu;Lili Wu;Wei Li;Lianghuan Feng;Jingquan Zhang;Wenwu Wang;Guanggen Zeng;Dan Leng - 通讯作者:
Dan Leng
Fast Iterative Shrinkage for Signal Declipping and Dequantization
用于信号去削波和去量化的快速迭代收缩
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Lucas Rencker;F. Bach;Wenwu Wang;Mark D. Plumbley - 通讯作者:
Mark D. Plumbley
A Polarization-Switching, Charge-Trapping, Modulated Arithmetic Logic Unit for In-Memory Computing Based on Ferroelectric Fin Field-Effect Transistors
基于铁电翅片场效应晶体管的用于内存计算的偏振切换、电荷捕获、调制算术逻辑单元
- DOI:
10.1021/acsami.1c20189 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Zhaohao Zhang;Yanna Luo;Yan Cui;Hong Yang;Qingzhu Zhang;Gaobo Xu;Zhenhua Wu;Jinjuan Xiang;Qianqian Liu;Huaxiang Yin;Shujuan Mao;Xiaolei Wang;Junjie Li;Yongkui Zhang;Qing Luo;Jianfeng Gao;Wenjuan Xiong;Jinbiao Liu;Yongliang Li;Junfeng Li;Jun Luo;Wenwu Wang - 通讯作者:
Wenwu Wang
Association Loss for Visual Object Detection
视觉对象检测的关联损失
- DOI:
10.1109/lsp.2020.3013160 - 发表时间:
2020-07 - 期刊:
- 影响因子:3.9
- 作者:
Dongli Xu;Jian Guan;Pengming Feng;Wenwu Wang - 通讯作者:
Wenwu Wang
Intelligent Signal Processing Mechanisms for Nuanced Anomaly Detection in Action Audio-Visual Data Streams
用于动作视听数据流中细微异常检测的智能信号处理机制
- DOI:
10.1109/icassp.2018.8461595 - 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
J. Kittler;Ioannis Kaloskampis;Cemre Zor;Yang Xu;Y. Hicks;Wenwu Wang - 通讯作者:
Wenwu Wang
Wenwu Wang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Cross-modal motion responses in blind and deaf humans
盲人和聋人的跨模式运动反应
- 批准号:
9264530 - 财政年份:2015
- 资助金额:
$ 14.69万 - 项目类别:
Cross-modal motion responses in blind and deaf humans
盲人和聋人的跨模式运动反应
- 批准号:
8486050 - 财政年份:2013
- 资助金额:
$ 14.69万 - 项目类别:
Cross-modal motion responses in blind and deaf humans
盲人和聋人的跨模式运动反应
- 批准号:
8704941 - 财政年份:2013
- 资助金额:
$ 14.69万 - 项目类别:
Dancing Dots Music Touch TTT: Multi-modal Teaching System for Blind musicians. T
Dancing Dots Music Touch TTT:盲人音乐家的多模式教学系统。
- 批准号:
7928397 - 财政年份:2010
- 资助金额:
$ 14.69万 - 项目类别:
Sensory Cortical Organization and Cross-Modal Plasticity in Blind Humans
盲人的感觉皮层组织和跨模式可塑性
- 批准号:
9113167 - 财政年份:2009
- 资助金额:
$ 14.69万 - 项目类别:
Sensory cortical organization and cross-modal plasticity in blind subjects
盲人受试者的感觉皮层组织和跨模式可塑性
- 批准号:
7895576 - 财政年份:2009
- 资助金额:
$ 14.69万 - 项目类别:
Sensory Cortical Organization and Cross-Modal Plasticity in Blind Humans
盲人的感觉皮层组织和跨模式可塑性
- 批准号:
8514241 - 财政年份:2009
- 资助金额:
$ 14.69万 - 项目类别:
Sensory Cortical Organization and Cross-Modal Plasticity in Blind Humans
盲人的感觉皮层组织和跨模式可塑性
- 批准号:
8691821 - 财政年份:2009
- 资助金额:
$ 14.69万 - 项目类别:
Sensory cortical organization and cross-modal plasticity in blind subjects
盲人受试者的感觉皮层组织和跨模式可塑性
- 批准号:
7450018 - 财政年份:2009
- 资助金额:
$ 14.69万 - 项目类别:
Optimizing Multi-Modal Displays for Blind and Visually-Impaired Technology Users
为盲人和视障技术用户优化多模式显示
- 批准号:
7567792 - 财政年份:2008
- 资助金额:
$ 14.69万 - 项目类别: