Unsupervised audio-visual geometry calibration of distributed microphone arrays

分布式麦克风阵列的无监督视听几何校准

基本信息

项目摘要

The overall goal of the project is to simplify the installation of audio-visual sensor networks. This is achieved by developing algorithms for automatically determining the position of the distributed microphone arrays from reverberant speech input. Then correspondences between events sensed by the acoustic sensor network and those sensed by a sensor network of another modality are established. This other modality is given by a multi-camera-network, where the camera positions are assumed to be known. Once the correspondences have been detected the relative geometries of the acoustic sensor network can be mapped to the given coordinate system of the camera network.In the first project phase we have developed algorithms for microphone array posiltion self calibration in 2D, which are characterized by immunity to reverberation and which can operate on speech input, not requiring artificial calibration signals.The requested project extension is devoted to the following objectives:1) A correspondence will be developed between the relative geometries of the acoustic sensor network and a global coordinate system, which is given by a multi-camera system. This shall be achieved by mapping trajectories of speakers obtained from the acoustic subsystem to trajectories of the visual correlates, i.e., faces or persons.2) The developed geometry self calibration system will be embedded in an ambient communication system for evaluation purposes.In the ambient communication system an acoustic beamformer extracts the speech signal, while the active camera system tracks the speaker. In this test bed the interaction between user and system can be studied in a real-time environment and under realistic environmental conditions, eventually leading to an optimization of the calibration system.
该项目的总体目标是简化视听传感器网络的安装。这是通过开发用于从混响语音输入自动确定分布式麦克风阵列的位置的算法来实现的。然后,建立由声学传感器网络感测的事件与由另一模态的传感器网络感测的事件之间的对应关系。该另一模态由多相机网络给出,其中相机位置被假设为已知的。一旦检测到对应关系,声学传感器网络的相对几何形状就可以映射到摄像机网络的给定坐标系。在第一个项目阶段,我们已经开发了用于麦克风阵列位置自校准的2D算法,其特征在于对混响的免疫力,并且可以对语音输入进行操作,所请求的项目扩展致力于以下目标:1)将在声学传感器网络的相对几何形状与由多相机系统给出的全局坐标系之间建立对应关系。这将通过将从声学子系统获得的扬声器的轨迹映射到视觉相关的轨迹来实现,即,2)将所开发的几何自校准系统嵌入到环境通信系统中进行评估。在环境通信系统中,声学波束形成器提取语音信号,而主动摄像机系统跟踪说话人。在该测试台中,可以在实时环境和现实环境条件下研究用户与系统之间的交互,最终导致校准系统的优化。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multi-speaker tracking using multiple distributed microphone arrays
使用多个分布式麦克风阵列进行多扬声器跟踪
Geometry calibration of multiple microphone arrays in highly reverberant environments
高混响环境中多个麦克风阵列的几何校准
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Professor Dr.-Ing. Reinhold Häb-Umbach其他文献

Professor Dr.-Ing. Reinhold Häb-Umbach的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Professor Dr.-Ing. Reinhold Häb-Umbach', 18)}}的其他基金

Coordination Funds
协调基金
  • 批准号:
    318726059
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Research Units
Source separation and noise reduction for automatic speech recognition in dynamic acoustic scenarios
动态声学场景中自动语音识别的源分离和降噪
  • 批准号:
    316471544
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Research Grants (Transfer Project)
Sound recognition with limited supervision over sensor networks
对传感器网络的有限监督的声音识别
  • 批准号:
    318489874
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Research Units
Bayesian Learning of a Hierarchical Representation of Language from Raw Speech
从原始语音中学习语言的分层表示的贝叶斯学习
  • 批准号:
    260050394
  • 财政年份:
    2014
  • 资助金额:
    --
  • 项目类别:
    Priority Programmes
Bayesian feature enhancement for large vocabulary speech recognition in the presence of noise and reverberation
贝叶斯特征增强,适用于存在噪声和混响的情况下的大词汇量语音识别
  • 批准号:
    235486169
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Sparse Coding Approaches to Language Acquisition
语言习得的稀疏编码方法
  • 批准号:
    200293401
  • 财政年份:
    2011
  • 资助金额:
    --
  • 项目类别:
    Priority Programmes
Ein integrierter Ansatz zur Störgeräuschunterdrückung und blinden Trennung von Sprachsignalen
一种降噪和语音信号盲分离的综合方法
  • 批准号:
    193484692
  • 财政年份:
    2010
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Ein systematischer Ansatz zur Ausnutzung von Korrelationen aufeinander folgender Merkmalsvektoren in der automatischen Spracherkennung
在自动语音识别中利用连续特征向量相关性的系统方法
  • 批准号:
    61519056
  • 财政年份:
    2008
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Blinde adaptive akustische Strahlformung und Quellentrennung für einen sich bewegenden Sprecher in nichtstationärer akustischer Umgebung
非静止声学环境中移动扬声器的盲自适应声波束形成和源分离
  • 批准号:
    21317402
  • 财政年份:
    2006
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Schätzung und Verwendung von weichen Merkmalsvektoren bei Spracherkennung über Telekommunikationssysteme
电信系统语音识别中软特征向量的估计和使用
  • 批准号:
    5418396
  • 财政年份:
    2004
  • 资助金额:
    --
  • 项目类别:
    Research Grants

相似海外基金

Multisensory Augmented Reality as a bridge to audio-only accommodations for inclusive STEM interactive digital media
多感官增强现实作为包容性 STEM 交互式数字媒体的纯音频住宿的桥梁
  • 批准号:
    10693600
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
EduSay™ - developing a digital, audio-visual and kinesthetic English pronunciation training programme for international students and professionals; upskilling communications for education, employability, UK productivity and integration
EduSay™ - 为国际学生和专业人士开发数字、视听和动觉英语发音培训计划;
  • 批准号:
    10063001
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Collaborative R&D
Empowering Archivists: Applying New Tools and Approaches for Better Representation of Women in Audio-Visual Collections
赋予档案管理员权力:应用新工具和方法在音像收藏中更好地代表女性
  • 批准号:
    AH/Y007328/1
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Research Grant
User-centric Audio-Visual Scene Understanding for Augmented Reality Smart Glasses in the Wild
以用户为中心的野外增强现实智能眼镜的视听场景理解
  • 批准号:
    23K16912
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Maps as a service: A systematic approach to the production of tactile and audio/vibrational maps for visually impaired users
地图即服务:为视障用户制作触觉和音频/振动地图的系统方法
  • 批准号:
    10720207
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Audio-visual poetics for the environmental pollutions: A research on the documentaries and expressions of "Kogai" films
环境污染的视听诗学——“小外”电影的纪录片与表达研究
  • 批准号:
    22H00613
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Using eye tracking to examine audio-visual rhythm perception in infants
使用眼动追踪检查婴儿的视听节律感知
  • 批准号:
    572614-2022
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    University Undergraduate Student Research Awards
Emotional McGurk: Developing a novel tool to examine audio-visual integration of affective signals
Emotional McGurk:开发一种新颖的工具来检查情感信号的视听整合
  • 批准号:
    574638-2022
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    University Undergraduate Student Research Awards
Neural Rendering of object-based audio-visual scenes
基于对象的视听场景的神经渲染
  • 批准号:
    2644080
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
Ghosts amongst us: an audio-visual exploration of haunting in Palestine
我们身边的鬼魂:对巴勒斯坦闹鬼事件的视听探索
  • 批准号:
    2733997
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了