Study on Computational Auditory Scene Analysis for Humanoids by Active Audition
基于主动听觉的类人计算听觉场景分析研究
基本信息
- 批准号:15200015
- 负责人:
- 金额:$ 32.86万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (A)
- 财政年份:2003
- 资助国家:日本
- 起止时间:2003 至 2006
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Robot audition is a capability in which a humanoid can hear sounds with its own microphones (ears) mounted on its body. Since humanoids usually hear a mixture of sounds in the real world, Computational Auditory Scene Analysis (CASA) of which essential functions consist of sound source localization, separation, and recognition of separated sounds is required to realize the capability of listening to several things simultaneously, like "Shotoku-Taishi" (Prince Shotoku). We have obtained the following research results :1) CASA functions with less prior information :The missing-feature based approach integrated sound localization (MUSIC, or steered beamformer), sound source separation (Geometrical Source Separation, or Independent Component Analysis), and automatic speech recognition (Mulit-band Julius, or CTK) by developing automatic missing feature mask generation. The whole system was implemented on the FlowDesigner architecture, so that recognizing three simultaneous speech was perform … More ed with latency of 1.9 sec. This result confirmed the validity of our approach on different humanoids including SIG2, Robovie-R2, and ASIMO.2) Distance-based behavior selection :The interaction strategy based on the distance between the humanoid and people according to Proxemics was devised to select an appropriate interaction partner. This system implemented on SIG-2 Humanoid was demonstrated for three months at the Kyoto University Museum to confirm its effectiveness in multiple person interaction.3) Robust face tracking was developed based on Color-target Detection Based on Nearest Neighbor Classifier to improve the performance of moving talker tracking.4) Music information technologies for polyphonic music, including musical instrument recognition, drum sound extraction, and singer recognition, were developed for humanoids to hear music.5) User model and error recovery from speech recognition errors were developed to improve the usability of multi-domain spoken dialogue system.6) Automatic onomatopoeia recognition system was developed to use environmental sounds in humanoid-human interaction.Future work includes the design and development of robot audition based on CASA. Less
机器人听力是一种人形机器人可以通过安装在其身体上的麦克风(耳朵)听到声音的能力。由于类人机器人在真实的世界中通常听到的是混合的声音,因此需要计算听觉场景分析(CASA),其基本功能包括声源定位、分离和分离声音的识别,以实现同时听到多个事物的能力,如“正德太子”(Shotoku-Taishi)。我们取得了以下研究成果:1)CASA功能与较少的先验信息:丢失的特征为基础的方法集成的声音定位(MUSIC,或转向波束形成器),声源分离(几何源分离,或独立分量分析),和自动语音识别(多带朱利叶斯,或CTK)通过开发自动丢失的特征掩模生成。整个系统在FlowDesigner架构上实现,实现了对三个同时出现的语音的识别 ...更多信息 艾德,潜伏期为1.9秒。这一结果证实了我们的方法在不同的人形机器人,包括SIG 2,Robovie-R2,和ASIMO的有效性。2)基于距离的行为选择:根据Proxemics的人形机器人和人之间的距离的交互策略被设计来选择合适的交互伙伴。在京都大学博物馆进行了为期三个月的多人交互实验,验证了该系统在多人交互中的有效性。3)基于最近邻分类器的颜色目标检测,开发了鲁棒的人脸跟踪技术,以提高移动说话人跟踪的性能。4)针对复调音乐的音乐信息技术,包括乐器识别,鼓声提取,5)为提高多领域口语对话系统的可用性,提出了用户模型和语音识别错误的恢复方法; 6)为利用环境声音进行人机交互,提出了拟声词自动识别系统,并进一步研究了基于CASA的机器人听觉系统的设计和开发。少
项目成果
期刊论文数量(323)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Sound and Visual Tracking for Humanoid Robot
- DOI:10.1023/b:apin.0000021417.62541.e0
- 发表时间:2001-06
- 期刊:
- 影响因子:5.3
- 作者:HIroshi G. Okuno;K. Nakadai;T. Lourens;H. Kitano
- 通讯作者:HIroshi G. Okuno;K. Nakadai;T. Lourens;H. Kitano
Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps
- DOI:10.1155/2007/51979
- 发表时间:2007
- 期刊:
- 影响因子:1.9
- 作者:Tetsuro Kitahara;Masataka Goto;Kazunori Komatani;T. Ogata;HIroshi G. Okuno
- 通讯作者:Tetsuro Kitahara;Masataka Goto;Kazunori Komatani;T. Ogata;HIroshi G. Okuno
Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectral Templates with Harmonic Harmonic Structure Suppression
通过频谱模板的适应和匹配与谐波谐波结构抑制来识别和弦音频信号
- DOI:
- 发表时间:2007
- 期刊:
- 影响因子:0
- 作者:Kazuyoshi Yoshii;et al.
- 通讯作者:et al.
Computational Auditory Scene Analysis and Its Application to Robot Audition : Five Years Experience
计算听觉场景分析及其在机器人试镜中的应用:五年经验
- DOI:
- 发表时间:2007
- 期刊:
- 影响因子:0
- 作者:Hiroshi G. Okuno
- 通讯作者:Hiroshi G. Okuno
Pitch-dependent identification of musical instrument sounds
乐器声音的音高相关识别
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Tetsuro Kitahara;et al.
- 通讯作者:et al.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
OKUNO Hiroshi其他文献
OKUNO Hiroshi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('OKUNO Hiroshi', 18)}}的其他基金
Development of Robot Audition based on Computational Auditory Scene Analysis
基于计算听觉场景分析的机器人试听开发
- 批准号:
19100003 - 财政年份:2007
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (S)
Automatic Transformation of GDA Document Tag and Development of Its Applications
GDA文档标签自动转换及其应用开发
- 批准号:
13558037 - 财政年份:2001
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Musical Information Processing by using Sound Ontology
使用声音本体进行音乐信息处理
- 批准号:
12480090 - 财政年份:2000
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Molecular chaperones in male infertility
男性不育症中的分子伴侣
- 批准号:
10671471 - 财政年份:1998
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Molecular biological analysis of urothelial cancers using short-term cultures of urinary exfoliated cells
使用尿脱落细胞的短期培养物对尿路上皮癌进行分子生物学分析
- 批准号:
08671813 - 财政年份:1996
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
相似海外基金
Excellence in Research: Incorporating Attention into Computational Auditory Scene Analysis Using Spectral Clustering with Focal Templates
卓越研究:使用带有焦点模板的谱聚类将注意力纳入计算听觉场景分析
- 批准号:
2100874 - 财政年份:2021
- 资助金额:
$ 32.86万 - 项目类别:
Standard Grant
Computational auditory scene analysis as causal inference
作为因果推理的计算听觉场景分析
- 批准号:
1921501 - 财政年份:2019
- 资助金额:
$ 32.86万 - 项目类别:
Standard Grant
Bayesian prediction for computational auditory scene analysis
用于计算听觉场景分析的贝叶斯预测
- 批准号:
510708-2017 - 财政年份:2017
- 资助金额:
$ 32.86万 - 项目类别:
University Undergraduate Student Research Awards
Applying structure in computational auditory scene analysis
在计算听觉场景分析中应用结构
- 批准号:
475019-2015 - 财政年份:2017
- 资助金额:
$ 32.86万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Applying structure in computational auditory scene analysis
在计算听觉场景分析中应用结构
- 批准号:
475019-2015 - 财政年份:2016
- 资助金额:
$ 32.86万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
A study on the structure creation of activation support of acoustic measurement environment based on computational auditory scene analysis
基于计算听觉场景分析的声学测量环境激活支撑结构创建研究
- 批准号:
16H02911 - 财政年份:2016
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Applying structure in computational auditory scene analysis
在计算听觉场景分析中应用结构
- 批准号:
475019-2015 - 财政年份:2015
- 资助金额:
$ 32.86万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Computational Auditory Scene Analysis Using Active Audio-Visual Integration in a Dynamically Changing Environment
在动态变化的环境中使用主动视听集成进行计算听觉场景分析
- 批准号:
22700165 - 财政年份:2010
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Development of Robot Audition based on Computational Auditory Scene Analysis
基于计算听觉场景分析的机器人试听开发
- 批准号:
19100003 - 财政年份:2007
- 资助金额:
$ 32.86万 - 项目类别:
Grant-in-Aid for Scientific Research (S)
Computational Auditory Scene Analysis algorithms for improving speech commu-nication in complex acoustic environments (B02)
用于改善复杂声学环境中语音通信的计算听觉场景分析算法(B02)
- 批准号:
406030471 - 财政年份:
- 资助金额:
$ 32.86万 - 项目类别:
Collaborative Research Centres