权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Study on Computational Auditory Scene Analysis for Humanoids by Active Audition

基于主动听觉的类人计算听觉场景分析研究

基本信息

批准号：
15200015
负责人：
OKUNO Hiroshi
金额：
$ 32.86万
依托单位：
Kyoto University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (A)
财政年份：
2003
资助国家：
日本
起止时间：
2003 至 2006
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/en/grant/KAKENHI-PROJECT-15200015/
关键词：
Robot Audition Computational Auditory Scene Analysis Audio-Visual Integration Music Information Processing Automatic Onomatopoeia Recognition Missing Feature Theory Automatic Missing Feature Mask Generation Genetic Algorithm ミッシングフィーチャ色

项目摘要

Robot audition is a capability in which a humanoid can hear sounds with its own microphones (ears) mounted on its body. Since humanoids usually hear a mixture of sounds in the real world, Computational Auditory Scene Analysis (CASA) of which essential functions consist of sound source localization, separation, and recognition of separated sounds is required to realize the capability of listening to several things simultaneously, like "Shotoku-Taishi" (Prince Shotoku). We have obtained the following research results :1) CASA functions with less prior information :The missing-feature based approach integrated sound localization (MUSIC, or steered beamformer), sound source separation (Geometrical Source Separation, or Independent Component Analysis), and automatic speech recognition (Mulit-band Julius, or CTK) by developing automatic missing feature mask generation. The whole system was implemented on the FlowDesigner architecture, so that recognizing three simultaneous speech was perform … More ed with latency of 1.9 sec. This result confirmed the validity of our approach on different humanoids including SIG2, Robovie-R2, and ASIMO.2) Distance-based behavior selection :The interaction strategy based on the distance between the humanoid and people according to Proxemics was devised to select an appropriate interaction partner. This system implemented on SIG-2 Humanoid was demonstrated for three months at the Kyoto University Museum to confirm its effectiveness in multiple person interaction.3) Robust face tracking was developed based on Color-target Detection Based on Nearest Neighbor Classifier to improve the performance of moving talker tracking.4) Music information technologies for polyphonic music, including musical instrument recognition, drum sound extraction, and singer recognition, were developed for humanoids to hear music.5) User model and error recovery from speech recognition errors were developed to improve the usability of multi-domain spoken dialogue system.6) Automatic onomatopoeia recognition system was developed to use environmental sounds in humanoid-human interaction.Future work includes the design and development of robot audition based on CASA. Less

机器人听力是一种人形机器人可以通过安装在其身体上的麦克风（耳朵）听到声音的能力。由于类人机器人在真实的世界中通常听到的是混合的声音，因此需要计算听觉场景分析（CASA），其基本功能包括声源定位、分离和分离声音的识别，以实现同时听到多个事物的能力，如“正德太子”（Shotoku-Taishi）。我们取得了以下研究成果：1）CASA功能与较少的先验信息：丢失的特征为基础的方法集成的声音定位（MUSIC，或转向波束形成器），声源分离（几何源分离，或独立分量分析），和自动语音识别（多带朱利叶斯，或CTK）通过开发自动丢失的特征掩模生成。整个系统在FlowDesigner架构上实现，实现了对三个同时出现的语音的识别 ...更多信息艾德，潜伏期为1.9秒。这一结果证实了我们的方法在不同的人形机器人，包括SIG 2，Robovie-R2，和ASIMO的有效性。2）基于距离的行为选择：根据Proxemics的人形机器人和人之间的距离的交互策略被设计来选择合适的交互伙伴。在京都大学博物馆进行了为期三个月的多人交互实验，验证了该系统在多人交互中的有效性。3）基于最近邻分类器的颜色目标检测，开发了鲁棒的人脸跟踪技术，以提高移动说话人跟踪的性能。4）针对复调音乐的音乐信息技术，包括乐器识别，鼓声提取，5）为提高多领域口语对话系统的可用性，提出了用户模型和语音识别错误的恢复方法; 6）为利用环境声音进行人机交互，提出了拟声词自动识别系统，并进一步研究了基于CASA的机器人听觉系统的设计和开发。少

项目成果

期刊论文数量（323）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Sound and Visual Tracking for Humanoid Robot

DOI：
10.1023/b:apin.0000021417.62541.e0
发表时间：
2001-06
期刊：
Applied Intelligence
影响因子：
5.3
作者：
HIroshi G. Okuno;K. Nakadai;T. Lourens;H. Kitano
通讯作者：
HIroshi G. Okuno;K. Nakadai;T. Lourens;H. Kitano

Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps

DOI：
10.1155/2007/51979
发表时间：
2007
期刊：
EURASIP Journal on Advances in Signal Processing
影响因子：
1.9
作者：
Tetsuro Kitahara;Masataka Goto;Kazunori Komatani;T. Ogata;HIroshi G. Okuno
通讯作者：
Tetsuro Kitahara;Masataka Goto;Kazunori Komatani;T. Ogata;HIroshi G. Okuno

Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectral Templates with Harmonic Harmonic Structure Suppression

通过频谱模板的适应和匹配与谐波谐波结构抑制来识别和弦音频信号