权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Establishment of Speech Communication under Very Heavy Environmental Noise

极重环境噪声下语音通信的建立

基本信息

批准号：
15500137
负责人：
UCHINO Eiji
金额：
$ 1.98万
依托单位：
Yamaguchi University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
2003
资助国家：
日本
起止时间：
2003 至 2006
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/en/grant/KAKENHI-PROJECT-15500137/
关键词：
heavy noise speech communication bone conduction voice air conduction voice speech conversion codebook twin units SOM self-organizing network 自己組織ネットワーク

项目摘要

In general, a bone conduction microphone, which eliminates surrounding noise, is often used in extremely noisy environments such as engine rooms in ships or runways at airports. It detects the vibration of bones such as jaws, and it converts the vibration to voice. Unfortunately, the quality of this voice converted by this microphone is bad for a smooth communication. Therefore, the aim of this research is to develop an algorithm of voice conversion from a bone conduction voice to an air conduction voice, in order to supply a smooth communication method by voice in the extremely noisy environments. The results of this research are the following.1. Voice Conversion by the Proposed TW-SOMA new type of self-organizing map with twin units (TW-SOM), which can describe a nonlinear input-output relation with high accuracy, was proposed, and was applied to voice conversion. Concretely, TW-SOM learns a nonlinear relation between the bone and the air conduction voices by the twin units. After its learning, the bone conduction voice applied to TW-SOM is converted to the corresponding air conduction voice.2. Verification of the Effectiveness of the Proposed Method and Application to Actual Ship-Handling WordsThe effectiveness of the proposed voice conversion method was verified for actual ship-handling words by comparing with the conventional SOM and other competing neural network methods. It was also confirmed that the proposed method is more suitable for a hardware implementation than the other conventional methods.3. Examination of Applicability to Other FieldsThe key idea of the codebook used in the proposed method was successfully applied to an image expansion to get a clear image. TW-SOM is a general method to describe precisely the various nonlinear mappings including voice conversion. We would like then to examine its applicability to other fields as future studies.

通常，消除周围噪声的骨传导麦克风通常用于极其嘈杂的环境，例如船舶的机舱或机场的跑道。它能检测到骨骼的振动，比如下巴，然后把振动转换成声音。不幸的是，这个麦克风转换的声音质量对流畅的交流很不利。因此，本研究的目的是开发一种从骨导语音到空气传导语音的语音转换算法，以便在极端噪声环境中提供一种平滑的语音通信方法。本研究的结果如下.提出了一种能够高精度描述非线性输入输出关系的双元自组织映射（TW-SOM），并将其应用于语音转换。具体地说，TW-SOM通过孪生单元学习骨导语音和气导语音之间的非线性关系。经过学习后，将应用于TW-SOM的骨导语音转换为相应的气导语音.验证所提出的方法的有效性和实际的船舶操作词的应用所提出的语音转换方法的有效性进行了验证，实际的船舶操作词通过比较与传统的SOM和其他竞争的神经网络方法。还证实了所提出的方法比其他传统方法更适合硬件实现。3.该方法的关键思想是将码书的思想成功地应用到图像扩展中，得到了清晰的图像。TW-SOM是精确描述包括语音转换在内的各种非线性映射的通用方法。然后，我们想研究它的适用性，以其他领域作为未来的研究。