权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

STUDIES OF AN ADVANCED AUDITORY MODEL AND THE APPLICATION TO IMPROVE THE ROBUSTNESS OF CONTINUOUS SPEECH RECOGNITION

先进听觉模型的研究及其在提高连续语音识别鲁棒性方面的应用

基本信息

批准号：
10650358
负责人：
TANIGUCHI Shuji
金额：
$ 2.24万
依托单位：
Fukui University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
1998
资助国家：
日本
起止时间：
1998 至 2000
项目状态：
已结题

项目摘要

Our final goal is to develop a reliable continuous speech recognition system based on a model of human auditory system. So, we have studied as follows :(1) On the base of a subword-unit-based isolated word recognizer (VQ-SWR) with the discrete hidden Markov models (DHMMs) as a recognition tool, which we developed before, the research to improve the robustness for speakers and some environment noises have been done. As experimental results, findings can be summarized as follows :[1] A new recognizer with the DHMMs replaced with the semi-continuous HMMs have been developed. Experimental results showed a considerable improvement of the new recognizer in speakerindependency.[2] We have developed a new subword-unit-based isolated word recognizer incorporated a multiparty and a speaker adaptation step on the base of the VQ-SWR.This is made up of DHMMs and a learning vector quantizer (LVQ) incorporated a feedback of information on the classification of input subword which is obtained from the … More output of the LVQ.Experimental results showed that the new recognizer performance including the robustness for speaker and noise in stationary states is higher than those accomplished with the conventional recognizer VQ-SWR.(2) To aim at achieving higher word recognition rates and higher noise robustness than the VQ-SWR, we have proposed a new recognizer (CM-RN-SWR) made up of a model (NLF-COM) of human cochlea called "a nonlinear feedback model for cochlea", a simple multi-layer recurrent neural network (RNN) which has feedback connections of self-loop type, and DHMMs for words. The NLF-COM and the RNN which were developed before by us has been used as a model of the human auditory system, and as a kind of spectrum analyzer for speech sounds and a subword recognizer, respectively. Experimental results showed that recognition accuracies for clean speech and speech in the presence of pseud-white noise are considerably improved in speaker-dependent applications in comparison with the VQ-SWR. Less

我们的最终目标是开发一个基于人类听觉系统模型的可靠的连续语音识别系统。为此，本文进行了以下研究：(1)在之前开发的基于子词单元的孤立词识别器（VQ-SWR）的基础上，以离散隐马尔可夫模型（dhmm）为识别工具，对说话人和一些环境噪声的鲁棒性进行了研究。实验结果表明：[1]用半连续hmm取代了传统的半连续hmm，得到了一种新的识别方法。实验结果表明，该识别器在说话人独立性方面有很大的提高我们开发了一种新的基于子词单元的孤立词识别器，该识别器在VQ-SWR的基础上结合了多方和说话人自适应步骤。它由dhmm和学习向量量化器（LVQ）组成，LVQ结合了输入子词分类的信息反馈，这些信息反馈来自LVQ的更多输出。实验结果表明，与传统的VQ-SWR辨识器相比，该辨识器在平稳状态下对说话人和噪声的鲁棒性都有所提高。(2)为了获得比VQ-SWR更高的单词识别率和更高的噪声鲁棒性，我们提出了一种新的识别器（CM-RN-SWR），该识别器由一个被称为“耳蜗非线性反馈模型”的人类耳蜗模型（NLF-COM）、一个具有自环型反馈连接的简单多层递归神经网络（RNN）和单词的dhmm组成。我们之前开发的NLF-COM和RNN分别被用作人类听觉系统的模型、语音频谱分析仪和子词识别器。实验结果表明，与VQ-SWR相比，在依赖于说话人的应用中，对干净语音和存在伪白噪声的语音的识别精度有了显著提高。少