权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

A Study for Utilizing the Linguistic Information in Phoneme Recognition to Understand Continuous Speech

利用音素识别中的语言信息来理解连续语音的研究

基本信息

批准号：
03452173
负责人：
KIDO Ken'iti
金额：
$ 4.35万
依托单位：
Chiba Institute of Technology
依托单位国家：
日本
项目类别：
Grant-in-Aid for General Scientific Research (B)
财政年份：
1991
资助国家：
日本
起止时间：
1991 至 1993
项目状态：
已结题

项目摘要

In this study, we proposed 2 higher performance phoneme recognition methodsand the continuous speech recognition method utilizing the linguistic information around the target phoneme.At first, we proposed MR-HMM (Multi-Resolution HMM) based on Wavelet transform, which is able to control the time-frequency resolution. The WTD (Wavelet transform Tree Data) is proposed to represent the time-frequency space in scalogram that is obtained through Wavelet transform. Using this WTD structure, we proposed the State merge Algorithm stucying MR-HMM, it enables the high recognition rate.Next, we proposed the phoneme recognition method using the 9 acoustic features besides the cepstrum parameters that is most popular but not enough. In general, it is necessary for using the several kinds of acoustic parameters to analyze what parameters are suitable for the specified phoneme recognition. But, the proposed method enables using the several kinds of parameters except that. We proposed the Membership Scale to enable applying the linear discriminant method that is for 2 category discrimination to the multi category discrimination. Using this method, the linguistic recognition stage can get the reliability of the results from the acoustical recognition stage.Finally, we proposed the new linguistic recognition method, that uses the co-occurative relationship of the words in one sentence. This method doesn't use the grammatical knowledge, so the task fre speech is available. Combining this linguistic recognition method with the acoustic recognition methods mentioned above, the misrecognition in the acoustical recognition stage can be controlled by the linguistic rrecognition stage. From the experimental results, we confirmed the effectiveness of the proposed recognition methods.

在本研究中，我们提出了两种更高性能的音素识别方法和一种利用目标音素周围语言信息的连续语音识别方法。首先，我们提出了基于小波变换的多分辨率HMM（Multi-Resolution HMM），它能够控制时频分辨率。提出用小波变换树数据表示尺度图的时频空间。利用这种WTD结构，我们提出了一种基于MR-HMM的状态合并算法，使得识别率更高。接下来，我们提出了一种基于倒谱参数的音素识别方法，该方法除了使用目前最流行但还不够的倒谱参数外，还使用了9个声学特征。通常，需要使用几种声学参数来分析什么参数适合于指定的音素识别。但是，所提出的方法允许使用除此之外的几种参数。我们提出了隶属度表，使应用线性判别方法，是2类歧视的多类歧视。利用这种方法，语言识别阶段可以得到语音识别阶段结果的可靠性。最后，我们提出了一种新的语言识别方法，即利用句子中词的同现关系。这种方法不需要语法知识，因此可以实现无语音任务。将这种语言识别方法与上述声学识别方法相结合，可以通过语言再识别阶段来控制声学识别阶段的误识别。从实验结果中，我们证实了所提出的识别方法的有效性。