Word Recognition using A Two-Dimensional Mel-cepstrum under Noisy Environments.
嘈杂环境下使用二维梅尔倒谱的单词识别。
基本信息
- 批准号:63550253
- 负责人:
- 金额:$ 1.34万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for General Scientific Research (C)
- 财政年份:1988
- 资助国家:日本
- 起止时间:1988 至 1989
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The purpose of this research is to offer a new method for word recognition under noisy environments. In this study white noise generated by computer simulation and colored noise recorded in the Nagoya station are used. A speaker- independent word recognition method of ten Japanese digits using a two- dimensional mel-cepstrum(TDMC) is proposed. TDMC is defined as the two- dimensional Fourier transform of mel-frequency scaled logarithm spectra in the frequency and time domains and consists of average features and dynamic features of the two-dimensional mel-log spectra, Experimental results in this study are shown as follows.1. Speech analysis-synthesis system using a TMDC and its estimation; The structure of speech analysis-synthesis system using a TMDC is proposed in order to study the size of the TDMC for synthesizing good quality speech. It is shown that the frequency of the required area of the TDMC is less than about 10Hz.2. Reference patterns robust for the variation of signal-to-noise ratio (SNR) of input speech; In this study a single set of TDMCs of noise-added reference patterns with desired SNR is used for word recognition under noisy environments. Experimental results show that a recognition method using this reference pattern set is more effective than a usual method.3. Distance measures for a word recognition method robust for the variation of SNR of input speech; Distance measures using a combination of dynamic and average features of the TDMC is proposed. It is shown that dynamic features are more important than average features for word recognition under noisy environment.
本研究的目的是为噪声环境下的单词识别提供一种新的方法。本研究采用计算机模拟产生的白噪声和名古屋站实测的彩色噪声。提出了一种基于二维梅尔倒谱(TDMC)的独立于说话人的10位日文数字词识别方法。TDMC定义为mel-frequency标度对数谱在频域和时域上的二维傅里叶变换,由二维mel-log谱的平均特征和动态特征组成,本研究的实验结果如下。基于TMDC的语音分析合成系统及其估计为了研究合成高质量语音所需的TDMC的大小,提出了基于TDMC的语音分析合成系统的结构。结果表明,TDMC所需区域的频率小于10hz左右。参考模式对输入语音信噪比变化具有鲁棒性;在本研究中,使用一组具有期望信噪比的加噪参考模式的tdmc进行噪声环境下的单词识别。实验结果表明,采用该参考模式集的识别方法比常规方法更有效。一种对输入语音信噪比变化具有鲁棒性的词识别方法的距离度量提出了一种结合TDMC的动态特征和平均特征的距离度量方法。结果表明,在噪声环境下,动态特征比平均特征更为重要。
项目成果
期刊论文数量(23)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
北村正: 日本音響学会昭和63年度秋期研究発表会講演論文集. 昭63ー10. 59-60 (1988)
Tadashi Kitamura:日本声学学会 1988 年秋季研究会议记录 1988-10 (1988)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
水谷忠司、北村正: "雑音下の数字音声認識における参照パタ-ンと距離尺度の検討" 電子情報通信学会音声研究会資料. SP88-121. 39-45 (1989)
Tadashi Mizutani、Tadashi Kitamura:“噪声下数字语音识别中的参考模式和距离测量的研究”IEICE SP88-121 (1989)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Tadashi Mizutani, Tadashi Kitamura: "On Methods Making Reference Patterns and Distance Measures in Digit speech Recognition in Noisy Environments." IEICE Technical Report SP88-121, pp.39-45, 1988.
Tadashi Mizutani、Tadashi Kitamura:“关于嘈杂环境中数字语音识别中参考模式和距离测量的方法。”
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Tadashi Kitamura, Etsuro Hayahara: "Speaker-Dependent Digit Word Recognition in Noisy Environments Using Dynamic Features of A Two-Dimensional Mel-Cepstrum." Trans.IEICE, Vol.J72-D-II, No.8, pp.1242-1247, 1989.
Tadashi Kitamura、Etsuro Hayahara:“使用二维梅尔倒谱的动态特征在嘈杂环境中进行与说话者相关的数字单词识别。”
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Yoshinori Asamura, Tadashi Kitamura: "Speech Analysis-Synthesis System Using A Two-Dimensional Mel-Cepstrum" IEICE Technical Report SP88-47, pp.17-24, 1988.
Yoshinori Asamura、Tadashi Kitamura:“使用二维梅尔倒谱的语音分析合成系统”IEICE 技术报告 SP88-47,第 17-24 页,1988 年。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KITAMURA Tadashi其他文献
KITAMURA Tadashi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KITAMURA Tadashi', 18)}}的其他基金
Subunit modeling for Japanese sign language recognition based on stochastic model
基于随机模型的日语手语识别子单元建模
- 批准号:
22500506 - 财政年份:2010
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
固有声(eigenvoice)に基づいた音声合成---多様な声質の実現を目指して---
基于特征语音的语音合成---旨在实现多样化的音质---
- 批准号:
12680380 - 财政年份:2000
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Person Recognition by Multi-modal Information
多模态信息的人物识别
- 批准号:
09680394 - 财政年份:1997
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Studies on the folklore concerning cattle raising in Chugoku mountain area
中国山区养牛民间传说研究
- 批准号:
09610316 - 财政年份:1997
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Formation and Diffusion of the Folk Cultures in Chugoku Mountains
中国山地民俗文化的形成与传播
- 批准号:
05610250 - 财政年份:1993
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for General Scientific Research (C)
On the Value Orientations of Okinawa in the Light of Dynamics of "Monchu" System.
从“文丘”制度动力看冲绳的价值取向
- 批准号:
60510151 - 财政年份:1985
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for General Scientific Research (C)
相似海外基金
Investigating the relationship between auditory discrimination and word recognition using Japanese pitch accent
使用日语音高重音研究听觉辨别与单词识别之间的关系
- 批准号:
23K00490 - 财政年份:2023
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Effects of word frequency and phonological neighborhood density on word recognition by native speakers of English and Japanese
词频和语音邻域密度对英语和日语母语者单词识别的影响
- 批准号:
22K00564 - 财政年份:2022
- 资助金额:
$ 1.34万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Behavioral and Neural Measures of Spoken Word Recognition in Late Language Emergence
晚期语言出现中口语识别的行为和神经测量
- 批准号:
10437317 - 财政年份:2022
- 资助金额:
$ 1.34万 - 项目类别:
Growing up around different accents: the effect of speech variability on infant word recognition
在不同的口音中长大:语音变异对婴儿单词识别的影响
- 批准号:
2738524 - 财政年份:2022
- 资助金额:
$ 1.34万 - 项目类别:
Studentship
Orthographic and Semantic Representations: Consolidation and Role in Visual Word Recognition
拼写和语义表示:视觉单词识别中的巩固和作用
- 批准号:
RGPIN-2018-03758 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别:
Discovery Grants Program - Individual
How variable input shapes word recognition
变量输入如何塑造单词识别
- 批准号:
RGPIN-2016-04511 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别:
Discovery Grants Program - Individual
Word recognition in dual language learners: The mechanisms underlying listening and reading in two languages
双语言学习者的单词识别:两种语言听力和阅读的机制
- 批准号:
10404052 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别:
Capacity limits in the neural circuitry of visual word recognition
视觉单词识别神经回路的容量限制
- 批准号:
10296072 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别:
Word recognition in dual language learners: The mechanisms underlying listening and reading in two languages
双语言学习者的单词识别:两种语言听力和阅读的机制
- 批准号:
10217506 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别:
Capacity limits in the neural circuitry of visual word recognition
视觉单词识别神经回路的容量限制
- 批准号:
10330043 - 财政年份:2021
- 资助金额:
$ 1.34万 - 项目类别: