High-quality Speech Synthesis based on Accurate Analysis Method and Statistical Method
基于精确分析方法和统计方法的高质量语音合成
基本信息
- 批准号:12480079
- 负责人:
- 金额:$ 6.4万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (B)
- 财政年份:2000
- 资助国家:日本
- 起止时间:2000 至 2002
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The original research plan, which aims at realizing high-quality speech synthesis through utilizing accurate pole-zero information of vocal transfer function for segmental feature generation and applying the functional model constraints for prosodic feature generation, was accomplished with the following results :1. A successive approximation was applied to ARX analysis enabling accurate pole-zero estimation. The method was combined with our formerly developed terminal analogue synthesizer to construct a analysis-synthesis workbench. Using this, we succeeded to improve the quality of liquid sound.2. A speech synthesizer, hybrid of terminal analogue and waveform concatenation, was developed. A high-quality speech synthesis was realized.3. A method was developed for stable formant extraction, which was based on AR-HMM modeling, representing source waveform using HMM. Result of speech synthesis experiment showed that the method could generate high-quality even for a large F0 (fundamental … More frequency) change.4. By adding natural waveform of junction periods in the spectral domain with appropriate weighting to the concatenated speech, we successfully realized a smooth spectral transition. Also we developed a method to effectively reduce the corpus size for concatenative synthesis by the weighted VQ according to the frequency.5. The necessary data size for speaker adaptation was investigated form the viewpoint of speech quality after developing a HMM speech synthesizer. It was shown that a good quality was obtainable 10 and more sentences.6. F0 contour generation was realized by estimating the generation process model parameters using statistical methods. A high speech quality was realized only from a small speech corpus by using linguistic information such as on direct modification relations of words. Also we succeeded to estimate the accent phrase boundaries form text using the same statistical framework. Furthermore, F0 contour generation and phoneme length estimation were realized for emotional speech with a good result.7. A method for automatically estimating F0 contour generation process model commands was realized. Using the method, a prosodic corpus was made. This corpus is indispensable for the above F0 contour generation.8. A rule for controlling mora duration for dialogue-like speech synthesis was constructed. The result of the speech synthesis experiment showed the validity of the rule. Less
本论文完成了原研究计划,即利用人声传递函数精确的零极点信息生成分段特征,应用函数模型约束生成韵律特征,实现高质量的语音合成,取得了以下成果:1.逐次逼近应用于ARX分析,从而实现精确的极点-零点估计。该方法与我们以前开发的终端模拟合成器相结合,构成了一个分析-综合工作台。利用这一点,我们成功地提高了液体声音的质量。研制了一种终端模拟与波形级联相结合的语音合成器。实现了高质量的语音合成.提出了一种基于AR-HMM建模的稳定共振峰提取方法,用HMM表示源波形。语音合成实验结果表明,该方法即使在基频较大的情况下,也能产生高质量的语音 ...更多信息 频率)变化。通过在级联语音中加入频谱域中自然的连接周期波形并进行适当的加权,成功地实现了平滑的频谱过渡。提出了一种根据频率加权矢量量化的方法,有效地减少了拼接合成语料的规模.在研制了HMM语音合成器的基础上,从语音质量的角度研究了说话人自适应所需的数据量。结果表明,10次以上的试验可获得良好的质量。F0轮廓线生成是通过统计方法估计生成过程模型参数实现的。通过使用诸如关于词的直接修饰关系的语言信息,仅从小的语音语料库实现高的语音质量。此外,我们成功地估计口音短语边界形式的文本使用相同的统计框架。实现了情感语音的F0轮廓生成和音素长度估计,取得了较好的效果.实现了一种自动估计F0轮廓生成过程模型命令的方法。利用该方法,建立了一个韵律语料库。该语料库对于上述F0等值线生成是必不可少的。构建了类对话语音合成中控制莫拉持续时间的规则。语音合成实验结果表明了该规则的有效性。少
项目成果
期刊论文数量(142)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
広瀬啓吉: "音声情報処理におけるパラ・非言語情報"日本音響学会秋季講演論文集. I. 243-246 (2002)
Keikichi Hirose:“语音信息处理中的副/非语言信息”日本声学学会秋季会议记录 I. 243-246 (2002)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
広瀬啓吉: "音声合成研究への招待 -自由な合成の実現に向けて-"情報処理. 43・3. 321-324 (2002)
广濑圭吉:“语音合成研究的邀请 - 实现自由合成 -” 信息处理 321-324。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Nobuyuki Nishizawa, Nobuaki Minematsu, and Keikichi Hirose: "Formant speech synthesis partly using waveform concatenative synthesis -Experimental study on VCV sounds-"IEICE Technical Report. SP2001-20. 35-42 (2001)
Nobuyuki Nishizawa、Nobuaki Minematsu 和 Keikichi Hirose:“部分使用波形连接合成的共振峰语音合成 -VCV 声音的实验研究 -”IEICE 技术报告。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
江藤雅哉: "生成過程モデルと統計的手法による統語構造を考慮した基本周波数パターンの生成"電子情報通信学会技術研究報告(音声研究会). 17-22 (2002)
Masaya Eto:“使用生成过程模型和统计方法考虑句法结构的基频模式的生成”IEICE 技术报告(语音研究组)17-22 (2002)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
西沢信行: "Development of a formant-based analysis-synthesis system and generation of high quality liquid sounds of Japanese"Proc.International Conf.on Spoken Language Processing. 1. 725-728 (2000)
Nobuyuki Nishizawa:“开发基于共振峰的分析合成系统并生成高质量的日语液体声音”Proc.International Conf.on Spoken Languageprocessing 1. 725-728 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
HIROSE Keikichi其他文献
HIROSE Keikichi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('HIROSE Keikichi', 18)}}的其他基金
Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis
基于生成过程模型和语音合成的非母语语音韵律系统化的发音教育系统
- 批准号:
24652115 - 财政年份:2012
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Advanced method of prosody control in statistical-based speech synthesis using generation process model of fundamental frequency contours
使用基频轮廓生成过程模型的基于统计的语音合成中韵律控制的先进方法
- 批准号:
24300068 - 财政年份:2012
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation
基于生成过程模型的表达性多语言语音合成及其在自动语音翻译中的应用
- 批准号:
21300061 - 财政年份:2009
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Synthesis of speech in any speaking styles based on corpus-based generation of prosodic features using the generation process model
使用生成过程模型基于语料库生成韵律特征来合成任何说话风格的语音
- 批准号:
17300055 - 财政年份:2005
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Naturally Sounding Speech Synthesis and Recognition Based on the Formulation of Prosody
基于韵律表述的自然语音合成与识别
- 批准号:
09480061 - 财政年份:1997
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Develoment of Spoken Dialogue System for Japanese and Chinese
日汉口语对话系统的开发
- 批准号:
08558028 - 财政年份:1996
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition
语音韵律特征的制定及其在连续语音识别中的应用
- 批准号:
06452397 - 财政年份:1994
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Rule-Synthesis of Spoken Sentences for the Speech Dialogue Systems
语音对话系统的口语句子规则合成
- 批准号:
03452288 - 财政年份:1991
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for General Scientific Research (B)
Development of Output System of Announcing Speech with Input of Kanji-Kana Sentences
输入汉字假名句子的语音播报输出系统的开发
- 批准号:
01850073 - 财政年份:1989
- 资助金额:
$ 6.4万 - 项目类别:
Grant-in-Aid for Developmental Scientific Research (B).