权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Synthesis of speech in any speaking styles based on corpus-based generation of prosodic features using the generation process model

使用生成过程模型基于语料库生成韵律特征来合成任何说话风格的语音

基本信息

批准号：
17300055
负责人：
HIROSE Keikichi
金额：
$ 10.79万
依托单位：
The University of Tokyo
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B)
财政年份：
2005
资助国家：
日本
起止时间：
2005 至 2007
项目状态：
已结题

项目摘要

Research works were conducted to establish a corpus-based speech synthesis method, which is based on generation process model of fundamental frequency contours and can generate high-quality speech in any speaking styles. The original research plan was fulfilled with the following results :1. A method was developed to predict the command parameters of the generation process model using binary decision trees with inputs such as linguistic information available by parsing texts, and thus to synthesize fundamental frequency contours. An integrated method of prosodic control was realized by integrating the above method with other methods using binary decision trees to predict pause positions and lengths and phoneme durations. The validity of the method was shown through experiments on speech synthesis of various styles including emotional speech. A method was also developed to automatically extract the command parameters from observed fundamental frequency contours using binary decision tre … More es. It was shown that the accuracy of extraction increased by including linguistic information of the text into inputs of the trees.2. Binary decision trees were constructed to predict deviations in phrase and accent commands of the utterances with specific focuses from those without. Their inputs are accent types and positions in sentences of the focused words, and command values of the corresponding parts of the utterances without specific focus. An appropriate focus control was realized by modifying the phrase and accent commands predicted by the method in section 1 based on the predicted deviations.3. A two-step method was developed for generating fundamental frequency contours of Standard Chinese. It first generates phrase components in a corpus-based way, and then generates tone components in a corpus-based way. The method has a high flexibility in synthesizing fundamental frequency contours. As an example of flexible control, it was shown that proper focus control could be realized in a simple set of rules.4. Speech synthesis systems were constructed for Japanese and Chinese by integrating methods developed in sections 1 and 2 above with HMM speech synthesis. It was shown that synthetic speech with higher natural ness could be realized by our system than using "full" HMM synthesizer, where prosodic control was done in HMM framework. It was also shown that various styles of synthetic speech could be realized by our system.5. Spoken dialogue systems for road guidance and TV program guidance were constructed using the above speech synthesis systems. The validity of the developed speech synthesis method was proved through experiments on the control of speaking styles of reply speech depending on the user's characters and situations. Less

研究建立了一种基于语料库的语音合成方法，该方法基于基频轮廓生成过程模型，能够生成任何说话风格的高质量语音。完成了原研究计划，取得了以下成果：1.开发了一种方法来预测生成过程模型的命令参数，使用二叉决策树的输入，如语言信息，通过解析文本，从而合成基频轮廓。将上述方法与其他基于二叉决策树的方法相结合，实现了一种综合的韵律控制方法。通过对包括情感语音在内的多种风格的语音合成实验，证明了该方法的有效性。提出了一种利用二进制判决树从观测的基频等值线中自动提取指令参数的方法 ...更多信息 es.结果表明，将文本的语言信息加入到树的输入中，可以提高抽取的准确率.二叉决策树被构造来预测具有特定焦点的话语的短语和重音命令与没有特定焦点的话语的短语和重音命令的偏差。它们的输入是焦点词的句子中的重音类型和位置，以及没有特定焦点的话语的对应部分的命令值。根据预测偏差，通过修改第1节方法预测的短语和重音命令，实现了适当的聚焦控制.本文提出了一种两步法生成普通话基频等值线的方法。首先基于语料库生成短语成分，然后基于语料库生成声调成分。该方法在合成基频轮廓时具有很高的灵活性。作为灵活控制的一个例子，它表明，适当的焦点控制可以实现在一个简单的规则集。通过将上面第1节和第2节中开发的方法与HMM语音合成相结合，构建了日语和汉语的语音合成系统。实验结果表明，与在HMM框架下进行韵律控制的“全”HMM合成器相比，该系统可以实现更高自然度的合成语音。实验结果表明，该系统可以实现多种风格的合成语音.使用上述语音合成系统构建了用于道路引导和电视节目引导的口语对话系统。通过实验证明了该方法的有效性，根据用户的性格和情况控制说话风格的回复语音。少

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Estimation of intonation variation with constrained tone transformations

通过受约束的声调变换来估计声调变化

DOI：
发表时间：
2005
期刊：
Proc. 9^<th> European Conference on Speech Communication and Technology (INTERSPEECH) CD-ROM
影响因子：
0
作者：
Keikichi Hirose;Yusuke Furuyama;Nobuaki Minematsu;Keikichi Hirose;Keikichi Hirose;広瀬啓吉;Keikichi Hirose;Keikichi Hirose;Quinghua Sun;Jinfu Ni
通讯作者：
Jinfu Ni

日本語テキスト音声合成用アクセント結合規則の改良

改进日语文本语音合成的重音组合规则

DOI：
发表时间：
2005
期刊：
日本音響学会講演論文集 CD-ROM
影响因子：
0
作者：
Keikichi Hirose;Yusuke Furuyama;Nobuaki Minematsu;Keikichi Hirose;Keikichi Hirose;広瀬啓吉;Keikichi Hirose;Keikichi Hirose;Quinghua Sun;Jinfu Ni;Keikichi Hirose;黒岩龍
通讯作者：
黒岩龍

Constrained tone transformation technique for separation and combination of Mandarin tone and intonation

普通话声调与语调分离与组合的约束声调变换技术

DOI：
发表时间：
2006
期刊：
Journal of Acoustical Society of America 119・3
影响因子：
0
作者：
Corinne Touati;Atsushi Inoie;Hisao Kameda;H.Kameda;Jinfu Ni
通讯作者：
Jinfu Ni

道案内音声対話システムへの概念音声合成に基づく応答生成手法の実装とその評価

基于概念语音合成的路线引导语音对话系统响应生成方法的实现与评估

DOI：
发表时间：
2007
期刊：
情報処理学会論文誌 48
影响因子：
0
作者：
Qinghua Sun;Keikichi Hirose;Nobuaki Minematsu;八木裕司
通讯作者：
八木裕司

Prosody in spoken language technologies(Special Lecture)

口语技术中的韵律（专题讲座）

DOI：
发表时间：
2007
期刊：
Proceedings of International Workshop on Nonlinear Circuits and Signal Processing(NCSP2007) CD-ROM
影响因子：
0
作者：
Keikichi Hirose;Qinghua Sun;Nobuaki Minematsu;八木裕司;Keikichi Hirose;Keikichi Hirose;Keikichi Hirose
通讯作者：
Keikichi Hirose