Synthesis of speech in any speaking styles based on corpus-based generation of prosodic features using the generation process model

使用生成过程模型基于语料库生成韵律特征来合成任何说话风格的语音

基本信息

  • 批准号:
    17300055
  • 负责人:
  • 金额:
    $ 10.79万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
  • 财政年份:
    2005
  • 资助国家:
    日本
  • 起止时间:
    2005 至 2007
  • 项目状态:
    已结题

项目摘要

Research works were conducted to establish a corpus-based speech synthesis method, which is based on generation process model of fundamental frequency contours and can generate high-quality speech in any speaking styles. The original research plan was fulfilled with the following results :1. A method was developed to predict the command parameters of the generation process model using binary decision trees with inputs such as linguistic information available by parsing texts, and thus to synthesize fundamental frequency contours. An integrated method of prosodic control was realized by integrating the above method with other methods using binary decision trees to predict pause positions and lengths and phoneme durations. The validity of the method was shown through experiments on speech synthesis of various styles including emotional speech. A method was also developed to automatically extract the command parameters from observed fundamental frequency contours using binary decision tre … More es. It was shown that the accuracy of extraction increased by including linguistic information of the text into inputs of the trees.2. Binary decision trees were constructed to predict deviations in phrase and accent commands of the utterances with specific focuses from those without. Their inputs are accent types and positions in sentences of the focused words, and command values of the corresponding parts of the utterances without specific focus. An appropriate focus control was realized by modifying the phrase and accent commands predicted by the method in section 1 based on the predicted deviations.3. A two-step method was developed for generating fundamental frequency contours of Standard Chinese. It first generates phrase components in a corpus-based way, and then generates tone components in a corpus-based way. The method has a high flexibility in synthesizing fundamental frequency contours. As an example of flexible control, it was shown that proper focus control could be realized in a simple set of rules.4. Speech synthesis systems were constructed for Japanese and Chinese by integrating methods developed in sections 1 and 2 above with HMM speech synthesis. It was shown that synthetic speech with higher natural ness could be realized by our system than using "full" HMM synthesizer, where prosodic control was done in HMM framework. It was also shown that various styles of synthetic speech could be realized by our system.5. Spoken dialogue systems for road guidance and TV program guidance were constructed using the above speech synthesis systems. The validity of the developed speech synthesis method was proved through experiments on the control of speaking styles of reply speech depending on the user's characters and situations. Less
研究建立了一种基于语料库的语音合成方法,该方法基于基频轮廓生成过程模型,能够生成任何说话风格的高质量语音。完成了原研究计划,取得了以下成果:1.开发了一种方法来预测生成过程模型的命令参数,使用二叉决策树的输入,如语言信息,通过解析文本,从而合成基频轮廓。将上述方法与其他基于二叉决策树的方法相结合,实现了一种综合的韵律控制方法。通过对包括情感语音在内的多种风格的语音合成实验,证明了该方法的有效性。提出了一种利用二进制判决树从观测的基频等值线中自动提取指令参数的方法 ...更多信息 es.结果表明,将文本的语言信息加入到树的输入中,可以提高抽取的准确率.二叉决策树被构造来预测具有特定焦点的话语的短语和重音命令与没有特定焦点的话语的短语和重音命令的偏差。它们的输入是焦点词的句子中的重音类型和位置,以及没有特定焦点的话语的对应部分的命令值。根据预测偏差,通过修改第1节方法预测的短语和重音命令,实现了适当的聚焦控制.本文提出了一种两步法生成普通话基频等值线的方法。首先基于语料库生成短语成分,然后基于语料库生成声调成分。该方法在合成基频轮廓时具有很高的灵活性。作为灵活控制的一个例子,它表明,适当的焦点控制可以实现在一个简单的规则集。通过将上面第1节和第2节中开发的方法与HMM语音合成相结合,构建了日语和汉语的语音合成系统。实验结果表明,与在HMM框架下进行韵律控制的“全”HMM合成器相比,该系统可以实现更高自然度的合成语音。实验结果表明,该系统可以实现多种风格的合成语音.使用上述语音合成系统构建了用于道路引导和电视节目引导的口语对话系统。通过实验证明了该方法的有效性,根据用户的性格和情况控制说话风格的回复语音。少

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Estimation of intonation variation with constrained tone transformations
通过受约束的声调变换来估计声调变化
日本語テキスト音声合成用アクセント結合規則の改良
改进日语文本语音合成的重音组合规则
  • DOI:
  • 发表时间:
    2005
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Keikichi Hirose;Yusuke Furuyama;Nobuaki Minematsu;Keikichi Hirose;Keikichi Hirose;広瀬啓吉;Keikichi Hirose;Keikichi Hirose;Quinghua Sun;Jinfu Ni;Keikichi Hirose;黒岩 龍
  • 通讯作者:
    黒岩 龍
Constrained tone transformation technique for separation and combination of Mandarin tone and intonation
普通话声调与语调分离与组合的约束声调变换技术
道案内音声対話システムへの概念音声合成に基づく応答生成手法の実装とその評価
基于概念语音合成的路线引导语音对话系统响应生成方法的实现与评估
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Qinghua Sun;Keikichi Hirose;Nobuaki Minematsu;八木裕司
  • 通讯作者:
    八木裕司
Prosody in spoken language technologies(Special Lecture)
口语技术中的韵律(专题讲座)
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

HIROSE Keikichi其他文献

HIROSE Keikichi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('HIROSE Keikichi', 18)}}的其他基金

Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis
基于生成过程模型和语音合成的非母语语音韵律系统化的发音教育系统
  • 批准号:
    24652115
  • 财政年份:
    2012
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Advanced method of prosody control in statistical-based speech synthesis using generation process model of fundamental frequency contours
使用基频轮廓生成过程模型的基于统计的语音合成中韵律控制的先进方法
  • 批准号:
    24300068
  • 财政年份:
    2012
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation
基于生成过程模型的表达性多语言语音合成及其在自动语音翻译中的应用
  • 批准号:
    21300061
  • 财政年份:
    2009
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
High-quality Speech Synthesis based on Accurate Analysis Method and Statistical Method
基于精确分析方法和统计方法的高质量语音合成
  • 批准号:
    12480079
  • 财政年份:
    2000
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Naturally Sounding Speech Synthesis and Recognition Based on the Formulation of Prosody
基于韵律表述的自然语音合成与识别
  • 批准号:
    09480061
  • 财政年份:
    1997
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Develoment of Spoken Dialogue System for Japanese and Chinese
日汉口语对话系统的开发
  • 批准号:
    08558028
  • 财政年份:
    1996
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition
语音韵律特征的制定及其在连续语音识别中的应用
  • 批准号:
    06452397
  • 财政年份:
    1994
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Rule-Synthesis of Spoken Sentences for the Speech Dialogue Systems
语音对话系统的口语句子规则合成
  • 批准号:
    03452288
  • 财政年份:
    1991
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (B)
Development of Output System of Announcing Speech with Input of Kanji-Kana Sentences
输入汉字假名句子的语音播报输出系统的开发
  • 批准号:
    01850073
  • 财政年份:
    1989
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Developmental Scientific Research (B).

相似海外基金

Automatic Estimation of Fundamental Frequency Contour Parameters and Automatic Acquisition of Generative rules
基频轮廓参数自动估计及生成规则自动获取
  • 批准号:
    11480090
  • 财政年份:
    1999
  • 资助金额:
    $ 10.79万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B).
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了