Naturally Sounding Speech Synthesis and Recognition Based on the Formulation of Prosody
基于韵律表述的自然语音合成与识别
基本信息
- 批准号:09480061
- 负责人:
- 金额:$ 5.25万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (B)
- 财政年份:1997
- 资助国家:日本
- 起止时间:1997 至 1999
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Several results including the following ones were achieved through the study aiming at formulating the relationship between prosodic features of speech and linguistic and para/non linguistic information, and realizing advanced technologies on speech synthesis :1. An improved accuracy was realized in automatic extraction of phrase component onsets from fundamental frequency (FO) contours by suppressing accent components through low-pass filtering of the contours and by taking their deviations. Further improvements in accuracy were realized in a method of automatic prosodic labeling where knowledge on prosody obtainable from linguistic information was utilized as constrictions F0 parameter estimation.2. Mora duration rules were constructed for dialogue-like speech synthesis. These rules are basically to modify each mora duration of reading-style speech to that of dialogue-like speech in prosodic phrase-basis, defined by the FO contours.3. Prosodic features of Speech with various attitude … More s/emotions were analyzed. It was found that a speaker selectively controlling several prosodic cues to express degree of attitudes/emotion. It was also found through a perceptual experiment that segmental feature control were also indispensable to realized emotional speech.4. A method was developed to represent F0 contours of prosodic words by codes in mora unit and to model their transitions statistically (Statistic model of moraic transition). The detection rates of 70-75% were achieved with insertion errors of 11-15% for prosodic word boundaries. The method was applied to continuous speech recognition with few % improvements in mora recognition rates. A method was also developed to generate sentence F0 contours with inputs of accent types and phrase boundary positions.5. A prosodic feature-based method was developed for the dynamic pruning in beam search process of large-vocabulary continuous speech recognition. It was proved that the search space could be reduced to a quarter without degradation in recognition rates. The method enlarges beam width at prosodic boundaries and decreases between boundaries. A method was also developed to select phoneme models with various context dependencies using prosodic boundary information.6. Based on the results obtained, a spoken dialogue system of academic information retrieval was developed and evaluated. Less
本研究旨在阐明语音韵律特征与语言信息、准语言信息和非语言信息之间的关系,实现语音合成的先进技术:1.通过对基频轮廓进行低通滤波来抑制重音成分并提取其偏差值,从而提高了从基频轮廓中自动提取短语成分的精度。在自动韵律标注方法中实现了准确率的进一步提高,该方法利用从语言信息中获得的韵律知识作为约束F0参数估计。构造了类对话语音合成的Mora时长规则。这些规则基本上是将阅读风格的每个Mora时长修改为以韵律短语为基础的对话式演讲的时长,由FO轮廓定义。不同态度…语音的韵律特征更多的是对S/情绪的分析。研究发现,说话者有选择地控制几个韵律线索来表达态度/情感的程度。通过知觉实验还发现,分段特征控制对于实现情感话语也是不可或缺的。提出了一种以Mora为单位用代码表示韵律词的F0轮廓并对其转换进行统计建模的方法(Moraic转换的统计模型)。韵律词边界的插入误差为11%~15%,检测率为70%~75%。将该方法应用于连续语音识别,对Mora识别率的改善不大。提出了一种利用重音类型和短语边界位置的输入生成句子F0轮廓的方法。针对大词汇量连续语音识别波束搜索过程中的动态剪枝问题,提出了一种基于韵律特征的动态剪枝方法。实验证明,在不降低识别率的情况下,搜索空间可以减少到四分之一。该方法增大了韵律边界处的波束宽度,减小了边界之间的波束宽度。提出了一种利用韵律边界信息选择具有不同语境依赖关系的音素模型的方法。在此基础上,开发了一个学术信息检索口语对话系统,并对该系统进行了评价。较少
项目成果
期刊论文数量(98)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
桜井淳宏: "Detecting accent sandhi in Japanese using a superpositional F_0 model"Proc.European Conf,on Speech Communication and Technology. 4. 1863-1866 (1999)
Atsuhiro Sakurai:“使用叠加 F_0 模型检测日语连读重音”Proc.European Conf,关于语音通信和技术 4. 1863-1866 (1999)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Keikichi Hirose and Shinya Kiriyama: "Generation of speech reply in a spoken dialogue system for literature retrieval"Proc. ESCA Tutorial and Research Workshop on Interactive Dialogue in Multi-Modal Systems. 29-32 (1999)
Keikichi Hirose 和 Shinya Kiriyama:“文献检索口语对话系统中语音回复的生成”Proc。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Keikichi Hirose and Jinsong Zhang: "Tone recognition of Chinese continuous speech using tone critical segments"Proc. European Conference on Speech Communication and Technology. 2. 879-882 (1999)
广濑圭吉和张劲松:“使用声调关键片段进行汉语连续语音的声调识别”Proc.
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Shi-wook Lee and Keikichi Hirose: "Dynamic beam search strategy using prosodic-syntactic information"Proc. IEEE Workshop on Automatic Speech Recognition and Understanding. 189-192 (1999)
Shi-wook Lee 和 Keikichi Hirose:“使用韵律句法信息的动态束搜索策略”Proc。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Keikichi Hirose: "Generation of dialogue speech Spoken Dialogue between Man and Machine Chapter 4"Ohm. 67-80 (1998)
广濑圭吉:《人与机器之间的对话语音的生成第4章》欧姆。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
HIROSE Keikichi其他文献
HIROSE Keikichi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('HIROSE Keikichi', 18)}}的其他基金
Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis
基于生成过程模型和语音合成的非母语语音韵律系统化的发音教育系统
- 批准号:
24652115 - 财政年份:2012
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Advanced method of prosody control in statistical-based speech synthesis using generation process model of fundamental frequency contours
使用基频轮廓生成过程模型的基于统计的语音合成中韵律控制的先进方法
- 批准号:
24300068 - 财政年份:2012
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation
基于生成过程模型的表达性多语言语音合成及其在自动语音翻译中的应用
- 批准号:
21300061 - 财政年份:2009
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Synthesis of speech in any speaking styles based on corpus-based generation of prosodic features using the generation process model
使用生成过程模型基于语料库生成韵律特征来合成任何说话风格的语音
- 批准号:
17300055 - 财政年份:2005
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
High-quality Speech Synthesis based on Accurate Analysis Method and Statistical Method
基于精确分析方法和统计方法的高质量语音合成
- 批准号:
12480079 - 财政年份:2000
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Develoment of Spoken Dialogue System for Japanese and Chinese
日汉口语对话系统的开发
- 批准号:
08558028 - 财政年份:1996
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition
语音韵律特征的制定及其在连续语音识别中的应用
- 批准号:
06452397 - 财政年份:1994
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Rule-Synthesis of Spoken Sentences for the Speech Dialogue Systems
语音对话系统的口语句子规则合成
- 批准号:
03452288 - 财政年份:1991
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for General Scientific Research (B)
Development of Output System of Announcing Speech with Input of Kanji-Kana Sentences
输入汉字假名句子的语音播报输出系统的开发
- 批准号:
01850073 - 财政年份:1989
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Developmental Scientific Research (B).
相似海外基金
Development of Speech Synthesis System for Controlling Speaker Identity through Text Prompts and Visual Interfaces
通过文本提示和可视化界面控制说话人身份的语音合成系统的开发
- 批准号:
23K20017 - 财政年份:2023
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Research Activity Start-up
Emotional Text-to-Speech Synthesis with Verbal Speech and Nonverbal Vocalizations
具有言语语音和非言语发声的情感文本到语音合成
- 批准号:
23KJ0828 - 财政年份:2023
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Everyday conversation speech synthesis
日常对话语音合成
- 批准号:
22K12107 - 财政年份:2022
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Sustainably Developable Speech Synthesis Based on Continual Learning
基于持续学习的可持续发展语音合成
- 批准号:
21K21305 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Research Activity Start-up
Exemplar-based Expressive Speech Synthesis
基于样本的表达性语音合成
- 批准号:
EP/V046772/1 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Research Grant
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
- 批准号:
2106928 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Standard Grant
Can AI Rakugoka entertain people? -Improved expressiveness of rakugo speech synthesis and automatic generation of storytelling
AI落语可以娱乐人们吗?
- 批准号:
21K19808 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
- 批准号:
2106930 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Standard Grant
Language-independent, multi-modal, and data-efficient approaches for speech synthesis and translation
独立于语言、多模式且数据高效的语音合成和翻译方法
- 批准号:
21K11951 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Speech Synthesis based on the sense of physical and psychological distance from the user
基于与用户的身体和心理距离感的语音合成
- 批准号:
21K17784 - 财政年份:2021
- 资助金额:
$ 5.25万 - 项目类别:
Grant-in-Aid for Early-Career Scientists














{{item.name}}会员




