Hybrid Speech Synthesis for Voice Output Communication Aids
用于语音输出通信辅助的混合语音合成
基本信息
- 批准号:7156322
- 负责人:
- 金额:$ 37.35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2004
- 资助国家:美国
- 起止时间:2004-04-01 至 2008-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
DESCRIPTION (provided by applicant): NovaSpeech proposes to develop an innovative perceptually-oriented hybrid approach to unconstrained speech synthesis for generating individualized, customized voices of either gender and any age. The system will provide human-sounding, intelligible, and mimetic speech, yet have small storage requirements, be able to support the cost-efficient addition of new voices, and be suitable for implementation on virtually any hardware platform. As a result, the technology will be well-suited to virtually any unlimited vocabulary synthesis application, but be of special benefit to speech-impaired individuals, who have a particularly great need for natural-sounding, individualized voices on a broad range of devices. With the hybrid system, individuals who know they will lose their voice due to illness or surgery will be able to cost-efficiently capture and utilize their pre-injury voice in a voice output communication aid; and all speech-impaired users will be able to obtain reliable, appropriate, individualized voices that can grow with them as they mature and age. No existing synthesis approach meets these needs, with each type of technology trading off one desirable property for another, be it low storage requirements for natural voice quality, or human voice quality for flexibility. The hybrid approach overcomes these limitations by integrating, in a novel and principled way, the best features of two well-known synthesis techniques: corpus-based waveform concatenation and rule-based formant synthesis. Capitalizing on a number of important perceptual principles, the system will prestore only a small number of intrinsic units, such as stressed vowels, from the target speaker, and synthesize other, adaptable units by rule. Thus with only a small prestored speech corpus, and a common set of rules across voices, it will produce speech that sounds like the intended speaker. In its proposed Phase II project, NovaSpeech will develop a complete hybrid prototype text-to-speech (TTS) system for eight voices in General American English, including male and female children, adults, and elderly adults (the base speakers), as well as for two speakers who know they will lose their ability to speak naturally as a result of future laryngectomies. Year 1 will be focused on exploring possible system architectures; implementing rules for adaptable units; and exploring through perceptual experiments possible strategies for storing and selecting intrinsic units. Year 2 will be focused on implementing a fully functional hybrid TTS prototype for the six base voices. By month six of year 2 at the latest, the company will verify the ability to quickly add new voices by implementing the voices of the laryngectomy patients, providing them with functional systems for their voices, and obtaining feedback from them and those who know them about the quality of the voices and system features. The ultimate objective of the hybrid project is to improve the naturalness and mimetic quality of speech synthesized from unrestricted symbolic input, with the particular goal of enhancing the utility and flexibility of voice output communication aids for speech-impaired individuals.
描述(由申请人提供):NovaSpeech提出开发一种创新的面向感知的混合方法来进行不受约束的语音合成,以生成个性化的、定制的任何性别和任何年龄的声音。该系统将提供听起来像人的、可理解的和模仿的语音,但具有小的存储要求,能够支持具有成本效益的新语音的添加,并且适合在几乎任何硬件平台上实现。因此,该技术将非常适合几乎任何无限制的词汇合成应用,但对有语言障碍的人特别有益,他们特别需要在各种设备上使用自然的、个性化的声音。有了混合系统,那些知道自己会因为疾病或手术而失去声音的人将能够以具有成本效益的方式捕获并利用他们受伤前的声音输出通信辅助设备;所有有语言障碍的用户都将能够获得可靠,适当,个性化的声音,这些声音可以随着他们的成熟和年龄而成长。现有的合成方法不能满足这些需求,每种类型的技术都在一个理想的特性与另一个理想的特性之间进行权衡,无论是对自然语音质量的低存储要求,还是对灵活性的人类语音质量。混合方法克服了这些限制,通过集成,在一个新的和原则的方式,两个著名的合成技术的最佳功能:基于语料库的波形拼接和基于规则的共振峰合成。利用一些重要的感知原则,系统将只预存少量的内在单位,如重读元音,从目标说话者,并合成其他的,可适应的单位的规则。因此,只有一个小的预存储的语音语料库,和一套共同的规则,在声音,它将产生语音听起来像预期的发言者。在其拟议的第二阶段项目中,NovaSpeech将开发一个完整的混合原型文本到语音(TTS)系统,用于普通美国英语中的八种声音,包括男性和女性儿童,成人和老年人(基础扬声器),以及两个扬声器,他们知道他们将失去自然说话的能力,因为未来的喉切除术。第一年将专注于探索可能的系统架构;实施适应性单元的规则;并通过感知实验探索存储和选择内在单元的可能策略。第二年的重点是为六个基本语音实现一个功能齐全的混合TTS原型。最迟在第二年的第六个月,该公司将通过实施喉切除术患者的语音,为他们提供语音功能系统,并从他们和了解他们的人那里获得关于语音质量和系统功能的反馈,来验证快速添加新语音的能力。该混合项目的最终目标是提高从不受限制的符号输入合成的语音的自然度和模仿质量,特别是提高语音输出通信辅助设备对语言障碍者的实用性和灵活性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(1)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
SUSAN R HERTZ其他文献
SUSAN R HERTZ的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('SUSAN R HERTZ', 18)}}的其他基金
Expressive Speech Synthesis for Speech-Generating Devices
语音生成设备的表达性语音合成
- 批准号:
8903390 - 财政年份:2015
- 资助金额:
$ 37.35万 - 项目类别:
Hybrid Speech Synthesis for Voice Output Communication Aids
用于语音输出通信辅助的混合语音合成
- 批准号:
7271981 - 财政年份:2004
- 资助金额:
$ 37.35万 - 项目类别:
Hybrid Synthesis For Voice Output Communication Aids
用于语音输出通信辅助的混合合成
- 批准号:
6790229 - 财政年份:2004
- 资助金额:
$ 37.35万 - 项目类别:
OPTIMIZATION OF SPEECH SYNTHESIS SOFTWARE FOR VOCAL COMM
语音通信语音合成软件的优化
- 批准号:
3494754 - 财政年份:1991
- 资助金额:
$ 37.35万 - 项目类别:
CUSTOMIZED SYNTHETIC VOICES FOR SPEECH-IMPAIRED PERSONS
为语言障碍人士定制合成声音
- 批准号:
2125980 - 财政年份:1990
- 资助金额:
$ 37.35万 - 项目类别:
CUSTOMIZED SYNTHETIC VOICES FOR SPEECH-IMPAIRED PERSONS
为语言障碍人士定制合成声音
- 批准号:
3507166 - 财政年份:1990
- 资助金额:
$ 37.35万 - 项目类别:
CUSTOMIZED SYNTHETIC VOICES FOR SPEECH-IMPAIRED PERSONS
为语言障碍人士定制合成声音
- 批准号:
3494678 - 财政年份:1990
- 资助金额:
$ 37.35万 - 项目类别:
相似国自然基金
衰老抑制脊髓损伤修复的CXCL13依赖性CD8+T细胞通讯机制研究
- 批准号:82371585
- 批准年份:2023
- 资助金额:49.00 万元
- 项目类别:面上项目
基于NLRP3/IL-1β信号探讨α7nAChR介导巨噬细胞—心肌细胞互作在Aβ诱导房颤心房重构中的作用及机制研究
- 批准号:82300356
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
microRNA-378的细胞间通讯及其对猪生前骨骼肌生长波的调控
- 批准号:31171192
- 批准年份:2011
- 资助金额:65.0 万元
- 项目类别:面上项目
混沌保密通信若干基础问题研究
- 批准号:61073187
- 批准年份:2010
- 资助金额:11.0 万元
- 项目类别:面上项目
超高频超宽带系统射频基带补偿理论与技术的研究
- 批准号:61001097
- 批准年份:2010
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Conference Grant (R13) for 16th Symposium on Cochlear Implants in Children to be held July10-13, 2019 in Hollywood (FL) on clinical research of interest to cochlear implant clinicians and scientists
会议拨款 (R13) 用于将于 2019 年 7 月 10 日至 13 日在好莱坞(佛罗里达州)举行的第 16 届儿童人工耳蜗研讨会,讨论人工耳蜗临床医生和科学家感兴趣的临床研究
- 批准号:
9751582 - 财政年份:2019
- 资助金额:
$ 37.35万 - 项目类别:
Conference Grant (R-13) for 15th Symposium on Cochlear Implants in Children to be held July 26-29, 2017 on clinical research of interest to cochlear implant clinicians and scientists
会议拨款 (R-13) 用于将于 2017 年 7 月 26 日至 29 日举行的第 15 届儿童人工耳蜗研讨会,讨论人工耳蜗临床医生和科学家感兴趣的临床研究
- 批准号:
9330961 - 财政年份:2017
- 资助金额:
$ 37.35万 - 项目类别:
Iowa Cochlear Implant Clinical Research Center VIII
爱荷华州人工耳蜗临床研究中心 VIII
- 批准号:
10640676 - 财政年份:1985
- 资助金额:
$ 37.35万 - 项目类别:
Iowa Cochlear Implant Clinical Research Center VII
爱荷华州人工耳蜗临床研究中心七
- 批准号:
10063423 - 财政年份:1985
- 资助金额:
$ 37.35万 - 项目类别:
Iowa Cochlear Implant Clinical Research Center VII
爱荷华州人工耳蜗临床研究中心七
- 批准号:
10308173 - 财政年份:1985
- 资助金额:
$ 37.35万 - 项目类别:
Iowa Cochlear Implant Clinical Research Center VII
爱荷华州人工耳蜗临床研究中心七
- 批准号:
10308170 - 财政年份:1985
- 资助金额:
$ 37.35万 - 项目类别: