Generating Personalized Synthetic Speech for Progressive Dysarthria Using Severity-Appropriate Adaptation Strategies for Neural Text-to-Speech and Voice Conversion
使用神经文本到语音和语音转换的严重程度适当的适应策略为进行性构音障碍生成个性化合成语音
基本信息
- 批准号:10525903
- 负责人:
- 金额:$ 22.63万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-01 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressAgeAmericanAmyotrophic Lateral SclerosisAreaArtificial IntelligenceAuditoryAugmentative and Alternative CommunicationBiometryBody partCaregiversCharacteristicsClimactericCommunicationCommunication impairmentComplexComputer softwareDataDeteriorationDevelopmentDevicesDysarthriaEtiologyExhibitsEyeFingersGenerationsGoalsHeadImpairmentIndividualJudgmentLeadLearningMainstreamingMethodsNational Institute on Deafness and Other Communication DisordersNerve DegenerationNeurodegenerative DisordersNeurologic EffectOutcomeOutputPartner CommunicationsPatientsPerformancePersonsPlant RootsPopulationQuality of lifeResearchSamplingScienceSelf-Help DevicesSeveritiesShapesSocial IdentificationSocial InteractionSourceSpecialistSpeechSpeech DisordersSpeech IntelligibilityStrokeSymptomsSystemTechniquesTechnologyTextTrainingTraumaUnited States National Institutes of HealthVoiceWithdrawalbasecommunication devicecomputer sciencedeep learningdeep neural networkexperienceimprovedinnovationmobile computingnervous system disordernoveloral communicationpreservationrelating to nervous systemsexsoftware systemssoundspeech synthesisvector
项目摘要
PROJECT SUMMARY
More than 2 million Americans have a complex communication disorder that impairs their ability to talk. The loss
of speech is among the most debilitating effects of neurological diseases like amyotrophic lateral sclerosis (ALS),
where 95% will progressively lose their ability to speak and get trapped in a state of isolation. Communication
devices with electronic voice output allow patients to augment or replace verbal communication as their speech
deteriorates. The text (alphabet, messages) available on these devices is accessed directly using functioning
body parts (fingers, head, eyes), and the selected text is converted to speech through text-to-speech (TTS)
technology. Electronic TTS voices available on current devices have limited options in terms of age, sex, and/or
dialect, which diminishes the experience of a genuine discourse because neither the user nor their
communication partner can relate to the device voice. Voice is an integral part of a person’s identity and without
a voice that captures this identity, users tend to withdraw from interactions, greatly reducing their quality of life,
and leading to low acceptance of the technology. Personalized TTS voice options are a critical need for the ALS
population in order for them to be able to communicate freely in the face of major life changes.
The long-term goal of this research is software-based, high-performance personalized speech synthesis that can
be used on mobile platforms and commercial speech devices by people with communication disorders. Our
short-term goal is to investigate innovative methods that leverage state-of-the-art, end-to-end neural TTS, to
generate intelligible, natural, and personalized synthetic speech for people who already exhibit speech loss from
ALS. Neural TTS has significantly outperformed the previous generations of TTS technology, and has lowered
the barrier to develop high-quality TTS systems. While it is clearly desirable to use neural TTS, the need for large
quantities of high-quality speech data prohibits training such a system directly for those with ALS. We address
this problem through our two specific aims in this exploratory project: (i) adapt neural TTS output by using voice
conversion to personalize TTS voice options for ALS and (ii) adapt neural TTS input features and network
parameters to personalize TTS voice options for ALS. Our methods for both aims will preserve TTS speech
intelligibility and naturalness while enhancing voice similarity, by using modest amounts of speech data from
persons with ALS.
Our adapted neural TTS system is expected to generate personalized synthetic speech that has the voice
characteristics of individual ALS users along with intelligibility and naturalness to promote communication and
listening comfort. The project goals align with NIH-NIDCD’s priority area related to “Advancing Research in Novel
Augmentative and Alternative Communication (AAC) Approaches”. The project outcomes are expected to
provide a significant number of people who have communication disorders from varying etiologies (ALS, stroke,
trauma) with personalized vocal expression and social identity.
项目摘要
超过200万美国人患有复杂的沟通障碍,损害了他们说话的能力。损失
语言障碍是肌萎缩侧索硬化症(ALS)等神经系统疾病最具破坏性的影响之一,
95%的人会逐渐失去说话的能力,陷入孤立的状态。通信
具有电子语音输出的设备允许患者增强或代替口头交流作为他们的讲话
恶化。这些设备上可用的文本(字母表、消息)可直接使用功能访问
身体部位(手指、头部、眼睛),所选文本通过文本到语音(TTS)转换为语音
技术.在当前设备上可用的电子TTS语音在年龄、性别和/或功能方面具有有限的选项。
方言,这减少了一个真正的话语的经验,因为无论是用户还是他们的
通信伙伴可以涉及设备语音。声音是一个人身份的组成部分,
一个捕捉到这种身份的声音,用户往往会退出互动,大大降低他们的生活质量,
并导致该技术的低接受度。个性化TTS语音选项是ALS的关键需求
这是一个很好的机会,让他们能够在面对重大生活变化时自由交流。
本研究的长期目标是基于软件的高性能个性化语音合成,
由有沟通障碍的人在移动的平台和商业语音设备上使用。我们
短期目标是研究利用最先进的端到端神经TTS的创新方法,
为那些已经表现出语言损失的人生成可理解的,自然的和个性化的合成语音。
人症神经TTS技术的表现明显优于前几代TTS技术,
开发高质量TTS系统的障碍。虽然使用神经TTS显然是可取的,但对大容量TTS的需求仍然存在。
大量的高质量语音数据禁止直接为ALS患者训练这样的系统。我们解决
这个问题通过我们在这个探索性项目中的两个具体目标来解决:(i)通过使用语音来适应神经TTS输出
转换为ALS的个性化TTS语音选项,以及(ii)调整神经TTS输入特征和网络
参数来个性化ALS的TTS语音选项。我们的方法为这两个目标将保留TTS语音
可理解性和自然性,同时增强语音相似性,通过使用适量的语音数据,
ALS患者。
我们的适应神经TTS系统预计将产生个性化的合成语音,
个体ALS用户的特征沿着以及可理解性和自然性,以促进沟通,
倾听的安慰该项目的目标与NIH-NIDCD的优先领域“推进小说研究”相一致
增强和替代性沟通方法”。项目成果预计将
提供了大量患有不同病因的交流障碍(ALS,中风,
创伤)与个性化的声音表达和社会身份。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mili Kuruvilla-Dugdale其他文献
Mili Kuruvilla-Dugdale的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mili Kuruvilla-Dugdale', 18)}}的其他基金
Generating Personalized Synthetic Speech for Progressive Dysarthria Using Severity-Appropriate Adaptation Strategies for Neural Text-to-Speech and Voice Conversion
使用神经文本到语音和语音转换的严重程度适当的适应策略为进行性构音障碍生成个性化合成语音
- 批准号:
10656540 - 财政年份:2022
- 资助金额:
$ 22.63万 - 项目类别:
相似国自然基金
靶向递送一氧化碳调控AGE-RAGE级联反应促进糖尿病创面愈合研究
- 批准号:JCZRQN202500010
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
对香豆酸抑制AGE-RAGE-Ang-1通路改善海马血管生成障碍发挥抗阿尔兹海默病作用
- 批准号:2025JJ70209
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
AGE-RAGE通路调控慢性胰腺炎纤维化进程的作用及分子机制
- 批准号:
- 批准年份:2024
- 资助金额:0 万元
- 项目类别:面上项目
甜茶抑制AGE-RAGE通路增强突触可塑性改善小鼠抑郁样行为
- 批准号:2023JJ50274
- 批准年份:2023
- 资助金额:0.0 万元
- 项目类别:省市级项目
蒙药额尔敦-乌日勒基础方调控AGE-RAGE信号通路改善术后认知功能障碍研究
- 批准号:
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
补肾健脾祛瘀方调控AGE/RAGE信号通路在再生障碍性贫血骨髓间充质干细胞功能受损的作用与机制研究
- 批准号:
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:面上项目
LncRNA GAS5在2型糖尿病动脉粥样硬化中对AGE-RAGE 信号通路上相关基因的调控作用及机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
围绕GLP1-Arginine-AGE/RAGE轴构建探针组学方法探索大柴胡汤异病同治的效应机制
- 批准号:81973577
- 批准年份:2019
- 资助金额:55.0 万元
- 项目类别:面上项目
AGE/RAGE通路microRNA编码基因多态性与2型糖尿病并发冠心病的关联研究
- 批准号:81602908
- 批准年份:2016
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
高血糖激活滑膜AGE-RAGE-PKC轴致骨关节炎易感的机制研究
- 批准号:81501928
- 批准年份:2015
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Queer and Environmental Melancholia in American Coming-of-age Fiction: Narratives of Loss and Resistance in the Anthropocene
美国成长小说中的酷儿与环境忧郁:人类世的失落与抵抗的叙述
- 批准号:
2883761 - 财政年份:2023
- 资助金额:
$ 22.63万 - 项目类别:
Studentship
The Representations of "Nature" by 19th Century American Women Poets: Perspectives in the Age of "War
19世纪美国女诗人对“自然”的再现:“战争”时代的视角
- 批准号:
22K00434 - 财政年份:2022
- 资助金额:
$ 22.63万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Representations of Waste People in the New World: American National Identity in the Age of the Nation-State and Beyond
新世界中废人的表征:民族国家时代及以后的美国民族认同
- 批准号:
22K00491 - 财政年份:2022
- 资助金额:
$ 22.63万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Work of Art in the Age of Empathy: Analyzing American and Soviet Culture during the Interwar Period
移情时代的艺术作品:分析两次世界大战期间的美国和苏联文化
- 批准号:
20J40040 - 财政年份:2020
- 资助金额:
$ 22.63万 - 项目类别:
Grant-in-Aid for JSPS Fellows
The American Public Broadcasting in the Internet Age: How they adopt the System, Mission, and Regulations during the IT Revolution?
网络时代的美国公共广播:IT革命期间他们如何采用制度、使命和规则?
- 批准号:
20K13715 - 财政年份:2020
- 资助金额:
$ 22.63万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Latin American Antiracism in a 'Post-Racial' Age
“后种族”时代的拉丁美洲反种族主义
- 批准号:
ES/N012747/1 - 财政年份:2017
- 资助金额:
$ 22.63万 - 项目类别:
Research Grant
The Philosophy of May Massee, an Editor who Brought about the Golden Age of American Picture Books
开启美国图画书黄金时代的编辑梅·马西的哲学
- 批准号:
16K02512 - 财政年份:2016
- 资助金额:
$ 22.63万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Spaces of Education: Pedagogical Writing and Social Practice in the Age of American Romanticism
教育空间:美国浪漫主义时代的教学写作与社会实践
- 批准号:
323813051 - 财政年份:2016
- 资助金额:
$ 22.63万 - 项目类别:
Research Grants
Collaborative Research: American Innovations in an Age of Discovery: Teaching Science and Engineering through 3D-printed Historical Reconstructions
合作研究:发现时代的美国创新:通过 3D 打印历史重建教授科学与工程
- 批准号:
1510289 - 财政年份:2015
- 资助金额:
$ 22.63万 - 项目类别:
Continuing Grant
Collaborative Research: American Innovations in an Age of Discovery: Teaching Science and Engineering through 3D-printed Historical Reconstructions
合作研究:发现时代的美国创新:通过 3D 打印历史重建教授科学与工程
- 批准号:
1511155 - 财政年份:2015
- 资助金额:
$ 22.63万 - 项目类别:
Continuing Grant