权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

HCC: Small: Modeling Acoustic and Articulatory Features for Hybrid Synthesis

HCC：小型：混合合成的声学和发音特征建模

基本信息

批准号：
1116799
负责人：
Rupal Patel
金额：
$ 18.09万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2011
资助国家：
美国
起止时间：
2011-10-01 至 2014-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1116799&HistoricalAwards=false
关键词：
HCC Small Modeling Acoustic Articulatory

项目摘要

In recent decades synthetic speech has become a ubiquitous and increasingly seamless aspect of human-machine interfaces. Although cars, microwaves, phones, and kiosks all "talk" in human-like ways, the naturalness and personality of these voices fall short of human expression. While this may not matter for many text-to-speech (TTS) applications, over two million Americans with severe speech-motor impairments require assistive communication aids with TTS output. Concatenative TTS synthesizers yield highly intelligible voices, yet many assistive devices rely on small footprint, formant synthesis that sounds robotic and has poor intelligibility. Moreover, the choice of voices on conventional devices is limited and does not reflect the user; it is not uncommon for a child to use the same voice her whole life and for her peers to share that same voice even when using different devices. This lack of attention to the individuality of synthetic voices has consequences on adoption of assistive technology as an extension of the user, and may adversely impact societal attitudes toward the user group.In her prior work the PI began to address these issues by adapting a concatenative synthesizer constructed from acoustic recordings of a healthy talker using vocal source characteristics obtained from a target talker with speech impairment. The adapted voice was highly intelligible and conveyed the target user's identity, yet it also retained substantial elements of the healthy talker's identity due to the influence of vocal tract filter characteristics. This suggests that personalized speech synthesis may be more successful utilizing an alternative approach, in which acoustic and articulatory data from healthy talkers are combined with both source and filter characteristics from target talkers to generate an individualized voice. In this project, the PI will develop hybrid statistical parametric synthesis techniques to model vocal tract and source characteristics of impaired talkers, with the goal of generating highly intelligible and personalized synthetic speech. The PI envisages a future where source and filter parameters of a Hidden Markov Model (HMM) based synthesizer can be adapted to model a child user's vocal tract and modified over time to "grow" with his maturing vocal system, fostering a stronger personal connection between the user and the communication device.Broader Impacts: This project strives to make communication accessible and socially fulfilling by designing an enabling technology that blurs the line between system and user. The human voice is not merely a signal; it has an individualized and personal quality that impacts how others perceive us and how we interact with those around us. The ultimate goal of this work is to afford users of TTS the same ownership and individuality as the natural voice. Project outcomes will have broad impact both on users of assistive aids and able-bodied users of TTS technologies. The research may also lead to a novel and innovative means of assessing the nature and articulatory locus of speech impairment, by comparing model parameters to impaired productions. The interdisciplinary nature of this research will promote teaching, training and learning in computer science and in speech and hearing sciences.

近几十年来，合成语音已成为人机界面中无处不在且日益无缝的一个方面。尽管汽车、微波炉、电话和售货亭都以类似人类的方式“说话”，但这些声音的自然性和个性与人类的表达方式相差甚远。虽然这对许多文本到语音（TTS）的应用来说可能无关紧要，但超过200万有严重语言运动障碍的美国人需要TTS输出的辅助交流辅助。串联TTS合成器产生高度可理解的声音，然而许多辅助设备依赖于小足迹，共振合成，听起来像机器人，具有较差的可理解性。此外，传统设备上的语音选择是有限的，不能反映用户；一个孩子一生都在使用同样的声音，而她的同龄人即使使用不同的设备也会发出同样的声音，这并不罕见。缺乏对合成声音个性的关注会影响辅助技术作为用户延伸的采用，并可能对社会对用户群体的态度产生不利影响。在她之前的工作中，PI开始通过使用一个连接合成器来解决这些问题，该合成器由健康说话者的录音构成，使用从言语障碍的目标说话者那里获得的声源特征。改编后的声音具有高度的可理解性，传达了目标用户的身份，但由于声道过滤器特性的影响，它也保留了健康谈话者身份的大量元素。这表明，利用另一种方法，将健康说话者的声学和发音数据与目标说话者的源和滤波器特征结合起来，生成个性化的声音，个性化的语音合成可能会更成功。在这个项目中，PI将开发混合统计参数合成技术来模拟受损说话者的声道和声源特征，目标是生成高度可理解和个性化的合成语音。PI设想了一个未来，基于隐马尔可夫模型（HMM）的合成器的源和滤波器参数可以适应儿童用户的声道模型，并随着时间的推移进行修改，以随着他成熟的声音系统“成长”，从而在用户和通信设备之间建立更强的个人联系。更广泛的影响：该项目通过设计一种模糊系统和用户之间界限的使能技术，努力使通信变得可访问和社会实现。人类的声音不仅仅是一种信号；它具有个性化和个性化的品质，影响着别人如何看待我们，以及我们如何与周围的人互动。这项工作的最终目标是为TTS用户提供与自然声音相同的所有权和个性。项目成果将对辅助器具使用者和TTS技术的健全使用者产生广泛影响。通过将模型参数与受损产品进行比较，该研究还可能导致一种新的和创新的方法来评估语言障碍的性质和发音轨迹。这项研究的跨学科性质将促进计算机科学和语言与听力科学的教学、培训和学习。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Rupal Patel其他文献

Su1096 - 96-Hour Esophageal PH Monitoring: The Tiebreaker for Abnormal Demeester Score and Symptom Index

Su1096 - 96 小时食管 PH 监测：Demeester 评分和症状指数异常的决胜局

DOI：
10.1016/s0016-5085(18)31852-3
发表时间：
2018
期刊：
Gastroenterology
影响因子：
29.4
作者：
Rupal Patel;Ambuj Kumar;J. Jacobs;Soojong Chae;J. Richter
通讯作者：
J. Richter

Mental health and the Gujarati communities : a case study of Leicester

心理健康和古吉拉特社区：莱斯特的案例研究

DOI：
发表时间：
2018
期刊：
影响因子：
0
作者：
Rupal Patel
通讯作者：
Rupal Patel

Dialect identification, intelligibility ratings, and acceptability ratings of dysarthric speech in two American English dialects.

两种美式英语方言中构音障碍语音的方言识别、清晰度评级和可接受性评级。

DOI：
10.1080/02699206.2023.2301337
发表时间：
2024
期刊：
Clinical linguistics & phonetics
影响因子：
1.2
作者：
Jacqueline S. Laures;Caitlin Ray Rogers;H. Griffey;Kenneth G Rice;S. Russell;Michael Frankel;Rupal Patel
通讯作者：
Rupal Patel

Clinical Features , Manometry , Timed Barium Esophagram , and Treatment Outcomes of Patients With Functional and Anatomic Esophagogastic Junction Outflow Obstruction

功能性和解剖性食管胃交界部流出道梗阻患者的临床特征、测压、定时食管钡餐检查和治疗结果

DOI：
发表时间：
期刊：
影响因子：
0
作者：
S. Clayton;Rupal Patel;J. Richter
通讯作者：
J. Richter

The Future Present

未来现在

DOI：
10.1044/leader.ftr1.18012013.36
发表时间：
2013
期刊：
The ASHA Leader
影响因子：
0
作者：
J. Stone;E. Rubel;R. Hillman;M. Cutter;Shannon C. Mauszycki;R. Shannon;J. Fridriksson;B. M. Law;N. Dronkers;Rupal Patel;E. Haacke
通讯作者：
E. Haacke