权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Adapting a Text-to-Speech Synthesizer to Convey User Identity

采用文本转语音合成器来传达用户身份

基本信息

批准号：
0712821
负责人：
Rupal Patel
金额：
--
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2007
资助国家：
美国
起止时间：
2007-07-15 至 2011-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0712821&HistoricalAwards=false
关键词：
Adapting Text Speech Synthesizer Convey

项目摘要

This project will advance computerized speech synthesis methods so that they can better approximate the unique vocal characteristics of individual human speakers. Voice quality is unique to each individual and thus is inextricable from personality, self-image, and the perceptions of others. We can alter our voice to sound like others, attract attention, project confidence, convey authority, and to perform countless other functions. This flexibility of the natural voice is inconceivable using even state-of-the art text-to-speech (TTS) synthesis. While voice quality may not matter for many text-to-speech applications, it is essential for assistive communication aids which are meant to be an extension of the user. Over two million Americans have severe speech and motor impairments that require them to use an assistive communication aid, many of whom use TTS to speak on their behalf. The speech output options on commercially available devices are extremely limited. Moreover, the synthetic voices are not representative of the user along basic dimensions such as age, gender, rate of speech, and voice quality thus drawing unnecessary attention and detracting from the content of the spoken message as well as impeding social integration. This project aims to harness the residual vocal control in the productions of individuals with severe speech impairment in order to adapt a text-to-speech synthesizer such that the resultant voice resembles that of the user. Conventional methods of voice morphing cannot be applied directly given the severity of the user's speech impairment. Recent empirical work suggests that children and adults with severe speech impairment retain the ability to control fundamental frequency, accent, rhythm, and speaking rate which are among the many acoustic cues that signal speaker identity. This research will leverage this preserved ability toward building an adaptive text-to-speech synthesizer that conveys the user's identity without degradation of intelligibility. Identity-bearing vocal cues will be elicited from children with severe speech impairment and used to adapt age and gender-matched concatenative synthetic voices using novel voice transformation techniques. Usability tests will be conducted to assess the impact of user identity adaptation on TTS intelligibility, naturalness, and acceptability, the results of which will provide insights for iterative design of TTS adaptation. The research will have broader impact on users of assistive aids and able-bodied users of communication technologies that use speech synthesis. This project strives to make communication accessible and socially fulfilling by designing an enabling technology in which the line between system and user is blurred. The ultimate goal is to afford users of speech synthesis technology the same ownership and individuality as the natural voice. The interdisciplinary nature of this work will promote teaching, training and learning in computer science and in speech and hearing sciences.

该项目将推进计算机语音合成方法，以便更好地近似个人说话者的独特声音特征。声音质量对每个人来说都是独一无二的，因此与个性，自我形象和他人的看法密不可分。我们可以改变自己的声音听起来像别人，吸引注意力，项目的信心，传达权威，并执行无数其他功能。自然语音的这种灵活性即使使用最先进的文本到语音（TTS）合成也是不可想象的。虽然语音质量对于许多文本到语音的应用程序来说可能并不重要，但对于旨在成为用户扩展的辅助通信辅助设备来说，它是必不可少的。超过200万美国人有严重的语言和运动障碍，需要使用辅助沟通工具，其中许多人使用TTS来代表他们说话。商用设备上的语音输出选项极其有限。此外，合成语音并不代表用户的沿着基本维度，例如年龄、性别、语速和语音质量，因此引起不必要的注意，并从口头消息的内容中转移注意力，以及阻碍社会融合。该项目旨在利用剩余的语音控制在生产的个人严重的语言障碍，以适应文本到语音合成器，使合成的声音类似于用户。传统的语音变形方法不能直接应用于给定的用户的言语障碍的严重性。最近的实证研究表明，严重言语障碍的儿童和成人保留了控制基频、口音、节奏和语速的能力，这些都是发出说话者身份信号的声学线索。这项研究将利用这种保留的能力，建立一个自适应的文本到语音合成器，传达用户的身份，而不降低可理解性。身份轴承的声音线索将引起严重的言语障碍的儿童，并用于适应年龄和性别匹配的拼接合成声音，使用新的语音转换技术。可用性测试将进行评估的影响，用户身份的TTS的可懂度，自然度和可接受性，其结果将提供见解的TTS适应迭代设计。这项研究将对使用辅助工具的用户和使用语音合成的通信技术的健全用户产生更广泛的影响。该项目致力于通过设计一种使系统和用户之间的界限模糊的技术，使通信变得容易获得和社会满意。最终目标是为语音合成技术的用户提供与自然声音相同的所有权和个性。这项工作的跨学科性质将促进计算机科学以及言语和听力科学的教学、培训和学习。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Rupal Patel其他文献

Su1096 - 96-Hour Esophageal PH Monitoring: The Tiebreaker for Abnormal Demeester Score and Symptom Index

Su1096 - 96 小时食管 PH 监测：Demeester 评分和症状指数异常的决胜局

DOI：
10.1016/s0016-5085(18)31852-3
发表时间：
2018
期刊：
Gastroenterology
影响因子：
29.4
作者：
Rupal Patel;Ambuj Kumar;J. Jacobs;Soojong Chae;J. Richter
通讯作者：
J. Richter

Mental health and the Gujarati communities : a case study of Leicester

心理健康和古吉拉特社区：莱斯特的案例研究

DOI：
发表时间：
2018
期刊：
影响因子：
0
作者：
Rupal Patel
通讯作者：
Rupal Patel

Dialect identification, intelligibility ratings, and acceptability ratings of dysarthric speech in two American English dialects.

两种美式英语方言中构音障碍语音的方言识别、清晰度评级和可接受性评级。

DOI：
10.1080/02699206.2023.2301337
发表时间：
2024
期刊：
Clinical linguistics & phonetics
影响因子：
1.2
作者：
Jacqueline S. Laures;Caitlin Ray Rogers;H. Griffey;Kenneth G Rice;S. Russell;Michael Frankel;Rupal Patel
通讯作者：
Rupal Patel

Clinical Features , Manometry , Timed Barium Esophagram , and Treatment Outcomes of Patients With Functional and Anatomic Esophagogastic Junction Outflow Obstruction

功能性和解剖性食管胃交界部流出道梗阻患者的临床特征、测压、定时食管钡餐检查和治疗结果

DOI：
发表时间：
期刊：
影响因子：
0
作者：
S. Clayton;Rupal Patel;J. Richter
通讯作者：
J. Richter

The Future Present

未来现在

DOI：
10.1044/leader.ftr1.18012013.36
发表时间：
2013
期刊：
The ASHA Leader
影响因子：
0
作者：
J. Stone;E. Rubel;R. Hillman;M. Cutter;Shannon C. Mauszycki;R. Shannon;J. Fridriksson;B. M. Law;N. Dronkers;Rupal Patel;E. Haacke
通讯作者：
E. Haacke