权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SBIR Phase II: VocaliD - Infusing Unique Vocal Identities into Synthesized Speech

SBIR 第二阶段：VocaliD - 将独特的声音特征注入合成语音中

基本信息

批准号：
1555608
负责人：
Rupal Patel
金额：
$ 74.78万
依托单位：
VOCALID INC
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2016
资助国家：
美国
起止时间：
2016-04-01 至 2020-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1555608&HistoricalAwards=false
关键词：
SBIR Phase II VocaliD Infusing

项目摘要

The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase II project is to offer custom crafted digital voices for text-to-speech applications. Each one of us has a unique voiceprint - an essential part of our self-identity. Though the quality of text-to-speech technology has improved, voice options remain limited. For the 2.5 million Americans (and tens of millions worldwide) living with voicelessness who rely on devices to talk, access to a custom digital voice is a game changer. It's the difference between a functional solution and being heard, uniquely, as oneself. Enhanced opportunities for social connection increase quality of life, independence, and access to educational and vocational resources that can narrow the gap between those with and without disability. This immediate unmet societal need, coupled with the increasing proliferation of devices that speak to us and for us, creates a compelling, timely and significant commercial opportunity for high quality, personalized digital voices that can be produced at scale. By leveraging the company's crowdsourced human voicebank and proprietary voice matching and blending algorithms the technology has the potential to empower everyone to express themselves through their own voice.This Small Business Innovation Research Phase II project builds on the company's NSF-funded research and Phase I results that support feasibility and commercialization of a customized voice building technology. The text-to-speech market, encompassing assistive technologies, enterprise and consumer applications, is currently valued at around $1B and is rapidly growing and ripe for innovation. To create custom voices, the company leverages the source-filter theory of speech production. From those who are unable or unwilling to record several hours of speech the company extracts a brief vocal sample - even a single vowel contains enough 'vocal DNA' to seed the personalization process. Identity cues of the source are then combined with filter properties of a demographically and acoustically matched donor in the company's voicebank. The result is a voice that captures the vocal identity of the recipient but the clarity of the donor. Phase II technical objectives address the need for 1) customer-driven voice customization, 2) quality assurance of crowdsourced recordings, 3) voice aging algorithms, and 4) targeted donor recruitment algorithms. These advances will help secure the assistive technology beachhead and spur innovations for broader applications such as virtual reality, personal robotics, and digital persona for the Internet of Things.

这个小型企业创新研究（SBIR）第二阶段项目的更广泛的影响/商业潜力是为文本到语音应用提供定制的数字语音。我们每个人都有一个独特的声纹-我们自我认同的重要组成部分。虽然文本转语音技术的质量有所提高，但语音选择仍然有限。对于250万美国人（以及全世界数千万人）来说，依靠设备说话的无障碍生活，获得定制数字语音是一个游戏规则改变者。这是一个功能性的解决方案和被听到的区别，独特的，作为自己。增加社会联系的机会可以提高生活质量、独立性以及获得教育和职业资源的机会，从而缩小残疾人与非残疾人之间的差距。这种迫切的未满足的社会需求，加上与我们交谈的设备的日益扩散，为大规模生产的高质量，个性化的数字语音创造了一个引人注目的，及时的和重要的商业机会。通过利用该公司的众包人类语音库和专有的语音匹配和混合算法，该技术有可能使每个人都能通过自己的声音表达自己。这个小企业创新研究第二阶段项目建立在该公司的NSF资助的研究和第一阶段结果的基础上，支持定制语音构建技术的可行性和商业化。包括辅助技术、企业和消费者应用在内的文本转语音市场目前价值约为10亿美元，并且正在快速增长，创新的时机已经成熟。为了创建定制的声音，该公司利用了语音生产的源过滤器理论。从那些无法或不愿录制数小时语音的人身上，该公司提取了一个简短的声音样本--即使是一个元音也包含了足够的“声音DNA”，可以为个性化过程奠定基础。然后将来源的身份线索与公司语音库中人口统计学和声学匹配的捐赠者的过滤器属性相结合。其结果是一个声音，捕捉声音的身份接受者，但清晰的捐助者。第二阶段的技术目标解决了以下需求：1）客户驱动的语音定制，2）众包录音的质量保证，3）语音老化算法，以及4）有针对性的捐赠者招募算法。这些进步将有助于确保辅助技术的滩头阵地，并刺激更广泛应用的创新，如虚拟现实，个人机器人和物联网的数字角色。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Rupal Patel其他文献

Su1096 - 96-Hour Esophageal PH Monitoring: The Tiebreaker for Abnormal Demeester Score and Symptom Index

Su1096 - 96 小时食管 PH 监测：Demeester 评分和症状指数异常的决胜局

DOI：
10.1016/s0016-5085(18)31852-3
发表时间：
2018
期刊：
Gastroenterology
影响因子：
29.4
作者：
Rupal Patel;Ambuj Kumar;J. Jacobs;Soojong Chae;J. Richter
通讯作者：
J. Richter

Mental health and the Gujarati communities : a case study of Leicester

心理健康和古吉拉特社区：莱斯特的案例研究

DOI：
发表时间：
2018
期刊：
影响因子：
0
作者：
Rupal Patel
通讯作者：
Rupal Patel

Dialect identification, intelligibility ratings, and acceptability ratings of dysarthric speech in two American English dialects.

两种美式英语方言中构音障碍语音的方言识别、清晰度评级和可接受性评级。

DOI：
10.1080/02699206.2023.2301337
发表时间：
2024
期刊：
Clinical linguistics & phonetics
影响因子：
1.2
作者：
Jacqueline S. Laures;Caitlin Ray Rogers;H. Griffey;Kenneth G Rice;S. Russell;Michael Frankel;Rupal Patel
通讯作者：
Rupal Patel

Clinical Features , Manometry , Timed Barium Esophagram , and Treatment Outcomes of Patients With Functional and Anatomic Esophagogastic Junction Outflow Obstruction

功能性和解剖性食管胃交界部流出道梗阻患者的临床特征、测压、定时食管钡餐检查和治疗结果

DOI：
发表时间：
期刊：
影响因子：
0
作者：
S. Clayton;Rupal Patel;J. Richter
通讯作者：
J. Richter

The Future Present

未来现在

DOI：
10.1044/leader.ftr1.18012013.36
发表时间：
2013
期刊：
The ASHA Leader
影响因子：
0
作者：
J. Stone;E. Rubel;R. Hillman;M. Cutter;Shannon C. Mauszycki;R. Shannon;J. Fridriksson;B. M. Law;N. Dronkers;Rupal Patel;E. Haacke
通讯作者：
E. Haacke