权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

HCC: Medium: Synthesis and Perception of Speaker Identity

HCC：媒介：说话者身份的综合和感知

基本信息

批准号：
0964468
负责人：
Alexander Kain
金额：
$ 91.48万
依托单位：
Oregon Health & Science University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2010
资助国家：
美国
起止时间：
2010-05-15 至 2015-04-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0964468&HistoricalAwards=false
关键词：
HCC Medium Synthesis Perception Speaker

项目摘要

This proposal addresses the problem of synthesizing speaker identity when only a small training sample is available. To achieve the goal of synthesis of speaker identity from a small training corpus the project will address problems including trainable abstract parameterizations of the prosodic patterns that characterize a speaker and voice conversion methods. The project falls into the general category of building Text-to-Speech (TTS) synthesis system in order to generate speech that sounds like that of a specific individual (Speaker Identity Synthesis, or SIS). Systems of this kind have numerous applications, including the creation of personalized voices for individuals with neurodegenerative disorders who anticipate becoming users of Speech Generating Devices (Sods) in the future and many other applications in the consumer products and entertainment industry. Consumer products such as navigation systems and mobile phones are rapidly being developed that make use of linguistic information about generated utterance. The project will also provide new tools and data for human perception of speaker identity. The tools developed in the process and the associated perceptual studies are also relevant for assessment of speaker recognition systems, and the project provides a new generation of concise, trainable characterizations of a speaker?s prosodic patterns that can be incorporated in these systems. The proposed study will elucidate the trade-offs and algorithm issues of the proposed SIS systems and it is likely that the proposed work will have a strong intellectual impact in the field of speech synthesis.

该方法解决了在训练样本较少的情况下合成说话人身份的问题。为了实现从一个小的训练语料库合成说话人身份的目标，该项目将解决的问题，包括可训练的抽象参数化的韵律模式，表征一个说话人和语音转换方法。该项目福尔斯属于一般类别的建设文本到语音（TTS）合成系统，以产生语音，听起来像一个特定的个人（发言人身份合成，或SIS）。这类系统具有许多应用，包括为患有神经退行性疾病的个体创建个性化语音，这些个体预期在未来成为语音生成设备（SOD）的用户，以及消费产品和娱乐行业中的许多其他应用。诸如导航系统和移动的电话之类的消费产品正在迅速开发，其利用关于所生成的话语的语言信息。该项目还将为人类感知说话人身份提供新的工具和数据。在这个过程中开发的工具和相关的感知研究也相关的评估扬声器识别系统，该项目提供了一个新一代的简洁，可训练的表征扬声器？的韵律模式，可以纳入这些系统。拟议的研究将阐明拟议的SIS系统的权衡和算法问题，很可能拟议的工作将在语音合成领域产生强大的智力影响。