User Adaptation of AAC Device Voices

AAC设备语音的用户适配

基本信息

  • 批准号:
    7219057
  • 负责人:
  • 金额:
    $ 15.01万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-01-01 至 2008-06-30
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): A wide range of individuals cannot communicate by voice. Voice enabled Augmentative and Alternative Communication (AAC) devices are often the only channel available by which these individuals can communicate. While many voice enabled AAC devices are currently available, they lack the important ability to generate customized speech that mimics aspects of the user's past or intermittently available speech. Modern "concatenative" speech synthesis technology can mimic a given speaker's voice, by excising speech fragments from a recorded speech data base ("acoustic inventory") and recombining these into output speech using sophisticated algorithms. It requires, however, a large amount of recordings and a high degree of consistency of pronunciation of the speaker. Many AAC users cannot meet these requirements because they already have lost the capability to speak or they cannot speak with adequate consistency of pronunciation. A new type of technology, voice transformation (VT) technology, is available that can transform speech spoken by a "source" speaker into speech that is perceived as spoken by a specific "target" speaker. To tune the transformation system, parallel "training recordings" of the same text are needed from the source and target speakers. The amount of training recordings is far less than what is needed for a high-quality acoustic inventory. We propose to use VT in combination with speech synthesis to convert the synthesis system's acoustic inventory into an acoustic inventory that mimics the target speaker's voice. The training recordings can consist of old home videos, or fragmented recordings produced during periods of intact speech, provided that they contain at least one sample of each phoneme. In Phase I, we will develop and evaluate a VT based synthesis system. The project will use high- quality and home-video quality recordings from male and female adults and children to create limited acoustic inventories (adequate to generate a specific set of test sentences) and VT training recordings. Perceptual experiments will be conducted to evaluate voice quality and perceived speaker identity. Phase II will focus on developing complete acoustic inventories for several canonical speakers that will be selected to cover a range of speaker characteristics, and on producing portable, user-friendly software. The anticipated commercial offering consists of (i) software components to be licensed to AAC vendors and (ii) a service consisting of collection and processing of recordings and creation of personalized acoustic inventories. Speech communication ability is impaired or absent in millions of Americans due to neurological disorders and diseases and to trauma, including autism, Parkinson's disease, and stroke. Augmentative and Alternative Communication (AAC) devices that are operated via switches, keyboards, and a broad range of other input devices, and that have synthetic speech as output, are often the only manner in which these individuals can communicate. Without AAC devices, these individuals may suffer from severe social and psychological isolation, and may be unable to lead productive lives. A psychologically important feature that no currently available systems have is the ability to speak with the user's voice, i.e., the ability to produce speech that mimics the individual's pre-morbid speech or speech that the individual may be able to intermittently produce. The proposed project will use voice transformation (VT) technology to accomplish this goal. VT technology requires recordings of the user to be available, but there is substantial flexibility as to the nature and quantity of these recordings; they may consist of home videos or of fragmentary speech, provided that at least some samples are available of each speech sound in the language. The goal of the application is to develop a synthetic voice for an AAC system that sounds like the individual using the system (before they lost the ability to speak), without requiring very much recorded data on the part of the original talker. The system works by first creating a synthetic "base" voice (or set of base voices) using professional actors who must provide a fairly large inventory of speech data. Using the base voice and a small sample from the target talker (i.e., containing at least one instance of each phoneme), a new synthetic voice is created by essentially modulating parameters in the base voice so that it takes on characteristics of the target talker. The ability to create a voice that sounds like the original talker without much data from the original talker would be a significant advantage.
描述(由申请人提供):很多人不能通过声音交流。支持语音的辅助和替代通信(AAC)设备通常是这些个人可以进行通信的唯一可用渠道。虽然目前有许多支持语音的AAC设备可用,但它们缺乏生成模仿用户过去或间歇性可用语音的定制语音的重要功能。现代“串联”语音合成技术可以通过从录制的语音数据库(“声学库存”)中删除语音片段,并使用复杂的算法将这些片段重新组合成输出语音,从而模拟给定说话者的声音。然而,它需要大量的录音和说话人的发音高度一致。许多AAC使用者无法满足这些要求,因为他们已经失去了说话的能力,或者他们的发音不能保持足够的一致性。语音转换(VT)技术是一种新型技术,它可以将“源”说话者所说的语音转换为特定“目标”说话者所感知的语音。为了调整转换系统,需要从源说话人和目标说话人对同一文本进行平行的“训练录音”。训练录音的数量远远少于高质量声学库存所需的数量。我们建议将VT与语音合成结合使用,将合成系统的声学库存转换为模仿目标说话者声音的声学库存。训练录音可以是旧的家庭录像,也可以是在完整语音期间产生的片段录音,只要它们至少包含每个音素的一个样本。在第一阶段,我们将开发和评估一个基于VT的合成系统。该项目将使用来自男性和女性成人及儿童的高质量和家庭录像质量的录音来创建有限的声音清单(足以生成一组特定的测试句子)和VT训练录音。将进行感知实验来评估语音质量和感知说话人身份。第二阶段将侧重于为几个典型扬声器开发完整的声学清单,这些扬声器将被选中以涵盖一系列扬声器特性,并生产便携式,用户友好的软件。预期的商业产品包括(i)授权给AAC供应商的软件组件和(ii)收集和处理录音以及创建个性化声学清单的服务。由于神经系统紊乱、疾病和创伤,包括自闭症、帕金森病和中风,数百万美国人的语言交流能力受损或缺失。辅助和替代通信(AAC)设备通过开关、键盘和各种其他输入设备进行操作,并以合成语音作为输出,这通常是这些人进行通信的唯一方式。如果没有AAC设备,这些人可能会遭受严重的社会和心理孤立,并可能无法过上富有成效的生活。一个重要的心理特征是,目前没有可用的系统具有用用户的声音说话的能力,也就是说,产生模仿个人发病前的语言或个人可能间歇性产生的语言的能力。拟议的项目将使用语音转换(VT)技术来实现这一目标。自动录像技术要求提供用户的录音,但这些录音的性质和数量有很大的灵活性;它们可以由家庭录像或语音片段组成,只要该语言中的每种语音至少有一些样本可用。该应用程序的目标是为AAC系统开发一种合成语音,听起来像使用该系统的个人(在他们失去说话能力之前),而不需要原始说话者的大量记录数据。该系统的工作原理是首先使用专业演员创建一个合成的“基本”声音(或一组基本声音),这些演员必须提供相当大的语音数据库存。使用基本声音和目标说话者的小样本(即,包含每个音素的至少一个实例),通过本质上调制基本声音中的参数,使其具有目标说话者的特征,创建新的合成声音。创造一个听起来像原始说话者的声音,而不需要原始说话者的太多数据的能力将是一个显着的优势。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jan van Santen其他文献

Jan van Santen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jan van Santen', 18)}}的其他基金

Voice Transformation for Dysarthria - Phase I
构音障碍的语音转换 - 第一阶段
  • 批准号:
    7162050
  • 财政年份:
    2006
  • 资助金额:
    $ 15.01万
  • 项目类别:

相似海外基金

Developing a Young Adult-Mediated Intervention to Increase Colorectal Cancer Screening among Rural Screening Age-Eligible Adults
制定年轻人介导的干预措施,以增加农村符合筛查年龄的成年人的结直肠癌筛查
  • 批准号:
    10653464
  • 财政年份:
    2023
  • 资助金额:
    $ 15.01万
  • 项目类别:
Doctoral Dissertation Research: Estimating adult age-at-death from the pelvis
博士论文研究:从骨盆估算成人死亡年龄
  • 批准号:
    2316108
  • 财政年份:
    2023
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Standard Grant
Determining age dependent factors driving COVID-19 disease severity using experimental human paediatric and adult models of SARS-CoV-2 infection
使用 SARS-CoV-2 感染的实验性人类儿童和成人模型确定导致 COVID-19 疾病严重程度的年龄依赖因素
  • 批准号:
    BB/V006738/1
  • 财政年份:
    2020
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Research Grant
Transplantation of Adult, Tissue-Specific RPE Stem Cells for Non-exudative Age-related macular degeneration (AMD)
成人组织特异性 RPE 干细胞移植治疗非渗出性年龄相关性黄斑变性 (AMD)
  • 批准号:
    10294664
  • 财政年份:
    2020
  • 资助金额:
    $ 15.01万
  • 项目类别:
Sex differences in the effect of age on episodic memory-related brain function across the adult lifespan
年龄对成人一生中情景记忆相关脑功能影响的性别差异
  • 批准号:
    422882
  • 财政年份:
    2019
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Operating Grants
Modelling Age- and Sex-related Changes in Gait Coordination Strategies in a Healthy Adult Population Using Principal Component Analysis
使用主成分分析对健康成年人群步态协调策略中与年龄和性别相关的变化进行建模
  • 批准号:
    430871
  • 财政年份:
    2019
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Studentship Programs
Transplantation of Adult, Tissue-Specific RPE Stem Cells as Therapy for Non-exudative Age-Related Macular Degeneration AMD
成人组织特异性 RPE 干细胞移植治疗非渗出性年龄相关性黄斑变性 AMD
  • 批准号:
    9811094
  • 财政年份:
    2019
  • 资助金额:
    $ 15.01万
  • 项目类别:
Study of pathogenic mechanism of age-dependent chromosome translocation in adult acute lymphoblastic leukemia
成人急性淋巴细胞白血病年龄依赖性染色体易位发病机制研究
  • 批准号:
    18K16103
  • 财政年份:
    2018
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Doctoral Dissertation Research: Literacy Effects on Language Acquisition and Sentence Processing in Adult L1 and School-Age Heritage Speakers of Spanish
博士论文研究:识字对西班牙语成人母语和学龄传统使用者语言习得和句子处理的影响
  • 批准号:
    1823881
  • 财政年份:
    2018
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Standard Grant
Adult Age-differences in Auditory Selective Attention: The Interplay of Norepinephrine and Rhythmic Neural Activity
成人听觉选择性注意的年龄差异:去甲肾上腺素与节律神经活动的相互作用
  • 批准号:
    369385245
  • 财政年份:
    2017
  • 资助金额:
    $ 15.01万
  • 项目类别:
    Research Grants
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了