Exemplar-based Expressive Speech Synthesis

基于样本的表达性语音合成

基本信息

  • 批准号:
    EP/V046772/1
  • 负责人:
  • 金额:
    $ 27.81万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2021
  • 资助国家:
    英国
  • 起止时间:
    2021 至 无数据
  • 项目状态:
    已结题

项目摘要

Synthetic voices are becoming ubiquitous: `smart' speakers at home, announcement systems on public transport, and voice-enabled assistants on call lines. There exist a strong public demand for `smarter' assistants capable of laughing at our jokes; interacting with our children as encouraging and emphatic tutors; calling to check up on our parents; providing a reassuring `ear' for an isolated person; and offering calming and supportive virtual therapy. To support current and future applications, voice synthesis technology needs to satisfy a number of requirements. First, it needs to be customisable for rapid research and development, and second, it needs to be able to produce any spoken content, including expressive voice characteristics. However, none of the current synthesis technologies can simultaneously satisfy all of the above requirements. For instance, while current non-machine learning approaches allow pre-recorded phrases to be efficiently combined into complete sentences, it also means that missing necessary phrases must be recorded first, thereby limiting their flexibility and efficiency. On the other hand, current machine learning models can seamlessly synthesise any spoken content. However, creating such models is a very costly, time-consuming and computationally demanding process. Furthermore, these models offer a very limited control over the qualities of the voice characteristics and lack interpretability, which are highly desirable conditions in both research and commercial settings.In this project, the objective is to develop a computationally efficient, customisable, expressive and interpretable speech synthesis, by drawing from the concept of `exemplars' in cognitive science.In the field of cognitive science, the notions of `exemplars' and `prototypes' form a part of a prominent view on how humans categorise concepts. In particular, exemplar theory argues that singular examples, rather than prototypes (an average of examples), form the basic building blocks of how we understand and interact with the world. The key argument in favour of exemplar theory is our ability as humans to solve complex tasks based on just a few examples, which makes this theory appealing to applications that involve complex phenomena or that require high computational efficiency. Furthermore, expressive speech synthesis combines expressivity and speech production, which are two complex phenomena that remain poorly understood. Unlike prototype theory, exemplar theory, at least theoretically, enables to produce expressive speech, provided that at least one recording of the desired spoken content and one recording featuring the desired expressivity are available. Lastly, adopting exemplar theory promotes transparency during the decision making process through the use of real examples that can be inspected, modified, replaced, added, etc. within the task.The objective will be achieved through three innovative means by: i) formulating a methodological framework for exemplar-based speech synthesis, ii) building an exemplar-based representation for speech expressivity from pre-recorded examples and iii) presenting a novel methodology for integrating this expressivity-based representation into the framework of i).
合成语音正在变得无处不在:家里的“智能”扬声器、公共交通工具上的广播系统以及呼叫线路上的语音助手。公众对“更聪明”的助手有着强烈的需求,他们能够为我们的笑话发笑;能够作为鼓励和指导的导师与我们的孩子互动;能够打电话检查我们的父母;能够为孤立的人提供一个令人放心的“耳朵”;能够提供平静和支持性的虚拟治疗。为了支持当前和未来的应用,语音合成技术需要满足许多要求。首先,它需要可定制以进行快速研发,其次,它需要能够产生任何口语内容,包括富有表现力的语音特征。然而,目前的合成技术都不能同时满足所有上述要求。例如,虽然当前的非机器学习方法允许将预先记录的短语有效地组合成完整的句子,但这也意味着必须首先记录缺失的必要短语,从而限制了它们的灵活性和效率。另一方面,当前的机器学习模型可以无缝合成任何口语内容。然而,创建这样的模型是一个非常昂贵,耗时和计算要求的过程。此外,这些模型对语音特征的质量提供了非常有限的控制,并且缺乏可解释性,这在研究和商业环境中都是非常理想的条件。在本项目中,目标是通过借鉴认知科学中的“范例”概念,开发一种计算效率高、可定制、表达性和可解释的语音合成。“范例”和“原型”的概念构成了关于人类如何对概念进行分类的突出观点的一部分。具体而言,范例理论认为,单一的例子,而不是原型(例子的平均值),形成了我们如何理解和与世界互动的基本构建块。支持范例理论的关键论点是,我们作为人类能够基于几个例子来解决复杂的任务,这使得这个理论对涉及复杂现象或需要高计算效率的应用程序具有吸引力。此外,表达性语音合成结合了表达性和语音产生,这是两个复杂的现象,仍然知之甚少。与原型理论不同,范例理论至少在理论上能够产生表达性的语音,前提是至少有一个所需口语内容的录音和一个具有所需表达性的录音可用。最后,采用范例理论,通过使用真实的范例,可以在任务中进行检查、修改、替换、添加等,从而提高决策过程的透明度。通过以下三种创新手段实现目标:i)制定用于基于范例的语音合成的方法框架,ii)从预先记录的示例中构建用于语音表达的基于示例的表示,以及iii)呈现用于将该基于表达的表示集成到i)的框架中的新颖方法。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anton Ragni其他文献

Adapting Pretrained Models for Adult to Child Voice Conversion
采用预训练模型进行成人到儿童的语音转换
Training Data Augmentation for Dysarthric Automatic Speech Recognition by Text-to-Dysarthric-Speech Synthesis
通过文本到构音障碍语音合成来增强构音障碍自动语音识别的训练数据
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wing;Mattias Cross;Anton Ragni;Stefan Goetze
  • 通讯作者:
    Stefan Goetze
Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models
使用中级 ASR 特征和人类记忆模型对听力受损用户进行非侵入式语音清晰度预测
  • DOI:
    10.48550/arxiv.2401.13611
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rhiannon Mogridge;George Close;Robert Sutherland;Thomas Hain;Jon Barker;Stefan Goetze;Anton Ragni
  • 通讯作者:
    Anton Ragni

Anton Ragni的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国青年学者研究基金项目
Exploring the Intrinsic Mechanisms of CEO Turnover and Market Reaction: An Explanation Based on Information Asymmetry
  • 批准号:
    W2433169
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目
含Re、Ru先进镍基单晶高温合金中TCP相成核—生长机理的原位动态研究
  • 批准号:
    52301178
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
NbZrTi基多主元合金中化学不均匀性对辐照行为的影响研究
  • 批准号:
    12305290
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
眼表菌群影响糖尿病患者干眼发生的人群流行病学研究
  • 批准号:
    82371110
  • 批准年份:
    2023
  • 资助金额:
    49.00 万元
  • 项目类别:
    面上项目
镍基UNS N10003合金辐照位错环演化机制及其对力学性能的影响研究
  • 批准号:
    12375280
  • 批准年份:
    2023
  • 资助金额:
    53.00 万元
  • 项目类别:
    面上项目
CuAgSe基热电材料的结构特性与构效关系研究
  • 批准号:
    22375214
  • 批准年份:
    2023
  • 资助金额:
    50.00 万元
  • 项目类别:
    面上项目
基于大数据定量研究城市化对中国季节性流感传播的影响及其机理
  • 批准号:
    82003509
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Next-Generation Expressive Personalized Voices for Speech-Generating Devices
用于语音生成设备的下一代富有表现力的个性化声音
  • 批准号:
    10547241
  • 财政年份:
    2022
  • 资助金额:
    $ 27.81万
  • 项目类别:
Embedding Expressive Functions into Soft Objects and Designing Interactions Based on EHD Fluid Control
将表达功能嵌入软物体并设计基于 EHD 流体控制的交互
  • 批准号:
    21H04882
  • 财政年份:
    2021
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Writing to Heal: A Culturally Based Brief Expressive Writing Intervention for Chinese Immigrant Breast Cancer Survivors
写作治愈:针对中国移民乳腺癌幸存者的基于文化的简短表达性写作干预
  • 批准号:
    10375527
  • 财政年份:
    2021
  • 资助金额:
    $ 27.81万
  • 项目类别:
Writing to Heal: A Culturally Based Brief Expressive Writing Intervention for Chinese Immigrant Breast Cancer Survivors
写作治愈:针对中国移民乳腺癌幸存者的基于文化的简短表达性写作干预
  • 批准号:
    10675433
  • 财政年份:
    2021
  • 资助金额:
    $ 27.81万
  • 项目类别:
Philosophical Study on the Appropriate Forms of Punishment Based on the Expressive Theory of Retribution
基于报应表现论的适当惩罚形式的哲学研究
  • 批准号:
    19K12932
  • 财政年份:
    2019
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
A Study of Narrative Techniques for Representing Reality/Fictitiousness: Based on an Empirical Analysis of Faulkner's Expressive Use of Punctuation Signs
再现现实/虚拟的叙事技巧研究——基于福克纳标点符号表达运用的实证分析
  • 批准号:
    17K02578
  • 财政年份:
    2017
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of an educational program of expressive writing for emotional awareness in elementary school: Based on integration of psychology and Japanese philology
小学情感意识表达性写作教育计划的开发:基于心理学和日本语言学的整合
  • 批准号:
    16K17318
  • 财政年份:
    2016
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Efficient statistical parsing and decoding for expressive grammar formalisms based on tree automata
基于树自动机的表达语法形式的高效统计解析和解码
  • 批准号:
    252303250
  • 财政年份:
    2014
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Research Grants
Psychological Internal State Estimation based on Expressive Tempos and Rhythms of Facial Expressions
基于面部表情的表达节奏和节奏的心理内部状态估计
  • 批准号:
    25330325
  • 财政年份:
    2013
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
ExODA: Integrating Description Logics and Database Technologies for Expressive Ontology-Based Data Access
ExODA:集成描述逻辑和数据库技术以实现基于表达本体的数据访问
  • 批准号:
    EP/H051511/1
  • 财政年份:
    2011
  • 资助金额:
    $ 27.81万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了