SCRIPT: Speech Synthesis for Spoken Content Production

SCRIPT:用于语音内容制作的语音合成

基本信息

  • 批准号:
    EP/P011586/1
  • 负责人:
  • 金额:
    $ 67.95万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2016
  • 资助国家:
    英国
  • 起止时间:
    2016 至 无数据
  • 项目状态:
    已结题

项目摘要

The cost of producing dynamically-updated media content - such as online video news packages - across multiple languages is very high. Maintaining substantial teams of journalists per language is expensive and inflexible. Modern media organisations like the BBC or the Financial Times need a more agile approach: they must be able to react quickly to changing world events (e.g., breaking news or emerging markets), dynamically allocating their limited resources in response to external demands. Ideally, they would like to create `pop-up' services & products in previously-unsupported languages, then to scale them up or down later.The government has set the BBC a target of reaching a global audience of 500 million people by 2022, compared with today's 308 million. The only way to reach such a huge audience is through new language services and efficient production techniques. Text-to-speech - which automatically produces speech from text - offers an attractive solution to this challenge, and the BBC have identified computer assisted translation and text-to-speech as key technologies that will provide them with new ways of creating and reversioning their content across many languages.This project's objectives are to push text-to-speech technology towards "broadcast quality" computer-generated speech (i.e., good enough for the BBC to broadcast) in many languages, and to make it cheap and easy to add more languages later. We will do this by combining and extending several distinct pieces of our previous basic research on text-to-speech. We will use the latest data-driven machine learning techniques, and extend them to produce much higher quality output speech. At the same time, we will enable the possibility of human control over the speech. This will allow the user (e.g., a BBC journalist) to adjust the speech to make sure the quality and the speaking style is right for their purposes (e.g., correcting the pronunciation of a difficult word, or putting emphasis in the right place).The technology we will create for the likes of the BBC will also enable smaller companies and other organisations, state bodies, charities, and individuals to rapidly create high-quality spoken content, in whatever language or domain they are operating. We will work with other types of organisation during the project, to make sure that the technology we create has broad appeal and will be useful to a wide range of companies and individuals.
制作跨多种语言的动态更新的媒体内容--如在线视频新闻包--的成本非常高。维持每种语言的大量记者团队既昂贵又缺乏灵活性。像BBC或英国《金融时报》这样的现代媒体机构需要一种更灵活的方法:它们必须能够对不断变化的世界事件(如突发新闻或新兴市场)做出快速反应,根据外部需求动态分配有限的资源。理想情况下,他们希望用以前不受支持的语言创建弹出式服务和产品,然后在以后扩大或缩小规模。英国政府为BBC设定了到2022年达到全球5亿观众的目标,而目前的目标是3.08亿。接触如此庞大的受众的唯一途径是通过新的语言服务和高效的制作技术。文本-语音--从文本自动生成语音--为这一挑战提供了一个有吸引力的解决方案,英国广播公司已将计算机辅助翻译和文本-语音转换确定为关键技术,将为他们提供新的方法来创建和还原其跨多种语言的内容。该项目的目标是推动文本-语音转换技术实现计算机生成的多语言语音(即,足够好到BBC可以播放),并使其成本更低,更容易在以后添加更多语言。我们将通过组合和扩展我们之前关于文本到语音的基础研究的几个不同部分来做到这一点。我们将使用最新的数据驱动的机器学习技术,并对其进行扩展,以产生更高质量的输出语音。与此同时,我们将使人类能够控制言论的可能性。这将允许用户(例如,BBC记者)调整演讲以确保质量和说话风格符合他们的目的(例如,纠正困难单词的发音,或将重点放在正确的位置)。我们将为BBC等公司创建的技术也将使较小的公司和其他组织、国家机构、慈善机构和个人能够快速创建高质量的口语内容,无论他们使用的是哪种语言或领域。在项目期间,我们将与其他类型的组织合作,以确保我们创造的技术具有广泛的吸引力,并将对广泛的公司和个人有用。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Improving speech synthesis with discourse relations
通过话语关系改进语音合成
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Aubin A
  • 通讯作者:
    Aubin A
A Hierarchical Encoder-Decoder Model for Statistical Parametric Speech Synthesis
  • DOI:
    10.21437/interspeech.2017-628
  • 发表时间:
    2017-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Ronanki;O. Watts;Simon King
  • 通讯作者:
    S. Ronanki;O. Watts;Simon King
Exemplar-based speech waveform generation for text-to-speech
用于文本转语音的基于示例的语音波形生成
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Cassia Valentini Botinhao
  • 通讯作者:
    Cassia Valentini Botinhao
Exemplar-based Speech Waveform Generation
基于范例的语音波形生成
  • DOI:
    10.21437/interspeech.2018-1857
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Watts O
  • 通讯作者:
    Watts O
Learning Interpretable Control Dimensions for Speech Synthesis by Using External Data
  • DOI:
    10.21437/interspeech.2018-2075
  • 发表时间:
    2018-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zack Hodari;O. Watts;S. Ronanki;Simon King
  • 通讯作者:
    Zack Hodari;O. Watts;S. Ronanki;Simon King
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Simon King其他文献

Improved average-voice-based speech synthesis usinggender-mixed modeling and a parameter generation algorithm considBring GV
使用性别混合建模和参数生成算法改进基于平均语音的语音合成
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Junichi Yamagishi;Takao Kobayashi;Steve Renals;Simon King;Heiga Zen;Tomoki Toda;Keiichi Tokuda
  • 通讯作者:
    Keiichi Tokuda
Estimating the spectral envelope of voiced speech using multi-frame analysis
使用多帧分析估计有声语音的频谱包络
  • DOI:
    10.21437/eurospeech.2003-27
  • 发表时间:
    2003
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Y. Shiga;Simon King
  • 通讯作者:
    Simon King
Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders
使用基于 HMM 的语音合成重建退行性言语障碍患者的声音
  • DOI:
  • 发表时间:
    2012
  • 期刊:
  • 影响因子:
    0
  • 作者:
    C. Veaux;J. Yamagishi;Simon King
  • 通讯作者:
    Simon King
Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise
基于 Glimpse Proportion 度量的 Mel 倒谱系数修正,用于提高噪声中 HMM 生成的合成语音的可懂度
Explorer Robust TTS Duration Modelling Using DNNs
Explorer 使用 DNN 进行稳健的 TTS 持续时间建模
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    G. Henter;S. Ronanki;O. Watts;M. Wester;Zhizheng Wu;Simon King
  • 通讯作者:
    Simon King

Simon King的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Simon King', 18)}}的其他基金

Automatic target cost and database design for unit-selection speech synthesis
用于单元选择语音合成的自动目标成本和数据库设计
  • 批准号:
    EP/E031447/1
  • 财政年份:
    2007
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Research Grant
Automatically-determined Unit Inventories for Unit Selection Text-to-Speech Synthesis
自动确定单位库存,用于单位选择文本到语音合成
  • 批准号:
    EP/D058139/1
  • 财政年份:
    2006
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Research Grant

相似海外基金

Development of Speech Synthesis System for Controlling Speaker Identity through Text Prompts and Visual Interfaces
通过文本提示和可视化界面控制说话人身份的语音合成系统的开发
  • 批准号:
    23K20017
  • 财政年份:
    2023
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
Emotional Text-to-Speech Synthesis with Verbal Speech and Nonverbal Vocalizations
具有言语语音和非言语发声的情感文本到语音合成
  • 批准号:
    23KJ0828
  • 财政年份:
    2023
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Everyday conversation speech synthesis
日常对话语音合成
  • 批准号:
    22K12107
  • 财政年份:
    2022
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Sustainably Developable Speech Synthesis Based on Continual Learning
基于持续学习的可持续发展语音合成
  • 批准号:
    21K21305
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
  • 批准号:
    2106928
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Standard Grant
US-German Research Proposal: ADaptive low-latency SPEEch Decoding and synthesis using intracranial signals (ADSPEED)
美德研究提案:使用颅内信号的自适应低延迟 SPEEch 解码和合成 (ADSPEED)
  • 批准号:
    2011595
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Continuing Grant
Can AI Rakugoka entertain people? -Improved expressiveness of rakugo speech synthesis and automatic generation of storytelling
AI落语可以娱乐人们吗?
  • 批准号:
    21K19808
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
Exemplar-based Expressive Speech Synthesis
基于样本的表达性语音合成
  • 批准号:
    EP/V046772/1
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Research Grant
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
  • 批准号:
    2106930
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Standard Grant
Language-independent, multi-modal, and data-efficient approaches for speech synthesis and translation
独立于语言、多模式且数据高效的语音合成和翻译方法
  • 批准号:
    21K11951
  • 财政年份:
    2021
  • 资助金额:
    $ 67.95万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了