权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Data-driven articulatory modelling: foundations for a new generation of speech synthesis

数据驱动的发音建模：新一代语音合成的基础

基本信息

批准号：
EP/E027741/1
负责人：
Steve Renals
金额：
$ 36.55万
依托单位：
University of Edinburgh
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2006
资助国家：
英国
起止时间：
2006 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FE027741%2F1
关键词：
Data driven articulatory modelling foundations

项目摘要

Technology to automatically generate artificial speech (speech synthesis) has come to sound natural enough within the past five years that its use has widened dramatically. Leaders in industry have integrated text-to-speech (TTS) systems into useful real-world applications, such as automated call-centres and call routing, telephone-based information systems (e.g. telephone banking or news services), readers for the visually impaired, and hands-free interfaces, such as car navigation systems.However, in spite of this success, state-of-the-art TTS systems are still severely limited in terms of control. In short, we can readily control what synthesisers say, but not how they say it. Therefore, although such systems are suitable for giving factual information in speech form, they are completely inadequate where a high level of expressiveness is required. By expressiveness we mean the ability to indicate questions or emphasis on selected words, or to convey emotion. Furthermore, the process of generating new synthetic voices is costly and labour-intensive. It is the aim of this project to develop an alternative to current speech synthesis technology with a comparable level of intelligibility and naturalness, but which affords far greater flexibility and control.Unit selection uses large collections of pre-recorded speech to perform synthesis by merely gluing together appropriate fragments in sequence. There is in effect little or no modelling of speech involved. In contrast, this project aims to develop a new model which is trained on pre-recorded speech and interprets it in a novel way: on the basis of its underlying articulation. The aim of this model is to produce synthetic speech which not only retains the qualities of the original speech used for training, but which also is much more versatile and therefore has the potential to be used in new and exciting ways.

在过去的五年里，自动生成人工语音的技术（语音合成）听起来已经足够自然，其使用范围已经大大扩大。工业界的领导者已经将文本到语音（TTS）系统集成到有用的实际应用中，例如自动呼叫中心和呼叫路由、基于电话的信息系统（例如电话银行或新闻服务）、视障者阅读器和免提接口（例如汽车导航系统）。简而言之，我们可以很容易地控制合成器说什么，但不能控制它们如何说，因此，尽管这种系统适合以语音形式提供事实信息，但在需要高水平表达力的地方，它们完全不够用。我们所说的表现力是指指出问题或强调所选词语的能力，或传达情感的能力。此外，产生新的合成声音的过程是昂贵和劳动密集型的。这个项目的目的是开发一种替代目前的语音合成技术，具有可比的清晰度和自然度，但它提供了更大的灵活性和控制。单元选择使用大量的预先录制的语音集合来执行合成，只需将适当的片段按顺序粘合在一起。实际上，很少或根本没有涉及语音建模。相比之下，该项目旨在开发一种新的模型，该模型在预先录制的语音上进行训练，并以一种新的方式对其进行解释：基于其潜在的清晰度。该模型的目的是产生合成语音，它不仅保留了用于训练的原始语音的质量，而且更加通用，因此具有以新的和令人兴奋的方式使用的潜力。

项目成果

期刊论文数量（8）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Predicting tongue shapes from a few landmark locations

从一些标志性位置预测舌头形状

DOI：
发表时间：
2008
期刊：
影响因子：
0
作者：
C Qin
通讯作者：
C Qin

Preliminary inversion mapping results with a new EMA corpus

DOI：
10.21437/interspeech.2009-724
发表时间：
2009
期刊：
影响因子：
0
作者：
Korin Richmond
通讯作者：
Korin Richmond

Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus

DOI：
10.21437/interspeech.2011-316
发表时间：
2011-08
期刊：
影响因子：
0
作者：
Korin Richmond;P. Hoole;Simon King
通讯作者：
Korin Richmond;P. Hoole;Simon King

HMM-based speech synthesiser using the LF-model of the glottal source

DOI：
10.1109/icassp.2011.5947405
发表时间：
2011-05
期刊：
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
影响因子：
0
作者：
João P. Cabral;S. Renals;J. Yamagishi;Korin Richmond
通讯作者：
João P. Cabral;S. Renals;J. Yamagishi;Korin Richmond

Glottal spectral separation for parametric speech synthesis

DOI：
10.21437/interspeech.2008-176
发表时间：
2008-09
期刊：
Speech Commun.
影响因子：
0
作者：
João P. Cabral;S. Renals;Korin Richmond;J. Yamagishi
通讯作者：
João P. Cabral;S. Renals;Korin Richmond;J. Yamagishi

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Steve Renals其他文献

Are extractive text summarisation techniques portable to broadcast news?

提取文本摘要技术是否可以移植到广播新闻中？

DOI：
10.1109/asru.2003.1318489
发表时间：
2003
期刊：
2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
影响因子：
0
作者：
Heidi Christensen;Y. Gotoh;B. Kolluru;Steve Renals
通讯作者：
Steve Renals