权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Natural Speech Technology

自然语音技术

基本信息

批准号：
EP/I031022/1
负责人：
Steve Renals
金额：
$ 794.6万
依托单位：
University of Edinburgh
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2011
资助国家：
英国
起止时间：
2011 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FI031022%2F1
关键词：
Natural Speech Technology

项目摘要

Humans are highly adaptable, and speech is our natural medium for informal communication. When communicating, we continuously adjust to other people, to the situation, and to the environment, using previously acquired knowledge to make this adaptation seem almost instantaneous. Humans generalise, enabling efficient communication in unfamiliar situations and rapid adaptation to new speakers or listeners. Current speech technology works well for certain controlled tasks and domains, but is far from natural, a consequence of its limited ability to acquire knowledge about people or situations, to adapt, and to generalise. This accounts for the uneasy public reaction to speech-driven systems. For example, text-to-speech synthesis can be as intelligible as human speech, but lacks expression and is not perceived as natural. Similarly, the accuracy of speech recognition systems can collapse if the acoustic environment or task domain changes, conditions which a human listener would handle easily. Research approaches to these problems have hitherto been piecemeal and as a result progress has been patchy. In contrast NST will focus on the integrated theoretical development of new joint models for speech recognition and synthesis. These models will allow us to incorporate knowledge about the speakers, the environment, the communication context and awareness of the task, and will learn and adapt from real world data in an online, unsupervised manner. This theoretical unification is already underway within the NST labs and, combined with our record of turning theory into practical state-of-the-art applications, will enable us to bring a naturalness to speech technology that is not currently attainable.The NST programme will yield technology which (1) approaches human adaptability to new communication situations, (2) is capable of personalised communication, and (3) takes account of speaker intention and expressiveness in speech recognition and synthesis. This is an ambitious vision. Its success will be measured in terms of how the theoretical development reshapes the field over the next decade, the takeup of the software systems that we shall develop, and through the impact of our exemplar interactive applications.We shall establish a strong User Group to maximise the impact of the project, with a members concerned with clinical applications, as well as more general speech technology. Members of the User Group include Toshiba, EADS Innovation Works, Cisco, Barnsley Hospital NHS Foundation Trust, and the Euan MacDonald Centre for MND Research. An important interaction with the User Group will be validating our systems on their data and tasks, discussed at an annual user workshop.

人类的适应能力很强，而语言是我们进行非正式交流的天然媒介。在交流时，我们不断地适应他人、环境和环境，利用之前获得的知识，使这种适应看起来几乎是瞬间的。人类具有广泛性，能够在不熟悉的情况下进行有效的沟通，并迅速适应新的说话者或听者。目前的语音技术对某些受控任务和领域很有效，但远不是自然的，这是因为它获取关于人或情况的知识、适应和概括的能力有限。这解释了公众对语音驱动系统的不安反应。例如，文本到语音的合成可以像人类语音一样容易理解，但缺乏表现力，也不被认为是自然的。同样，如果声学环境或任务域发生变化，语音识别系统的准确性可能会崩溃，而人类听众很容易处理这些情况。到目前为止，对这些问题的研究方法是零碎的，因此进展参差不齐。相比之下，NST将专注于语音识别和合成新联合模型的综合理论开发。这些模型将使我们能够纳入有关发言者、环境、通信环境和对任务的认识的知识，并将以在线、无监督的方式从现实世界数据中学习和调整。这一理论统一已经在NST实验室内进行，结合我们将理论转化为实际最先进应用的记录，将使我们能够为语音技术带来目前无法获得的自然。NST计划将产生以下技术：(1)接近人类对新通信环境的适应；(2)能够进行个性化交流；以及(3)在语音识别和合成中考虑说话人的意图和表达能力。这是一个雄心勃勃的愿景。它的成功将取决于理论发展如何在未来十年重塑这一领域，我们将开发的软件系统的接受度，以及我们示范的交互应用程序的影响。我们将建立一个强大的用户小组，以最大限度地发挥该项目的影响，成员包括临床应用程序和更通用的语音技术。用户组的成员包括东芝、EADS创新工场、思科、巴恩斯利医院NHS基金会信托基金和尤安·麦克唐纳医学研究中心。与用户组的一个重要互动将是验证我们的系统的数据和任务，这在年度用户研讨会上进行了讨论。

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Multi-reference WER for evaluating ASR for languages with no orthographic rule

用于评估没有拼写规则的语言的 ASR 的多参考 WER

DOI：
发表时间：
2015
期刊：
Proc IEEE ASRU
影响因子：
0
作者：
Ali A
通讯作者：
Ali A

Reactive accent interpolation through an interactive map application

通过交互式地图应用程序进行反应式重音插值

DOI：
发表时间：
2013
期刊：
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
影响因子：
0
作者：
Astrinaki M.
通讯作者：
Astrinaki M.

A system for automatic alignment of broadcast media captions using weighted finite-state transducers

DOI：
10.1109/asru.2015.7404861
发表时间：
2015-12
期刊：
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
影响因子：
0
作者：
P. Bell;S. Renals
通讯作者：
P. Bell;S. Renals

Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis

DOI：
10.1016/j.specom.2011.08.001
发表时间：
2012-02
期刊：
Speech Commun.
影响因子：
0
作者：
Sebastian Andersson;J. Yamagishi;R. Clark
通讯作者：
Sebastian Andersson;J. Yamagishi;R. Clark

A flexible front-end for HTS

DOI：
10.21437/interspeech.2014-320
发表时间：
2014
期刊：
影响因子：
0
作者：
M. Aylett;R. Dall;Arnab Ghoshal;G. Henter;Thomas Merritt
通讯作者：
M. Aylett;R. Dall;Arnab Ghoshal;G. Henter;Thomas Merritt

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Steve Renals其他文献

Are extractive text summarisation techniques portable to broadcast news?

提取文本摘要技术是否可以移植到广播新闻中？

DOI：
10.1109/asru.2003.1318489
发表时间：
2003
期刊：
2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
影响因子：
0
作者：
Heidi Christensen;Y. Gotoh;B. Kolluru;Steve Renals
通讯作者：
Steve Renals

HMM音声合成における変分ベイズ法に基づく線形回帰

HMM语音合成中基于变分贝叶斯方法的线性回归

DOI：
发表时间：
2012
期刊：
影响因子：
0
作者：
橋本佳;山岸順一;Peter Bell;Simon King;Steve Renals;徳田恵一
通讯作者：
徳田恵一

Improved average-voice-based speech synthesis usinggender-mixed modeling and a parameter generation algorithm considBring GV

使用性别混合建模和参数生成算法改进基于平均语音的语音合成

DOI：
发表时间：
2007
期刊：
影响因子：
0
作者：
Junichi Yamagishi;Takao Kobayashi;Steve Renals;Simon King;Heiga Zen;Tomoki Toda;Keiichi Tokuda
通讯作者：
Keiichi Tokuda

音声の障害患者のための音声合成枝術 : Voice banking and reconstruction

适用于语音障碍患者的语音合成技术：语音库和重建

DOI：
10.20697/jasj.67.12_587
发表时间：
2011
期刊：
影响因子：
0
作者：
山岸順一;Christophe Veaux;S. King;Steve Renals
通讯作者：
Steve Renals

Exploring the style-technique interaction in extractive summarization of broadcast news

探索广播新闻提取摘要中风格与技术的相互作用

DOI：
10.1109/asru.2003.1318490
发表时间：
2003
期刊：
2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
影响因子：
0
作者：
B. Kolluru;Heidi Christensen;Y. Gotoh;Steve Renals
通讯作者：
Steve Renals