Natural Speech Technology

自然语音技术

基本信息

  • 批准号:
    EP/I031022/1
  • 负责人:
  • 金额:
    $ 794.6万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2011
  • 资助国家:
    英国
  • 起止时间:
    2011 至 无数据
  • 项目状态:
    已结题

项目摘要

Humans are highly adaptable, and speech is our natural medium for informal communication. When communicating, we continuously adjust to other people, to the situation, and to the environment, using previously acquired knowledge to make this adaptation seem almost instantaneous. Humans generalise, enabling efficient communication in unfamiliar situations and rapid adaptation to new speakers or listeners. Current speech technology works well for certain controlled tasks and domains, but is far from natural, a consequence of its limited ability to acquire knowledge about people or situations, to adapt, and to generalise. This accounts for the uneasy public reaction to speech-driven systems. For example, text-to-speech synthesis can be as intelligible as human speech, but lacks expression and is not perceived as natural. Similarly, the accuracy of speech recognition systems can collapse if the acoustic environment or task domain changes, conditions which a human listener would handle easily. Research approaches to these problems have hitherto been piecemeal and as a result progress has been patchy. In contrast NST will focus on the integrated theoretical development of new joint models for speech recognition and synthesis. These models will allow us to incorporate knowledge about the speakers, the environment, the communication context and awareness of the task, and will learn and adapt from real world data in an online, unsupervised manner. This theoretical unification is already underway within the NST labs and, combined with our record of turning theory into practical state-of-the-art applications, will enable us to bring a naturalness to speech technology that is not currently attainable.The NST programme will yield technology which (1) approaches human adaptability to new communication situations, (2) is capable of personalised communication, and (3) takes account of speaker intention and expressiveness in speech recognition and synthesis. This is an ambitious vision. Its success will be measured in terms of how the theoretical development reshapes the field over the next decade, the takeup of the software systems that we shall develop, and through the impact of our exemplar interactive applications.We shall establish a strong User Group to maximise the impact of the project, with a members concerned with clinical applications, as well as more general speech technology. Members of the User Group include Toshiba, EADS Innovation Works, Cisco, Barnsley Hospital NHS Foundation Trust, and the Euan MacDonald Centre for MND Research. An important interaction with the User Group will be validating our systems on their data and tasks, discussed at an annual user workshop.
人类的适应能力很强,而语言是我们进行非正式交流的天然媒介。在交流时,我们不断地适应他人、环境和环境,利用之前获得的知识,使这种适应看起来几乎是瞬间的。人类具有广泛性,能够在不熟悉的情况下进行有效的沟通,并迅速适应新的说话者或听者。目前的语音技术对某些受控任务和领域很有效,但远不是自然的,这是因为它获取关于人或情况的知识、适应和概括的能力有限。这解释了公众对语音驱动系统的不安反应。例如,文本到语音的合成可以像人类语音一样容易理解,但缺乏表现力,也不被认为是自然的。同样,如果声学环境或任务域发生变化,语音识别系统的准确性可能会崩溃,而人类听众很容易处理这些情况。到目前为止,对这些问题的研究方法是零碎的,因此进展参差不齐。相比之下,NST将专注于语音识别和合成新联合模型的综合理论开发。这些模型将使我们能够纳入有关发言者、环境、通信环境和对任务的认识的知识,并将以在线、无监督的方式从现实世界数据中学习和调整。这一理论统一已经在NST实验室内进行,结合我们将理论转化为实际最先进应用的记录,将使我们能够为语音技术带来目前无法获得的自然。NST计划将产生以下技术:(1)接近人类对新通信环境的适应;(2)能够进行个性化交流;以及(3)在语音识别和合成中考虑说话人的意图和表达能力。这是一个雄心勃勃的愿景。它的成功将取决于理论发展如何在未来十年重塑这一领域,我们将开发的软件系统的接受度,以及我们示范的交互应用程序的影响。我们将建立一个强大的用户小组,以最大限度地发挥该项目的影响,成员包括临床应用程序和更通用的语音技术。用户组的成员包括东芝、EADS创新工场、思科、巴恩斯利医院NHS基金会信托基金和尤安·麦克唐纳医学研究中心。与用户组的一个重要互动将是验证我们的系统的数据和任务,这在年度用户研讨会上进行了讨论。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multi-reference WER for evaluating ASR for languages with no orthographic rule
用于评估没有拼写规则的语言的 ASR 的多参考 WER
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ali A
  • 通讯作者:
    Ali A
Reactive accent interpolation through an interactive map application
通过交互式地图应用程序进行反应式重音插值
A system for automatic alignment of broadcast media captions using weighted finite-state transducers
Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis
  • DOI:
    10.1016/j.specom.2011.08.001
  • 发表时间:
    2012-02
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sebastian Andersson;J. Yamagishi;R. Clark
  • 通讯作者:
    Sebastian Andersson;J. Yamagishi;R. Clark
A flexible front-end for HTS
  • DOI:
    10.21437/interspeech.2014-320
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    M. Aylett;R. Dall;Arnab Ghoshal;G. Henter;Thomas Merritt
  • 通讯作者:
    M. Aylett;R. Dall;Arnab Ghoshal;G. Henter;Thomas Merritt
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Steve Renals其他文献

Are extractive text summarisation techniques portable to broadcast news?
提取文本摘要技术是否可以移植到广播新闻中?
HMM音声合成における変分ベイズ法に基づく線形回帰
HMM语音合成中基于变分贝叶斯方法的线性回归
  • DOI:
  • 发表时间:
    2012
  • 期刊:
  • 影响因子:
    0
  • 作者:
    橋本佳;山岸順一;Peter Bell;Simon King;Steve Renals;徳田恵一
  • 通讯作者:
    徳田恵一
Improved average-voice-based speech synthesis usinggender-mixed modeling and a parameter generation algorithm considBring GV
使用性别混合建模和参数生成算法改进基于平均语音的语音合成
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Junichi Yamagishi;Takao Kobayashi;Steve Renals;Simon King;Heiga Zen;Tomoki Toda;Keiichi Tokuda
  • 通讯作者:
    Keiichi Tokuda
音声の障害患者のための音声合成枝術 : Voice banking and reconstruction
适用于语音障碍患者的语音合成技术:语音库和重建
  • DOI:
    10.20697/jasj.67.12_587
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    山岸 順一;Christophe Veaux;S. King;Steve Renals
  • 通讯作者:
    Steve Renals
Exploring the style-technique interaction in extractive summarization of broadcast news
探索广播新闻提取摘要中风格与技术的相互作用

Steve Renals的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Steve Renals', 18)}}的其他基金

SpeechWave
语音波
  • 批准号:
    EP/R012180/1
  • 财政年份:
    2018
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Research Grant
Ultrax2020: Ultrasound Technology for Optimising the Treatment of Speech Disorders.
Ultrax2020:优化言语障碍治疗的超声技术。
  • 批准号:
    EP/P02338X/1
  • 财政年份:
    2017
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Research Grant
Ultrax: Real-time tongue tracking for speech therapy using ultrasound
Ultrax:使用超声波进行言语治疗的实时舌头追踪
  • 批准号:
    EP/I027696/1
  • 财政年份:
    2011
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Research Grant
MultiMemoHome: Multimodal Reminders Within the Home
MultiMemoHome:家庭内的多模式提醒
  • 批准号:
    EP/G060614/1
  • 财政年份:
    2009
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Research Grant
Data-driven articulatory modelling: foundations for a new generation of speech synthesis
数据驱动的发音建模:新一代语音合成的基础
  • 批准号:
    EP/E027741/1
  • 财政年份:
    2006
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Research Grant

相似海外基金

Disrupter or enabler? Assessing the impact of using automatic speech recognition technology in interpreter-mediated legal proceedings
颠覆者还是推动者?
  • 批准号:
    2889440
  • 财政年份:
    2023
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Studentship
Establishment of intraoperative education model using speech recognition and language information processing technology
利用语音识别和语言信息处理技术建立术中教育模型
  • 批准号:
    23K16281
  • 财政年份:
    2023
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Collaborative Research: Improving Speech Technology for Better Learning Outcomes: The Case of AAE Child Speakers
合作研究:改进语音技术以获得更好的学习成果:AAE 儿童演讲者的案例
  • 批准号:
    2202049
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Standard Grant
Collaborative Research: Improving speech technology for better learning outcomes: the case of AAE child speakers
协作研究:改进语音技术以获得更好的学习成果:AAE 儿童扬声器的案例
  • 批准号:
    2202467
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Standard Grant
Collaborative Research: Improving Speech Technology for Better Learning Outcomes: The Case of AAE Child Speakers
合作研究:改进语音技术以获得更好的学习成果:AAE 儿童演讲者的案例
  • 批准号:
    2202474
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Standard Grant
SBIR Phase II: Software Technology for Improved Perception of Speech/Audio to Self Personalize Hearing Aids/Devices
SBIR 第二阶段:改善语音/音频感知以自我个性化助听器/设备的软件技术
  • 批准号:
    2154649
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Cooperative Agreement
Collaborative Research: Improving speech technology for better learning outcomes: the case of AAE child speakers
协作研究:改进语音技术以获得更好的学习成果:AAE 儿童扬声器的案例
  • 批准号:
    2202585
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Standard Grant
Speech recognition technology for language documentation: a case study on Sakhalin Ainu
语言文献的语音识别技术:以萨哈林岛阿伊努语为例
  • 批准号:
    22K17952
  • 财政年份:
    2022
  • 资助金额:
    $ 794.6万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
The effects of telepractice technology on dysarthric speech evaluation
远程治疗技术对构音障碍言语评估的影响
  • 批准号:
    10383726
  • 财政年份:
    2021
  • 资助金额:
    $ 794.6万
  • 项目类别:
The effects of telepractice technology on dysarthric speech evaluation
远程治疗技术对构音障碍言语评估的影响
  • 批准号:
    10196408
  • 财政年份:
    2021
  • 资助金额:
    $ 794.6万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了