RI: Small: Low-Latency and High-Quality Simultaneous Translation
RI:小:低延迟、高质量同声翻译
基本信息
- 批准号:2009071
- 负责人:
- 金额:$ 45万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-08-15 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Simultaneous language translation (interpretation) is widely used in many situations including multilateral organizations such as the United Nations, international summits and conferences, and legal proceedings. However, the concurrent perception and production in two languages makes this task extremely challenging and exhausting for humans. The number of professional simultaneous interpreters is extremely limited worldwide, and they have to work in groups of two or three where each interpreter can only sustain for about 15-30 minutes. Therefore, there is a critical need to develop simultaneous translation techniques to reduce the burden of human interpreters and make this service more accessible and affordable. However, simultaneous translation is also notoriously difficult for machines and accomplishing it consistently and reliably is considered one of the holy grails of Artificial Intelligence. Various methods have been proposed to solve this problem, but with three major limitations: (a) their translation model is still a full-sentence translation model; (b) they cannot achieve short latencies such as "3-seconds delay" common in human interpretation; and (c) their systems are complicated and difficult to train. Therefore, this project aims to develop new algorithms, techniques, and datasets for high-quality simultaneous machine translation with minimum delay (low latency). The technologies developed by this project will make simultaneous translation more affordable and accessible, which will improve the efficiency of human communication across linguistic barriers. This project also supports STEM education of underrepresented minorities (who do not speak English natively) by recruiting them in machine translation studies.Based on the principal investigator's successful prior work, the key idea in this project is to discard the conventional full-sentence translation paradigm and the classical sequence-to-sequence framework which processes the full input sentence before starting to translate and are thus ill-suited to simultaneous translation. Instead, this project adopts a "prefix-to-prefix" framework which starts translation after processing only a few input words, mimicking human interpreters. Though extremely simple, this framework achieves low latency and high translation quality. Using this framework, this project aims to (1) Develop an algorithm to detect and fix anticipation mistakes on the fly, and explore new evaluation metrics that can work for translations with revisions; (2) Develop dynamic and flexible translation strategies to balance quality and latency; (3) Construct better training data for simultaneous translation by revising the reference translations in a parallel text to remove unnecessary reorderings; (4) Apply the prefix-to-prefix framework to incremental text-to-speech synthesis (TTS), thus completing the end-to-end simultaneous speech-to-speech pipeline, improve its quality and latency, and compare with human simultaneous interpreters.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
同声传译被广泛应用于许多场合,包括联合国等多边组织、国际峰会和会议以及法律诉讼。然而,同时使用两种语言进行感知和生产使得这项任务对人类来说非常具有挑战性和累人。专业同声传译人员的数量在世界范围内极其有限,他们必须以两到三人一组的方式工作,每个译员只能维持大约15-30分钟。因此,迫切需要开发同声传译技术,以减轻人工口译人员的负担,并使这项服务更容易获得和负担得起。然而,同声翻译对于机器来说也是出了名的困难,持续可靠地完成同声翻译被认为是人工智能的圣杯之一。为了解决这个问题,人们提出了各种各样的方法,但都有三个主要的局限性:(a)他们的翻译模型仍然是一个完整的句子翻译模型;(b)它们不能实现短延迟,如人类口译中常见的“3秒延迟”;(c)他们的系统很复杂,很难训练。因此,本项目旨在开发新的算法、技术和数据集,以实现最小延迟(低延迟)的高质量同步机器翻译。该项目开发的技术将使同声传译更加经济实惠,更容易获得,这将提高人类跨越语言障碍的沟通效率。该项目还通过招募机器翻译研究人员来支持代表性不足的少数民族(母语不是英语的人)的STEM教育。在本项目的基础上,本项目的主要思想是摒弃传统的整句翻译范式和经典的序列到序列的翻译框架,这些框架在翻译之前处理完整的输入句子,因此不适合同声翻译。相反,这个项目采用了“前缀到前缀”的框架,模仿人工译员,只处理几个输入的单词就开始翻译。虽然非常简单,但该框架实现了低延迟和高翻译质量。使用这个框架,这个项目的目标是:(1)开发一种算法来检测和修复动态的预期错误,并探索新的评估指标,可以用于修订的翻译;(2)制定动态灵活的翻译策略,平衡质量和延迟;(3)通过修改平行文本中的参考译文,去除不必要的重排,构建更好的同声翻译训练数据;(4)将前缀到前缀框架应用于增量文本到语音合成(TTS),从而完成端到端同步语音到语音的管道,提高其质量和延迟,并与人工同声传译进行比较。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings
通过合并伪引用并减少重新排序来改进同声翻译
- DOI:10.18653/v1/2021.emnlp-main.473
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Chen, Junkun;Zheng, Renjie;Kita, Atsuhito;Ma, Mingbo;Huang, Liang
- 通讯作者:Huang, Liang
Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR
- DOI:10.18653/v1/2021.findings-acl.406
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Junkun Chen;Mingbo Ma;Renjie Zheng;Liang Huang
- 通讯作者:Junkun Chen;Mingbo Ma;Renjie Zheng;Liang Huang
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Liang Huang其他文献
Gaussian orthogonal ensemble statistics in graphene billiards with the shape of classically integrable billiards
具有经典可积台球形状的石墨烯台球中的高斯正交系综统计
- DOI:
10.1103/physreve.94.062214 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Pei Yu;Zi-Yuan Li;Hong-Ya Xu;Liang Huang;Barbara Dietz;Celso Grebogi;Ying-Cheng Lai - 通讯作者:
Ying-Cheng Lai
Safety evaluation method of tubing strings in high-pressure, high-temperature and high-yield gas wells based on FIV analysis
基于FIV分析的高压高温高产气井管柱安全评价方法
- DOI:
10.1016/j.engfailanal.2020.105044 - 发表时间:
2020-10 - 期刊:
- 影响因子:4
- 作者:
Xiaoqiang Guo;Jun Liu;Liming Dai;Liang Huang;Anchao Wei;Dake Fang;Linlin Zeng - 通讯作者:
Linlin Zeng
Inhomogeneous deformation behaviors of oblique hole-flanging parts during electromagnetic forming
斜孔翻边件电磁成形过程中的不均匀变形行为
- DOI:
10.1016/j.jmapro.2019.12.047 - 发表时间:
2020-04 - 期刊:
- 影响因子:6.2
- 作者:
Hongliang Su;Liang Huang;Jianjun Li;Fei Ma;Huijuan Ma;Pan Huang;Hui Zhu;Fei Feng - 通讯作者:
Fei Feng
Semantic model of ship behaviour based on ontology engineering
基于本体工程的船舶行为语义模型
- DOI:
10.1049/joe.2018.8329 - 发表时间:
2018-10 - 期刊:
- 影响因子:0
- 作者:
Yimeng Zhang;Yuanqiao Wen;Fan Zhang;Chunhui Zhou;Lei Du;Liang Huang;Changshi Xiao - 通讯作者:
Changshi Xiao
A Review of the Application of Steel Slag in CO2 Fixation
钢渣固定CO2的应用综述
- DOI:
10.1002/cben.202000021 - 发表时间:
2021 - 期刊:
- 影响因子:4.8
- 作者:
Junya Wang;Mi Zhong;Pengfei Wu;Shikun Wen;Liang Huang;Ping Ning - 通讯作者:
Ping Ning
Liang Huang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Liang Huang', 18)}}的其他基金
MFB: Better Homologous Folding using Computational Linguistics and Deep Learning
MFB:使用计算语言学和深度学习更好的同源折叠
- 批准号:
2330737 - 财政年份:2024
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
RI: Small: Fast and Accurate Natural Language Parsing and Generation by Marrying Deep Learning with Dynamic Programming
RI:小型:将深度学习与动态规划相结合,快速准确地进行自然语言解析和生成
- 批准号:
1817231 - 财政年份:2018
- 资助金额:
$ 45万 - 项目类别:
Continuing Grant
EAGER: Collaborative Research: Scaling Up Discriminative Learning for Natural Language Understanding and Translation
EAGER:协作研究:扩大自然语言理解和翻译的判别学习
- 批准号:
1656051 - 财政年份:2015
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
EAGER: Collaborative Research: Scaling Up Discriminative Learning for Natural Language Understanding and Translation
EAGER:协作研究:扩大自然语言理解和翻译的判别学习
- 批准号:
1449278 - 财政年份:2014
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
SBIR Phase II: Amphiphilic Copolymers as Thickening Agents for Personal Care Products
SBIR 第二阶段:作为个人护理产品增稠剂的两亲性共聚物
- 批准号:
1430647 - 财政年份:2014
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
SBIR Phase I: Amphiphilic Copolymers as Thickening Agents for Personal Care Products
SBIR 第一阶段:作为个人护理产品增稠剂的两亲性共聚物
- 批准号:
1248253 - 财政年份:2013
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
CIF: Small: Learning Low-Dimensional Representations with Heteroscedastic Data Sources
CIF:小:使用异方差数据源学习低维表示
- 批准号:
2331590 - 财政年份:2024
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
A small steps, low-literacy, breakfast-focused dietary self-management intervention for adults with poorly controlled type 2 diabetes
针对控制不佳的 2 型糖尿病成人的小步骤、低识字率、以早餐为重点的饮食自我管理干预
- 批准号:
10417553 - 财政年份:2023
- 资助金额:
$ 45万 - 项目类别:
NeTS: Small: Low Latency Uplink Communications in Low Earth Orbit (LEO) Satellite Networks with Chirp Permutation Multiple Access (CPMA)
NeTS:小型:低地球轨道 (LEO) 卫星网络中采用线性调频排列多址 (CPMA) 的低延迟上行链路通信
- 批准号:
2312113 - 财政年份:2023
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
AF: Small: Low-Degree Methods for Optimization in Random Structures. Power and Limitations
AF:小:随机结构优化的低度方法。
- 批准号:
2233897 - 财政年份:2023
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
SHF: Small: Efficient, Deterministic and Formally Certified Methods for Solving Low-dimensional Linear Programs with Floating-point Precision
SHF:小型:用于以浮点精度求解低维线性程序的高效、确定性且经过正式认证的方法
- 批准号:
2312220 - 财政年份:2023
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
Low FODMAP food in irritable bowel syndrome and the involvement of small bowel bacterial overgrowth
低 FODMAP 食物与肠易激综合征及小肠细菌过度生长有关
- 批准号:
23K10824 - 财政年份:2023
- 资助金额:
$ 45万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
SHF: Small: Rethinking Virtualization at the Edge to Support Highly-efficient and Low-power Applications
SHF:小型:重新思考边缘虚拟化以支持高效和低功耗应用
- 批准号:
2210744 - 财政年份:2022
- 资助金额:
$ 45万 - 项目类别:
Standard Grant
Design of future low Earth orbit small satellites by combining aerodynamic force and solar radiation pressure
气动力与太阳辐射压相结合的未来近地轨道小卫星设计
- 批准号:
22J13958 - 财政年份:2022
- 资助金额:
$ 45万 - 项目类别:
Grant-in-Aid for JSPS Fellows
SBIR Phase II: Internal Combustion Engines as Small Scale Chemical Plants for Compact, Low Cost Gas-to-Liquids Systems to Reduce Methane Flaring
SBIR 第二阶段:内燃机作为小型化工厂,用于紧凑、低成本的气转液系统,以减少甲烷火炬
- 批准号:
2136751 - 财政年份:2022
- 资助金额:
$ 45万 - 项目类别:
Cooperative Agreement
Collaborative Research: SHF: Small: Exploiting Performance Correlations for Accurate and Low-cost Performance Testing for Serverless Computing
协作研究:SHF:小型:利用性能相关性对无服务器计算进行准确且低成本的性能测试
- 批准号:
2155096 - 财政年份:2022
- 资助金额:
$ 45万 - 项目类别:
Standard Grant