Discriminative Phrase-Based Statistical Machine Translation
基于判别性短语的统计机器翻译
基本信息
- 批准号:EP/D074959/1
- 负责人:
- 金额:$ 34.31万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2007
- 资助国家:英国
- 起止时间:2007 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Statistical Machine Translation (SMT) has made great improvements over the last decade. A striking property of these systems is that they make minimal usage of linguistic knowledge about translation. All knowledge about how to translate sentences is gathered in a data-driven manner from parallel corpora (sentences paired with their translation).In tandem with this observation, projecting ahead, we can see that the volumes of parallel corpora available for traning will not increase at a substantial rate. This suggests that further progress in SMT will come from better modelling of the existing data we have: this means bringing linguistics to the translation problem.For some languages, linguistic constraints are easily obtained. For other languages, this information is less widely present. We intend seeing whether an improvement in translation can be obtained even when using impoverished knowledge sources.To successfully carry out this integration, we need a flexible framework. We shall extend an existing approach (which yields state-of-the-art results) using techniques from discriminitive machine learning techinques ( maximim entropy ). These approaches will not only allow us to easily integrate linguistics into the translation process, but should also allow us to improve upon the state-of-the-art simply from better modelling. Associated with better modelling are serious scaling problems, for which we have experience at tackling.The language pairs we shall investigate will include German-English, Arabic-English and Chinese-English.Finally, we shall compete in international Machine Translation evaluation exercises. This will involve automatic and manual evaluation of our translation quality, and will allow comparison of our approaches with that of other groups.
统计机器翻译(SMT)在过去的十年中取得了巨大的进步。这些系统的一个显著特点是,它们对有关翻译的语言知识的使用最少。所有关于如何翻译句子的知识都是以数据驱动的方式从平行语料库(句子及其翻译)中收集的。结合这一观察,我们可以看到,可供训练的平行语料库的数量不会有实质性的增长。这表明SMT的进一步进展将来自对现有数据的更好建模:这意味着将语言学引入翻译问题。对于某些语言,语言约束很容易获得。对于其他语言,这种信息的存在范围较小。我们打算看看即使在使用贫乏的知识来源的情况下,翻译是否也能得到改善。要成功地进行这种整合,我们需要一个灵活的框架。我们将使用区别机器学习技术(最大熵)中的技术来扩展现有的方法(产生最先进的结果)。这些方法不仅使我们能够轻松地将语言学融入翻译过程,而且还应该使我们能够仅仅通过更好的建模来改进最先进的技术。与更好的建模相关的是严重的伸缩问题,我们在解决这一问题方面有经验。我们将调查的语言对将包括德语-英语、阿拉伯语-英语和汉语-英语。最后,我们将参加国际机器翻译评估练习。这将包括对我们的翻译质量进行自动和手动评估,并将我们的方法与其他小组的方法进行比较。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
M Osborne其他文献
M Osborne的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('M Osborne', 18)}}的其他基金
ReDites: Real Time, Detection, Tracking, Monitoring and Interpretation of Events in Social Media
ReDites:社交媒体事件的实时、检测、跟踪、监控和解释
- 批准号:
EP/L010690/1 - 财政年份:2013
- 资助金额:
$ 34.31万 - 项目类别:
Research Grant
CROSS: Real-time Story Detection Across Multiple Massive Streams
CROSS:跨多个海量流的实时故事检测
- 批准号:
EP/J020664/1 - 财政年份:2012
- 资助金额:
$ 34.31万 - 项目类别:
Research Grant
相似海外基金
Proposal of a New Index to Measure L2 Learners' English Producing Skills: Through an Analysis of the Noun Phrase Development Process
衡量二语学习者英语生成能力的新指标的提议:通过名词短语发展过程的分析
- 批准号:
23K00705 - 财政年份:2023
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A study of nominal phrase structure in Russian: with a focus on event nominals
俄语名词短语结构研究:以事件名词为重点
- 批准号:
23K12155 - 财政年份:2023
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
COLLABORATIVE RESEARCH: Social-Emotional Analysis of the Language Environment (SEAL): Key Word & Phrase Spotting in Early Childhood Care Settings
合作研究:语言环境的社会情感分析 (SEAL):关键词
- 批准号:
2234916 - 财政年份:2023
- 资助金额:
$ 34.31万 - 项目类别:
Standard Grant
COLLABORATIVE RESEARCH: Social-Emotional Analysis of the Language Environment (SEAL): Key Word & Phrase Spotting in Early Childhood Care Settings
合作研究:语言环境的社会情感分析 (SEAL):关键词
- 批准号:
2235041 - 财政年份:2023
- 资助金额:
$ 34.31万 - 项目类别:
Standard Grant
'Prisoner of Christ': What did this phrase mean to Paul and other early Christians, and what might it mean today?
“基督的囚徒”:这句话对保罗和其他早期基督徒意味着什么,在今天又意味着什么?
- 批准号:
2710979 - 财政年份:2022
- 资助金额:
$ 34.31万 - 项目类别:
Studentship
Compiling a JFS word and phrase list using corpora
使用语料库编译 JFS 单词和短语列表
- 批准号:
21K00620 - 财政年份:2021
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Doctoral Dissertation Research: Subjacency, the Empty Category Principle (ECP), and the nature of constraints on phrase movement
博士论文研究:下属、空范畴原则(ECP)以及短语移动约束的本质
- 批准号:
2116270 - 财政年份:2021
- 资助金额:
$ 34.31万 - 项目类别:
Standard Grant
A Comparative Syntactic Study of Noun Phrase Structure and Agreement Phenomena
名词短语结构与一致现象的比较句法研究
- 批准号:
20K00679 - 财政年份:2020
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of Japanese phrase final intonation's teaching guideline for Korean learners
韩语学习者日语短语终调教学指南的制定
- 批准号:
20K13074 - 财政年份:2020
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Semantic Studies of French Noun Phrase Expressions and Their Application to Contrastive Studies
法语名词短语表达的语义研究及其在对比研究中的应用
- 批准号:
20K00569 - 财政年份:2020
- 资助金额:
$ 34.31万 - 项目类别:
Grant-in-Aid for Scientific Research (C)