Computational Aspects of Discourse Analysis
话语分析的计算方面
基本信息
- 批准号:RGPIN-2014-05540
- 负责人:
- 金额:$ 2.33万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2016
- 资助国家:加拿大
- 起止时间:2016-01-01 至 2017-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
MOTIVATION
"Writing an NSERC DG application takes time... Jane wrote hers in two months."
In a coherent text, textual units are not understood in isolation but in relation with each other through discourse relations that may or may not be explicitly marked. The fact that Jane wrote her application in two months, "illustrates" that writing a NSERC DG application is long! Research on discourse analysis tries to model such relations, which allows us to interpret the text and understand the communicative purpose of its units. This, in turn, is useful for many Natural Language Processing applications such as automatic summarization, text simplification...
LONG TERM OBJECTIVE
Discourse coherence refers to the logical connections between textual units (for example, "illustration", "result", "purpose"...). Although, much work has been done in recent years, computational discourse analysis is still in its infancy and many important open questions still remain. The long term objective of this research program is to explore computational aspects of automatic discourse analysis.
SCIENTIFIC APPROACH
This research will explore 3 specific questions:
1) The exploration of the effect of genre on discourse tagging: Here, we will explore how the textual genre (e.g. political, procedural, scientific,...) affects the use of discourse relations. There are significant differences in the linguistic realization of relations across domains and genres that need to be captured and modeled. The types of questions we will address include: What is the influence of the textual genre on the usage of relations and choice of discourse markers? What is the interaction between global discourse structures and local relations? Can we identify stereotypical patterns of local relations in particular textual genres? This work will identify and measure the correlation between linguistic features (lexical, contextual, syntactic and semantic information) and textual genres which will be used to tailor discourse taggers to a specific genre.
2) Unsupervised discourse tagging across languages: Here we will investigate how the usage and linguistic realization of relations vary across languages. We will perform a cross-lingual comparison of discourse relations in English and French, and focus on two questions: Is the usage of discourse relations language-independent? How do discourse relations align across languages? This work will result in a large cross-lingual resource which we will make available to the research community and will be used to induce discourse parsers in one language from other parsers in another language.
3) The use of discourse analysis for text simplification: Here we will explore how discourse analysis can be used to improve simplify texts automatically to make them accessible to a wider audience regardless of their language skills. In particular, we will address two questions: Can we achieve text simplification by signaling implicit relations more explicitly? Can we achieve text simplification by pruning less informative text spans?
SIGNIFICANCE OF THE WORK
With the current state of the art in Natural Language Processing research, work on computational aspects of discourse analysis has now become possible and is growing much interest in the research community. Addressing this issue will allow us to go beyond understanding the literal meaning of a text, but interpret its "deep" meaning or communicative intention. Concretely, our work can be beneficial to all sectors of the Canadian language industry: in speech processing, machine-aided translation, content management, language e-learning... With all these potential applications, I am confident that our work can easily be transferred to the Canadian language industry.
动机
“撰写 NSERC DG 申请需要时间……Jane 用了两个月的时间写了她的申请。”
在连贯的文本中,文本单元不是孤立地理解的,而是通过可能或可能没有明确标记的话语关系相互关联。 Jane 在两个月内写完申请的事实“说明”撰写 NSERC DG 申请是漫长的!话语分析研究试图对这种关系进行建模,这使我们能够解释文本并理解其单元的交际目的。反过来,这对于许多自然语言处理应用程序很有用,例如自动摘要、文本简化......
长期目标
话语连贯性是指文本单元之间的逻辑联系(例如“插图”、“结果”、“目的”……)。尽管近年来已经做了很多工作,但计算话语分析仍处于起步阶段,许多重要的悬而未决的问题仍然存在。该研究计划的长期目标是探索自动话语分析的计算方面。
科学方法
这项研究将探讨 3 个具体问题:
1)探索体裁对话语标注的影响:在这里,我们将探讨文本体裁(例如政治的、程序的、科学的……)如何影响话语关系的使用。需要捕获和建模的跨领域和流派关系的语言实现存在显着差异。我们要解决的问题类型包括:文本体裁对关系的使用和话语标记的选择有什么影响?全球话语结构与地方关系之间有何互动?我们能否识别特定文本体裁中当地关系的刻板模式?这项工作将识别和测量语言特征(词汇、上下文、句法和语义信息)和文本体裁之间的相关性,这将用于根据特定体裁定制话语标签。
2)跨语言的无监督话语标记:在这里我们将研究不同语言之间关系的使用和语言实现如何变化。我们将对英语和法语的语篇关系进行跨语言比较,重点关注两个问题:语篇关系的使用是否与语言无关?跨语言的话语关系如何协调?这项工作将产生大量的跨语言资源,我们将向研究界提供这些资源,并将用于从另一种语言的其他解析器中诱导一种语言的话语解析器。
3)使用话语分析进行文本简化:在这里,我们将探讨如何使用话语分析来自动改进简化文本,使更广泛的受众能够理解它们,无论他们的语言技能如何。特别是,我们将解决两个问题:我们能否通过更明确地表示隐含关系来实现文本简化?我们可以通过修剪信息较少的文本跨度来实现文本简化吗?
工作的意义
随着自然语言处理研究的最新水平,话语分析的计算方面的工作现已成为可能,并且研究界越来越感兴趣。解决这个问题将使我们超越理解文本的字面含义,而是解释其“深层”含义或交际意图。具体来说,我们的工作可以惠及加拿大语言行业的各个领域:语音处理、机器辅助翻译、内容管理、语言电子学习……凭借所有这些潜在的应用,我相信我们的工作可以轻松转移到加拿大语言行业。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kosseim, Leila其他文献
Kosseim, Leila的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kosseim, Leila', 18)}}的其他基金
Computational Discourse Analysis
计算话语分析
- 批准号:
RGPIN-2020-05542 - 财政年份:2022
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Discourse Analysis
计算话语分析
- 批准号:
RGPIN-2020-05542 - 财政年份:2021
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Discourse Analysis
计算话语分析
- 批准号:
RGPIN-2020-05542 - 财政年份:2020
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2019
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2017
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Using semantic similarity to improve the automatic mining of known attack patterns from security-related events
使用语义相似性改进从安全相关事件中自动挖掘已知攻击模式
- 批准号:
500825-2016 - 财政年份:2016
- 资助金额:
$ 2.33万 - 项目类别:
Engage Grants Program
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2015
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2014
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Answering opinion questions from blogs
回答博客中的意见问题
- 批准号:
222852-2008 - 财政年份:2013
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Answering opinion questions from blogs
回答博客中的意见问题
- 批准号:
222852-2008 - 财政年份:2012
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
基于构件软件的面向可靠安全Aspects建模和一体化开发方法研究
- 批准号:60503032
- 批准年份:2005
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Aspects of the Discourse on Women in Late 18th Century Britain: Utilitarian Feminism.
18 世纪晚期英国妇女话语的各个方面:功利主义女权主义。
- 批准号:
19K01570 - 财政年份:2019
- 资助金额:
$ 2.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2019
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2017
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Social and Pedagogical Aspects of Mindfulness Training, a discourse analysis
正念训练的社会和教学方面,话语分析
- 批准号:
1946246 - 财政年份:2017
- 资助金额:
$ 2.33万 - 项目类别:
Studentship
CAREER: Characterizing Critical Aspects of Mathematics Classroom Discourse
职业:描述数学课堂讨论的关键方面
- 批准号:
1649979 - 财政年份:2016
- 资助金额:
$ 2.33万 - 项目类别:
Continuing Grant
Prague and Dublin: discourse spaces on ghosts and media. Aspects of "translation" between cultures
布拉格和都柏林:关于鬼魂和媒体的话语空间。
- 批准号:
16H03398 - 财政年份:2016
- 资助金额:
$ 2.33万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2015
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
Computational Aspects of Discourse Analysis
话语分析的计算方面
- 批准号:
RGPIN-2014-05540 - 财政年份:2014
- 资助金额:
$ 2.33万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Characterizing Critical Aspects of Mathematics Classroom Discourse
职业:描述数学课堂讨论的关键方面
- 批准号:
1149313 - 财政年份:2012
- 资助金额:
$ 2.33万 - 项目类别:
Continuing Grant
CAREER: Characterizing Critical Aspects of Mathematics Classroom Discourse
职业:描述数学课堂讨论的关键方面
- 批准号:
1265677 - 财政年份:2012
- 资助金额:
$ 2.33万 - 项目类别:
Continuing Grant