Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
基本信息
- 批准号:RGPIN-2014-06020
- 负责人:
- 金额:$ 3.93万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2014
- 资助国家:加拿大
- 起止时间:2014-01-01 至 2015-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Our research is in computational linguistics (CL), natural language processing (NLP), and their applications. Themes that run through the work are (i) computationally determining the structure of discourse and argumentation; (ii) computational notions of paraphrase and of the semantic distance between words, larger linguistic expressions, or documents; and (iii) the use of statistical classification methods in text analysis. We propose the following new research: (1) Finding the structure and framing of discourse and arguments: Understanding a speaker's argument entails understanding not only what is presented as evidence and what as conclusion but also how the conclusion follows from the evidence, what the unstated premises (enthymemes) are, and what the implicit framing of the issue is. We will look in particular at opinionated texts in which argumentation structure is fairly explicit on the surface but determining enthymemes and framing is crucial to full understanding. For automatic text analysis, we take quantifiable semantic characteristics of the speaker's presentation of a position as indicators or proxies of the framing, which can then be interpreted qualitatively. In a simple analysis, this could be merely a statistical analysis of the key concepts of the text, as denoted by content words and significant collocations -- something like a topic model. Here, however, we propose a novel, more-sophisticated analysis in which we also look at the actual argumentation structures and discourse relationships of the text and how the concepts adduced by the lower-level linguistic components are used in these structures. This will draw on and extend our recent work on discourse parsing and the identification of argumentation schemes in text. Although these are difficult tasks for which the state-of-the-art is far from perfect, we hypothesize that typical political speech contains a sufficiently well-cued discourse structure that the analyses that we can achieve, although still quite imperfect, will be usefully indicative of issue framing. (2) Finding precedent scientific literature: Researchers often have difficulty searching for past research relevant to, or precedent to, their new or proposed research, and often resort simply to Google keyword searches, which are rarely adequate. We will develop methods for searching scientific literature that use semantic and structural relationships to find publications that are possibly relevant to a new text. We will concentrate in particular on the legacy literature of biodiversity, for which conventional keyword searches are almost invariably insufficient because in this literature, more so than most other fields of science, related concepts are often described or explained in different terms, or in completely different conceptual frameworks, from those of contemporary research. As a result, relevant legacy publications, or even whole literatures, may remain hidden to term-based methods. This goal will not be reached in five years, but it motivates the next stage of our work because it requires bringing together many of the methods of natural language processing developed in our own and other researchers' work of the past decade or more: (a) the recognition of paraphrase and of textual entailment, and measurement of semantic similarity at the sentence level and above; (b) the automatic analysis of the structure and argumentation of scholarly papers and scientific discourse (this is a point of overlap with (1) above, but in scientific texts we expect the micro-structure to be less explicit and the macro-structure more explicit than in opinion texts). (3) We will continue our work on automatic authorship identification; on characterizing aphasic speech; and on the philosophy of CL.
我们的研究方向是计算语言学(CL)、自然语言处理(NLP)及其应用。贯穿整个工作的主题是(I)通过计算确定语篇和论证的结构;(Ii)释义和单词、较大的语言表达或文件之间的语义距离的计算概念;以及(Iii)在文本分析中使用统计分类方法。我们建议进行以下新的研究:(1)寻找话语和论点的结构和框架:理解说话人的论点不仅需要理解什么是作为证据提出的,什么是作为结论提出的,而且需要理解结论是如何从证据中得出的,什么是未陈述的前提(迷因),以及问题的隐含框架是什么。我们将特别关注自以为是的文本,其中的论证结构在表面上是相当明确的,但确定迷因和框架对于充分理解是至关重要的。对于自动文本分析,我们将说话人陈述位置的可量化语义特征作为框架的指示符或代理,然后可以定性地解释这些语义特征。在简单的分析中,这可能仅仅是对由实词和重要搭配表示的文本的关键概念的统计分析--类似于主题模型。然而,在这里,我们提出了一种新颖的、更复杂的分析方法,其中我们还考察了文本的实际论证结构和话语关系,以及低级语言成分引用的概念是如何在这些结构中使用的。这将借鉴和扩展我们最近在语篇分析和文本中论证方案识别方面的工作。尽管这些都是困难的任务,最先进的技术远远不能完美地完成这些任务,但我们假设,典型的政治演讲包含一个足够好的语篇结构,我们可以实现的分析,尽管仍然相当不完美,但将有益地指示问题的框架。(2)寻找先例科学文献:研究人员往往难以搜索与其新的或拟议的研究相关的或先例的过去研究,并经常简单地求助于谷歌关键词搜索,这很少是足够的。我们将开发搜索科学文献的方法,使用语义和结构关系来查找可能与新文本相关的出版物。我们将特别集中于生物多样性的传统文献,对于这些文献,传统的关键字搜索几乎总是不够的,因为在这些文献中,与大多数其他科学领域相比,相关概念的描述或解释往往与当代研究的术语或概念框架完全不同。因此,相关的遗留出版物,甚至整个文献,可能对基于术语的方法仍然是隐藏的。这一目标不会在五年内实现,但它会推动我们下一阶段的工作,因为它需要将我们自己和其他研究人员在过去十年或更长时间的工作中开发的许多自然语言处理方法结合在一起:(A)识别释义和文本蕴含,并在句子及以上层面衡量语义相似性;(B)自动分析学术论文和科学语篇的结构和论证(这是上文(1)的交叉点,但在科学文本中,我们预计微观结构不像意见文本那么明确,而宏观结构比意见文本更明确)。(3)我们将继续在作者身份自动识别、失语症语音特征和CL哲学方面的工作。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hirst, Graeme其他文献
Evaluating WordNet-based measures of lexical semantic relatedness
- DOI:
10.1162/coli.2006.32.1.13 - 发表时间:
2006-03-01 - 期刊:
- 影响因子:9.3
- 作者:
Budanitsky, Alexander;Hirst, Graeme - 通讯作者:
Hirst, Graeme
Automatically determining cause of death from verbal autopsy narratives
- DOI:
10.1186/s12911-019-0841-9 - 发表时间:
2019-07-09 - 期刊:
- 影响因子:3.5
- 作者:
Jeblee, Serena;Gomes, Mireille;Hirst, Graeme - 通讯作者:
Hirst, Graeme
Providing Care Beyond Therapy Sessions With a Natural Language Processing-Based Recommender System That Identifies Cancer Patients Who Experience Psychosocial Challenges and Provides Self-care Support: Pilot Study.
通过基于自然语言处理的推荐系统,为您提供治疗课程的护理,以确定经历了社会心理挑战并提供自我保健支持的癌症患者:试点研究。
- DOI:
10.2196/35893 - 发表时间:
2022-07-29 - 期刊:
- 影响因子:2.8
- 作者:
Leung, Yvonne W.;Park, Bomi;Heo, Rachel;Adikari, Achini;Chackochan, Suja;Wong, Jiahui;Alie, Elyse;Gancarz, Mathew;Kacala, Martyna;Hirst, Graeme;de Silva, Daswin;French, Leon;Bender, Jacqueline;Mishna, Faye;Gratzer, David;Alahakoon, Damminda;Esplen, Mary Jane - 通讯作者:
Esplen, Mary Jane
Rhetorical structure and Alzheimer's disease
- DOI:
10.1080/02687038.2017.1355439 - 发表时间:
2018-01-01 - 期刊:
- 影响因子:2
- 作者:
Abdalla, Mohamed;Rudzicz, Frank;Hirst, Graeme - 通讯作者:
Hirst, Graeme
Therapist Feedback and Implications on Adoption of an Artificial Intelligence-Based Co-Facilitator for Online Cancer Support Groups: Mixed Methods Single-Arm Usability Study.
治疗师的反馈及其对基于人工智能的在线癌症支持小组的采用的含义:混合方法单臂可用性研究。
- DOI:
10.2196/40113 - 发表时间:
2023-06-09 - 期刊:
- 影响因子:2.8
- 作者:
Leung, Yvonne W.;Ng, Steve;Duan, Lauren;Lam, Claire;Chan, Kenith;Gancarz, Mathew;Rennie, Heather;Trachtenberg, Lianne;Chan, Kai P.;Adikari, Achini;Fang, Lin;Gratzer, David;Hirst, Graeme;Wong, Jiahui;Esplen, Mary Jane - 通讯作者:
Esplen, Mary Jane
Hirst, Graeme的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Hirst, Graeme', 18)}}的其他基金
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2019
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2018
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2017
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2016
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2015
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Digging into Linked Parliamentary Data (DiLiPad)
深入挖掘关联议会数据 (DiLiPad)
- 批准号:
451265-2013 - 财政年份:2014
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Frontiers - Digging into Data
Automatically categorizing forum posts
自动对论坛帖子进行分类
- 批准号:
477227-2014 - 财政年份:2014
- 资助金额:
$ 3.93万 - 项目类别:
Engage Grants Program
Nuances of meaning, paraphrase, and argument identification in applications of natural language processing
自然语言处理应用中意义、释义和论点识别的细微差别
- 批准号:
201-2009 - 财政年份:2013
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Digging into Linked Parliamentary Data (DiLiPad)
深入挖掘关联议会数据 (DiLiPad)
- 批准号:
451265-2013 - 财政年份:2013
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Frontiers - Digging into Data
Nuances of meaning, paraphrase, and argument identification in applications of natural language processing
自然语言处理应用中意义、释义和论点识别的细微差别
- 批准号:
201-2009 - 财政年份:2012
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
物体运动对流场扰动的数学模型研究
- 批准号:51072241
- 批准年份:2010
- 资助金额:10.0 万元
- 项目类别:专项基金项目
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2022
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2021
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2020
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2019
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2019
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2018
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2018
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2017
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Computational models, algorithms and methods for comparative genomics, applied to pathogens and anopheles mosquitoes genomes
应用于病原体和按蚊基因组的比较基因组学计算模型、算法和方法
- 批准号:
RGPIN-2017-03986 - 财政年份:2017
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual
Applied computational models of discourse, argument, and text
话语、论证和文本的应用计算模型
- 批准号:
RGPIN-2014-06020 - 财政年份:2016
- 资助金额:
$ 3.93万 - 项目类别:
Discovery Grants Program - Individual