权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Applied computational models of discourse, argument, and text

话语、论证和文本的应用计算模型

基本信息

批准号：
RGPIN-2014-06020
负责人：
Hirst, Graeme
金额：
$ 3.93万
依托单位：
University of Toronto
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2018
资助国家：
加拿大
起止时间：
2018-01-01 至 2019-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=653622
关键词：
Applied computational models discourse argument

项目摘要

Our research is in computational linguistics (CL), natural language processing (NLP), and their applications. Themes that run through the work are (i) computationally determining the structure of discourse and argumentation; (ii) computational notions of paraphrase and of the semantic distance between words, larger linguistic expressions, or documents; and (iii) the use of statistical classification methods in text analysis. We propose the following new research:**(1) Finding the structure and framing of discourse and arguments: Understanding a speaker's argument entails understanding not only what is presented as evidence and what as conclusion but also how the conclusion follows from the evidence, what the unstated premises (enthymemes) are, and what the implicit framing of the issue is. We will look in particular at opinionated texts in which argumentation structure is fairly explicit on the surface but determining enthymemes and framing is crucial to full understanding.**For automatic text analysis, we take quantifiable semantic characteristics of the speaker's presentation of a position as indicators or proxies of the framing, which can then be interpreted qualitatively. In a simple analysis, this could be merely a statistical analysis of the key concepts of the text, as denoted by content words and significant collocations -- something like a topic model. Here, however, we propose a novel, more-sophisticated analysis in which we also look at the actual argumentation structures and discourse relationships of the text and how the concepts adduced by the lower-level linguistic components are used in these structures. This will draw on and extend our recent work on discourse parsing and the identification of argumentation schemes in text. Although these are difficult tasks for which the state-of-the-art is far from perfect, we hypothesize that typical political speech contains a sufficiently well-cued discourse structure that the analyses that we can achieve, although still quite imperfect, will be usefully indicative of issue framing. **(2) Finding precedent scientific literature: Researchers often have difficulty searching for past research relevant to, or precedent to, their new or proposed research, and often resort simply to Google keyword searches, which are rarely adequate. We will develop methods for searching scientific literature that use semantic and structural relationships to find publications that are possibly relevant to a new text. We will concentrate in particular on the legacy literature of biodiversity, for which conventional keyword searches are almost invariably insufficient because in this literature, more so than most other fields of science, related concepts are often described or explained in different terms, or in completely different conceptual frameworks, from those of contemporary research. As a result, relevant legacy publications, or even whole literatures, may remain hidden to term-based methods. **This goal will not be reached in five years, but it motivates the next stage of our work because it requires bringing together many of the methods of natural language processing developed in our own and other researchers' work of the past decade or more: (a) the recognition of paraphrase and of textual entailment, and measurement of semantic similarity at the sentence level and above; (b) the automatic analysis of the structure and argumentation of scholarly papers and scientific discourse (this is a point of overlap with (1) above, but in scientific texts we expect the micro-structure to be less explicit and the macro-structure more explicit than in opinion texts). **(3) We will continue our work on automatic authorship identification; on characterizing aphasic speech; and on the philosophy of CL.

我们的研究方向是计算语言学（CL）、自然语言处理（NLP）及其应用。贯穿整个作品的主题是(i)计算确定话语和论证的结构；（ii）释义的计算概念以及单词、较大的语言表达或文档之间的语义距离；（三）统计分类方法在文本分析中的应用。我们提出以下新的研究：**(1)寻找话语和论点的结构和框架：理解说话者的论点不仅需要理解作为证据的内容和作为结论的内容，还需要理解如何从证据中得出结论，未陈述的前提（推理）是什么，以及问题的隐含框架是什么。我们将特别关注固执己见的文本，其中论证结构在表面上相当明确，但确定推理和框架对于充分理解至关重要。**对于自动文本分析，我们将说话人对立场的陈述的可量化语义特征作为框架的指示或代理，然后可以对其进行定性解释。在简单的分析中，这可能只是对文本关键概念的统计分析，由内容词和重要搭配表示——类似于主题模型。然而，在这里，我们提出了一种新颖的、更复杂的分析方法，我们还研究了文本的实际论证结构和话语关系，以及低层语言成分所引用的概念是如何在这些结构中使用的。这将借鉴并扩展我们最近在语篇分析和文本论证方案识别方面的工作。虽然这些都是困难的任务，目前的技术水平远非完美，但我们假设典型的政治演讲包含一个足够好的话语结构，我们可以实现的分析，尽管仍然很不完美，将有用地指示问题框架。**(2)寻找先例科学文献：研究人员通常很难搜索到与他们的新研究或提议的研究相关的过去研究或先例，并且经常简单地求助于谷歌关键字搜索，这是很少足够的。我们将开发搜索科学文献的方法，使用语义和结构关系来查找可能与新文本相关的出版物。我们将特别关注生物多样性的传统文献，传统的关键字搜索几乎总是不够的，因为在这些文献中，与大多数其他科学领域相比，相关概念经常用不同的术语或完全不同的概念框架来描述或解释，与当代研究不同。因此，相关的遗留出版物，甚至整个文献，可能仍然隐藏在基于术语的方法中。**这个目标不会在五年内实现，但它激励了我们下一阶段的工作，因为它需要将我们自己和其他研究人员在过去十年或更长时间的工作中开发的许多自然语言处理方法结合起来：(a)对释义和文本蕴涵的识别，以及句子及以上级别的语义相似性的测量；(b)对学术论文和科学话语的结构和论证的自动分析（这一点与上面的(1)有重叠之处，但在科学文本中，我们期望微观结构比观点文本更不明确，宏观结构更明确）。**(3)我们将继续进行自动作者身份识别的工作；失语言语的特征分析以及CL的哲学。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Hirst, Graeme其他文献

Evaluating WordNet-based measures of lexical semantic relatedness

DOI：
10.1162/coli.2006.32.1.13
发表时间：
2006-03-01
期刊：
COMPUTATIONAL LINGUISTICS
影响因子：
9.3
作者：
Budanitsky, Alexander;Hirst, Graeme
通讯作者：
Hirst, Graeme

Automatically determining cause of death from verbal autopsy narratives

DOI：
10.1186/s12911-019-0841-9
发表时间：
2019-07-09
期刊：
BMC MEDICAL INFORMATICS AND DECISION MAKING
影响因子：
3.5
作者：
Jeblee, Serena;Gomes, Mireille;Hirst, Graeme
通讯作者：
Hirst, Graeme

Providing Care Beyond Therapy Sessions With a Natural Language Processing-Based Recommender System That Identifies Cancer Patients Who Experience Psychosocial Challenges and Provides Self-care Support: Pilot Study.

通过基于自然语言处理的推荐系统，为您提供治疗课程的护理，以确定经历了社会心理挑战并提供自我保健支持的癌症患者：试点研究。

DOI：
10.2196/35893
发表时间：
2022-07-29
期刊：
JMIR CANCER
影响因子：
2.8
作者：
Leung, Yvonne W.;Park, Bomi;Heo, Rachel;Adikari, Achini;Chackochan, Suja;Wong, Jiahui;Alie, Elyse;Gancarz, Mathew;Kacala, Martyna;Hirst, Graeme;de Silva, Daswin;French, Leon;Bender, Jacqueline;Mishna, Faye;Gratzer, David;Alahakoon, Damminda;Esplen, Mary Jane
通讯作者：
Esplen, Mary Jane

Rhetorical structure and Alzheimer's disease

DOI：
10.1080/02687038.2017.1355439
发表时间：
2018-01-01
期刊：
APHASIOLOGY
影响因子：
2
作者：
Abdalla, Mohamed;Rudzicz, Frank;Hirst, Graeme
通讯作者：
Hirst, Graeme

Therapist Feedback and Implications on Adoption of an Artificial Intelligence-Based Co-Facilitator for Online Cancer Support Groups: Mixed Methods Single-Arm Usability Study.

治疗师的反馈及其对基于人工智能的在线癌症支持小组的采用的含义：混合方法单臂可用性研究。

DOI：
10.2196/40113
发表时间：
2023-06-09
期刊：
JMIR CANCER
影响因子：
2.8
作者：
Leung, Yvonne W.;Ng, Steve;Duan, Lauren;Lam, Claire;Chan, Kenith;Gancarz, Mathew;Rennie, Heather;Trachtenberg, Lianne;Chan, Kai P.;Adikari, Achini;Fang, Lin;Gratzer, David;Hirst, Graeme;Wong, Jiahui;Esplen, Mary Jane
通讯作者：
Esplen, Mary Jane