Annotating Reference and Coreference In Dialogue Using Conversational Agents in games

在游戏中使用对话代理注释对话中的参考和共指

基本信息

  • 批准号:
    EP/W001632/1
  • 负责人:
  • 金额:
    $ 139.06万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

The development of modern neural network architectures architectures such as the encoder/decoder model and the Transformer has brought about an explosion of interest in neural models for AI systems able to engage in conversations (aka conversational agents), reflected by a spike of published work, dedicated workshops, and industry-sponsored competitions and grants. While at first these models were applied to simple chatbots, the focus of research has been shifting towards conversational agents capable of engaging in more complex and task-oriented dialogue such as restaurant booking or question answering. But the results on these tasks show that while end-to-end architectures without dedicated models for semantic interpretation can work well for chatbots, conversational agents carrying out more complex tasks require greater ablity to handle such aspects of interpretation, and some form of modelling of context. Among the aspects of natural language interpretation that require more advanced architectures are COREFERENCE and REFERENCE. For an example of the importance of coreference in dialog, consider the following except from a real-life chat conversation, where both participants continually use anaphoric expressions such as BOTH, THEY, IT, etc to refer to previously introduced entities such as Google or Microsoft.A:Are you a fan of Google or Microsoft?B:Both are excellent technology they are helpful in many ways. For the security purpose both are super.A:I'm not a huge fan of Google, but I use it a lot because I have to. I think they are a monopoly in some sense.B:Google provides online related services and products, which includes search engine and cloud computing.A:Yeah, their services are good. I'm just not a fan of intrusive they can be on our personal livesEnriching conversational agents with the ability to carry out these forms of interpretation raises two issues. First, developing models for these tasks requires specific training data: most deep-learning architectures are trained on large amounts of freely available written text. Training a coreference resolver on written text and domain-adapting it to dialogue however has proven ineffective as coreference in dialogue involves different phenomena and is more involved than coreference in text. Second, the developed architectures require specific modules that enable them to interpret coreference and reference. Our group has pioneered the use of Games-With-A-Purpose (GWAPs) to collect data for NLP, resulting in the largest NLP dataset collected using GWAPs or indeed crowdsourcing. But there is a fundamental difference between conversation and written text: the latter is designed to be read by third parties, whereas research has shown that overhearers to a conversation only acquire a partial understanding of what was said.OUR PROPOSED SOLUTION to the problem of creating large annotated datasets of coreference and reference interpretation in conversation is to collect the judgments for anaphoric and referential information via GAMES IN WHICH CONVERSATIONAL AGENTS INTERACT WITH HUMAN PLAYERS AND EVOLVE BY ACQUIRING INFORMATION FROM THEM. This idea builds on recent work by Facebook and Microsoft, among others, that pioneered the use of conversational agents in games to collect data about dialogue, and of Hockenmaier and her lab. Our agents will be deployed in gaming platforms such as LIGHT and MINECRAFT in collaboration with these labs. But whereas in previous work conversational agents only interact with the aim to improve their end-to-end behavior, in the proposed project we will develop artificial agents able to improve their ability to interpret coreference and reference by collecting judgments about these interpretation aspects via CLARIFICATION QUESTIONS to the players at appropriate moments, which can also be used to annotate a dataset.
现代神经网络结构体系结构的发展,如编码器/解码器模型和转换器,已经带来了对能够参与对话(也称为对话代理)的人工智能系统的神经模型的兴趣的爆炸式增长,反映在发表的工作、专门的研讨会以及行业赞助的比赛和资助的激增。虽然最初这些模型被应用于简单的聊天机器人,但研究的重点已经转移到能够参与更复杂和面向任务的对话的代理,如餐厅预订或回答问题。但这些任务的结果表明,虽然没有专用语义解释模型的端到端体系结构可以很好地适用于聊天机器人,但执行更复杂任务的对话代理需要更强的能力来处理这些方面的解释,以及某种形式的上下文建模。自然语言解释需要更高级的体系结构,其中包括COREFERENCE和REFERENCE。关于对话中共指关系的重要性的一个例子,除了在现实生活中的聊天对话中,两个参与者都不断地使用诸如Both、They、IT等回指短语来指代以前介绍的实体,如Google或Microsoft。A:你是Google或Microsoft的粉丝吗?B:这两个都是优秀的技术,它们在许多方面都有帮助。出于安全考虑,两者都是超级的。答:我不是谷歌的超级粉丝,但我经常使用它,因为我必须这样做。我认为他们在某种意义上是垄断的。B:谷歌提供与在线相关的服务和产品,包括搜索引擎和云计算。A:是的,他们的服务很好。我只是不喜欢侵扰我们的人,他们可能会影响我们的个人生活。让谈话代理人有能力进行这些形式的解释会带来两个问题。首先,为这些任务开发模型需要特定的训练数据:大多数深度学习架构都是在大量免费可用的书面文本上进行训练的。然而,对书面文本和领域的共指解析器进行培训--使其适应对话--已被证明是无效的,因为对话中的共指涉及不同的现象,而且比文本中的共指更复杂。其次,开发的体系结构需要特定的模块,使它们能够解释共指和引用。我们的团队率先使用有目的的游戏(GWAP)为NLP收集数据,从而产生了使用GWAP或众包收集的最大的NLP数据集。但会话和书面文本之间有一个根本的区别:后者被设计为供第三方阅读,而研究表明,监听者对所说的话只能获得部分理解。我们提出的解决方案是通过游戏收集指代和指称信息的判断,在游戏中,会话主体与人类参与者交互,并通过从参与者那里获取信息来进化。这个想法是建立在Facebook和微软等公司最近的工作基础上的,这些工作开创了在游戏中使用对话代理来收集对话数据的先河,以及霍根迈尔和她的实验室。我们的代理将与这些实验室合作,部署在LIGH和MIWARTH等游戏平台上。但是,在以前的工作中,对话代理只是为了改善他们的端到端行为而进行交互,在拟议的项目中,我们将开发人工代理,通过在适当的时刻向玩家提出澄清问题来收集对这些解释方面的判断,从而提高他们解释共指和参照的能力,这些判断也可以用来注释数据集。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
CODI-CRAC 2022 对话中的照应、桥接和话语指示语共享任务
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yu, J
  • 通讯作者:
    Yu, J
Aggregating crowdsourced and automatic judgments to scale up a corpus of anaphoric reference for fiction and Wikipedia texts
聚合众包和自动判断,以扩大小说和维基百科文本的照应参考语料库
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yu, J
  • 通讯作者:
    Yu, J
LingoTowns: A Virtual World For Natural Language Annotation and Language Learning
LingoTowns:自然语言注释和语言学习的虚拟世界
  • DOI:
    10.1145/3505270.3558323
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Madge C
  • 通讯作者:
    Madge C
Coreference Annotation of an Arabic Corpus using a Virtual World Game
使用虚拟世界游戏对阿拉伯语语料库进行共指注释
  • DOI:
    10.18653/v1/2022.wanlp-1.37
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Aliady W
  • 通讯作者:
    Aliady W
ARCIDUCA: Annotating Reference and Coreference In Dialogue Using Conversational Agents in games
ARCIDUCA:在游戏中使用会话代理注释对话中的参考和共指
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Poesio, M
  • 通讯作者:
    Poesio, M
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Massimo Poesio其他文献

Justified Sloppiness In Anaphoric Reference
照应参考中合理的草率
  • DOI:
    10.1007/978-1-4020-5958-2_2
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    1.5
  • 作者:
    Massimo Poesio;Uwe Reyle;R. Stevenson
  • 通讯作者:
    R. Stevenson
The provision of corrective feedback in a spoken dialogue CALL system
在语音对话 CALL 系统中提供纠正反馈
Bias decreases in proportion to the number of annotators
偏差与注释者数量成比例减少
  • DOI:
  • 发表时间:
    2005
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ron Artstein;Massimo Poesio
  • 通讯作者:
    Massimo Poesio
Ambiguity, Underspecification and Discourse Interpretation
歧义、不明确和话语解释
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Massimo Poesio
  • 通讯作者:
    Massimo Poesio
State-of-the-art NLP Approaches to Coreference Resolution: Theory and Practical Recipes
最先进的 NLP 共指消解方法:理论与实践秘诀

Massimo Poesio的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Massimo Poesio', 18)}}的其他基金

Creating anaphorically annotated resources through semantic wikis (AnaWiki)
通过语义 wiki 创建照应注释资源 (AnaWiki)
  • 批准号:
    EP/F00575X/1
  • 财政年份:
    2007
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Research Grant

相似海外基金

Multidisciplinary analysis of financial reference points and wellbeing
财务参考点和福祉的多学科分析
  • 批准号:
    DP240101927
  • 财政年份:
    2024
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Discovery Projects
6G-REFERENCE: 6G haRdware Enablers For cEll fRee cohEreNt Communications & sEnsing
6G 参考:无细胞一致性通信的 6G 硬件推动者
  • 批准号:
    10096702
  • 财政年份:
    2024
  • 资助金额:
    $ 139.06万
  • 项目类别:
    EU-Funded
Mutated human oncogene recombinant nucleosomes as reference materials for liquid biopsy
突变人癌基因重组核小体作为液体活检参考材料
  • 批准号:
    10090714
  • 财政年份:
    2024
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Collaborative R&D
CADMap: Creating Mapped Solid Models of Deformed As-Manufactured Geometries that Link to an Original Reference Design
CADMap:创建链接到原始参考设计的变形制造几何图形的映射实体模型
  • 批准号:
    2332264
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Standard Grant
QT Gravity for the Global Geodetic Reference Frame
全球大地测量参考系的 QT 重力
  • 批准号:
    EP/X036359/1
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Research Grant
QT Gravity for the Global Geodetic Reference Frame
全球大地测量参考系的 QT 重力
  • 批准号:
    EP/X036332/1
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Research Grant
Quantum mechanics in rotating reference frames
旋转参考系中的量子力学
  • 批准号:
    2888161
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Studentship
Literature and War: The Yugoslavia conflict in German literature with special reference to the texts by Peter Handke and Saša Stanišić
文学与战争:德国文学中的南斯拉夫冲突,特别参考彼得·汉德克和萨的文本
  • 批准号:
    23K00442
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
A reference-free computational algorithm for comprehensive somatic mosaic mutation detection
一种用于综合体细胞嵌合突变检测的无参考计算算法
  • 批准号:
    10662755
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
Development and Production of Standardized Reference Diets for Zebrafish Research
斑马鱼研究标准化参考饲料的开发和生产
  • 批准号:
    10823702
  • 财政年份:
    2023
  • 资助金额:
    $ 139.06万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了