Unsupervised Background Knowledge for Language Understanding
语言理解的无监督背景知识
基本信息
- 批准号:MR/T042001/1
- 负责人:
- 金额:$ 143.12万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Fellowship
- 财政年份:2021
- 资助国家:英国
- 起止时间:2021 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Significant progress in Artificial Intelligence (AI) has been made in recent years, and this has resulted in a huge expectation on what this technology can offer us in the future. However, there are still many challenges that must be addressed before this promise can be turned into a reality, and one of these challenges is Natural Language Processing (NLP). If a computer is ever to understand humans in a natural way and to demonstrate a level of intelligence that we would normally expect, then the problem of Language Understanding must be solved.Making computers to understand natural languages is a non-trivial task. Current approaches to language understanding rely on end-to-end supervised learning, exemplified by deep learning techniques in recent years. Typically, a corpus of relevant text is collected and then used to train the computer to perform a certain task. However, this approach may have several problems, e.g., the words extracted and used to train a computer often have implicit meanings and can be ambiguous. Consider the following two sentences, for example:(1) We found many birds during our visit to the zoo: eagles, parrots, cranes...(2) The crane was hurt and could barely move. A computer will not be able to understand from these training examples that there are in fact two types of crane (bird and machine) and the fact that only one type of crane (bird) can get hurt. It is widely recognised that handling word ambiguity and, more broadly, understanding what words mean, is a significant challenge in NLP. For instance, Google Translate, widely considered as the state-of-the-art in machine translation, fails to translate these two sentences correctly even to closely related languages such as Spanish. Generally speaking, current techniques are hard to generalize across different tasks and domains, especially in applications requiring language understanding. The proposed research intends to develop theories and novel solutions to bridge this gap by combining and leveraging lexical resources and unsupervised techniques for analysing text corpora, thereby learning the much-needed, but not-explicitly-available background knowledge. Our goal is then to seamlessly integrate this background knowledge into real-world applications for more accurate language understanding. We will exploit these techniques in different languages, making them directly applicable in important multilingual NLP tasks, including lower-resourced languages such as Welsh, and in domains with direct societal impact such as social media and health care.
近年来,人工智能(AI)取得了重大进展,这使得人们对这项技术在未来能为我们带来什么寄予了巨大的期望。然而,在这一承诺变为现实之前,仍有许多挑战必须解决,其中一个挑战是自然语言处理(NLP)。如果计算机能够以一种自然的方式理解人类,并展示出我们通常期望的智能水平,那么语言理解的问题必须得到解决。让计算机理解自然语言是一项不平凡的任务。当前的语言理解方法依赖于端到端的监督学习,近年来的深度学习技术就是一个例子。通常,收集相关文本的语料库,然后用于训练计算机执行特定任务。然而,这种方法可能有几个问题,例如,提取并用于训练计算机的单词通常具有隐含的含义,并且可能是模棱两可的。考虑下面的两个句子,例如:(1)我们在参观动物园时发现了许多鸟类:鹰、鹦鹉、鹤……起重机受了伤,几乎不能移动。计算机将无法从这些训练示例中理解实际上有两种类型的起重机(鸟和机器)以及只有一种类型的起重机(鸟)会受伤的事实。人们普遍认为,处理单词歧义,更广泛地说,理解单词的意思,是NLP中的一个重大挑战。例如,被公认为机器翻译领域最先进的谷歌Translate,即使将这两个句子翻译成西班牙语等密切相关的语言,也无法正确翻译。一般来说,当前的技术很难在不同的任务和领域之间进行泛化,特别是在需要语言理解的应用程序中。本研究旨在通过结合和利用词汇资源和无监督技术来分析文本语料库,从而学习急需但不明确可用的背景知识,从而发展理论和新的解决方案来弥补这一差距。我们的目标是将这些背景知识无缝地整合到现实世界的应用中,以获得更准确的语言理解。我们将在不同的语言中利用这些技术,使它们直接适用于重要的多语言NLP任务,包括资源较低的语言,如威尔士语,以及具有直接社会影响的领域,如社交媒体和医疗保健。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
社交媒体中强大的仇恨言论检测:跨数据集实证评估
- DOI:10.18653/v1/2023.woah-1.25
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Antypas D
- 通讯作者:Antypas D
Generative Language Models for Paragraph-Level Question Generation
- DOI:10.48550/arxiv.2210.03992
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Asahi Ushio;Fernando Alva-Manchego;José Camacho-Collados
- 通讯作者:Asahi Ushio;Fernando Alva-Manchego;José Camacho-Collados
Politics, Sentiment and Virality: A Large-Scale Multilingual Twitter Analysis in Greece, Spain and United Kingdom
- DOI:10.2139/ssrn.4166108
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Dimosthenis Antypas;A. Preece;José Camacho-Collados
- 通讯作者:Dimosthenis Antypas;A. Preece;José Camacho-Collados
Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
评估语义空间中分布假设的局限性:基于特征的关系知识和共现的影响
- DOI:10.18653/v1/2022.starsem-1.15
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Anderson M
- 通讯作者:Anderson M
Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification
- DOI:
- 发表时间:2021-11
- 期刊:
- 影响因子:0
- 作者:A. Edwards;Asahi Ushio;José Camacho-Collados;Hélène de Ribaupierre;A. Preece
- 通讯作者:A. Edwards;Asahi Ushio;José Camacho-Collados;Hélène de Ribaupierre;A. Preece
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jose Camacho Collados其他文献
AI for Analyzing Mental Health Disorders Among Social Media Users: Quarter-Century Narrative Review of Progress and Challenges
用于分析社交媒体用户心理健康障碍的人工智能:对进展与挑战的 25 年叙述性回顾
- DOI:
10.2196/59225 - 发表时间:
2024-01-01 - 期刊:
- 影响因子:6.000
- 作者:
David Owen;Amy J Lynham;Sophie E Smart;Antonio F Pardiñas;Jose Camacho Collados - 通讯作者:
Jose Camacho Collados
Federated Learning for Exploiting Annotators’ Disagreements in Natural Language Processing
利用注释者在自然语言处理中的分歧进行联邦学习
- DOI:
10.1162/tacl_a_00664 - 发表时间:
2024 - 期刊:
- 影响因子:10.9
- 作者:
Nuria Rodríguez;Eugenio Martínez Cámara;Jose Camacho Collados;M. V. Luzón;Francisco Herrera - 通讯作者:
Francisco Herrera
Jose Camacho Collados的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Leveraging Background Knowledge for Identification and Estimation of Causal Effects in the Presence of Latent Variables
利用背景知识识别和估计存在潜在变量的因果效应
- 批准号:
2210210 - 财政年份:2022
- 资助金额:
$ 143.12万 - 项目类别:
Continuing Grant
Scientific background knowledge and research transfer as exemplified by the Egyptologist Georg Steindorff (1861-1951)
以埃及古物学家格奥尔格·斯坦多夫(Georg Steindorff,1861-1951)为例的科学背景知识和研究转移
- 批准号:
233145017 - 财政年份:2012
- 资助金额:
$ 143.12万 - 项目类别:
Research Grants
Effects of background knowledge on human spatial reasoning (R 2)
背景知识对人类空间推理的影响(R 2)
- 批准号:
5403632 - 财政年份:2003
- 资助金额:
$ 143.12万 - 项目类别:
CRC/Transregios
ITR: Knowledge-Enhanced Discovery System (KEDS): Incorporating Background Knowledge for Scientific Discovery
ITR:知识增强发现系统(KEDS):纳入科学发现的背景知识
- 批准号:
0325329 - 财政年份:2003
- 资助金额:
$ 143.12万 - 项目类别:
Continuing Grant
Inductive learning in the presence of background knowledge
有背景知识的归纳学习
- 批准号:
2480-1996 - 财政年份:2000
- 资助金额:
$ 143.12万 - 项目类别:
Discovery Grants Program - Individual
CAUSAL BACKGROUND KNOWLEDGE EFFECT ON CATEGORIZATION
因果背景知识对分类的影响
- 批准号:
6258292 - 财政年份:2000
- 资助金额:
$ 143.12万 - 项目类别:
Inductive learning in the presence of background knowledge
有背景知识的归纳学习
- 批准号:
2480-1996 - 财政年份:1999
- 资助金额:
$ 143.12万 - 项目类别:
Discovery Grants Program - Individual
CAUSAL BACKGROUND KNOWLEDGE EFFECT ON CATEGORIZATION
因果背景知识对分类的影响
- 批准号:
2696658 - 财政年份:1998
- 资助金额:
$ 143.12万 - 项目类别:
EFFECTS OF CAUSAL BACKGROUND KNOWLEDGE ON CATEGORIZATION
因果背景知识对分类的影响
- 批准号:
6185823 - 财政年份:1998
- 资助金额:
$ 143.12万 - 项目类别:
Inductive learning in the presence of background knowledge
有背景知识的归纳学习
- 批准号:
2480-1996 - 财政年份:1998
- 资助金额:
$ 143.12万 - 项目类别:
Discovery Grants Program - Individual