A Unified Model of Compositional and Distributional Semantics: Theory and Applications

组合语义和分布语义的统一模型:理论与应用

基本信息

  • 批准号:
    EP/I037512/1
  • 负责人:
  • 金额:
    $ 44.01万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2012
  • 资助国家:
    英国
  • 起止时间:
    2012 至 无数据
  • 项目状态:
    已结题

项目摘要

The notion of meaning is central to many areas of Computer Science, Artificial Intelligence (AI), Linguistics, Philosophy, and Cognitive Science. A formal, mathematical account of the meaning of natural language utterances is crucial to AI, since an understanding of natural language (i.e. languages such as English, German, Chinese etc) is at the heart of much intelligent behaviour. More specifically, Natural Language Processing (NLP) --- the branch of AI concerned with the computer processing, analysis and generation of text --- requires a model of meaning for many of its tasks and applications.There have been two main approaches to modelling the meaning of language in NLP, in order that a computer can gain some "understanding" of the text. The first, the so-called compositional approach, is based on classical ideas from Philosophy and Mathematical Logic. Using a well-known principle from the 19th century logicianFrege --- that the meaning of a phrase can be determined from the meanings of its parts and how those parts are combined --- logicians have developed formal accounts of how the meaning of a sentence can be determined from the relations of words in a sentence. This idea culminated famously in Linguistics in the work of Richard Montague in the 1970s. The compositional approach addresses a fundamental problem in Linguistics -- how it is that humans are able to generate an unlimited number of sentences using a limited vocabulary. We would like computers to have a similar capacity also.The second, more recent, approach to modelling meaning in NLP focuses on the meanings of the words themselves. This is the so-called distributional approach to modelling word meanings and is based on the ideas of the "structural" linguists such as Firth from the 1950s. This idea is also sometimes related to Wittenstein's philosophy of "meaning as use". The idea is that the meanings of words can be determined by considering the contexts in which words appear in text. For example,if we take a large amount of text and see which words appear close to the word "dog", and do a similar thing for the word "cat", we will see that the contexts of dog and cat tend to share many words in common (such as walk, run, furry, pet, and so on). Whereas if we see which words appear in the context of the word "television", for example, we will find less overlap with the contexts for "dog". Mathematically we represent the contexts in a vector space, so that word meanings occupy positions in a geometrical space. We would expect to find that "dog" and "cat" are much closer in the space than "dog" and "television", indicating that "dog" and "cat" are closer in meaning than "dog" and "television".The two approaches to meaning can be roughly characterized as follows: the compositional approach is concerned with how meanings combine, but has little to say about the individual meanings of words; the distributional approach is concerned with word meanings, but has little to say about how those meanings combine. Our ambitious proposal is to exploit the strengths of the two approaches, by developing a unified model of distributional and compositional semantics. Our proposal has a central theoretical component, drawing on models of semantics from Theoretical Computer Science and Mathematical Logic. This central component which will inform, be driven by, and evaluated on tasks and applications in NLP and Information Retrieval, and also data drawn from empirical studies in Cognitive Science (thecomputational study of the mind). Hence we aim to make the following fundamental contributions:1. advance the theoretical study of meaning in Linguistics, Computer Science and Artificial Intelligence;2. develop new meaning-sensitive approaches to NLP applications which can be robustly applied to naturally occurring text.
意义的概念是计算机科学,人工智能(AI),语言学,哲学和认知科学的许多领域的核心。对自然语言话语的意义进行正式的数学解释对人工智能至关重要,因为对自然语言(即英语,德语,汉语等语言)的理解是许多智能行为的核心。更具体地说,自然语言处理(NLP)--人工智能的分支,涉及计算机处理、分析和生成文本--需要一个意义模型来完成它的许多任务和应用。在NLP中,有两种主要的方法来建模语言的意义,以便计算机可以获得对文本的一些“理解”。第一种是所谓的组合方法,它基于哲学和数理逻辑的经典思想。使用一个众所周知的原则,从19世纪世纪logicianFrege -一个短语的意义可以确定从它的部分的意义,以及这些部分是如何结合-逻辑学家已经制定了正式的帐户,一个句子的意义可以确定从一个句子中的词的关系。这一观点在20世纪70年代理查德·蒙塔古的《语言学》中达到了顶峰。组合方法解决了语言学中的一个基本问题-人类如何能够使用有限的词汇生成无限数量的句子。我们希望计算机也有类似的能力。第二种,也是最近的,在NLP中建模意义的方法关注单词本身的意义。这就是所谓的分布式方法来模拟单词的含义,它是基于20世纪50年代的“结构”语言学家如弗斯的想法。这个想法有时也与维滕施泰因的“意义即用途”的哲学有关。这个想法是,单词的含义可以通过考虑单词出现在文本中的上下文来确定。例如,如果我们获取大量文本,并查看哪些单词出现在单词“dog”附近,并对单词“cat”做类似的事情,我们将看到dog和cat的上下文往往共享许多共同的单词(例如walk,run,furry,pet等)。然而,如果我们看到哪些单词出现在单词“television”的上下文中,例如,我们会发现与“dog”的上下文中的重叠较少。在数学上,我们在向量空间中表示上下文,因此词义在几何空间中占据位置。这两种意义分析方法大致可以概括为以下几个方面:合成法关注意义的组合,关注意义的联合收割机,而不关注单个意义;分布方法关注的是词义,但对这些词义如何组合联合收割机却几乎没有涉及。我们雄心勃勃的建议是利用这两种方法的优势,通过开发一个统一的模型的分布和组合语义。我们的建议有一个核心的理论组成部分,从理论计算机科学和数理逻辑的语义模型。这个中心组成部分将通知,驱动,并在NLP和信息检索的任务和应用程序进行评估,以及从认知科学的实证研究(思维的计算研究)中得出的数据。因此,我们的目标是作出以下基本贡献:1。推进语言学、计算机科学和人工智能中意义的理论研究;2.为NLP应用开发新的意义敏感方法,这些方法可以稳健地应用于自然发生的文本。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Low-Rank Tensors for Verbs in Compositional Distributional Semantics
  • DOI:
    10.3115/v1/p15-2120
  • 发表时间:
    2015-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Daniel Fried;T. Polajnar;S. Clark
  • 通讯作者:
    Daniel Fried;T. Polajnar;S. Clark
Learning Neural Audio Embeddings for Grounding Semantics in Auditory Perception
  • DOI:
    10.1613/jair.5665
  • 发表时间:
    2017-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Douwe Kiela;S. Clark
  • 通讯作者:
    Douwe Kiela;S. Clark
Quantum Physics and Linguistics - A Compositional, Diagrammatic Discourse
量子物理学和语言学 - 组合式、图解式的论述
  • DOI:
    10.1093/acprof:oso/9780199646296.003.0013
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    CLARK S
  • 通讯作者:
    CLARK S
A Type-Driven Tensor-Based Semantics for CCG
  • DOI:
    10.3115/v1/w14-1406
  • 发表时间:
    2014-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jean Maillard;S. Clark;Edward Grefenstette
  • 通讯作者:
    Jean Maillard;S. Clark;Edward Grefenstette
Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms.
  • DOI:
    10.1371/journal.pone.0128254
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    3.7
  • 作者:
    Bentz C;Verkerk A;Kiela D;Hill F;Buttery P
  • 通讯作者:
    Buttery P
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Stephen Clark其他文献

From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models
从概念空间到量子概念:形式化和学习结构化概念模型
  • DOI:
    10.48550/arxiv.2401.08585
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sean Tull;R. A. Shaikh;Sara Sabrina Zemljič;Stephen Clark
  • 通讯作者:
    Stephen Clark
Estimating local car ownership models
  • DOI:
    10.1016/j.jtrangeo.2006.02.014
  • 发表时间:
    2007-05
  • 期刊:
  • 影响因子:
    6.1
  • 作者:
    Stephen Clark
  • 通讯作者:
    Stephen Clark
MICROSCOPIC MODELLING OF TRAFFIC MANAGEMENT MEASURES FOR GUIDED BUS OPERATION
用于引导公交车运营的交通管理措施的微观建模
  • DOI:
  • 发表时间:
    1999
  • 期刊:
  • 影响因子:
    0
  • 作者:
    R. Liu;Stephen Clark;F. Montgomery;D. Watling
  • 通讯作者:
    D. Watling
Leg posture characteristics in children with Cerebral Palsy during walking and running
  • DOI:
    10.1016/0021-9290(93)90489-2
  • 发表时间:
    1993-03-01
  • 期刊:
  • 影响因子:
  • 作者:
    Pekka Luhtanen;Esko Mälkiä;Juhani Huhtinen;Pauli Rintala;Stephen Clark
  • 通讯作者:
    Stephen Clark
A classification for English primary schools using open data
使用开放数据对英语小学进行分类
  • DOI:
    10.18335/region.v7i2.326
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    Stephen Clark;N. Lomax;M. Birkin
  • 通讯作者:
    M. Birkin

Stephen Clark的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Stephen Clark', 18)}}的其他基金

EPSRC-SFI: Non-Equilibrium Steady-States of Quantum many-body systems: uncovering universality and thermodynamics (QuamNESS)
EPSRC-SFI:量子多体系统的非平衡稳态:揭示普遍性和热力学 (QuamNESS)
  • 批准号:
    EP/T028424/1
  • 财政年份:
    2020
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Research Grant
Emerging correlations from strong driving: a tensor network projection variational Monte Carlo approach to 2D quantum lattice systems
强驱动中出现的相关性:二维量子晶格系统的张量网络投影变分蒙特卡罗方法
  • 批准号:
    EP/P025110/2
  • 财政年份:
    2018
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Research Grant
Emerging correlations from strong driving: a tensor network projection variational Monte Carlo approach to 2D quantum lattice systems
强驱动中出现的相关性:二维量子晶格系统的张量网络投影变分蒙特卡罗方法
  • 批准号:
    EP/P025110/1
  • 财政年份:
    2017
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Research Grant
Accurate and Efficient Parsing of Biomedical Text
准确高效的生物医学文本解析
  • 批准号:
    EP/E035698/1
  • 财政年份:
    2007
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Research Grant
Collaborative Research: Systems of Ordinary Differential Equations - Inverse and Non-Self-Adjoint Problems
合作研究:常微分方程组 - 反函数和非自共轭问题
  • 批准号:
    0405528
  • 财政年份:
    2004
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于术中实时影像的SAM(Segment anything model)开发AI指导房间隔穿刺位置决策的增强现实模型
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    40 万元
  • 项目类别:
应用Agent-Based-Model研究围术期单剂量地塞米松对手术切口愈合的影响及机制
  • 批准号:
    81771933
  • 批准年份:
    2017
  • 资助金额:
    50.0 万元
  • 项目类别:
    面上项目
基于Multilevel Model的雷公藤多苷致育龄女性闭经预测模型研究
  • 批准号:
    81503449
  • 批准年份:
    2015
  • 资助金额:
    18.0 万元
  • 项目类别:
    青年科学基金项目
基于非齐性 Makov model 建立病证结合的绝经后骨质疏松症早期风险评估模型
  • 批准号:
    30873339
  • 批准年份:
    2008
  • 资助金额:
    32.0 万元
  • 项目类别:
    面上项目

相似海外基金

Advanced modeling of 3-D compositional distribution in the Japan crust for geoneutrino flux prediction to constrain the bulk silicate Earth compositional model
日本地壳 3D 成分分布的高级建模,用于预测中微子通量,以约束块体硅酸盐地球成分模型
  • 批准号:
    23H01280
  • 财政年份:
    2023
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Model-based clustering of high dimensional discrete data and compositional data
高维离散数据和组合数据的基于模型的聚类
  • 批准号:
    RGPIN-2021-03812
  • 财政年份:
    2022
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    RGPIN-2020-06904
  • 财政年份:
    2022
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    RGPIN-2020-06904
  • 财政年份:
    2021
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Grants Program - Individual
Model-based clustering of high dimensional discrete data and compositional data
高维离散数据和组合数据的基于模型的聚类
  • 批准号:
    RGPIN-2021-03812
  • 财政年份:
    2021
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional learning research including a novel meta-model for eCommerce
组合学习研究,包括一种新颖的电子商务元模型
  • 批准号:
    97180
  • 财政年份:
    2021
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Collaborative R&D
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    RGPIN-2020-06904
  • 财政年份:
    2020
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    DGECR-2020-00309
  • 财政年份:
    2020
  • 资助金额:
    $ 44.01万
  • 项目类别:
    Discovery Launch Supplement
Towards a Compositional Generative Model of Human Vision
迈向人类视觉的组合生成模型
  • 批准号:
    10228003
  • 财政年份:
    2019
  • 资助金额:
    $ 44.01万
  • 项目类别:
Towards a Compositional Generative Model of Human Vision
迈向人类视觉的组合生成模型
  • 批准号:
    10458624
  • 财政年份:
    2019
  • 资助金额:
    $ 44.01万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了