Automating Dictionary Construction for Better Natural Language Processing

自动构建字典以实现更好的自然语言处理

基本信息

  • 批准号:
    RGPIN-2015-05615
  • 负责人:
  • 金额:
    $ 1.68万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

Language technology is an important component of many everyday software systems, such as predictive text entry on smartphones, spelling checkers in word processors, and automatic translation tools. Many language technology systems rely on dictionary-like resources that contain specialized information about words.******Language is always changing. For example, new words, such as "mansplain" and "staycation", are coined every day. Established words regularly take on new meanings, such as the new senses of "post" and "wall" related to social media. New expressions also emerge, such as "flash mob". Dictionaries must therefore be continually updated to include these new usages.******When information is missing from a language technology system's dictionary, the system won't work as well as it could otherwise. Keeping dictionaries up-to-date is therefore essential to building high quality language technology.******Dictionaries are also important tools for language learners and translators, and are valuable in terms of cultural heritage. To be most useful for these purposes, dictionaries must also be kept up-to-date.******Traditionally, lexicographers manually analyze large collections of texts to identify relevant facts about how words are used, and subsequently write dictionary entries. However, a staggering amount of text is written every day, particularly through online sources such as the web and social media. It is simply not possible for lexicographers to keep up with it all.******How then, can dictionaries be kept up-to-date?******Automatic methods are required. This research program will first consider how to best build massive document collections from the web. Drawing on recent linguistic theory and research in statistical machine learning, new natural language processing techniques will then be developed to analyze these documents to automatically "learn" the meanings of words and idiosyncratic expressions. New methods will then be developed to automatically discover how the meanings of words and expressions have changed. This research will have a particular focus on social media, because it contains an abundance of new words and usages.******This research will enable the construction of higher quality dictionaries. This will have broad benefits for language technology systems, because they often rely on dictionaries. It will also benefit traditional dictionary users, including language learners and translators.**
语言技术是许多日常软件系统的重要组成部分,例如智能手机上的预测文本输入,文字处理器中的拼写检查器和自动翻译工具。许多语言技术系统依赖于类似词典的资源,这些资源包含有关单词的专门信息。******语言总是在变化。例如,“男人说教”(mansplain)和“宅度假”(staycation)等新词每天都在被创造出来。既有词汇通常会有新的含义,比如与社交媒体相关的“post”和“wall”的新含义。新的表达也出现了,比如“快闪族”。因此,字典必须不断更新以包括这些新用法。******当语言技术系统的字典中缺少信息时,系统将无法正常工作。因此,保持词典的更新对于构建高质量的语言技术至关重要。******字典也是语言学习者和译者的重要工具,在文化遗产方面也很有价值。为了在这些方面发挥最大的作用,字典还必须保持最新。******传统上,词典编纂者手动分析大量文本,以确定有关单词如何使用的相关事实,然后编写字典条目。然而,每天都有惊人数量的文字被写出来,尤其是通过网络和社交媒体等在线资源。词典编纂者根本不可能跟上这一切。******那么,字典怎样才能保持最新呢?******需要使用自动方法。这个研究项目将首先考虑如何最好地从网络上建立大量的文档集合。利用最新的语言学理论和统计机器学习方面的研究,新的自然语言处理技术将被开发出来分析这些文档,以自动“学习”单词和特殊表达的含义。新的方法将被开发出来,自动发现单词和短语的含义是如何变化的。这项研究将特别关注社交媒体,因为它包含了大量的新词和用法。******这项研究将有助于构建更高质量的词典。这将给语言技术系统带来广泛的好处,因为它们经常依赖词典。这也将使传统的词典用户受益,包括语言学习者和翻译

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Cook, ChristopherPaul其他文献

Cook, ChristopherPaul的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Cook, ChristopherPaul', 18)}}的其他基金

Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2022
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2021
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2019
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2017
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2016
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2015
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Automatically building vocabularies from web forum text
从网络论坛文本自动构建词汇表
  • 批准号:
    485322-2015
  • 财政年份:
    2015
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Engage Grants Program
Computational models of subtractive word formation precesses
减法构词过程的计算模型
  • 批准号:
    363418-2008
  • 财政年份:
    2009
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Computational models of subtractive word formation precesses
减法构词过程的计算模型
  • 批准号:
    363418-2008
  • 财政年份:
    2008
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Research in Computational Linguistics
计算语言学研究
  • 批准号:
    318970-2005
  • 财政年份:
    2005
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Postgraduate Scholarships - Master's

相似海外基金

Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2022
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
The Construction of Multi-language Considerate Expressions Database and the Compilation of Considerate Expressions Dictionary
多语言体贴表达数据库的构建及体贴表达词典的编写
  • 批准号:
    22H00670
  • 财政年份:
    2022
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2021
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Construction and applied research of the database of high-quality Japanese example sentences available for the development of dictionary websites and applications
可供词典网站及应用开发的优质日语例句数据库建设及应用研究
  • 批准号:
    21H00535
  • 财政年份:
    2021
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2019
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Construction of Japanese Predicate-Argument Structure Dictionary for Natural Language Processing and Linguistic Analysis with Concordancer
用于自然语言处理和语言分析的日语谓词-论元结构字典的构建
  • 批准号:
    19K00552
  • 财政年份:
    2019
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Construction of the clinical concept dictionary that can record application process for medical management analysis using artificial intelligence technology
利用人工智能技术构建可记录医疗管理分析应用流程的临床概念词典
  • 批准号:
    18K09948
  • 财政年份:
    2018
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The Database Construction of Correct Use and Misuse in Considerate Expressions for the Dictionary of Japanese Considerate Expressions
日语体贴用语词典正确使用和误用体贴用语的数据库建设
  • 批准号:
    18H00680
  • 财政年份:
    2018
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Construction of a medical terms dictionary with information on word formations and meanings
构建包含构词和含义信息的医学术语词典
  • 批准号:
    18H03499
  • 财政年份:
    2018
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Automating Dictionary Construction for Better Natural Language Processing
自动构建字典以实现更好的自然语言处理
  • 批准号:
    RGPIN-2015-05615
  • 财政年份:
    2017
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了