CAREER: From One Language to Another

职业:从一种语言到另一种语言

基本信息

  • 批准号:
    1943418
  • 负责人:
  • 金额:
    $ 55万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-06-01 至 2021-10-31
  • 项目状态:
    已结题

项目摘要

Language technology has become an integral part of how we interact with the world of information, but sophisticated natural language processing (NLP) tools are available only for a handful of the approximately 7000 languages spoken across the world. Modern data-driven methods for developing NLP tools generally rely on the availability of enormous amounts of data for the language in question, an obstacle that may be insurmountable for many languages, especially languages lacking significant digital resources and languages with small or diminishing numbers of speakers. This project aims to remove barriers to developing NLP tools for languages with less data, developing new methods that incorporate knowledge about linguistic properties of languages into models learned from data. Learning how to build faster paths to NLP tools for new languages has the potential to rapidly advance the state of language technology for any language. In addition, the tools and knowledge developed here have the potential to speed up the description of endangered languages, helping to secure an informed record of the world's languages while there are still speakers to learn from.The imbalance in access to language technologies arises in part because current NLP models and algorithms need to learn from large amounts of training data. This project addresses that imbalance by adapting methods from cross-lingual transfer learning, in which models learned on one language are adapted and exploited to make predictions for another language. One innovation of this project is to investigate the incorporation of expert linguistic knowledge for improving model transfer. Two types of linguistic knowledge will be injected into artificial neural network models for morphological analysis and part-of-speech tagging: a) knowledge about relationships between individual languages and language families; and b) knowledge about specific linguistic properties of individual languages and language families. The models will be evaluated both intrinsically and extrinsically, the latter by studying the usefulness of the models for human linguistic analysis and as part of the language documentation and description workflow.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
语言技术已经成为我们与信息世界互动的一个组成部分,但复杂的自然语言处理(NLP)工具只适用于世界上大约7000种语言中的少数几种。开发NLP工具的现代数据驱动方法通常依赖于所讨论语言的大量数据的可用性,这对于许多语言来说可能是无法克服的障碍,特别是缺乏重要数字资源的语言以及使用者数量较少或正在减少的语言。该项目旨在消除为数据较少的语言开发NLP工具的障碍,开发新方法,将语言的语言属性知识纳入从数据中学习的模型。学习如何为新语言构建更快的NLP工具路径,有可能快速推进任何语言的语言技术状态。此外,这里开发的工具和知识有可能加快濒危语言的描述,帮助确保世界语言的知情记录,而仍然有发言人学习。语言技术的不平衡部分是因为当前的NLP模型和算法需要从大量的训练数据中学习。该项目通过调整跨语言迁移学习的方法来解决这种不平衡,在这种方法中,对一种语言学习的模型进行调整和利用,以预测另一种语言。本计画的创新点之一,是探讨如何将专家语言知识融入模型转换。两种类型的语言知识将被注入人工神经网络模型进行形态分析和词性标注:a)关于个体语言和语系之间关系的知识;以及B)关于个体语言和语系的特定语言属性的知识。该模型将进行内在和外在的评估,后者通过研究模型对人类语言分析的有用性,并作为语言文档和描述工作流程的一部分。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Alexis Palmer其他文献

Orthographic vs. Semantic Representations for Unsupervised Morphological Paradigm Clustering
无监督形态范式聚类的正交表示与语义表示
UNTLing at SemEval-2020 Task 11: Detection of Propaganda Techniques in English News Articles
UNTLing 在 SemEval-2020 任务 11:英文新闻文章中的宣传技术检测
Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few
Sigmorphon 2019 任务 2 系统描述论文:多种语言上下文中的形态分析,仅由少数人监督
Evaluating FrameNet-style semantic parsing: the role of coverage gaps in FrameNet
评估 FrameNet 风格的语义解析:覆盖间隙在 FrameNet 中的作用
Findings of the AmericasNLP 2023 Shared Task on Machine Translation into Indigenous Languages
AmericasNLP 2023 土著语言机器翻译共享任务的调查结果
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Abteen Ebrahimi;Manuel Mager;Shruti Rijhwani;Enora Rice;Arturo Oncevay;Claudia Baltazar;María Cortés;C. Montaño;J. Ortega;Rolando Coto;Hilaria Cruz;Alexis Palmer;Katharina Kann
  • 通讯作者:
    Katharina Kann

Alexis Palmer的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Alexis Palmer', 18)}}的其他基金

Collaborative Research: CCRI: New: Building a Broad Infrastructure for Uniform Meaning Representations
合作研究:CCRI:新:为统一含义表示构建广泛的基础设施
  • 批准号:
    2213805
  • 财政年份:
    2022
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
CAREER: From One Language to Another
职业:从一种语言到另一种语言
  • 批准号:
    2149404
  • 财政年份:
    2021
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant

相似海外基金

Doctoral Dissertation Research: Aspect and Event Cognition in the Acquisition and Processing of a Second Language
博士论文研究:第二语言习得和处理中的方面和事件认知
  • 批准号:
    2337763
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: Large Language Models for Biological Discoveries (LLMs4Bio)
合作研究:会议:生物发现的大型语言模型 (LLMs4Bio)
  • 批准号:
    2411529
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: Large Language Models for Biological Discoveries (LLMs4Bio)
合作研究:会议:生物发现的大型语言模型 (LLMs4Bio)
  • 批准号:
    2411530
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
REU Site: Recent Advances in Natural Language Processing
REU 网站:自然语言处理的最新进展
  • 批准号:
    2349452
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
EAGER: Accelerating decarbonization by representing catalysts with natural language
EAGER:通过用自然语言表示催化剂来加速脱碳
  • 批准号:
    2345734
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
SBIR Phase II: Intelligent Language Learning Environment
SBIR第二阶段:智能语言学习环境
  • 批准号:
    2335265
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Cooperative Agreement
Conference: Bridging Child Language Research to Practice for Language Revitalization
会议:将儿童语言研究与语言复兴实践联系起来
  • 批准号:
    2331639
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Combining eye-tracking and comparative judgments to identify proficiency differences for more effective language learning
结合眼动追踪和比较判断来识别熟练程度差异,以实现更有效的语言学习
  • 批准号:
    24K16140
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Reel Voices: Empowering Language Learners Through Filmmaking
Reel Voices:通过电影制作赋予语言学习者权力
  • 批准号:
    24K04057
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Teacher and learner views on motivation in the language classroom
教师和学习者对语言课堂动机的看法
  • 批准号:
    24K04067
  • 财政年份:
    2024
  • 资助金额:
    $ 55万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了