Automatic grammar engineering for endangered languages based on cross-linguistic resources

基于跨语言资源的濒危语言自动语法工程

基本信息

  • 批准号:
    1561833
  • 负责人:
  • 金额:
    $ 42.99万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-08-15 至 2021-01-31
  • 项目状态:
    已结题

项目摘要

Grammar engineering is the process of creating computer models of the grammar of languages, including how words are formed from smaller meaningful parts, how words are put together into sentences, and how the meaning of sentences is built based on their structure and the meaning of their parts. This project is automatically creating computational grammars by combining computational techniques developed for well-studied languages, data collected and annotated by field linguists and a cross-linguistic grammar resource (the LinGO Grammar Matrix). Computational grammars enrich the results of language documentation because they can be used to automatically create further annotations (of word structure, sentence structure and meaning). Text annotated in this way can be searched both for word forms or structures of interest as well as for examples which fall outside of current hypotheses, helping linguists more rapidly zero in on the data of interest. Broader impacts include the training of graduate students and the development of computational tools of potential use to groups ranging from linguists to endangered or low resource language communities. The AGGREGATION Project aims to bring the benefits of grammar engineering to the urgent task of documenting endangered languages. In Phase II, the AGGREGATION Project will pursue two related sets of goals: (1) Expanding the coverage of the cross-linguistic resource and the resulting computational grammars and (2) creating interfaces to realize the potential of the grammars, the annotations they produce, and the intermediate outputs of our grammar creation system as analytical tools for field linguists. The overall system and its interfaces are general tools, meant to bring the power of computational processing to field linguists. In order to ensure their broad applicability, the tools will be developed using three languages from different language families and different parts of the world as case studies: Chintang (a Kiranti language of Nepal), Matsigenka (an Arawak language of Peru), and Abui (an Alor-Pantar language of Indonesia). This project is supported by NSF's Robust Intelligence Program in CISE.
语法工程是创建语言语法的计算机模型的过程,包括单词如何由较小的有意义的部分组成,单词如何组合成句子,以及如何根据句子的结构和各部分的含义构建句子的意思。该项目通过结合为研究充分的语言开发的计算技术、由现场语言学家收集和注释的数据以及跨语言语法资源(LinGO语法矩阵)自动创建计算语法。计算语法丰富了语言文档的结果,因为它们可以用来自动创建进一步的注释(单词结构、句子结构和含义)。以这种方式注释的文本既可以搜索感兴趣的单词形式或结构,也可以搜索当前假设之外的例子,帮助语言学家更快地锁定感兴趣的数据。更广泛的影响包括研究生的培训和计算工具的开发,这些工具可能会被语言学家、濒危或资源匮乏的语言社区等群体使用。聚合项目旨在将语法工程的好处带到记录濒危语言的紧迫任务中。在第二阶段,聚合项目将追求两组相关的目标:(1)扩大跨语言资源和由此产生的计算语法的覆盖范围;(2)创建接口,以实现语法的潜力,它们产生的注释,以及作为领域语言学家分析工具的语法创建系统的中间输出。整个系统及其接口是通用工具,旨在为语言学家带来计算处理的能力。为了确保它们的广泛适用性,这些工具将使用来自不同语系和世界不同地区的三种语言作为案例研究:Chintang(尼泊尔的基兰蒂语),Matsigenka(秘鲁的阿拉瓦克语)和Abui(印度尼西亚的Alor-Pantar语)。本项目由美国国家科学基金会CISE稳健智能项目资助。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Emily Bender其他文献

The Syntax of Mandarin Bă: Reconsidering the Verbal Analysis

Emily Bender的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Emily Bender', 18)}}的其他基金

2015 Association for Computational Linquistics (ACL) Student Research Workshop
2015 计算语言学协会 (ACL) 学生研究研讨会
  • 批准号:
    1545471
  • 财政年份:
    2015
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Standard Grant
AGGREGATION: Automatic Generation of Grammars for Endangered Languages from Glosses and Typological Information [ctn, ing, inh]
聚合:根据词汇和类型信息自动生成濒危语言语法 [ctn、ing、inh]
  • 批准号:
    1160274
  • 财政年份:
    2012
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Standard Grant
Cyberling 2009 Workshop: Towards a Cyberinfrastructure for Linguistics
Cyber​​ling 2009 研讨会:迈向语言学的网络基础设施
  • 批准号:
    0936577
  • 财政年份:
    2009
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Standard Grant
CAREER: The Grammar Matrix: Computational Linguistic Typology
职业:语法矩阵:计算语言类型学
  • 批准号:
    0644097
  • 财政年份:
    2007
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Continuing Grant

相似海外基金

PlantSynBio: Deciphering the grammar of crop regulatory DNA for precise engineering of gene expression
PlantSynBio:破译作物调控 DNA 的语法以实现基因表达的精确工程
  • 批准号:
    2240888
  • 财政年份:
    2023
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Standard Grant
Decoding the Spatial Grammar of Developmental Signaling
解码发育信号的空间语法
  • 批准号:
    10687505
  • 财政年份:
    2023
  • 资助金额:
    $ 42.99万
  • 项目类别:
A Molecular Grammar for Guide RNAs (gRNAs) with Engineered Secondary Structures
具有工程化二级结构的向导 RNA (gRNA) 的分子语法
  • 批准号:
    10683334
  • 财政年份:
    2022
  • 资助金额:
    $ 42.99万
  • 项目类别:
A Molecular Grammar for Guide RNAs (gRNAs) with Engineered Secondary Structures
具有工程化二级结构的向导 RNA (gRNA) 的分子语法
  • 批准号:
    10511156
  • 财政年份:
    2022
  • 资助金额:
    $ 42.99万
  • 项目类别:
Grammar-Driven Genomic Data Visualization
语法驱动的基因组数据可视化
  • 批准号:
    10452031
  • 财政年份:
    2022
  • 资助金额:
    $ 42.99万
  • 项目类别:
Grammar-Driven Genomic Data Visualization
语法驱动的基因组数据可视化
  • 批准号:
    10646478
  • 财政年份:
    2022
  • 资助金额:
    $ 42.99万
  • 项目类别:
Elucidating the cis-regulatory grammar of human photoreceptors
阐明人类光感受器的顺式调节语法
  • 批准号:
    10372052
  • 财政年份:
    2020
  • 资助金额:
    $ 42.99万
  • 项目类别:
Elucidating the cis-regulatory grammar of human photoreceptors
阐明人类光感受器的顺式调节语法
  • 批准号:
    10601005
  • 财政年份:
    2020
  • 资助金额:
    $ 42.99万
  • 项目类别:
Patterns in the academic language of engineering. A corpus-based construction grammar analysis
工程学术语言中的模式。
  • 批准号:
    290777355
  • 财政年份:
    2016
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Research Grants
Reverse Engineering State Machine Hierarchies by Grammar Inference (REGI)
通过语法推理 (REGI) 进行逆向工程状态机层次结构
  • 批准号:
    EP/F065825/1
  • 财政年份:
    2009
  • 资助金额:
    $ 42.99万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了