NNA: Collaborative Research: Integrating Language Documentation and Computational Tools for Yupik, an Alaska Native Language

NNA:协作研究:集成阿拉斯加母语 Yupik 的语言文档和计算工具

基本信息

  • 批准号:
    1761680
  • 负责人:
  • 金额:
    $ 30.44万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-08-01 至 2022-10-31
  • 项目状态:
    已结题

项目摘要

One locus of crosslinguistic variation in how languages build words is whether meaning is encoded in free morphemes ('units of meaning') that stand alone as words, or whether those morphemes must combine with other morphemes to become words. While English has many free morphemes, the Alaska Native language St. Lawrence Island/Siberian Yupik uses the second strategy with very complex words, often sentence-sized. These properties are known as agglutination and polysynthesis. Researchers will document critical structures in the language, digitize existing Yupik materials, and build computational tools to help the community and other researchers. The data from Yupik are extremely important to language science, since many of the phenomena displayed in the language are rare and not well understood. Creating computational tools for languages with very complex words, like Yupik, is of additional benefit to computer scientists and language scientists in that it helps researchers improve computational tools for languages like English. The Native American Languages Act, passed by the U.S. Congress in 1990, enacted into policy the recognition of the unique status and importance of Native American languages. This project will build and improve tools like a morphological analyzer, a spellchecker, and a searchable dictionary, of value to the community in revitalizing their language. Graduate students will be trained in these methods, and researchers will hold outreach meetings with high school students in the language community to teach them important computer and coding skills that will enable them to build further tools. All data gathered will be permanently archived at the Alaska Native Language Archive.The investigators, a collaboration of language and computer scientists from the University of Illinois at Urbana-Champaign and George Mason University, will undertake this project. It involves three interconnected parts: digitization of existing materials on and in Yupik for use by community members and researchers; recording and analyzing the speech of Yupik speakers; and working with the community to build computer tools for Yupik and teaching students how to do so. A successful computational model of Yupik linguistic phenomena has implications for unsupervised and semi-supervised methods in morphology induction and grammar induction because the types of morphophonological change are pervasive, much more so than models used in other approaches to unsupervised morphology induction. This work is likely to have important implications regarding appropriate computational modeling of polysynthetic agglutinative morphosyntax. Accessing materials at several archives, the team will scan them, and clean and process the scans so they are accessible digitally and searchable. This will create a digital corpus of Yupik materials for use by the community and for linguistic investigations into grammatical mood, tense, and aspect to better understand these complex morphosemantic constructions. The data will also improve the computational tools being developed in this project, providing the Yupik community with access to modern tools like spellcheckers, electronically searchable dictionaries, and electronic books. Finally, in its tight integration of field work and the development of computational tools for the analysis of the language, this project will serve as a model for future collaborations of this kind.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
语言如何构词的一个跨语言变异的轨迹是,意义是否编码在独立于单词的自由语素(意义单位)中,或者这些语素是否必须与其他语素结合才能成为单词。虽然英语有很多自由语素,但阿拉斯加本土语言圣劳伦斯岛/西伯利亚尤皮克人使用第二种策略,使用非常复杂的单词,通常是句子大小。这些性质被称为凝集和多合成。研究人员将用这种语言记录关键结构,将现有的尤皮克材料数字化,并建立计算工具来帮助社区和其他研究人员。来自尤皮克的数据对语言科学极其重要,因为语言中显示的许多现象都是罕见的,也没有被很好地理解。为像Yupik这样具有非常复杂单词的语言创建计算工具,对计算机科学家和语言科学家来说是额外的好处,因为它帮助研究人员改进了像英语这样的语言的计算工具。美国国会1990年通过的《美洲原住民语言法案》将承认美洲原住民语言的独特地位和重要性纳入了政策。这个项目将建立和改进诸如词法分析器、拼写检查器和可搜索词典等工具,这些工具对社区振兴他们的语言有价值。研究生将接受这些方法的培训,研究人员将与语言界的高中生举行外联会议,向他们传授重要的计算机和编码技能,使他们能够开发更多工具。所有收集的数据将永久保存在阿拉斯加土著语言档案馆。调查人员由伊利诺伊大学厄巴纳-香槟分校和乔治梅森大学的语言和计算机科学家合作,将承担这一项目。它涉及三个相互关联的部分:将尤皮克的现有材料数字化,供社区成员和研究人员使用;录制和分析说尤皮克人的演讲;与社区合作,为尤皮克人制作计算机工具,并教学生如何这样做。Yupik语言现象的一个成功的计算模型对词法归纳和语法归纳中的无监督和半监督方法具有启示意义,因为形态变化的类型是普遍存在的,比其他无监督形态归纳方法中使用的模型要普遍得多。这项工作可能对多合成凝集性形态合成的适当计算建模具有重要意义。在访问几个档案馆的材料时,该团队将对它们进行扫描,并对扫描进行清理和处理,以便可以数字方式访问和搜索。这将创建一个尤皮克语料的数字语料库,供社区使用,并用于语法语气、时态和体的语言学研究,以更好地理解这些复杂的形态语义结构。这些数据还将改进该项目正在开发的计算工具,使尤皮克社区能够使用拼写检查器、电子可搜索词典和电子书等现代工具。最后,通过将实地工作和语言分析计算工具的开发紧密结合在一起,该项目将成为未来此类合作的典范。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(15)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Liinnaqumalghiit: A Web-based Tool for Addressing Orthographic Transparency in St. Lawrence Island / Central Siberian Yupik.
Liinnaqumalghiit:一种基于网络的工具,用于解决圣劳伦斯岛/中西伯利亚尤皮克的正字法透明度问题。
Neural Polysynthetic Language Modelling
  • DOI:
  • 发表时间:
    2020-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lane Schwartz;Francis M. Tyers;Lori S. Levin;Christo Kirov;Patrick Littell;Chi-kiu (羅致翹) Lo;Emily Prudhommeaux;Hyunji Hayley Park;K. Steimel;Rebecca Knowles;J. Micher;Lonny Strunk;Han Liu;Coleman Haley;Katherine J. Zhang;Robbie Jimmerson;Vasilisa Andriyanets;Aldrian Obaja Muis;Naoki Otani;J. Park;Zhisong Zhang
  • 通讯作者:
    Lane Schwartz;Francis M. Tyers;Lori S. Levin;Christo Kirov;Patrick Littell;Chi-kiu (羅致翹) Lo;Emily Prudhommeaux;Hyunji Hayley Park;K. Steimel;Rebecca Knowles;J. Micher;Lonny Strunk;Han Liu;Coleman Haley;Katherine J. Zhang;Robbie Jimmerson;Vasilisa Andriyanets;Aldrian Obaja Muis;Naoki Otani;J. Park;Zhisong Zhang
A Morphological Analyzer for St. Lawrence Island / Central Siberian Yupik
圣劳伦斯岛/中西伯利亚尤皮克形态分析仪
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Chen, Emily;Schwartz, Lane
  • 通讯作者:
    Schwartz, Lane
Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik
Morphology Matters: A Multilingual Language Modeling Analysis
形态学很重要:多语言语言建模分析
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lane Schwartz其他文献

Fast, Scalable Phrase-Based SMT Decoding
快速、可扩展的基于短语的 SMT 解码
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hieu T. Hoang;Nikolay Bogoychev;Lane Schwartz;Marcin Junczys
  • 通讯作者:
    Marcin Junczys
The MIT-LL/AFRL IWSLT-2013 MT system
MIT-LL/AFRL IWSLT-2013 机器翻译系统
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Michaeel Kazi;Michael Coury;Elizabeth Salesky;Jessica Ray;Wade Shen;Terry P. Gleason;Tim Anderson;Grant Erdmann;Lane Schwartz;Brian M. Ore;Raymond E. Slyh;Jeremy Gwinnup;Katherine Young;M. Hutt
  • 通讯作者:
    M. Hutt
Reproducible Results in Parsing-Based Machine Translation: The JHU Shared Task Submission
基于解析的机器翻译的可重复结果:JHU 共享任务提交
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lane Schwartz
  • 通讯作者:
    Lane Schwartz
The history and promise of machine translation
机器翻译的历史和前景
  • DOI:
    10.1075/ata.18.08sch
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lane Schwartz
  • 通讯作者:
    Lane Schwartz
Robust Incremental Parsing using Human-Like Memory Constraints
使用类人内存约束的鲁棒增量解析
  • DOI:
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    0
  • 作者:
    William Schuler;S. Abdelrahman;Tim Miller;Lane Schwartz
  • 通讯作者:
    Lane Schwartz

Lane Schwartz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lane Schwartz', 18)}}的其他基金

NNA: Collaborative Research: Integrating Language Documentation and Computational Tools for Yupik, an Alaska Native Language
NNA:协作研究:集成阿拉斯加母语 Yupik 的语言文档和计算工具
  • 批准号:
    2243445
  • 财政年份:
    2022
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Continuing Grant
ComputEL: A workshop to explore the use of computational methods in the study of endangered language
ComputEL:探讨计算方法在濒危语言研究中的应用的研讨会
  • 批准号:
    1550905
  • 财政年份:
    2015
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: NNA Research: Electric Vehicles in the Arctic (EVITA) - Interactions with Cold Weather, Microgrids, People, and Policy
合作研究:NNA 研究:北极电动汽车 (EVITA) - 与寒冷天气、微电网、人员和政策的相互作用
  • 批准号:
    2318385
  • 财政年份:
    2024
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
Collaborative Research: NNA Research: Electric Vehicles in the Arctic (EVITA) - Interactions with Cold Weather, Microgrids, People, and Policy
合作研究:NNA 研究:北极电动汽车 (EVITA) - 与寒冷天气、微电网、人员和政策的相互作用
  • 批准号:
    2318384
  • 财政年份:
    2024
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Incubator: Collaborative Research: Indigenous-led Strategies for Co-Productive and Convergent Arctic Research
NNA 孵化器:合作研究:土著主导的北极研究协同生产和融合策略
  • 批准号:
    2318276
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Collaboratory: Collaborative Research: ACTION - Alaska Coastal Cooperative for Co-producing Transformative Ideas and Opportunities in the North
NNA 合作实验室:合作研究:行动 - 阿拉斯加沿海合作社,共同在北部产生变革性的想法和机遇
  • 批准号:
    2318377
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Cooperative Agreement
NNA Collaboratory: Collaborative Research: ACTION - Alaska Coastal Cooperative for Co-producing Transformative Ideas and Opportunities in the North
NNA 合作实验室:合作研究:行动 - 阿拉斯加沿海合作社,共同在北部产生变革性的想法和机遇
  • 批准号:
    2318375
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Cooperative Agreement
NNA Research: Collaborative Research: Socio-Ecological Systems Transformation in River basins of the sub-Arctic under climate change (SESTRA)
NNA 研究:合作研究:气候变化下亚北极河流流域的社会生态系统转型 (SESTRA)
  • 批准号:
    2318383
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Research: Collaborative Research: Arctic, Climate, and Earthquakes (ACE): Seismic Resilience and Adaptation of Arctic Infrastructure and Social Systems amid Changing Climate
NNA 研究:合作研究:北极、气候和地震 (ACE):气候变化中北极基础设施和社会系统的抗震能力和适应
  • 批准号:
    2220221
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Research: Collaborative Research: Towards resilient water infrastructure in Alaska Native communities through knowledge co-production
NNA 研究:合作研究:通过知识共同生产为阿拉斯加原住民社区打造具有复原力的水基础设施
  • 批准号:
    2220518
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Research: Collaborative Research: Towards resilient water infrastructure in Alaska Native communities through knowledge co-production
NNA 研究:合作研究:通过知识共同生产为阿拉斯加原住民社区打造具有复原力的水基础设施
  • 批准号:
    2220516
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
NNA Research: Collaborative Research: Towards resilient water infrastructure in Alaska Native communities through knowledge co-production
NNA 研究:合作研究:通过知识共同生产为阿拉斯加原住民社区打造具有复原力的水基础设施
  • 批准号:
    2220517
  • 财政年份:
    2023
  • 资助金额:
    $ 30.44万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了