Collaborative Research: Contributions of Endangered Language Data for Advances in Technology-enhanced Speech Annotation

合作研究:濒危语言数据对技术增强语音注释进步的贡献

基本信息

  • 批准号:
    1500595
  • 负责人:
  • 金额:
    $ 22.78万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-07-01 至 2020-06-30
  • 项目状态:
    已结题

项目摘要

Linguists have increased efforts to collect authentic speech materials from endangered and little-studied languages to discover linguistic diversity. However, the challenge of transcribing these speech into written form to facilitate analysis is daunting. This is because of both the sheer quantity of digitally collected speech that needs to be transcribed and the difficulty of unpacking the sounds of spoken speech. Linguist Andreas Kathol and computer scientist Vikramjit Mitra of SRI international and linguist Jonathan D. Amith of Gettysburg College will team up to create software that can substantially reduce the language transcription bottleneck. Using as a test case Yoloxochitl Mixtec, an endangered language from the state of Guerrero, Mexico, the team will develop a software tool that will use previously transcribed Yoloxochitl Mixtec speech data to both train a new generation of native speakers in practical orthography and to develop automatic speech recognition software. The output of the recognition software will be used as preliminary transcription that native speakers will correct, as necessary, to create additional high-quality training data. This recursive method will create corpus of transcribed speech large enough so that software will be able to complete automatic transcription of newly collected speech materials. The project will include the training of undergraduate and graduate students in software development and the analysis of the Yoloxochitl Mixtec sound system. The project will also train native speakers as documenters in an interactive fashion that systematically introduces them to the transcription conventions of their language. This software tool will help in establishing literacy in Yoloxochitl Mixtec among a broader base of speakers. The results of this project will be available at the Archive of Indigenous Languages of Latin America (University of Texas, Austin), Kaipuleohone (University of Hawai'i Digital Language Archive), and at the Linguistic Data Consortium (University of Pennsylvania).
语言学家加大了从濒危和研究较少的语言中收集真实语音材料的力度,以发现语言多样性。然而,将这些演讲转化为书面形式以便于分析的挑战是艰巨的。这是因为需要转录的数字收集的语音数量庞大,而且很难拆开口语的声音。SRI国际的语言学家Andreas Kathol和计算机科学家Vikramjit Mitra以及语言学家Jonathan D.葛底斯堡学院的Amith将合作开发软件,大大减少语言转录的瓶颈。作为一个测试案例,Yoloxochitl Mixtec是墨西哥格雷罗州的一种濒危语言,该团队将开发一种软件工具,该工具将使用以前转录的Yoloxochitl Mixtec语音数据来训练新一代的母语使用者,并开发自动语音识别软件。识别软件的输出将被用作初步转录,母语使用者将在必要时进行纠正,以创建更多的高质量培训数据。这种递归方法将创建足够大的转录语音语料库,以便软件能够完成新收集的语音材料的自动转录。该项目将包括对本科生和研究生进行软件开发和Yoloxochitl Mixtec音响系统分析方面的培训。该项目还将以互动方式培训母语者作为文件编制者,系统地向他们介绍其语言的转录惯例。这一软件工具将有助于在更广泛的发言者中建立Yoloxochitl Mixtec的识字能力。该项目的成果将在拉丁美洲土著语言档案馆(得克萨斯大学,奥斯汀)、Kaipuleohone(夏威夷大学数字语言档案馆)和语言数据联合会(宾夕法尼亚大学)提供。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jonathan Amith其他文献

Jonathan Amith的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jonathan Amith', 18)}}的其他基金

Collaborative Research: RI: Medium: From Acoustic Signal to Morphosyntactic Analysis in One End-to-End Neural System
合作研究:RI:媒介:从声学信号到端到端神经系统中的形态句法分析
  • 批准号:
    2211952
  • 财政年份:
    2022
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
A comparative database for biologists, botanists, and linguists
生物学家、植物学家和语言学家的比较数据库
  • 批准号:
    2109821
  • 财政年份:
    2021
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: Improving Techniques of Automatic Speech Recognition and Transfer Learning using Documentary Linguistic Corpora
合作研究:利用文献语言语料库改进自动语音识别和迁移学习技术
  • 批准号:
    2123578
  • 财政年份:
    2021
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Documentation of discourse and cultural activities to advance scientific knowledge of an endangered tonal language
记录话语和文化活动,以增进对濒危声调语言的科学认识
  • 批准号:
    1761421
  • 财政年份:
    2018
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Continuing Grant
Documenting Traditional Ecological Knowledge in the Sierra Nororiental de Puebla, Mexico, in Synchronic and Diachronic Perspectives
从共时和历时的角度记录墨西哥普埃布拉东北山脉的传统生态知识
  • 批准号:
    1401178
  • 财政年份:
    2014
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Corpus and lexicon development: Endangered genres of discourse and domains of cultural knowledge in Tu'un isavi (Mixtec) of Yoloxochitl, Guerrero
语料库和词汇发展:格雷罗州约洛索奇特尔的 Tuun isavi (Mixtec) 中濒临灭绝的话语流派和文化知识领域
  • 批准号:
    0966462
  • 财政年份:
    2010
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Nahuatl Language Documentation Project: Sierra Norte de Puebla [ISO 639 azz]
纳瓦特尔语言文档项目:Sierra Norte de Puebla [ISO 639 azz]
  • 批准号:
    0756536
  • 财政年份:
    2008
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Guerrero Nahuatl Language Documentation and Lexicon Enrichment Project
格雷罗纳瓦特尔语言文档和词典丰富项目
  • 批准号:
    0504164
  • 财政年份:
    2005
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: Back to the Future: Assimilating Paleo Thinning Rates and Grounding Line Positions to Constrain Future Antarctic Sea Level Contributions
合作研究:回到未来:同化古变薄率和接地线位置以限制未来南极海平面的贡献
  • 批准号:
    2303344
  • 财政年份:
    2023
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: Back to the Future: Assimilating Paleo Thinning Rates and Grounding Line Positions to Constrain Future Antarctic Sea Level Contributions
合作研究:回到未来:同化古变薄率和接地线位置以限制未来南极海平面的贡献
  • 批准号:
    2303345
  • 财政年份:
    2023
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: Elucidating the contributions of nonlinearities in musculotendon properties to enabling locomotion in unpredictable environments.
合作研究:阐明肌肉腱特性中的非线性对在不可预测的环境中实现运动的贡献。
  • 批准号:
    2128545
  • 财政年份:
    2022
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: Elucidating the contributions of nonlinearities in musculotendon properties to enabling locomotion in unpredictable environments.
合作研究:阐明肌肉腱特性中的非线性对在不可预测的环境中实现运动的贡献。
  • 批准号:
    2128546
  • 财政年份:
    2022
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: A general approach to partitioning contributions from multiple drivers affecting individuals, populations, and communities
协作研究:划分影响个人、人口和社区的多个驱动因素贡献的通用方法
  • 批准号:
    1933612
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: Deconstructing the contributions of muscle intrinsic mechanics to control of locomotion using a novel Muscle Avatar approach
合作研究:使用新颖的肌肉化身方法解构肌肉内在力学对运动控制的贡献
  • 批准号:
    2016054
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: RoL: Detecting and predicting the relative contributions of fecundity and survival to fitness in changing environments
合作研究:RoL:检测和预测不断变化的环境中繁殖力和生存对健康的相对贡献
  • 批准号:
    1951356
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: RoL: Detecting and predicting the relative contributions of fecundity and survival to fitness in changing environments
合作研究:RoL:检测和预测不断变化的环境中繁殖力和生存对健康的相对贡献
  • 批准号:
    1951364
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: A general approach to partitioning contributions from multiple drivers affecting individuals, populations, and communities
协作研究:划分影响个人、人口和社区的多个驱动因素贡献的通用方法
  • 批准号:
    1933497
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
Collaborative Research: A general approach to partitioning contributions from multiple drivers affecting individuals, populations, and communities
协作研究:划分影响个人、人口和社区的多个驱动因素贡献的通用方法
  • 批准号:
    1933561
  • 财政年份:
    2020
  • 资助金额:
    $ 22.78万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了