Language Preservation 2.0: Crowdsourcing Oral Language Documentation using Mobile Devices
语言保存2.0:使用移动设备众包口语文档
基本信息
- 批准号:1160639
- 负责人:
- 金额:$ 10.15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-07-01 至 2014-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Language Preservation 2.0The purpose of this pilot project is to demonstrate the feasibility of a new approach to documenting endangered languages.To allow wide-ranging investigation of a language even after it is no longer spoken, we need the equivalent of the million words of extant biblical Hebrew texts, or the five million words of extant classical Latin. But for endangered languages without a significant culture of literacy, diverse text collections on this scale seem out of reach. Given typical speaking rates of about 10,000 word-equivalents per hour, a hundred hours of recorded speech -- conversations, narratives, or oral histories -- would give us the equivalent of a million words of text. With community involvement, hundreds of hours of such recordings are easily within reach.However, transcribing such large audio collections is a daunting task, given the small number of literate native speakers and the time-consuming nature of such transcription, which can take 200 hours of work for every hour of audio. We propose to solve this problem by substituting re-speaking and verbal translation: one or more native speakers repeats each phrase of a recording, speaking slowly and carefully, and then translates it into a better-documented language.The utility of translated passages as a way to analyze otherwise-unknown languages has been demonstrated many times, starting with the Rosetta Stone. This aspect of our task is easier, since at least a grammatical sketch will in general be available. Our goal in this project is to demonstrate the utility of re-speaking. We believe that linguists, starting out with relatively little knowledge of a language, can produce phonetic transcriptions that will be good enough to support subsequent analysis resulting in coherent texts, in a process analogous to (but easier than) the process that allowed previous generations of scholars to learn to read ancient Egyptian or Sumerian.
语言保存2.0这项试验计划的目的是证明一种记录濒危语言的新方法的可行性。为了对一种语言进行广泛的研究,即使它不再被使用,我们需要相当于现存圣经希伯来语文本的百万单词,或者现存古典拉丁语的五百万单词。但对于没有重要文化素养的濒危语言,如此大规模的多样化文本收集似乎是遥不可及的。考虑到典型的每小时约1万个单词的说话速度,100小时的录音——对话、叙述或口述历史——将给我们提供相当于100万单词的文本。在社区的参与下,数百小时的这样的录音很容易就能得到。然而,转录如此庞大的音频集合是一项艰巨的任务,因为有文化的母语人士数量很少,而且这种转录的耗时性质,每小时的音频可能需要200小时的工作。我们建议通过复述和口头翻译来解决这个问题:一个或多个以英语为母语的人重复录音中的每个短语,慢慢地、仔细地说,然后把它翻译成一种有更好记录的语言。从罗塞塔石碑开始,翻译段落作为一种分析未知语言的方法已经被证明了很多次。这方面的任务比较容易,因为通常至少会有一个语法草图。我们在这个项目中的目标是演示重新说话的效用。我们相信,语言学家在对一种语言的了解相对较少的情况下,可以产生足够好的语音转录,以支持随后的分析,从而产生连贯的文本,其过程类似于(但比)允许前几代学者学习阅读古埃及语或苏美尔语的过程。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mark Liberman其他文献
Dimensions of Speech and Language Disturbance in Psychosis and Computational Linguistic Markers
- DOI:
10.1016/j.biopsych.2022.02.144 - 发表时间:
2022-05-01 - 期刊:
- 影响因子:
- 作者:
Sunny Tang;Katrin Hänsel;Yan Cong;Sarah Berretta;Sunghye Cho;Amir Nikzad;Aarush Mehta;Sameer Pradhan;James Fiumara;Mark Liberman - 通讯作者:
Mark Liberman
Ruptured Appendicitis after Laparoscopic Roux-enY Gastric Bypass: Pitfalls in Diagnosing a Surgical Abdomen in the Morbidly Obese
- DOI:
10.1381/096089203322618812 - 发表时间:
2003-12-01 - 期刊:
- 影响因子:3.100
- 作者:
Amir Mehran;Mark Liberman;Raul Rosenthal;Samuel Szomstein - 通讯作者:
Samuel Szomstein
CLiFF Notes: Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
CLiFF笔记:宾夕法尼亚大学语言、信息和计算实验室的研究
- DOI:
- 发表时间:
1995 - 期刊:
- 影响因子:0
- 作者:
Norm Badler;F. B. Baldwin;Nicola J. Bessell;Eric Brill;Sharon Cote;Barbara Di Eugenio;Alexis Dimitriadis;Jon Freeman;Christopher W. Geib;A. Gertner;Daniel Hardt;Michael Hegarty;Shyam Kapur;Jonathan Kaye;Michael H. Kelly;Libby Levison;Mark Liberman;D. R. Mani;Mitch Marcus Michael;B. Moore;Michael Niv;Charles L. Ortiz;Jong Cheol Park;Sandeep Prasada Scott - 通讯作者:
Sandeep Prasada Scott
l / VARIATION IN AMERICAN ENGLISH : A CORPUS
l / 美式英语变体:语料库
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Jiahong Yuan;Mark Liberman - 通讯作者:
Mark Liberman
LOOKING BACK, MOVING FORWARD Why underlying representations? 1
回顾过去,展望未来 为什么要使用底层表征?
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Looking Back;Moving Forward;Larry;M. Hyman;Jeffrey Heinz;Sharon Inkelas;Keith Johnson;Mark Liberman - 通讯作者:
Mark Liberman
Mark Liberman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mark Liberman', 18)}}的其他基金
CI-NEW: NIEUW: Novel Incentives and Workflows in Linguistic Data Collection and Annotation
CI-NEW:NIEUW:语言数据收集和注释中的新颖激励措施和工作流程
- 批准号:
1730377 - 财政年份:2017
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
Prosodic Systems in New Guinea: Integrating computational and typological approaches to linguistic analysis
新几内亚的韵律系统:将计算和类型学方法整合到语言分析中
- 批准号:
0951651 - 财政年份:2010
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
Collaborative Research: OLAC: Accessing the World's Language Resources
合作研究:OLAC:访问世界语言资源
- 批准号:
0723357 - 财政年份:2007
- 资助金额:
$ 10.15万 - 项目类别:
Continuing Grant
ITR-SCOTUS: A Resource for Collaborative Research in Speech Technology, Linguistics, Decision Processes and the Law
ITR-SCOTUS:语音技术、语言学、决策过程和法律合作研究的资源
- 批准号:
0325739 - 财政年份:2003
- 资助金额:
$ 10.15万 - 项目类别:
Continuing Grant
Eletronic Materials For Natural Language Research
用于自然语言研究的电子材料
- 批准号:
9113530 - 财政年份:1991
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
相似海外基金
Real Versus Digital: Sustainability optimization for cultural heritage preservation in national libraries
真实与数字:国家图书馆文化遗产保护的可持续性优化
- 批准号:
AH/Z000041/1 - 财政年份:2024
- 资助金额:
$ 10.15万 - 项目类别:
Research Grant
Towards preservation of the natural knee: State-of-the-art approaches to understand the kinematics and tissue mechanics of human menisci in vivo.
保护自然膝盖:了解体内人体半月板运动学和组织力学的最先进方法。
- 批准号:
EP/Y002415/1 - 财政年份:2024
- 资助金额:
$ 10.15万 - 项目类别:
Research Grant
I-Corps: Translation Potential of Multi-component Bioactives for Breastmilk Preservation
I-Corps:多成分生物活性物质对母乳保存的转化潜力
- 批准号:
2409744 - 财政年份:2024
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
I-Corps: Using Peptides for Biomolecules Encapsulation, Storage, and Preservation
I-Corps:使用肽进行生物分子封装、储存和保存
- 批准号:
2414552 - 财政年份:2024
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
Protecting spermatogonial stem cells from chemotherapy-induced damage for fertility preservation in childhood cancer
保护精原干细胞免受化疗引起的损伤,以保存儿童癌症的生育能力
- 批准号:
MR/Y011783/1 - 财政年份:2024
- 资助金额:
$ 10.15万 - 项目类别:
Fellowship
Inclusive AI for Healthy Change, Retaining Identity Preservation
包容性人工智能促进健康变革,保留身份保护
- 批准号:
10059947 - 财政年份:2023
- 资助金额:
$ 10.15万 - 项目类别:
Grant for R&D
Development of high-dimensional optical analysis technology for preservation and restoration of cultural properties
开发用于文化遗产保护和修复的高维光学分析技术
- 批准号:
23H00499 - 财政年份:2023
- 资助金额:
$ 10.15万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Development of collection and preservation criteria and search terms for 'folk material': from factory parts to diaries
制定“民间材料”的收集和保存标准以及搜索术语:从工厂零件到日记
- 批准号:
23K00959 - 财政年份:2023
- 资助金额:
$ 10.15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
I-Corps: Tri-Cure Hybrid Organo-Silicon Coatings for Surface Preservation
I-Corps:用于表面保护的 Tri-Cure 混合有机硅涂料
- 批准号:
2327701 - 财政年份:2023
- 资助金额:
$ 10.15万 - 项目类别:
Standard Grant
Investigation of adsorption of exosomes on porous materials and regulating the behavior to create separation, purification and preservation techniques
研究外泌体在多孔材料上的吸附并调节行为以创建分离、纯化和保存技术
- 批准号:
23KJ0192 - 财政年份:2023
- 资助金额:
$ 10.15万 - 项目类别:
Grant-in-Aid for JSPS Fellows