EAGER: Matching Non-Native Transcribers to the Distinctive Features of the Language Transcribed
EAGER:将非母语转录者与转录语言的独特特征相匹配
基本信息
- 批准号:1550145
- 负责人:
- 金额:$ 15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-08-01 至 2018-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Automatic speech recognition (ASR) systems must be trained using hundreds of hours of speech, with synchronized text transcriptions. Transcribing that much speech is beyond the means of most language communities; therefore ASR systems do not exist for most languages. To overcome this bottleneck, this exploratory EAGER project asks people who don't understand a particular language to transcribe it as if they were listening to nonsense syllables. Of course, when people try to transcribe speech in a language they don't understand, they make mistakes. However there are patterns to those mistakes which can be modeled using decoding strategies developed for telephone and wireless communication, and used to route each transcription task to people whose native language helps them to perform it. The resulting transcriptions are then fused in order to recover correct transcriptions. Five different languages are to be tested, including languages with lexical tone, and languages with a variety of consonant contrasts very different from English. The resulting transcriptions can then train ASR systems in all five languages, and the quality of the research evaluated based on its ability to train those systems without using transcriptions produced by native speakers. Mismatched crowdsourcing is formalized as a noisy channel; the talker encodes meaning in a string of symbols (phonemes) not all of which are reliably distinguishable by the perceiver. Models of second-language speech perception for each transcriber can be initialized using a perceptual assimilation model, then specialized. In particular, this proposal seeks increases in the scale and robustness of mismatched crowdsourcing by using error-correcting codes to divide the transcription task, and by then distributing each sub-task to transcribers whose native language contains the distinctive feature requested. It also seeks to develop new theory at the intersection of the current fields of crowdsourcing (the learnability of a function under conditions of label noise) and grammar induction (the learnability of a function from one language to another), and to perform grammar induction under conditions of label noise. Preliminary bounds exist for some aspects of this problem; the proposed research is designed to develop more detailed theoretic results, and test and apply them to determine the feasibility of creating serviceable ASR systems for under-resourced languages without having to use fluent speakers of those languages to transcribe speech in those languages.
自动语音识别(ASR)系统必须使用数百小时的语音进行训练,并同步文本传输。 大多数语言社区都无法转录这么多的语音,因此大多数语言都不存在ASR系统。 为了克服这个瓶颈,EAGER这个探索性的项目要求不懂某种语言的人把它转录下来,就好像他们在听无意义的音节一样。 当然,当人们试图用他们不理解的语言转录语音时,他们会犯错误。 然而,这些错误有模式,可以使用为电话和无线通信开发的解码策略来建模,并用于将每个转录任务路由到母语帮助他们执行该任务的人。 五种不同的语言将被测试,包括词汇语气的语言,以及与英语非常不同的各种辅音对比的语言。 由此产生的transmittance可以用所有五种语言训练ASR系统,并且研究的质量基于其在不使用母语者产生的transmittance的情况下训练这些系统的能力进行评估。 不匹配的众包被形式化为一个嘈杂的通道;说话者将意义编码在一串符号(音素)中,并非所有符号都能被感知者可靠地区分。 第二语言的语音感知模型为每个转录器可以使用感知同化模型初始化,然后专门化。 特别是,该提案寻求通过使用纠错码来划分转录任务,然后将每个子任务分配给其母语包含所请求的独特特征的转录者,来增加不匹配的众包的规模和鲁棒性。 它还寻求在当前众包(标签噪声条件下函数的可学习性)和语法归纳(从一种语言到另一种语言的函数的可学习性)领域的交叉点上开发新理论,并在标签噪声条件下执行语法归纳。 该问题的某些方面存在初步界限;拟议的研究旨在开发更详细的理论结果,并测试和应用它们,以确定为资源不足的语言创建可用的ASR系统的可行性,而不必使用这些语言的流利使用者来转录这些语言的语音。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mark Hasegawa-Johnson其他文献
Mark Hasegawa-Johnson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mark Hasegawa-Johnson', 18)}}的其他基金
FAI: A New Paradigm for the Evaluation and Training of Inclusive Automatic Speech Recognition
FAI:包容性自动语音识别评估和训练的新范式
- 批准号:
2147350 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
RI: Small: Collaborative Research: Automatic Creation of New Speech Sound Inventories
RI:小型:协作研究:自动创建新语音库存
- 批准号:
1910319 - 财政年份:2019
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
FODAVA-Partner: Visualizing Audio for Anomaly Detection
FODAVA-合作伙伴:可视化音频以进行异常检测
- 批准号:
0807329 - 财政年份:2008
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
RI Medium: Audio Diarization - Towards Comprehensive Description of Audio Events
RI Medium:音频二值化 - 全面描述音频事件
- 批准号:
0803219 - 财政年份:2008
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Audiovisual Distinctive-Feature-Based Recognition of Dysarthric Speech
基于视听特征的构音障碍语音识别
- 批准号:
0534106 - 财政年份:2005
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
Prosodic, Intonational, and Voice Quality Correlates of Disfluency
韵律、语调和语音质量与不流畅的相关性
- 批准号:
0414117 - 财政年份:2004
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
CAREER: Landmark-Based Speech Recognition in Music and Speech Backgrounds
职业:音乐和语音背景中基于地标的语音识别
- 批准号:
0132900 - 财政年份:2002
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
相似海外基金
Efficient Matching, Continuous Voting, and Non-Contractable Critical Information
高效匹配、持续投票、关键信息不可承包
- 批准号:
2049810 - 财政年份:2021
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Research on Structured Feature Extraction and Matching Method for Video Retrieval on Non-Rigid Objects
非刚性物体视频检索结构化特征提取与匹配方法研究
- 批准号:
15K00251 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Flexible and Accurate Recognition for Non-Rigid Object using Graph Matching
利用图匹配灵活准确地识别非刚性物体
- 批准号:
15H06009 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Research Activity Start-up
Robust pattern recognition under shearing and perspective distortions based on Radon transform and non-linear matching
基于Radon变换和非线性匹配的剪切和透视畸变下的鲁棒模式识别
- 批准号:
26330206 - 财政年份:2014
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Materials World Network: Quasi-Phase Matching in Non-Centrosymmetric Wide Band Gap Semiconductors.
材料世界网络:非中心对称宽带隙半导体中的准相位匹配。
- 批准号:
1312582 - 财政年份:2013
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
III: Small: Collaborative Research: Solving Matching Problems in Machine Learning with Non-commutative Harmonic Analysis
III:小:协作研究:用非交换调和分析解决机器学习中的匹配问题
- 批准号:
1320755 - 财政年份:2013
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: Solving Matching Problems in Machine Learning with Non-commutative Harmonic Analysis
III:小:协作研究:用非交换调和分析解决机器学习中的匹配问题
- 批准号:
1320344 - 财政年份:2013
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Design of broadband antennas using Non-foster matching circuit and wideband CP antennas
采用非福斯特匹配电路的宽带天线和宽带CP天线的设计
- 批准号:
25420378 - 财政年份:2013
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Research on the mechanism for handling non-uniform GPGPU applicationsby data-driven matching
数据驱动匹配处理非均匀GPGPU应用的机制研究
- 批准号:
23650022 - 财政年份:2011
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Research and Development of Next Generation Cell Function Analysis System by Non-contact Motion Con trol and Rotational-Invariant Matching
基于非接触运动控制和旋转不变匹配的下一代细胞功能分析系统的研发
- 批准号:
21760177 - 财政年份:2009
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Young Scientists (B)