权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Computational approaches to human spoken word recognition

人类口语单词识别的计算方法

基本信息

批准号：
1754284
负责人：
James Magnuson
金额：
$ 60.23万
依托单位：
University of Connecticut
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2018
资助国家：
美国
起止时间：
2018-03-15 至 2023-02-28
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1754284&HistoricalAwards=false
关键词：
Computational approaches human spoken word

项目摘要

This project addresses one of the grand challenges facing cognitive science -- how humans understand speech. People recognize words far more easily than even the best computer speech recognition systems, even though the actual sounds we hear as consonants and vowels vary greatly depending on context (what sounds come before or after), who is talking, and the setting (a quiet room versus a crowded airport). Most current models of speech recognition cannot handle the huge variability in real speech because they do not operate on the actual speech signal. Also, they do not learn, so they cannot model how people acquire language. This project addresses these challenges by comparing current models of speech recognition to each other and to human capabilities, with the goal of understanding how human speech processing is so robust and flexible. In addition, simplified "deep learning" networks will be developed and evaluated as models of human speech recognition. Deep learning networks are similar to cognitive models in that they learn abstract representations of the data, not task-specific rules or algorithms. These networks have been used to create accurate commercial speech recognition systems. By comparing them to human performance, the investigators may provide new insights into why human speech recognition is so robust. The results of this project will have technical implications (better understanding of human flexibility may aid in improving computer speech recognition) and health implications (better understanding of human speech recognition will aid in developing better interventions for language disorders). The project will also support the training of a postdoctoral researcher and a PhD student, both of whom will develop skills that can be used to contribute to research and development in academia or industry. This project focuses on the development of a "shallow deep network" model called "DeepListener" that will be compared with the behavior of human listeners. A close match in the millisecond-level behavior of the network (for example, in which words are temporarily confusable with each other) and human performance suggests that human speech processing may emerge from similar principles as those in the model. In preliminary work, DeepListener learned to recognize 93% of 2000 real words (200 words produced by 10 talkers). DeepListener will be evaluated by detailed comparison to standard neural network models of cognitive theories and to human performance. The ways in which DeepListener is similar and dissimilar to human performance and competing models will help to advance scientific theories of human speech recognition. This project will follow emerging standards for open science: experiments will be pre-registered and data and computer code will be made freely and publicly available.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目解决了认知科学面临的重大挑战之一-人类如何理解语音。人们比最好的计算机语音识别系统更容易识别单词，即使我们听到的辅音和元音的实际声音根据上下文（之前或之后的声音），谁在说话以及环境（安静的房间与拥挤的机场）而变化很大。当前的大多数语音识别模型不能处理真实的语音中的巨大变化，因为它们不对实际的语音信号进行操作。此外，他们不学习，所以他们不能模拟人们如何获得语言。该项目通过将当前的语音识别模型相互比较以及与人类能力进行比较来解决这些挑战，目的是了解人类语音处理如何如此强大和灵活。此外，简化的“深度学习”网络将作为人类语音识别的模型进行开发和评估。深度学习网络类似于认知模型，因为它们学习数据的抽象表示，而不是特定于任务的规则或算法。这些网络已被用于创建准确的商业语音识别系统。通过将它们与人类的表现进行比较，研究人员可能会对人类语音识别为什么如此强大提供新的见解。该项目的结果将具有技术意义（更好地了解人类的灵活性可能有助于改善计算机语音识别）和健康意义（更好地了解人类语音识别将有助于开发更好的语言障碍干预措施）。该项目还将支持培训一名博士后研究员和一名博士生，他们都将发展可用于促进学术界或工业界研究和开发的技能。该项目的重点是开发一种名为“Deepcourse”的“浅层深度网络”模型，该模型将与人类听众的行为进行比较。网络毫秒级行为（例如，单词暂时彼此混淆）和人类表现的密切匹配表明，人类语音处理可能源于与模型中相似的原理。在初步工作中，DeepMind学会了识别2000个真实的单词（10个说话者产生的200个单词）中的93%。深度学习将通过与认知理论的标准神经网络模型和人类表现的详细比较来评估。DeepMind与人类表现和竞争模型的相似和不同之处将有助于推进人类语音识别的科学理论。该项目将遵循开放科学的新兴标准：实验将预先注册，数据和计算机代码将免费公开。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（18）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Does predictive processing imply predictive coding in models of spoken word recognition?

预测处理是否意味着口语单词识别模型中的预测编码？

DOI：
发表时间：
2020
期刊：
Proceedings of the Cognitive Science Society
影响因子：
0
作者：
Magnuson, J. S.;Li, M.;Luthra, S.;You, H.;Steiner, R.
通讯作者：
Steiner, R.

LexFindR: A fast, simple, and extensible R package for finding similar words in a lexicon

LexFindR：一个快速、简单且可扩展的 R 包，用于在词典中查找相似单词

DOI：
10.3758/s13428-021-01667-6
发表时间：
2021
期刊：
Behavior Research Methods
影响因子：
5.4
作者：
Li, ZhaoBin;Crinnion, Anne Marie;Magnuson, James S.
通讯作者：
Magnuson, James S.

EARSHOT: A minimal network model of human speech recognition that operates on real speech

DOI：
发表时间：
2019
期刊：
影响因子：
0
作者：
J. Magnuson;Heejo You;J. Rueckl;Paul D. Allopenna;Monica Li;Sahil Luthra;Rachael Steiner;Hosung Nam;M. Escabí;K. Brown;Rachel M. Theodore;Nicholas Monto
通讯作者：
J. Magnuson;Heejo You;J. Rueckl;Paul D. Allopenna;Monica Li;Sahil Luthra;Rachael Steiner;Hosung Nam;M. Escabí;K. Brown;Rachel M. Theodore;Nicholas Monto

Word length, proportion of overlap, and phonological competition in spoken word recognition