权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Enhancing research on speech and deep learning through holistic acoustic analysis

通过整体声学分析加强语音和深度学习研究

基本信息

批准号：
2219843
负责人：
Matthew Goldrick
金额：
$ 100万
依托单位：
Northwestern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-08-15 至 2026-07-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2219843&HistoricalAwards=false
关键词：
Enhancing research speech deep learning

项目摘要

You can guess a lot about a person from the way they pronounce words. Remarkably, human listeners can tell if it is likely that talkers learned English as a first language or a second language, or if the talkers might have a brain injury that makes it difficult for them to speak. Such intuitions rely on human listeners’ holistic pattern recognition abilities; these allow us to perceive the important, meaningful, yet subtle differences between pronunciations. However, the methods scientists currently use to measure speech objectively – based on a small number of properties of speech sounds – fail to capture these differences, hampering our ability to use speech to learn about the mind and brain. This project brings together speech scientists, computer scientists, and neuroscientists to test a radically different approach to this problem. Machine learning will be used to discover a new method for quantifying differences between spoken utterances based on holistic pattern recognition. This will be tested against new and existing data from bilingual speakers. If successful, this will yield a fully general method that can be applied to speech from any language or any domain of language usage, allowing scientists to capitalize on the wealth of information in speech to develop powerful new insights into the mind and brain. Improved detection of subtle problems with pronunciation, such as occurs with Alzheimer’s disease, will advance our understanding of the brain mechanisms that humans use to produce speech. The results of this testing will also allow computer scientists to advance our understanding of how machine learning algorithms process sounds, driving improvements in the algorithms and supporting applications in any area of speech and language technology that relies on spoken language processing. Speech variability across talkers provides a treasure trove of information for cognitive neuroscientists, leading to important insights into the cognitive mechanisms underlying language processing and potentially providing early signs of brain dysfunction. Current studies of speech are hamstrung by analyses that require preselecting specific temporal scales and acoustic dimensions. We propose a radically different approach: using unsupervised deep learning to discover a representational space for analysis of acoustic variation. To test this highly general approach, this method will be compared to current state-of-the art methods for analyzing individual variation in bilingual speech. This includes using the acoustic variation in second language speech to predict intelligibility and to detect difficulties in code-switching, particularly the challenges faced by individuals with Alzheimer’s Disease. The results will inform development of deep learning and cognitive neuroscience. The machine learning algorithm is fully general; it can be applied to speech from any language or any domain of language usage, expanding the range of populations and contexts that can be served by speech technology or studied by cognitive neuroscientists. The project’s integrative approach will allow computer scientists to advance our understanding of the extent to which modern deep learning architectures do or do not approximate human speech processing and allow cognitive neuroscientists to further our understanding of how meaningful acoustic distinctions are represented in speech perception and production. human speech representation. This project is funded by the Integrative Strategies for Understanding Neural and Cognitive Systems (NCS) program, which is jointly supported by the Directorates for Computer and Information Science and Engineering (CISE), Education and Human Resources (EHR), Engineering (ENG), and Social, Behavioral, and Economic Sciences (SBE).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

你可以从一个人的发音中猜出很多关于他的事。值得注意的是，人类听众可以判断说话者是否可能将英语作为第一语言或第二语言学习，或者说话者是否可能有大脑损伤，使他们难以说话。这种直觉依赖于人类听者的整体模式识别能力;这些能力使我们能够感知发音之间重要的、有意义的、但又微妙的差异。然而，科学家目前用来客观测量语音的方法--基于语音的少量属性--未能捕捉到这些差异，阻碍了我们使用语音来了解思维和大脑的能力。这个项目汇集了语音科学家，计算机科学家和神经科学家来测试一个完全不同的方法来解决这个问题。机器学习将被用来发现一种新的方法，用于基于整体模式识别来量化口语之间的差异。这将根据来自双语者的新的和现有的数据进行测试。如果成功的话，这将产生一种完全通用的方法，可以应用于任何语言或任何语言使用领域的语音，使科学家能够利用语音中的丰富信息来开发对心灵和大脑的强大的新见解。改善对发音细微问题的检测，如阿尔茨海默氏病，将促进我们对人类用于产生语音的大脑机制的理解。这项测试的结果还将使计算机科学家能够推进我们对机器学习算法如何处理声音的理解，推动算法的改进，并支持依赖口语处理的任何语音和语言技术领域的应用。说话者之间的言语变异性为认知神经科学家提供了宝贵的信息，从而对语言处理的认知机制产生了重要的见解，并可能提供大脑功能障碍的早期迹象。目前的语音研究受到分析的束缚，需要预先选择特定的时间尺度和声学维度。我们提出了一种完全不同的方法：使用无监督的深度学习来发现用于分析声学变化的代表性空间。为了测试这种高度通用的方法，这种方法将比较目前国家的最先进的方法，用于分析双语语音中的个体差异。这包括使用第二语言语音中的声学变化来预测可懂度并检测代码转换中的困难，特别是阿尔茨海默病患者所面临的挑战。研究结果将为深度学习和认知神经科学的发展提供信息。机器学习算法是完全通用的;它可以应用于任何语言或任何语言使用领域的语音，扩大了语音技术或认知神经科学家研究的人群和背景的范围。该项目的综合方法将使计算机科学家能够推进我们对现代深度学习架构在多大程度上接近或不接近人类语音处理的理解，并使认知神经科学家能够进一步理解语音感知和产生中有意义的声学差异。人类语音表示。该项目由理解神经和认知系统（NCS）计划的综合策略资助，该计划由计算机和信息科学与工程（CISE），教育和人力资源（EHR），工程（ENG）和社会，行为，经济科学（SBE）该奖项反映了NSF的法定使命，并通过使用基金会的智力价值进行评估，更广泛的影响审查标准。

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Inhibitory control of the dominant language: Reversed language dominance is the tip of the iceberg

DOI：
10.1016/j.jml.2023.104410
发表时间：
2023-01-23
期刊：
JOURNAL OF MEMORY AND LANGUAGE
影响因子：
4.3
作者：
Goldrick, Matthew;Gollan, Tamar H.
通讯作者：
Gollan, Tamar H.

Advancement of phonetics in the 21st century: Exemplar models of speech production

21 世纪语音学的进步：语音产生的范例模型

DOI：
发表时间：
2023
期刊：
Journal of Phonetics
影响因子：
1.9
作者：
Goldrick, Matthew;Cole, Jennifer
通讯作者：
Cole, Jennifer

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Matthew Goldrick其他文献

Language and the Brain: Developments in Neurology/Neuroscience, Linguistics, and Psycholinguistics

语言与大脑：神经病学/神经科学、语言学和心理语言学的发展

DOI：
发表时间：
2014
期刊：
影响因子：
0
作者：
Lise Menn;Matthew Goldrick
通讯作者：
Matthew Goldrick

The perception of code-switched speech in noise.

噪声中语码转换语音的感知。

DOI：
10.1121/10.0025375
发表时间：
2024
期刊：
JASA express letters
影响因子：
1
作者：
M. Gavino;Matthew Goldrick
通讯作者：
Matthew Goldrick

Predicting relative intelligibility from inter-talker distances in a perceptual similarity space for speech

DOI：
10.3758/s13423-025-02652-2
发表时间：
2025-02-10
期刊：
PSYCHONOMIC BULLETIN & REVIEW
影响因子：
3.000
作者：
Seung-Eun Kim;Bronya R. Chernyak;Joseph Keshet;Matthew Goldrick;Ann R. Bradlow
通讯作者：
Ann R. Bradlow