权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: Improving speech technology for better learning outcomes: the case of AAE child speakers

协作研究：改进语音技术以获得更好的学习成果：AAE 儿童扬声器的案例

基本信息

批准号：
2202585
负责人：
Abeer Alwan
金额：
$ 31.89万
依托单位：
University of California-Los Angeles
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-05-01 至 2025-04-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2202585&HistoricalAwards=false
关键词：
Collaborative Research Improving speech technology

项目摘要

The lack of reading proficiency seen in children of underserved school districts has lasting impacts on students’ performances in various subjects. Low literacy is an especially pressing issue for African American students. Interactive spoken language systems offer the possibility of a powerful tool for assisting in early childhood education, freeing up teachers’ time, and engaging students in repeated opportunities for learning. These systems involve both Automatic Speech Recognition and Text-to-Speech Systems. The goal of this research is to improve the performance of such systems for young speakers of African American English (AAE) such that automated oral literacy assessment can be developed. The research has important societal and technological impacts. It will enhance the usability of speech technology in early education for AAE speaking children, providing a model for better supporting students with diverse dialects. Many under-resourced children do not have access to adequate reading and language assessments, and the proposed work will address these issues by creating methods for adapting spoken language technology to AAE children, increasing fairness in speech technology on a broader scale. The work has strong outreach and dissemination programs and will train undergraduate and graduate students in interdisciplinary research in Electrical and Computer Engineering, Linguistics, Education, and Psychology. Challenges facing children’s Automatic Speech Recognition (ASR) are due to (1) lack of child speech data and, hence, current models used for recognition are trained using data collected from adult speakers, and (2) children display a wider range of intra- and inter- speaker variability than adults. ASR performance is especially poor for children who are non-native English speakers or those who at times transition into dialects such as AAE that are different from what ASR systems are typically trained on. In addition, most dialog systems built on text-to-speech (TTS) technology are designed using General American English (GAE) voices, which minority children may not identify with. In the high-stakes area of education, these considerations impact the effectiveness of technology for different groups. The work will utilize a new and continuously developing database of AAE children's speech to research the impact of spoken language systems on children’s learning outcomes. On the learning side, the research will highlight the impact of dialect on literacy assessment. On the technology side, the work will yield novel machine learning algorithms for low-resource tasks. Specifically, this project will develop data augmentation techniques that can increase the amount of training data available for low-resource tasks, and data normalization techniques so that ASR performance is improved for AAE child speakers. The work on TTS will explore new methods of disentangling speaker and dialect impacts on spectral realization of phrases that model dialect density (rather than treating dialect as a categorical variable) and separately accounting for pronunciation and prosodic factors. Methods found to be effective for TTS will be leveraged in the data augmentation work for ASR and explored as a diagnostic in literacy assessment.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在服务不足的学区，儿童缺乏阅读能力，这对学生在各个科目的表现产生了持久的影响。对于非裔美国学生来说，识字率低是一个特别紧迫的问题。交互式口语系统提供了一种强大的工具，用于帮助幼儿教育，解放教师的时间，并让学生参与反复学习的机会。这些系统既包括自动语音识别系统，也包括文本到语音系统。这项研究的目标是改善这些系统对年轻的非裔美国人英语(AAE)的表现，以便开发自动化的口语识字评估。这项研究具有重要的社会和技术影响。它将增强语音技术在AAE儿童早期教育中的可用性，为更好地支持不同方言的学生提供一个模式。许多资源不足的儿童无法获得适当的阅读和语言评估，拟议的工作将通过创造使口语技术适应AAE儿童的方法来解决这些问题，在更广泛的范围内增加语音技术的公平性。这项工作有强大的外展和传播计划，并将在电气和计算机工程、语言学、教育学和心理学的跨学科研究方面培训本科生和研究生。儿童自动语音识别(ASR)面临的挑战是：(1)缺乏儿童语音数据，因此，当前用于识别的模型是使用从成人说话人那里收集的数据进行训练的，以及(2)儿童比成人表现出更大范围的说话人内和说话人之间的可变性。对于非英语母语的儿童或有时过渡到AAE等方言的儿童来说，ASR的表现尤其糟糕，这些方言与ASR系统通常接受的培训不同。此外，大多数建立在文本到语音(TTS)技术上的对话系统都是使用通用美国英语(GAE)语音设计的，少数族裔儿童可能不会认同这种语音。在事关重大的教育领域，这些考虑因素会影响技术对不同群体的有效性。这项工作将利用一个新的和不断发展的AAE儿童语音数据库来研究口语系统对儿童学习结果的影响。在学习方面，研究将突出方言对识字评估的影响。在技术方面，这项工作将为低资源任务产生新的机器学习算法。具体地说，该项目将开发数据增强技术，以增加可用于低资源任务的训练数据量，并开发数据标准化技术，以便提高AAE儿童说话者的ASR性能。TTS方面的工作将探索新的方法，以分离说话人和方言对建模方言密度(而不是将方言作为范畴变量)并分别考虑发音和韵律因素的短语的频谱实现的影响。被发现对TTS有效的方法将被用于ASR的数据增强工作，并被用作识字评估的诊断。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Towards Effective Speech-based AI in the Classroom: The Case of AAE-Speaking Children

在课堂上实现有效的基于语音的人工智能：以 AAE 语言儿童为例

DOI：
发表时间：
2022
期刊：
Black in AI NeurIPs Workshop
影响因子：
0
作者：
Alexander, Johnson;Julie, Washington;Robin, Morris;Mari, Ostendorf;Alison, Bailey;Abeer, Alwan
通讯作者：
Abeer, Alwan

Leveraging Multiple Sources in Automatic African American English Dialect Detection for Adults and Children

利用多种来源自动检测成人和儿童的非裔美国英语方言

DOI：
发表时间：
2023
期刊：
Speech and Signal Processing (ICASSP
影响因子：
0
作者：
Johnson, Alexander;Shetty, Vishwas;Ostendorf, Mari;and Alwan, Abeer
通讯作者：
and Alwan, Abeer

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Abeer Alwan其他文献

Modeling auditory perception to improve robust speech recognition

建立听觉感知模型以提高稳健的语音识别能力

DOI：
发表时间：
1997
期刊：
Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136)
影响因子：
0
作者：
B. Strope;Abeer Alwan
通讯作者：
Abeer Alwan

Unraveling the associations between voice pitch and major depressive disorder: a multisite genetic study

揭示声音音调与重度抑郁症之间的关联：一项多站点遗传研究

DOI：
10.1038/s41380-024-02877-y
发表时间：
2024-12-31
期刊：
MOLECULAR PSYCHIATRY
影响因子：
10.100
作者：
Yazheng Di;Elior Rahmani;Joel Mefford;Jinhan Wang;Vijay Ravi;Aditya Gorla;Abeer Alwan;Kenneth S. Kendler;Tingshao Zhu;Jonathan Flint
通讯作者：
Jonathan Flint