Leveraging Natural Language Processing for Reverberant Speech Enhancement in Cochlear Implants
利用自然语言处理增强人工耳蜗的混响语音
基本信息
- 批准号:10755798
- 负责人:
- 金额:$ 17.38万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-03-15 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:AcousticsAddressAlgorithmsAreaArtificial IntelligenceAudiologyAuditoryAuditory PerceptionBenchmarkingChurchClinicalCochlear ImplantsComprehensionComputer softwareComputersDataEffectivenessEnvironmentFamiliarityFrequenciesFutureGoalsHearing AidsHomeIndividualKnowledgeLanguageLiteratureMachine LearningMapsMasksMethodsModernizationMorphologic artifactsNatural Language ProcessingNoisePerformancePredictive textQuality of lifeResearchResourcesSignal TransductionSpeechSpeech IntelligibilitySpeech PerceptionStructureSystemTechniquesTestingTextTimeUnited States National Institutes of HealthVoiceWorkautomated speech recognitiondeafeffectiveness evaluationexperienceexperimental studyflexibilityhearing impairmentimprovedinnovationintelligent personal assistantmachine learning algorithmmultidisciplinarynormal hearingnovelopen sourceportabilityprototypesignal processingspeech processingspeech recognitionspeech synthesissuccesssyntaxtime use
项目摘要
ABSTRACT
The overarching goal of this project is to develop algorithms to address the difficulties that cochlear implant
(CI) users experience interpreting speech in reverberant listening environments like churches, auditoriums and
classrooms. Recent research has made progress in this area using time-frequency masking techniques, but
these algorithms are often not robust in changing acoustic environments or are not amenable to real time
processing. Machine learning (ML) and artificial intelligence (AI) techniques are burgeoning in many
applications areas recently, but to date, AI/ML approaches for reverberation in CI users have shown limited
success. Our proposed approach is to investigate several AI/ML speech enhancement methods based on the
natural language processing (NLP) field to essentially recognize speech in reverberation and then clean it. We
will provide final assessment of algorithm performance by using the open-source NIH-supported CCi-MOBILE
CI research platform for its ease and flexibility necessary for developing and prototyping CI signal processing
algorithms. We propose to use phoneme-based recognition and automatic speech recognition (ASR)
approaches to develop and test our reverberation mitigation algorithms. Aim 1 will investigate the real-time
feasibility of exploiting phoneme recognition for ML-based T-F masking in CIs. We will develop a novel
phoneme-based T-F mask estimation algorithm and conduct speech recognition tests with an offline algorithm
mode to compare conventional and phoneme-based T-F masking. This work will determine whether phoneme
knowledge is beneficial for speech enhancement in CIs. Aim 2 will investigate the utility of real-time T-F mask
estimation in CI users. We will implement various T-F mask estimation algorithms to mitigate reverberation
from the literature (including our novel phoneme-based T-F algorithm developed in Aim 1) in real-time in CCi-
MOBILE. In addition to their impact on speech intelligibility, algorithms will be benchmarked against CI
computational limits and tolerable time delays of audiovisual asynchrony. This work will evaluate the
effectiveness of T-F mask estimation algorithms in real-time operational conditions. Aim 3 will investigate
advancing speech intelligibility for CI users via ASR and text-to-speech synthesis (ASR-TTS). We will
investigate various front-end speech enhancement strategies to improve ASR predictions and TTS engines
with generic and familiar synthetic voices. This work will use CCi-MOBILE to evaluate the utility of ASR-TTS
and the effect of speaker familiarity on reverberant speech intelligibility in CI users. Our team brings AI/ML,
hardware, experimental testing and audiology experience that will be needed for successful research. CCi-
CLOUD, a cloud feature of CCI-MOBILE, will be used to facilitate remote and collaborative CI user studies.
Our work is highly innovative and has the potential to instigate a paradigm shift towards AI/ML-driven auditory
protheses that leverage NLP to adapt speech processing strategies to acoustic settings to maximize user
benefits. Demonstrated success will improve the quality of life of CI users.
摘要
这个项目的首要目标是开发算法来解决人工耳蜗植入的困难。
(CI)用户体验到在像教堂、礼堂和音乐厅这样的混响收听环境中解释语音,
教室最近的研究在这一领域取得了进展,使用时频掩蔽技术,但
这些算法通常在变化的声学环境中不鲁棒或者不适合于真实的时间
处理.机器学习(ML)和人工智能(AI)技术在许多领域正在蓬勃发展。
最近,AI/ML在CI用户中的混响方法显示出有限的应用领域,
成功我们所提出的方法是研究几种AI/ML语音增强方法,
自然语言处理(NLP)领域,从本质上识别混响中的语音,然后对其进行清理。我们
将使用NIH支持的开源CCi-MOBILE提供算法性能的最终评估
CI研究平台,用于开发和原型化CI信号处理所需的易用性和灵活性
算法我们建议使用基于音素的识别和自动语音识别(ASR)
方法来开发和测试我们的混响缓解算法。目标1将调查实时
在CI中利用基于ML的T-F掩蔽的音素识别的可行性。我们要写一部小说
基于音素的T-F掩码估计算法,并使用离线算法进行语音识别测试
模式来比较传统的和基于音素的T-F掩蔽。这项工作将确定音素是否
知识对于CI中的语音增强是有益的。目标2将研究实时T-F掩模的实用性
在CI用户中的估计。我们将实现各种T-F掩模估计算法来减轻混响
从文献(包括我们的新的音素为基础的T-F算法在目标1)实时在CCi-
移动的.除了对语音清晰度的影响外,算法还将以CI为基准
计算限制和可容忍的视听延迟。这项工作将评估
T-F掩模估计算法在实时操作条件下的有效性。Aim 3将进行调查
通过ASR和文本到语音合成(ASR-TTS)提高CI用户的语音清晰度。我们将
研究各种前端语音增强策略,以改善ASR预测和TTS引擎
用普通和熟悉的合成声音。本文将使用CCi-MOBILE来评估ASR-TTS的实用性
以及说话人熟悉度对CI用户混响语音清晰度的影响。我们的团队带来了AI/ML,
硬件,实验测试和听力学经验,将需要成功的研究。CCi-
云,CCI-MOBILE的云功能,将用于促进远程和协作CI用户研究。
我们的工作具有高度创新性,有可能引发向AI/ML驱动的听觉模式转变。
利用NLP使语音处理策略适应声学设置,以最大限度地提高用户
效益证明成功将提高CI用户的生活质量。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
LESLIE M. COLLINS其他文献
LESLIE M. COLLINS的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('LESLIE M. COLLINS', 18)}}的其他基金
Using Machine Learning to Mitigate Reverberation Effects in Cochlear Implants
使用机器学习减轻人工耳蜗的混响效应
- 批准号:
9100672 - 财政年份:2015
- 资助金额:
$ 17.38万 - 项目类别:
Using Machine Learning to Mitigate Reverberation Effects in Cochlear Implants
使用机器学习减轻人工耳蜗的混响效应
- 批准号:
9305035 - 财政年份:2015
- 资助金额:
$ 17.38万 - 项目类别:
Using Machine Learning to Mitigate Reverberation Effects in Cochlear Implants
使用机器学习减轻人工耳蜗的混响效应
- 批准号:
8963088 - 财政年份:2015
- 资助金额:
$ 17.38万 - 项目类别:
Towards Clinical Acceptability: Enhancing the P300-based Brain-Computer Interface
迈向临床可接受性:增强基于 P300 的脑机接口
- 批准号:
8309132 - 财政年份:2009
- 资助金额:
$ 17.38万 - 项目类别:
Towards Clinical Acceptability: Enhancing the P300-based Brain-Computer Interface
迈向临床可接受性:增强基于 P300 的脑机接口
- 批准号:
7779866 - 财政年份:2009
- 资助金额:
$ 17.38万 - 项目类别:
Towards Clinical Acceptability: Enhancing the P300-based Brain-Computer Interface
迈向临床可接受性:增强基于 P300 的脑机接口
- 批准号:
8521238 - 财政年份:2009
- 资助金额:
$ 17.38万 - 项目类别:
Towards Clinical Acceptability: Enhancing the P300-based Brain-Computer Interface
迈向临床可接受性:增强基于 P300 的脑机接口
- 批准号:
8307568 - 财政年份:2009
- 资助金额:
$ 17.38万 - 项目类别:
Implementation and Tuning of Multi-rate Speech Processors for Cochlear Implants
人工耳蜗多速率语音处理器的实现和调整
- 批准号:
7749928 - 财政年份:2006
- 资助金额:
$ 17.38万 - 项目类别:
Implementation and Tuning of Multi-rate Speech Processors for Cochlear Implants
人工耳蜗多速率语音处理器的实现和调整
- 批准号:
7335628 - 财政年份:2006
- 资助金额:
$ 17.38万 - 项目类别:
Implementation and Tuning of Multi-rate Speech Processors for Cochlear Implants
人工耳蜗多速率语音处理器的实现和调整
- 批准号:
7156175 - 财政年份:2006
- 资助金额:
$ 17.38万 - 项目类别:
相似海外基金
Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
- 批准号:
MR/S03398X/2 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
- 批准号:
EP/Y001486/1 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
- 批准号:
2338423 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
- 批准号:
MR/X03657X/1 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:
2348066 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
- 批准号:
AH/Z505481/1 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10107647 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:
2341402 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10106221 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
- 批准号:
AH/Z505341/1 - 财政年份:2024
- 资助金额:
$ 17.38万 - 项目类别:
Research Grant