CAREER: Personalized Speech Enhancement: Test-Time Adaptation Using No or Few Private Data
职业:个性化语音增强:不使用或很少使用私人数据的测试时适应
基本信息
- 批准号:2046963
- 负责人:
- 金额:$ 47.8万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-04-01 至 2026-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Current general-purpose speech enhancement systems employ large models trained from big datasets of audio signals which are too bulky to run on small personal devices. A personalized model can be a resource-efficient solution because it focuses on a particular user and a specific test environment for which a smaller model architecture can be good enough. However, training a personalized model requires clean voice data from the test-time user in advance, which are not always available because of the user’s privacy concerns or problems with recording. This CAREER project develops machine-learning methods to achieve the personalization goal while requiring no or few data samples from the test-time users. Because the project achieves the personalization goal in a privacy-preserving and resource-efficient way, it is a step towards a more available and affordable use of artificial intelligence for all members of society.The project circumvents the lack of personal data in the context of personalized speech enhancement using no- and few-shot learning frameworks with help from adversarial and self-supervised learning. First, it verifies that a personalized system with reduced computational complexity can still compete with a generic model in speech enhancement performance. To this end, the training algorithm divides the potentially large model into multiple sub-modules, each of which handles a particular sub-problem (e.g., a particular user's utterance). If the sub-problems are defined to be mutually exclusive, the test-time inference can be made efficiently by using only the most suitable sub-module. Since the sub-module selection is done on noisy speech, it achieves personalization with no additional training on the test user's data. Second, the project explores a no-shot learning approach, in which the fundamental challenge lies in optimizing a machine learning model with no available target. To this end, an already-trained general-purpose model is fine-tuned for an unseen test environment using adversarial optimization. The third research topic handles the case when a small amount of user's clean speech is available, which falls in the category of few-shot learning. The project overcomes data shortage via a self-supervised learning method that learns effective features from noisy speech data, which are more available than the clean ones. That way, the model can be prepared for a subsequent fine-tuning step, which can be done with only a few clean user-specific speech utterances.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
目前的通用语音增强系统采用从音频信号大数据集训练的大型模型,这些模型过于庞大,无法在小型个人设备上运行。个性化模型可以是一种资源高效的解决方案,因为它关注于一个特定的用户和一个特定的测试环境,对于这些环境,一个较小的模型体系结构就足够好了。然而,训练一个个性化的模型需要事先从测试时间的用户那里获得干净的语音数据,由于用户的隐私问题或录音问题,这些数据并不总是可用的。这个CAREER项目开发了机器学习方法来实现个性化目标,同时不需要或只需要很少的测试时间用户的数据样本。由于该项目以保护隐私和节约资源的方式实现了个性化目标,因此它是向所有社会成员更容易获得和负担得起的人工智能使用迈出的一步。该项目在对抗性学习和自我监督学习的帮助下,使用无镜头和少镜头学习框架,解决了个性化语音增强背景下缺乏个人数据的问题。首先,它验证了降低计算复杂度的个性化系统仍然可以在语音增强性能上与通用模型竞争。为此,训练算法将潜在的大模型分成多个子模块,每个子模块处理一个特定的子问题(例如,一个特定的用户的话语)。如果将子问题定义为互斥的,则只需使用最合适的子模块即可有效地进行测试时间推断。由于子模块的选择是在噪声语音上完成的,因此无需对测试用户的数据进行额外的训练即可实现个性化。其次,该项目探索了一种无射击学习方法,其中最基本的挑战在于优化没有可用目标的机器学习模型。为此,使用对抗性优化对已经训练好的通用模型进行微调,以适应看不见的测试环境。第三个研究课题处理少量用户干净语音的情况,属于few-shot学习的范畴。该项目通过一种自监督学习方法克服了数据短缺的问题,该方法从有噪声的语音数据中学习有效的特征,这些特征比干净的语音数据更可用。这样,模型就可以为后续的微调步骤做好准备,这可以只用几个干净的用户特定语音来完成。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Efficient Personalized Speech Enhancement Through Self-Supervised Learning
- DOI:10.1109/jstsp.2022.3181782
- 发表时间:2021-04
- 期刊:
- 影响因子:7.5
- 作者:Aswin Sivaraman;Minje Kim
- 通讯作者:Aswin Sivaraman;Minje Kim
The Potential of Neural Speech Synthesis-Based Data Augmentation for Personalized Speech Enhancement
- DOI:10.1109/icassp49357.2023.10096601
- 发表时间:2022-11
- 期刊:
- 影响因子:0
- 作者:Anastasia Kuznetsova;Aswin Sivaraman;Minje Kim
- 通讯作者:Anastasia Kuznetsova;Aswin Sivaraman;Minje Kim
Bloom-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement
- DOI:10.1109/icassp43922.2022.9746767
- 发表时间:2021-11
- 期刊:
- 影响因子:0
- 作者:Sunwoo Kim;Minje Kim
- 通讯作者:Sunwoo Kim;Minje Kim
Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification
- DOI:10.21437/interspeech.2021-1868
- 发表时间:2021-04
- 期刊:
- 影响因子:0
- 作者:Aswin Sivaraman;Sunwoo Kim;Minje Kim
- 通讯作者:Aswin Sivaraman;Sunwoo Kim;Minje Kim
Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning with Knowledge Distillation
- DOI:10.1109/waspaa52581.2021.9632771
- 发表时间:2021-05
- 期刊:
- 影响因子:0
- 作者:Sunwoo Kim;Minje Kim
- 通讯作者:Sunwoo Kim;Minje Kim
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Minje Kim其他文献
Generative De-Quantization for Neural Speech Codec via Latent Diffusion
通过潜在扩散进行神经语音编解码器的生成去量化
- DOI:
10.48550/arxiv.2311.08330 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Haici Yang;Inseon Jang;Minje Kim - 通讯作者:
Minje Kim
Does Restricting the Entry of Formula Businesses Help Mom-and-Pop Stores? The Case of Small American Towns With Unique Community Character
限制配方奶企业进入对夫妻店有帮助吗?
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Minje Kim;Tingyu Zhou - 通讯作者:
Tingyu Zhou
Collaborative Deep Learning for speech enhancement: A run-time model selection method using autoencoders
- DOI:
10.1109/icassp.2017.7952121 - 发表时间:
2017-03 - 期刊:
- 影响因子:0
- 作者:
Minje Kim - 通讯作者:
Minje Kim
FOR EFFICIENT SINGLE-CHANNEL SOURCE SEPARATION
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Minje Kim - 通讯作者:
Minje Kim
Infrastructure investments and land value capture: variations and uncertainties at the frontiers of urban expansion
基础设施投资和土地价值获取:城市扩张前沿的变化和不确定性
- DOI:
10.3828/tpr.2022.23 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Minje Kim - 通讯作者:
Minje Kim
Minje Kim的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Personalized Online Adaptive Learning System
个性化在线自适应学习系统
- 批准号:
23K20186 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
CAREER: Personalized Maternal Care Decision Support System for Underserved Populations
职业:针对服务不足人群的个性化孕产妇护理决策支持系统
- 批准号:
2339992 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Continuing Grant
CAREER: Personalized, wearable robot mobility assistance considering human-robot co-adaptation that incorporates biofeedback, user coaching, and real-time optimization
职业:个性化、可穿戴机器人移动辅助,考虑人机协同适应,结合生物反馈、用户指导和实时优化
- 批准号:
2340519 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Continuing Grant
Realizing Human Brain Stimulation of Deep Regions Based on Novel Personalized Electrical Computational Modelling
基于新型个性化电计算模型实现人脑深部刺激
- 批准号:
23K25176 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Prediction, Monitoring and Personalized Recommendations for Prevention and Relief of Dementia and Frailty
预防和缓解痴呆症和衰弱的预测、监测和个性化建议
- 批准号:
10103541 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
EU-Funded
Modular Laser Sources For Sustainable Production Of Short Personalized Production Series (WAVETAILOR)
用于短个性化生产系列可持续生产的模块化激光源 (WAVETAILOR)
- 批准号:
10091981 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
EU-Funded
5G4PHealth: Enhanced 5G-Powered Platform for Predictive Preventive Personalized and Participatory Healthcare
5G4PHealth:增强型 5G 支持平台,用于预测、预防、个性化和参与式医疗保健
- 批准号:
10093679 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Collaborative R&D
Biomarker-Based Platform for Early Diagnosis of Chronic Liver Disease to Enable Personalized Therapy (LIVERAIM)
基于生物标志物的慢性肝病早期诊断平台,以实现个性化治疗(LIVERAIM)
- 批准号:
10087822 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
EU-Funded
HSI Implementation and Evaluation Project: Increasing Computer Science Undergraduate Retention through Predictive Modeling and Early, Personalized Academic Interventions
HSI 实施和评估项目:通过预测建模和早期个性化学术干预提高计算机科学本科生的保留率
- 批准号:
2345378 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Standard Grant
PFI-RP: Resilient and Energy-Efficient Memory Chips for Enhanced Mobile AI and Personalized Machine Learning
PFI-RP:用于增强移动人工智能和个性化机器学习的弹性和节能内存芯片
- 批准号:
2345655 - 财政年份:2024
- 资助金额:
$ 47.8万 - 项目类别:
Standard Grant