权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Real-time deep learning to improve speech intelligibility in noise

实时深度学习提高噪声中的语音清晰度

基本信息

批准号：
10268203
负责人：
Eric Martin Johnson
金额：
$ 7.01万
依托单位：
OHIO STATE UNIVERSITY
依托单位国家：
美国
项目类别：
财政年份：
2020
资助国家：
美国
起止时间：
2020-09-30 至 2022-07-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10268203
关键词：
Address Algorithms American Area Auditory Cellular Phone Characteristics Cochlear Implants Communication Complex Data Devices Diagnosis Economic Burden Effectiveness Environment Equilibrium Etiology Faculty Fellowship Floor Foundations Future Goals Healthcare Hearing Hearing Aids Human Implant Measures Mentors Mission National Institute on Deafness and Other Communication Disorders Noise Performance Phase Prevention Process Quality of life Recommendation Research Research Personnel Research Training Scheme Seminal Signal Transduction Speech Speech Intelligibility Strategic Planning System Telephone Testing Time Training Translating Universities Videoconferencing Work artificial neural network base career communication device deep learning deep neural network design experimental study health economics hearing impairment hearing loss treatment improved microphone network architecture neural network normal hearing novel novel strategies operation skills speech in noise wearable device

项目摘要

Project Summary/Abstract One in eight Americans has hearing loss, and this constitutes a major health and economic burden (Blackwell et al., 2014). The primary complaint of hearing-impaired (HI) listeners is difficulty understanding speech when background noise is present (see Dillon, 2012). While hearing aids (HAs) have improved in recent years, they still provide little benefit in noisy environments. For decades, a means of improving the ability to understand speech in background noise appeared unattainable, despite substantial amounts of research by both universities and HA companies. This changed when deep learning provided the first demonstration of a single-microphone algorithm that improves intelligibly in noise for HI listeners (Healy et al., 2013, 2014, 2015). Although this algorithm provides massive intelligibility improvements (even allowing listeners to improve intelligibility from floor to ceiling levels), it is currently not implemented to operate in real time and is therefore not suitable for implementation into HAs and cochlear implants (CIs). What is needed, therefore, is a highly effective noise-reduction algorithm that is capable of operating in real time. This project aims to address this critical need. The long-term goal of the currently proposed project is to alleviate HI listeners’ predominant hearing handicap, which is difficulty understanding speech in background noise. The first aim introduces a new algorithm, based on a novel foundational scheme, that is designed to provide substantial benefit for any HI listener in real time. This algorithm will be well suited for implementation into HAs, CIs, and other face-to-face communication applications. The effectiveness of this new algorithm will be quantified using both HI and normal-hearing (NH) listeners. The second aim expands upon this new algorithm by modifying it to accept a small amount of future time-frame information, which could improve its noise-reduction performance but will introduce a brief processing delay. The rationale is that different devices have different allowable latencies. Face-to-face communication devices (HAs, CIs, etc.) have strict low-latency requirements, but other important communication systems (e.g., telephones) have different requirements. It is possible that the addition of future time-frame information within these requirements (up to 150 ms) will result in even better speech intelligibility. But the magnitude of any potential benefit is unknown. This critical information will be established currently. Using both HI and NH listeners, we will measure intelligibility for noisy sentences that have been processed using various amounts of future time information. This comprehensive fellowship training plan will provide individualized, mentored research training from world-class faculty in a highly supportive and productive environment. The proposed work will endow the applicant with the skills needed to transition to the next stage of his research career, transform our treatment of hearing loss, and substantially impact quality of life for millions of Americans.

项目概要/摘要八分之一的美国人有听力损失，这构成了重大的健康和经济负担（布莱克威尔等人，2014）。听障 (HI) 听众的主要抱怨是理解困难存在背景噪声时的语音（参见 Dillon，2012）。虽然助听器 (HA) 在以下方面有所改进：近年来，它们在嘈杂的环境中仍然提供很少的好处。几十年来，一种改善尽管有大量的声音，但在背景噪音中理解语音的能力似乎无法实现大学和 HA 公司的研究。当深度学习提供第一个解决方案时，情况发生了变化演示单麦克风算法，该算法可明显改善 HI 听众的噪音（Healy 等人， 2013 年、2014 年、2015 年）。尽管该算法提供了巨大的清晰度改进（甚至允许听众以提高从地板到天花板水平的清晰度），目前尚未实现实际操作时间，因此不适合实施到 HA 和人工耳蜗 (CI) 中。需要什么，因此，是一种能够实时运行的高效降噪算法。这个项目旨在解决这一关键需求。当前提议项目的长期目标是减轻 HI 听众的主要听力障碍，即难以理解背景噪音中的语音。第一个目标引入了一个新的算法，基于一种新颖的基础方案，旨在为任何 HI 提供实质性好处实时监听。该算法非常适合在 HA、CI 和其他面对面的应用中实施通信应用。这种新算法的有效性将使用 HI 和听力正常 (NH) 的听众。第二个目标通过修改它以接受一个新的算法来扩展它少量的未来时间范围信息，这可以提高其降噪性能，但会引入短暂的处理延迟。理由是不同的设备具有不同的允许延迟。面对面的通信设备（HA、CI等）具有严格的低延迟要求，但其他重要的通信系统（例如电话）有不同的要求。未来可能会增加这些要求内的时间范围信息（最多 150 毫秒）将带来更好的语音清晰度。但潜在好处的大小尚不清楚。目前将确定这一关键信息。使用 HI 和 NH 监听器，我们将测量已处理的噪声句子的清晰度使用各种数量的未来时间信息。这项全面的奖学金培训计划将提供个性化的、指导性的研究培训在高度支持和富有成效的环境中拥有世界一流的教师。拟议的工作将赋予申请人具备过渡到研究生涯下一阶段所需的技能，改变我们的治疗方式听力损失，严重影响数百万美国人的生活质量。