Video-based Speech Enhancement for Vision and Hearing Impairment
针对视力和听力障碍的基于视频的语音增强
基本信息
- 批准号:8659442
- 负责人:
- 金额:$ 23.09万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-06-01 至 2016-05-31
- 项目状态:已结题
- 来源:
- 关键词:AccountingAcousticsActivities of Daily LivingAddressAdultAgeAge-YearsAging-Related ProcessAlgorithmsAmplifiersAreaAuditoryBlindnessCommunicationComprehensionComputer Vision SystemsCuesDependenceDetectionDevelopmentDevicesDigital Signal ProcessingEffectivenessElderlyEnvironmentFacial ExpressionFeedbackGrantHearing AidsHumanLaboratoriesLeadLearningLifeLip structureLiteratureLocationMachine LearningMeasuresMethodsModalityModelingNoiseOutputPerformancePersonsPlayPopulationPresbycusisProcessQuality of lifeResearchRoleSelf-Help DevicesSensorySensory AidsShapesSignal TransductionSocietiesSourceSpeechSpeech IntelligibilitySpeech PerceptionStagingSystemTechniquesTestingTimeUnited StatesVisionVisualVisual impairmentVoicebasedesignhearing impairmentimprovedinterestnovel strategiesperformance testsprototypepublic health relevanceresearch studysignal processingsocialsoundspeech recognitiontool developmentvisual information
项目摘要
DESCRIPTION (provided by applicant):  Video-based Speech Enhancement for Persons with Hearing and Vision Loss Project Summary It is estimated that by 2030, the number of people in the United States over the age of 65 will account for over 20% of the total population.  Hearing and vision loss naturally accompanies the aging process.  Persons with hearing loss can benefit from observing the visual cues from a speaker such as the shape of the lips and facial expression to greatly improve their ability to comprehend speech.  However, persons with vision loss cannot make use of these visual cues, and have a harder time understanding speech, especially in noisy environments.  Furthermore, people with normal vision can use visual information to identify a speaker in a group, which allows them to focus on this person.  This can greatly benefit a person with hearing loss who may be using a device such as a sound amplifier or a hearing aid.  A user with vision loss, however, needs to be provided with this speaker information to make optimal use of such devices.  We propose developing a prototype device that will clean the speech signal from a target speaker and improve speech comprehension for persons with hearing and vision loss in everyday situations.  In order to accomplish this task, we need to harness the visual cues that have so far largely been ignored in the design of assistive technolo- gies for persons with hearing loss.  Our first aim is to learn speaker-independent visual cues that are associated with the target speech signal, and use these audio-visual cues to design speech enhancement algorithms that perform much better in noisy everyday environment than current methods which only utilize the audio signal.  We will utilize a video camera and computer vision methods to design advanced digital signal processing techniques to enhance the target speech signals recorded through a microphone.  Our second aim is to use the video and audio signals to detect and efficiently localize the visible speaker.  The information regarding the location of the speaker of interest can then be used to efficiently perform speaker separation, as well as be provided to the user.  Finally, we aim to implement these developed algorithms on a portable prototype system.  We will test the performance of this system and improve the user-interface through user experiments in real-world situations as well as laboratory conditions.  The end product will show the feasibility and importance of incorporating multiple modalities into sensory assistive devices, and set the stage for future research and development efforts.
描述(由申请人提供):基于视频的听力和视力丧失者语音增强项目概述预计到2030年,美国65岁以上人口将占总人口的20%以上。听力和视力的丧失自然伴随着衰老的过程。听力损失的人可以从观察说话人的视觉线索中获益,比如嘴唇的形状和面部表情,从而大大提高他们理解言语的能力。然而,视力丧失的人不能利用这些视觉线索,并且很难理解言语,特别是在嘈杂的环境中。此外,视力正常的人可以使用视觉信息来识别群体中的说话者,这使他们能够专注于这个人。这对正在使用扩音器或助听器等设备的听力损失患者大有裨益。然而,需要向视力丧失的用户提供这些扬声器信息,以便最佳地使用这些设备。我们建议开发一种原型装置,用于清除目标说话人的语音信号,提高听力和视力丧失者在日常情况下的语音理解能力。为了完成这项任务,我们需要利用视觉线索,迄今为止,在为听力损失人士设计辅助技术时,这些线索在很大程度上被忽视了。我们的第一个目标是学习与目标语音信号相关的与说话人无关的视觉线索,并使用这些视听线索来设计语音增强算法,这些算法在嘈杂的日常环境中比目前仅利用音频信号的方法表现得更好。我们将利用摄像机和计算机视觉方法设计先进的数字信号处理技术,以增强通过麦克风录制的目标语音信号。我们的第二个目标是利用视频和音频信号来检测和有效地定位可见说话人。然后,有关感兴趣的说话者位置的信息可用于有效地执行说话者分离,并提供给用户。最后,我们的目标是在便携式原型系统上实现这些开发的算法。我们将通过实际情况和实验室条件下的用户实验来测试该系统的性能并改进用户界面。最终产品将显示将多种模式纳入感官辅助设备的可行性和重要性,并为未来的研究和开发工作奠定基础。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
JAMES M COUGHLAN其他文献
JAMES M COUGHLAN的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('JAMES M COUGHLAN', 18)}}的其他基金
Leveraging Maps and Computer Vision to Support Indoor Navigation for Blind Travelers
利用地图和计算机视觉支持盲人旅行者的室内导航
- 批准号:9934891 
- 财政年份:2019
- 资助金额:$ 23.09万 
- 项目类别:
Leveraging Maps and Computer Vision to Support Indoor Navigation for Blind Travelers
利用地图和计算机视觉支持盲人旅行者的室内导航
- 批准号:10220178 
- 财政年份:2018
- 资助金额:$ 23.09万 
- 项目类别:
Leveraging Maps and Computer Vision to Support Indoor Navigation for Blind Travelers
利用地图和计算机视觉支持盲人旅行者的室内导航
- 批准号:9899994 
- 财政年份:2018
- 资助金额:$ 23.09万 
- 项目类别:
Enabling Audio-Haptic Interaction with Physical Objects for the Visually Impaired
为视障人士提供与物理对象的音频触觉交互
- 批准号:9238777 
- 财政年份:2016
- 资助金额:$ 23.09万 
- 项目类别:
Point and Listen: Augmented Reality Interfaces for the Visually Impaired
指向并聆听:为视障人士提供的增强现实界面
- 批准号:10540115 
- 财政年份:2016
- 资助金额:$ 23.09万 
- 项目类别:
Point and Listen: Augmented Reality Interfaces for the Visually Impaired
指向并聆听:为视障人士提供的增强现实界面
- 批准号:10839155 
- 财政年份:2016
- 资助金额:$ 23.09万 
- 项目类别:
A Cell Phone-based Sign Reader for Blind & Visually Impaired Persons
基于手机的盲人标志阅读器
- 批准号:7373002 
- 财政年份:2009
- 资助金额:$ 23.09万 
- 项目类别:
A Cell Phone-based Sign Reader for Blind & Visually Impaired Persons
基于手机的盲人标志阅读器
- 批准号:7911722 
- 财政年份:2009
- 资助金额:$ 23.09万 
- 项目类别:
Providing Access to Appliance Displays for Visually Impaired Users
为视障用户提供对设备显示屏的访问
- 批准号:8916115 
- 财政年份:2008
- 资助金额:$ 23.09万 
- 项目类别:
A Non-Document Text and Display Reader for Visually Impaired Persons
适合视障人士的非文档文本和显示阅读器
- 批准号:7446299 
- 财政年份:2008
- 资助金额:$ 23.09万 
- 项目类别:
相似海外基金
Nonlinear Acoustics for the conditioning monitoring of Aerospace structures (NACMAS)
用于航空航天结构调节监测的非线性声学 (NACMAS)
- 批准号:10078324 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:BEIS-Funded Programmes 
ORCC: Marine predator and prey response to climate change: Synthesis of Acoustics, Physiology, Prey, and Habitat In a Rapidly changing Environment (SAPPHIRE)
ORCC:海洋捕食者和猎物对气候变化的反应:快速变化环境中声学、生理学、猎物和栖息地的综合(蓝宝石)
- 批准号:2308300 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:Continuing Grant 
University of Salford (The) and KP Acoustics Group Limited KTP 22_23 R1
索尔福德大学 (The) 和 KP Acoustics Group Limited KTP 22_23 R1
- 批准号:10033989 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:Knowledge Transfer Partnership 
User-controllable and Physics-informed Neural Acoustics Fields for Multichannel Audio Rendering and Analysis in Mixed Reality Application
用于混合现实应用中多通道音频渲染和分析的用户可控且基于物理的神经声学场
- 批准号:23K16913 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:Grant-in-Aid for Early-Career Scientists 
Combined radiation acoustics and ultrasound imaging for real-time guidance in radiotherapy
结合辐射声学和超声成像,用于放射治疗的实时指导
- 批准号:10582051 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:
Comprehensive assessment of speech physiology and acoustics in Parkinson's disease progression
帕金森病进展中言语生理学和声学的综合评估
- 批准号:10602958 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:
The acoustics of climate change - long-term observations in the arctic oceans
气候变化的声学——北冰洋的长期观测
- 批准号:2889921 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:Studentship 
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:2343847 
- 财政年份:2023
- 资助金额:$ 23.09万 
- 项目类别:Standard Grant 
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:2141275 
- 财政年份:2022
- 资助金额:$ 23.09万 
- 项目类别:Standard Grant 
Flow Physics and Vortex-Induced Acoustics in Bio-Inspired Collective Locomotion
仿生集体运动中的流动物理学和涡激声学
- 批准号:DGECR-2022-00019 
- 财政年份:2022
- 资助金额:$ 23.09万 
- 项目类别:Discovery Launch Supplement 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



