权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Characterizing the recovery of spectral, temporal, and phonemic speech information from visual cues

表征从视觉线索中恢复频谱、时间和音位语音信息

基本信息

批准号：
10563860
负责人：
David Brang
金额：
$ 55.04万
依托单位：
UNIVERSITY OF MICHIGAN AT ANN ARBOR
依托单位国家：
美国
项目类别：
财政年份：
2023
资助国家：
美国
起止时间：
2023-02-14 至 2028-01-31
项目状态：
未结题

来源：
https://reporter.nih.gov/project-details/10563860
关键词：
Acoustics Attention Auditory Auditory area Auditory system Biological Brain Brain Injuries Brain Neoplasms Classification Cochlear Implants Code Compensation Crowding Cues Data Development Devices Dimensions Distributional Activity Electrodes Electroencephalography Emotional Frequencies Functional Magnetic Resonance Imaging Health Hearing Human Illusions Impairment Individual Lipreading Maps Measures Modality Modeling Movement Neurons Noise Oral Oral cavity Participant Patients Pattern Perception Periodicity Physiological Processes Population Presbycusis Process Reaction Time Recovery Rehabilitation therapy Research Resolution Resources Route Shapes Signal Transduction Social Interaction Speech Speech Perception Speech Sound Stimulus Stroke Superior temporal gyrus System Testing Titrations Training Programs Trauma Vision Visual Vocation audiovisual speech auditory stimulus density healthy aging improved neural prosthesis neuromechanism programs response restoration sensory substitution social speech accuracy visual information visual speech

项目摘要

Project Summary Auditory speech perception is essential for social, vocational, and emotional health in hearing individuals. However, the reliability of auditory signals varies widely in everyday settings (e.g., at a crowded party), requiring supplemental processes to enable accurate speech perception. The principle mechanisms that support the perception of degraded auditory speech signals are auditory-visual (crossmodal) interactions, which can perceptually restore speech content using visual cues provided by lipreading, rhythmic articulatory movements, and the natural correlations present between oral resonance and mouth shape. Moreover, receptive speech processes can be limited through a variety of causes, including intrinsic brain tumor, stroke, cochlear implant usage, and age-related hearing loss, making compensatory crossmodal mechanisms necessary for one to continue working and maintaining healthy social interactions. However, the physiological processes that enable vision to facilitate speech perception remain poorly understood and no integrative model exists for how these multiple visual dimensions combine to enhance auditory speech perception. In the auditory domain, distributed populations of neurons encode spectro-temporal information about acoustic cues that are then transcoded into phonemes. We propose a dual-route perceptual model through which visual signals integrate with phoneme- coded neurons. First, a direct path through which viseme-to-phoneme conversions generate semi-overlapping distributions of activity in the superior temporal gyrus, leading to improved hearing through improved auditory phoneme tuning functions. Second, an indirect path through which visual features restore spectral information about speech frequencies and alter phoneme-response timing, resulting in improved auditory spectro-temporal profiles (which in turn are transcoded into phonemes with greater precision). Finally, we will examine the hypothesis that our perceptual system optimizes which of these visual dimensions is prioritized for recovery based on what is missing from the auditory signal. These studies will provide a unified framework for how speech perception benefits from different visual signals. By understanding biological approaches to crossmodally restoring degraded auditory speech information, we can develop better targeted rehabilitation programs and neural prostheses to maximize speech perception recovery after trauma or during healthy aging.

项目概要听觉言语感知对于听力正常的人的社会、职业和情感健康至关重要。然而，听觉信号的可靠性在日常环境中（例如，在拥挤的聚会上）差异很大，需要补充过程以实现准确的语音感知。支持的主要机制对退化听觉语音信号的感知是听觉-视觉（跨模态）交互，这可以使用唇读、有节奏的发音运动提供的视觉线索在感知上恢复语音内容，以及口腔共鸣和口腔形状之间存在的自然相关性。此外，接待讲话过程可能因多种原因而受到限制，包括内在脑肿瘤、中风、人工耳蜗使用情况和与年龄相关的听力损失，使得补偿性跨模式机制成为人们必要的继续工作并保持健康的社交互动。然而，生理过程使促进语音感知的愿景仍然知之甚少，并且不存在关于如何实现这些目标的综合模型多个视觉维度相结合，增强听觉言语感知。在听觉领域，分布神经元群体编码有关声音线索的频谱时间信息，然后将其转码为音素。我们提出了一种双路线感知模型，通过该模型将视觉信号与音素相结合编码神经元。首先，视位到音位转换产生半重叠的直接路径颞上回的活动分布，通过改善听觉来改善听力音素调整功能。二、视觉特征恢复光谱信息的间接路径关于语音频率并改变音素响应时间，从而改善听觉频谱时间配置文件（进而以更高的精度转码为音素）。最后，我们将检查假设我们的感知系统优化了这些视觉维度中的哪一个是优先恢复的基于听觉信号中缺失的内容。这些研究将为语音如何感知受益于不同的视觉信号。通过了解跨模式的生物学方法恢复退化的听觉言语信息，我们可以制定更有针对性的康复方案神经假体可最大限度地在创伤后或健康衰老期间恢复言语感知。