权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Quantitative Modeling of Segmental Timing in Dysarthria

构音障碍分段时间的定量建模

基本信息

批准号：
7739582
负责人：
KRIS TJADEN
金额：
$ 24.39万
依托单位：
OREGON HEALTH & SCIENCE UNIVERSITY
依托单位国家：
美国
项目类别：
财政年份：
2009
资助国家：
美国
起止时间：
2009-07-17 至 2011-06-30
项目状态：
已结题

项目摘要

DESCRIPTION (provided by applicant): Quantitative, acoustic models of segmental timing in spoken English, such as have been developed for text-to-speech synthesis (TTS), acknowledge that segment durations in connected speech reflect the combined influence of systematic factors as well as nonsystematic or random factors. Systematic Variability in segment durations reflects factors such as context, stress, speaking style or register, and cognitive load. Segment durations also reflect within-speaker variability - termed Random Variability - that cannot be attributed to any of these systematic factors. An individual talker's speech duration patterns therefore can be mathematically characterized in terms of the magnitude of the effects of each systematic factor (e.g., amount of lengthening associated with word stress), as well as in terms of the relative and absolute amounts of systematic and random variability. Importantly, this powerful modeling framework can be applied to meaningful sentence productions, and is capable of isolating the effects of individual systematic factors without requiring the use of artificial speech materials. This approach to quantitatively modeling segmental timing in TTS has further proven crucial for successfully synthesizing intelligible, natural-sounding speech. Given the importance of this modeling framework for generating high quality speech synthesis, it is surprising that similar modeling efforts have not been applied to dysarthria as a means of understanding the source of reduced intelligibility and naturalness in this speech disorder. Aberrancies in the temporal patterning of speech are ubiquitous in most persons with dysarthria, and the contribution of speech duration variables to intelligibility and naturalness is suggested in a variety of studies. The approach used in many existing studies is to document whether speech durations in dysarthria are - on average - atypically short, long or variable as compared to normal speech. The TTS modeling framework described above, however, goes beyond this type of simple description to identify the relative contribution of specific systematic factors influencing segment durations for an individual speaker as well as the combined relative and absolute contributions of systematic and random factors to segmental timing for that individual. The TTS modeling framework further allows model parameters for an individual speaker to be manipulated via speech synthesis to determine the impact on intelligibility and naturalness. The proposed exploratory project seeks to apply such a quantitative modeling framework to segment durations in sentences produced by speakers with a variety of neurological diagnoses and dysarthrias. The perceptual relevance of model parameters will be further studied via speech resynthesis to determine their impact on judgments of intelligibility and naturalness. PUBLIC HEALTH RELEVANCE: Effective and efficacious treatment of reduced intelligibility and naturalness in dysarthria requires knowledge of factors explaining or underlying these functional limitations. The proposed exploratory project seeks to apply a quantitative model of segmental timing, developed for text-to-speech synthesis, to persons with dysarthria for whom anomalies in the temporal patterning of speech are common. Findings from this project will provide a new and comprehensive model of aberrancies in the temporal patterning of speech in dysarthria; the contribution of model parameters to perceptual judgments of intelligibility and naturalness also will be determined.

描述（由申请人提供）：英语口语中分段计时的定量声学模型，例如已经为文本到语音合成（TTS）开发的模型，承认连接语音中的分段持续时间反映了系统因素以及非系统或随机因素的综合影响。片段持续时间的系统变异性反映了诸如上下文、压力、说话风格或语域以及认知负荷等因素。段持续时间也反映了说话人内部的变异性-称为随机变异性-不能归因于任何这些系统因素。因此，可以根据每个系统因素（例如，与单词重音相关的延长量），以及系统和随机变化的相对量和绝对量。重要的是，这个强大的建模框架可以应用于有意义的句子制作，并且能够隔离单个系统因素的影响，而不需要使用人工语音材料。这种定量建模的TTS分段定时的方法已进一步证明成功地合成可理解的，自然的声音的语音至关重要。考虑到这种建模框架对于生成高质量语音合成的重要性，令人惊讶的是，类似的建模工作尚未应用于构音障碍，作为理解这种语音障碍中的可懂度和自然度降低的来源的手段。言语的时间模式的畸变在大多数构音障碍患者中普遍存在，并且在各种研究中提出了言语持续时间变量对可懂度和自然度的贡献。许多现有研究中使用的方法是记录构音障碍患者的言语持续时间与正常言语相比平均而言是否短、长或可变。然而，上述TTS建模框架超出了这种类型的简单描述，以识别影响单个说话者的段持续时间的特定系统因素的相对贡献以及系统和随机因素对该个体的段定时的组合的相对和绝对贡献。TTS建模框架还允许经由语音合成来操纵个体说话者的模型参数，以确定对可懂度和自然度的影响。拟议的探索性项目旨在应用这样一个定量建模框架，以段持续时间的发言者与各种神经系统诊断和构音障碍的句子。模型参数的感知相关性将通过语音再合成进一步研究，以确定它们对可懂度和自然度判断的影响。公共卫生关系：有效治疗构音障碍的可懂度和自然度降低需要了解解释或潜在这些功能限制的因素。拟议的探索性项目旨在应用定量模型的分段定时，开发文本到语音合成，构音障碍的人的语音的时间模式的异常是常见的。该项目的研究结果将提供一个新的和全面的模型异常的时间模式的语音构音障碍，模型参数的可理解性和自然的感性判断的贡献也将被确定。