3D Articulatory Speech Synthesis using Biomechanical Models of the Oral, Pharyngeal and Laryngeal Complex
使用口腔、咽部和喉部复合体的生物力学模型进行 3D 发音语音合成
基本信息
- 批准号:RGPIN-2014-05949
- 负责人:
- 金额:$ 1.89万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2014
- 资助国家:加拿大
- 起止时间:2014-01-01 至 2015-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
For years, traditional formant-based and acoustic-based speech synthesis techniques have largely overshadowed articulatory synthesis research. While successful in some domains, they still cannot produce natural sounding speech from text. Articulatory speech synthesis, in contrast, continues to progress steadily at the fringes of both industrial and academic interest and is now poised to provide the necessary platform to overcome basic problems in speech processing and, I believe, represents the next major advance in speech synthesis technology. As part of my long-term research program, I have been establishing computer modeling and simulation of the oral, pharyngeal and laryngeal (OPAL) complex as a critical and central tool in many disciplines, including engineering, linguistics, education, entertainment, anatomy, physiology and medicine. From this effort, the infrastructure necessary to make large advances in articulatory speech synthesis is now available. Thus, I propose to develop the OPAL complex models to include aero-acoustic simulation to create the next generation of validated, articulatory speech synthesis. This research builds upon the detailed biomechanical models of the human of the upper airway that have been used to study the mechanics of speech behavior, swallowing and mastication along with the tools needed for fast, accurate coupled FEM and rigid body simulations. The main research activities are: 1. Integration of model geometries and biomechanics to create a complete biomechanical model of the OPAL complex capable of articulatory speech synthesis; 2. Improvement of the existing 1D Navier Stokes approach to provide improved coupling between laryngeal excitation of the vocal tract filter by leveraging the work of [Birkholz et al, 2006]; 3. Development of 3D aero-acoustic simulation of vocal tract noise sources based on fluid simulation as well as CFD based approximations; 4. Creation of inverse methods for driving speech articulation by extending the work of [Stavness et al, 2013] to include support for propagating both kinematic targets from image data or electromagnetic articulography (EMA) of speech and acoustic targets, backward through the aeroacoustic model, to derive muscle activations based on inverse dynamics techniques. Speech articulation constraints will be used to help make the system better determined. 5. Creation of advanced coupled vocal fold models that will oscillate under tension and airflow. We are poised to make a substantial advance in articulatory speech synthesis. The last decade of effort in creating the medical imaging techniques, speech measurements, complex 3D biomechanical models of the OPAL complex and tools for fast, accurate coupled FEM/rigid body simulation provides the necessary infrastructure for this project. Canada stands to benefit from being at the forefront of speech synthesis technology that support the next wave of applications. Including articulatory speech synthesis into our open source toolkit will support international efforts in speech research contributing profound new knowledge from a deeper understanding of human speech production.
多年来,传统的基于共振峰和声学的语音合成技术在很大程度上掩盖了发音合成的研究。虽然在某些领域取得了成功,但它们仍然无法从文本中产生自然的语音。相比之下,发音语音合成在工业界和学术界的兴趣中继续稳步发展,现在正准备提供必要的平台来克服语音处理中的基本问题,我相信,它代表了语音合成技术的下一个重大进展。作为我长期研究计划的一部分,我一直在建立口腔,咽和喉(OPAL)复合体的计算机建模和模拟,作为许多学科的关键和核心工具,包括工程,语言学,教育,娱乐,解剖学,生理学和医学。通过这些努力,现在可以获得在发音语音合成方面取得重大进展所需的基础设施。因此,我建议开发OPAL复杂的模型,包括航空声学模拟,以创建下一代的验证,发音语音合成。这项研究建立在详细的生物力学模型的人的上呼吸道,已被用来研究的力学语言行为,吞咽和咀嚼沿着与快速,准确耦合有限元和刚体模拟所需的工具。主要研究活动有:1.整合模型几何形状和生物力学,以创建能够进行发音语音合成的OPAL复合体的完整生物力学模型; 2.通过利用[Birkholz等人,2006]的工作,改进现有的1D Navier Stokes方法,以提供声道滤波器的喉部激励之间的改进的耦合; 3.基于流体模拟和CFD近似的声道噪声源的三维气动声学模拟的发展; 4.通过扩展[Stavness et al,2013]的工作,创建用于驱动语音清晰度的逆方法,以包括支持从语音和声学目标的图像数据或电磁关节成像(EMA)向后传播运动学目标,通过航空声学模型,基于逆动力学技术推导肌肉激活。语音清晰度约束将用于帮助更好地确定系统。5.创建先进的耦合声带模型,将在紧张和气流下振荡。我们准备在发音语音合成方面取得实质性进展。过去十年在创建医学成像技术、语音测量、OPAL复合体的复杂3D生物力学模型以及用于快速、准确耦合FEM/刚体模拟的工具方面的努力为该项目提供了必要的基础设施。加拿大站在支持下一波应用的语音合成技术的最前沿。将发音语音合成纳入我们的开源工具包将支持语音研究的国际努力,从对人类语音产生的更深入理解中贡献深刻的新知识。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Fels, Sidney其他文献
Dysfunctional paraspinal muscles in adult spinal deformity patients lead to increased spinal loading.
- DOI:
10.1007/s00586-022-07292-x - 发表时间:
2022-09 - 期刊:
- 影响因子:2.8
- 作者:
Malakoutian, Masoud;Noonan, Alex M.;Dehghan-Hamani, Iraj;Yamamoto, Shun;Fels, Sidney;Wilson, David;Doroudi, Majid;Schutz, Peter;Lewis, Stephen;Ailon, Tamir;Street, John;Brown, Stephen H. M.;Oxland, Thomas R. - 通讯作者:
Oxland, Thomas R.
Characterizing Motor Control of Mastication With Soft Actor-Critic
- DOI:
10.3389/fnhum.2020.00188 - 发表时间:
2020-05-26 - 期刊:
- 影响因子:2.9
- 作者:
Abdi, Amir H.;Sagl, Benedikt;Fels, Sidney - 通讯作者:
Fels, Sidney
A framework for evaluating usability of clinical monitoring technology.
- DOI:
10.1007/s10877-007-9091-y - 发表时间:
2007-10-01 - 期刊:
- 影响因子:2.2
- 作者:
Daniels, Jeremy;Fels, Sidney;Ansermino, J Mark - 通讯作者:
Ansermino, J Mark
Optimizing Multiple Object Tracking and Best View Video Synthesis
- DOI:
10.1109/tmm.2008.2001379 - 发表时间:
2008-10-01 - 期刊:
- 影响因子:7.3
- 作者:
Jiang, Hao;Fels, Sidney;Little, James J. - 通讯作者:
Little, James J.
Extracting moving boundaries from dynamic, multislice CT images for fluid simulation
- DOI:
10.1080/21681163.2016.1188028 - 发表时间:
2018-01-01 - 期刊:
- 影响因子:1.6
- 作者:
Ho, Andrew Kenneth;Inamoto, Yoko;Fels, Sidney - 通讯作者:
Fels, Sidney
Fels, Sidney的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Fels, Sidney', 18)}}的其他基金
Creating and Evaluating New Media Interfaces for Expression
创建和评估新的表达媒体界面
- 批准号:
RGPIN-2020-07054 - 财政年份:2022
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
Creating and Evaluating New Media Interfaces for Expression
创建和评估新的表达媒体界面
- 批准号:
RGPIN-2020-07054 - 财政年份:2021
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
Creating and Evaluating New Media Interfaces for Expression
创建和评估新的表达媒体界面
- 批准号:
RGPIN-2020-07054 - 财政年份:2020
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
Creation and Evaluation of Novel Media Experiences
新颖媒体体验的创造和评估
- 批准号:
RGPIN-2015-04971 - 财政年份:2019
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
An investigation of a ViDeX interface in onenote to provide video experience for learning
Onenote 中 ViDeX 界面的研究,为学习提供视频体验
- 批准号:
508852-2017 - 财政年份:2019
- 资助金额:
$ 1.89万 - 项目类别:
Collaborative Research and Development Grants
Creation and Evaluation of Novel Media Experiences
新颖媒体体验的创造和评估
- 批准号:
RGPIN-2015-04971 - 财政年份:2018
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
An investigation of a ViDeX interface in onenote to provide video experience for learning
Onenote 中 ViDeX 界面的研究,为学习提供视频体验
- 批准号:
508852-2017 - 财政年份:2018
- 资助金额:
$ 1.89万 - 项目类别:
Collaborative Research and Development Grants
Control strategies for articulatory speech synthesis for natural user interfaces
自然用户界面的发音语音合成控制策略
- 批准号:
506576-2017 - 财政年份:2018
- 资助金额:
$ 1.89万 - 项目类别:
Strategic Projects - Group
Creation and Evaluation of Novel Media Experiences
新颖媒体体验的创造和评估
- 批准号:
RGPIN-2015-04971 - 财政年份:2017
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
Control strategies for articulatory speech synthesis for natural user interfaces
自然用户界面的发音语音合成控制策略
- 批准号:
506576-2017 - 财政年份:2017
- 资助金额:
$ 1.89万 - 项目类别:
Strategic Projects - Group
相似海外基金
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2343847 - 财政年份:2023
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2141275 - 财政年份:2022
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant
Expanding articulatory information from ultrasound imaging of speech using MRI-based image simulations and audio measurements
使用基于 MRI 的图像模拟和音频测量来扩展语音超声成像的发音信息
- 批准号:
10537976 - 财政年份:2022
- 资助金额:
$ 1.89万 - 项目类别:
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2141413 - 财政年份:2022
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2141246 - 财政年份:2022
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant
Exploring the validity of articulatory impairment phenotypes in speech motor disorders
探索言语运动障碍中发音障碍表型的有效性
- 批准号:
10331841 - 财政年份:2021
- 资助金额:
$ 1.89万 - 项目类别:
The role of anatomical factors in speech production: towards the speaker-independent articulatory model.
解剖因素在言语产生中的作用:走向独立于说话者的发音模型。
- 批准号:
RGPIN-2016-06587 - 财政年份:2021
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
The role of anatomical factors in speech production: towards the speaker-independent articulatory model.
解剖因素在言语产生中的作用:走向独立于说话者的发音模型。
- 批准号:
RGPIN-2016-06587 - 财政年份:2020
- 资助金额:
$ 1.89万 - 项目类别:
Discovery Grants Program - Individual
Collaborative Research: RI: Small: From Ultrasound and MRI to articulatory and acoustic models of child speech development
合作研究:RI:小型:从超声和 MRI 到儿童言语发展的发音和声学模型
- 批准号:
2006979 - 财政年份:2020
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: From ultrasound and MRI to articulatory and acoustic models of child speech development
合作研究:RI:小型:从超声和 MRI 到儿童言语发展的发音和声学模型
- 批准号:
2006818 - 财政年份:2020
- 资助金额:
$ 1.89万 - 项目类别:
Standard Grant