基于MRI 数据的个性化发音机理研究
批准号:
61573254
项目类别:
面上项目
资助金额:
65.0 万元
负责人:
本多清志
依托单位:
学科分类:
F0605.模式识别与数据挖掘
结题年份:
2019
批准年份:
2015
项目状态:
已结题
项目参与者:
方强、路文焕、曹金鑫、张句、张聪聪、张敬姝、刘立成、李静、关雯丹
国基评审专家1V1指导 中标率高出同行96.8%
结合最新热点,提供专业选题建议
深度指导申报书撰写,确保创新可行
指导项目中标800+,快速提高中标率
微信扫码咨询
中文摘要
该项目旨在揭示人类语音产生机制,尤其是发音运动中的个性化语音特征的生理机理。通过利用MRI作为观测手段,结合cine-MRI和tagged-MRI的优点实现对于完整发音器官的观测及标志点追踪研究。通过MRI图像对于鼻腔、下咽腔等发音器官的几何形态观测及声学仿真,来揭示人的个性化发音特征的生理学机理。通过对女性发音人的下咽腔形态分析及相应的声学仿真,揭示女性下咽腔对于语音频域的影响。通过建立固态机械发音模型来定量研究发音器官与独立元音的声学特征对应关系。进而利用数字声学模型对动态发音过程进行仿真。利用本团队已有的生理发音运动模型,对个性化发音运动过程进行仿真研究。利用MRI观测数据及声学计算模型,从生理层、发音运动层到声学特征深入揭示人的个性化发音机理。
英文摘要
This project aims at expanding our scientific knowledge on human mechanisms for voice and speech production so as to facilitate technological development and medical application in future. The principal mean for the project is the magnetic resonance imaging (MRI) technique to visualize anatomical structures and physic-physiological mechanisms involved in voice and speech production. High-quality static and motion images will be obtained for this purpose by solving technical difficulties that are known to date regarding MRI. Newer analysis techniques will be established to visualize smaller structures, explore unknown mechanisms, and elucidate acoustic phenomena. The structures and functions for speech production by articulation are recorded by both static and motion imaging techniques. The static images are collected and analyzed so as to distinguish muscles and cartilages from the surrounding structures. The motion images are acquired for the entire articulatory organs and analyzed for tissue-contact detection and marker-tracking analysis, which will describe time-space patterns of speech articulation numerically. Real-time MRI during sentence production is used for cross-linguistic studies in this project, which will contribute to the studies for language learning and clinical examination. Exploring individualized mechanism of voice and speech production is another topic in this project. Three-dimensional shapes of the vocal tract are visualized with MRI, and they are analyzed to discover certain unknown features in speech signals, such as the causal process of deriving individual vocal characteristics, or articulatory normalization of vowels cross male and female speakers. New knowledge thus obtained by MRI-based analysis is applied to refine a physiological articulatory model. Anatomical findings are used to revise muscle geometry in the model, and kinematic data are used to evaluate the performance of the model in speech articulation and synthesis. The project team is formed by the principal investigator having a 20-year experience in the research field and his colleagues at Tianjin University and the Chinese Academy of Social Science. His international colleagues in China, Japan, USA, and France will also support this project as volunteers or consultants. The MRI systems used for this project are modern powerful ones with a 3-Tesla static magnetic field. Both researchers’ experience and advanced research systems promote rapid advancement of the related fields in China and further stimulate speech science worldwide.
发音人的声道结构影响着语音的声学特性,本项目主要研究说话人个性化语音产生的生理机制。基于共振成像(MRI)以及快速扫描技术,我们建立了高分辨率的口腔、喉腔、咽腔以及鼻腔的三维中文发音数据库,包括静态和部分动态发音器官数据,实现了语音生成过程的可视化。对于说话人的静态特性研究,主要基于数据库中女性的元音数据,建立固体声道模型并进行声学实验,更进一步使用时域有限差分方法(FDTD)建立声学计算模型,并与已有的男性说话人研究结果进行比较。研究结果揭示出下咽腔结构在语音生成过程中对个性化语音特征的贡献,同时表明语音中男性和女性说话人在频谱特性上的明显差异。对于说话人的动态特性研究,我们使用MRI数据库中的静态数据定义了相对舌体大小(RTS),作为不会被控制的生理结构来表征说话人的个性化信息,并使用动态MRI图像计算元音到元音的舌体移动速度。结果表明相对舌体大小通过影响说话人的舌体运动速率改变了共振峰的变化速率。本项目的研究成果扩展了语音个性化的表征方式,进一步完善了语音生成的基础理论。
期刊论文列表
专著列表
科研奖励列表
会议论文列表
专利列表
DOI:10.1121/1.5089220
发表时间:2019-02
期刊:The Journal of the Acoustical Society of America
影响因子:--
作者:Ju Zhang;K. Honda;Jianguo Wei;T. Kitamura
通讯作者:Ju Zhang;K. Honda;Jianguo Wei;T. Kitamura
DOI:--
发表时间:2018
期刊:清华大学学报(自然科学版)
影响因子:--
作者:路文焕;冯晓艳;Kiyoshi Honda;魏建国
通讯作者:魏建国
DOI:10.1109/access.2019.2918988
发表时间:2019
期刊:IEEE Access
影响因子:3.9
作者:Yuxuan Li;Wenhuan Lu;Yuqing He;Jianwu Dang
通讯作者:Jianwu Dang
Tooth visualization in vowel production MR images for three-dimensional vocal tract modeling
用于三维声道建模的元音生成 MR 图像中的牙齿可视化
DOI:10.1016/j.specom.2017.11.005
发表时间:2018-02
期刊:Speech Communication
影响因子:3.2
作者:Ju Zhang;Kiyoshi Honda;Jianguo Wei
通讯作者:Jianguo Wei
国内基金
海外基金















{{item.name}}会员


