权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Realization of HMM-based text-to-speech Synthesis Systems

基于HMM的文本语音合成系统的实现

基本信息

批准号：
10555125
负责人：
KOBAYASHI Takao
金额：
$ 2.88万
依托单位：
Tokyo Institute of Technology
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B).
财政年份：
1998
资助国家：
日本
起止时间：
1998 至 2000
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/en/grant/KAKENHI-PROJECT-10555125/
关键词：
text-to-speech synthesis (TTS)hidden Markov model (HMM)multi-space probability distribution HMM (MSD-HMM)HMM-based speech synthesis voice characteristics conversion speaker interpolation pitch pattern speech parameter generation 形態素解析 /

项目摘要

The main purpose of this research is to realize a text-to-speech synthesis system which can generate speech with various voice characteristics based on hidden Markov models (HMMs). We have obtained the following results.1. Modeling of phonetic and prosodic information of speech based on HMMWe have proposed a new kind of HMM, called multi-space probability distribution HMM (MSD-HMM), which can model pitch pattern of speech without heuristic assumption. Then we have also proposed a technique in which spectrum, pitch, and state duration are modeled simultaneously in a unified framework of HMM.2. Speech parameter generation from HMMWe have extended the parameter generation algorithm from HMM to a general case in which the state sequence or a part of it is latent and derived a new algorithm. We have also derived a pitch pattern generation algorithm based on MSD-HMM3. Realization of text-to-speech synthesis system based on HMMsWe have developed a Japanese text-to-speech synthesis system, which works on workstations and PCs, based on the simultaneous modeling of spectrum, pitch, and duration by HMM and the speech parameter generation from HMM.4. Speech synthesis with various voice characteristicsWe have proposed voice characteristics conversion techniques for the HMM-based speech synthesis system using speaker adaptation techniques for HMMs, such as MAP/VFS and MLLR.We have also proposed a speaker interpolation technique by interpolating HMM parameters among representative speakers' HMM sets. Using these techniques, we have shown that the HMM-based speech synthesis system can generate speech with various voice characteristics.

本研究的主要目的是实现一个基于隐马尔可夫模型的文语合成系统，该系统可以生成具有多种语音特征的语音。我们得到了以下结果。1.基于隐马尔可夫模型的语音韵律信息建模我们提出了一种新的隐马尔可夫模型--多空间概率分布隐马尔可夫模型（MSD-HMM），它可以在不作启发式假设的情况下对语音的基音模式进行建模。然后，我们还提出了一种技术，其中频谱，音调，和状态持续时间建模同时在一个统一的框架HMM。基于隐马尔可夫模型的语音参数生成我们将隐马尔可夫模型的参数生成算法推广到一般的状态序列或部分状态序列是隐式的情况，并导出了一种新的算法。我们还推导了一个基于MSD-HMM 3的基音模式生成算法。基于隐马尔可夫模型的文语合成系统的实现基于隐马尔可夫模型对语音的频谱、基音周期和持续时间同时建模，并由隐马尔可夫模型生成语音参数，我们开发了一个日文文语合成系统，该系统可在工作站和PC上运行.基于HMM的语音合成系统采用了MAP/VFS和MLLR等说话人自适应技术，提出了语音特征转换技术，并提出了在代表说话人的HMM集合中插值HMM参数的说话人插值技术。使用这些技术，我们已经表明，基于HMM的语音合成系统可以生成具有各种语音特征的语音。