A Study on Constructing Various Acoustic Models using Distributed Speech Corpora
利用分布式语音语料库构建多种声学模型的研究
基本信息
- 批准号:15200014
- 负责人:
- 金额:$ 29.37万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (A)
- 财政年份:2003
- 资助国家:日本
- 起止时间:2003 至 2005
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In order to collect speech utterances made under various environmental conditions, field tests of spoken dialogue systems have been conducted for the public transportation guidance, the in-car information retrieval and the guidance for a public space. Based on the three corpora, a prototype of the data sharing infrastructure for acoustic model training has been developed. In the system, one can search for the particular speech subsets by invoking queries on the age of the speakers, SNR of the utterance and distribution of the phoneme frequency. The system can train a set of HMM's by sharing the efficient statistics, i.e., the visiting count, the branching count, the sum and the square sum, for the Gaussian Mixture pdf's for each state of HMM acoustic models. In addition, in order to characterize the utterance, a blind, i.e., does not require the explicit voice activity detection (VAD), method for SNR is developed for wide range of the SNR.As for the training strategy, not only the maximum likelihood (ML) training over the set of utterances, but also a model adaptation method using only statistics has been also studied. The effectiveness of the adaptation approach using pre-stored statistics for each utterance was confirmed through the recognition experiments where the accuracy of the model trained by the adaptation is almost equivalent to the pooled EM algorithm.
为了收集在各种环境条件下发出的语音,已经针对公共交通引导、车内信息检索和公共空间引导进行了语音对话系统的现场测试。基于这三个语料库,开发了一个用于声学模型训练的数据共享基础设施原型。在该系统中,人们可以通过调用对说话人的年龄、话语的SNR和音素频率分布的查询来搜索特定的语音子集。系统可以通过共享有效的统计数据来训练一组HMM,即,访问计数、分支计数、和以及平方和,用于HMM声学模型的每个状态的高斯混合pdf。此外,为了表征话语,盲的,即,在训练策略上,本文研究了基于最大似然(ML)的训练方法和基于统计量的模型自适应方法。使用预存储的统计数据为每个话语的自适应方法的有效性得到了确认,通过识别实验,其中由自适应训练的模型的准确性几乎相当于池EM算法。
项目成果
期刊论文数量(1163)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Discrimination between Singing and Speaking Voices
歌声和说话声音的区别
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Yasunori OHISHI;Masataka GOTO;Katunobu ITO;Kazuya TAKEDA
- 通讯作者:Kazuya TAKEDA
実走行車内単語音声データベースCENSREC-3と共通評価環境の構築
使用实际车载词汇数据库CENSREC-3构建通用评估环境
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:藤本雅清;中村哲;武田一哉;黒岩眞吾;山田武志;北岡教英;山本一公;水町光徳;西浦敬信;佐宗晃;宮島千代美;遠藤俊樹
- 通讯作者:遠藤俊樹
Sound field auralization system in free listening positions
自由聆听位置的声场助听系统
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Toshiyuki KIMURA;Wataru MIZUNO;Takanori NISHINO;Kazuya TAKEDA
- 通讯作者:Kazuya TAKEDA
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
TAKEDA Kazuya其他文献
機械学習結果を利用した確率的情報処理法に関する一検討
利用机器学习结果的概率信息处理方法研究
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
TERANISHI Masakiyo;FUJII Keisuke;TAKEDA Kazuya;片岡 駿 - 通讯作者:
片岡 駿
TAKEDA Kazuya的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('TAKEDA Kazuya', 18)}}的其他基金
A study of comparative history and 3D archive creating on the walled cities and Buddist stupas in eastern Eurasia during the 5-13th century.
对 5-13 世纪欧亚大陆东部城墙城市和佛塔的比较历史和 3D 档案创建的研究。
- 批准号:
18K00918 - 财政年份:2018
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Interdisciplinary research on the use of historical and archaeological materials to build a diversity of crop resources for the next generation.
利用历史和考古材料的跨学科研究,为下一代构建多样化的农作物资源。
- 批准号:
18KT0048 - 财政年份:2018
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Interdisciplinary research on the distribution, cultivation, and food cultural history on the cruciferous crops
十字花科作物分布、栽培及饮食文化史的跨学科研究
- 批准号:
26300003 - 财政年份:2014
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Analysis of individuality in subjective similarity among music songs
音乐歌曲主观相似性的个性分析
- 批准号:
25540168 - 财政年份:2013
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Restoring study and compilation of historicai, archaeological, linguistic material about the Qhitai(Liao) dynasty period by digital archive technology.
利用数字档案技术对齐台(辽)朝时期的历史、考古、语言资料进行恢复研究和整理。
- 批准号:
25370842 - 财政年份:2013
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Dynamical analysis of the F0 entrainment in chorus singing Through stochastic phase plane
通过随机相平面对合唱中 F0 夹带进行动态分析
- 批准号:
23650088 - 财政年份:2011
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Biochemical analysis of new secreted APP and mutant Aβ
新分泌的 APP 和突变 Aβ 的生化分析
- 批准号:
21680032 - 财政年份:2009
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Young Scientists (A)
Study on Speech Enhancement Based on Distorted Speech Corpora in the Real-world
基于现实世界扭曲语音语料库的语音增强研究
- 批准号:
19300060 - 财政年份:2007
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
The effects of N-terminal region of Abeta-protein on the cerebral amyloid deposition.
Abeta蛋白N末端区域对大脑淀粉样蛋白沉积的影响。
- 批准号:
19700333 - 财政年份:2007
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Integrated Modeling of Acoustic and Linguistic Knowledge for Stochastic
随机声学和语言知识的集成建模
- 批准号:
11680386 - 财政年份:1999
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
相似海外基金
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2021
- 资助金额:
$ 29.37万 - 项目类别:
Discovery Grants Program - Individual
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2020
- 资助金额:
$ 29.37万 - 项目类别:
Discovery Grants Program - Individual
Construction of acoustic model on small training data for dysarthric speech recognition
用于构音障碍语音识别的小训练数据声学模型的构建
- 批准号:
20K19862 - 财政年份:2020
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2019
- 资助金额:
$ 29.37万 - 项目类别:
Discovery Grants Program - Individual
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
9803678 - 财政年份:2019
- 资助金额:
$ 29.37万 - 项目类别:
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
10159735 - 财政年份:2019
- 资助金额:
$ 29.37万 - 项目类别:
Automatic acquisition of optimized acoustic model unit for automatic speech recognition using deep learning
使用深度学习自动获取用于自动语音识别的优化声学模型单元
- 批准号:
19K12027 - 财政年份:2019
- 资助金额:
$ 29.37万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
10401242 - 财政年份:2019
- 资助金额:
$ 29.37万 - 项目类别:
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2018
- 资助金额:
$ 29.37万 - 项目类别:
Discovery Grants Program - Individual
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2017
- 资助金额:
$ 29.37万 - 项目类别:
Discovery Grants Program - Individual