Large Vocabulary Continuous Speech Recognition System on Japanese Newspaper Reading Task
日语报纸阅读任务的大词汇量连续语音识别系统
基本信息
- 批准号:10680368
- 负责人:
- 金额:$ 2.11万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (C)
- 财政年份:1998
- 资助国家:日本
- 起止时间:1998 至 2000
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
We investigated large vocabulary continuous speech recognition (LVCSR) system on Japanese newspaper reading task, and obtained the following results.(1) Acoustic models : A Hidden Markov Network (HM-Net) is a highly accurate and robust acoustic model which represents a tied-state structure of context dependent Hidden Markov Models as a network. We propose a state clustering-based rapid topology design method to generate high accuracy HM-Nets for LVCSR.Furthermore, MLLR (Maximum Likelihood Linear Regression)-based speaker adaptation of acoustic models is investigated, and a regression class selection algorithm based on the BIC principle is proposed.(2) Language models : N-gram task adaptation method is investigated, which uses large corpus of the general task (TI text) and small corpus of the specific task (AD text), and employs a simple weighting to mix TI and AD texts. Furthermore we propose a new SCFG (Stochastic Context Free Grammar) model which uses a phrase-based dependency gramma … More r instead of general CFG.Word error rate in the case of using the mixture model besed on the proposed SCFG model and trigram becomes less than that in the case of using only the trigram.(3) Decoder : We investigate about fast search strategies for LVCSR, and propose a new method - a phoneme-graph-based hypothesis restriction, which effectually prunes the search space. In the proposed method, a phoneme graph is generated at the pre-processing stage, and then the best word sequence is searched while restricting expansion of hypotheses using the information of the phoneme graph at the main recognition stage. In the multiple pass LVCSR system that uses word graph as an intermediate data structure, decoder parameters should be optimized in order to generate a good word graph. A new method to optimize these parameters is proposed. This method uses rescoring of the word graph using bigram LM instead of generating many word graphs for each parameter setting.(4) Software Tool : We describe a statistical language model toolkit for word and class-based n-gram. This toolkit has command-level compatibility with CMU-Cambridge SLM Toolkit, and supports class n-gram and n-gram count mixture as well as combined language model using linear interpolation. Less
(1)声学模型:隐马尔可夫网络(HM-Net)是一种高精度、高健壮性的声学模型,它将上下文相关隐马尔可夫模型的约束结构表示为一个网络。提出了一种基于状态聚类的快速拓扑设计方法来生成高精度的声学模型HM-Net。此外,研究了基于最大似然线性回归(MLLR)的说话人自适应声学模型,并提出了一种基于BIC原则的回归类选择算法。(2)语言模型:研究了N-gram任务自适应方法,该方法使用大语料库(TI文本)和小语料库(AD文本),并采用简单的加权来混合TI和AD文本。在此基础上,提出了基于短语依存文法…的随机上下文无关文法模型在使用基于SCFG模型的混合模型的情况下,单词错误率比仅使用三元组的情况下要小。(3)解码器:研究了LVCSR的快速搜索策略,并提出了一种新的方法--基于音素图的假设约束,有效地修剪了搜索空间。该方法在预处理阶段生成音素图,然后在主识别阶段利用音素图的信息在限制假设扩展的同时搜索最佳单词序列。在使用字图作为中间数据结构的多程LVCSR系统中,为了生成良好的字图,需要对译码参数进行优化。提出了一种优化这些参数的新方法。该方法使用二元语法LM对单词图进行重新评分,而不是为每个参数设置生成许多单词图。(4)软件工具:我们描述了一个基于单词和类的n元语法统计语言模型工具包。该工具包与CMU-Cambridge SLM工具包具有命令级兼容性,并支持类n元语法和n元语法计数混合以及使用线性内插的组合语言模型。较少
项目成果
期刊论文数量(49)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A.Ito, M.Kohda, M.Ostendorf: "A New Metric for Stochastic Language Model Evaluation"Proc. Euro. Conf. on Speech Commu. and Technology. Vol.4. 1591-1594 (1999)
A.Ito、M.Kohda、M.Ostendorf:“随机语言模型评估的新指标”Proc。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
加藤正治: "単語グラフ生成におけるパラメータ最適化の検討"電子情報通信学会技術研究報告. SP2000-93. 107-112 (2000)
加藤正治:“字图生成中的参数优化研究”IEICE技术研究报告107-112(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
C.Hori, M.Katoh, A.Ito, M.Kohda: "Construction and Evaluation of Language Models Based on Stochastic Context Free Grammar for Speech Recognition"Trans. IEICE (D-II). Vol.J83-D-II, No.11. 2407-2417 (2000)
C.Hori、M.Katoh、A.Ito、M.Kohda:《基于随机上下文无关语法的语音识别语言模型的构建和评估》译。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
伊藤彰則: "単語およびクラスN-gram作成のためのツールキット"電子情報通信学会技術研究報告. SP2000-106. 67-72 (2000)
Akinori Ito:“创建单词和类别 N 元语法的工具包”IEICE SP2000-106 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
斎院 俊典: "単語グラフ生成の言語重み・挿入ペナルティ最適化の検討"日本音響学会講演論文集. 2-8-12. 47-48 (2000)
Toshinori Saiin:“词图生成的语言权重和插入惩罚优化研究”日本声学学会会议记录 2-8-12 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KOHDA Masaki其他文献
KOHDA Masaki的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KOHDA Masaki', 18)}}的其他基金
Large-vocabulary continuous speech recognition on spontaneous speech task
自发语音任务的大词汇量连续语音识别
- 批准号:
18500126 - 财政年份:2006
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Spontaneous speech recognition
自发语音识别
- 批准号:
15500098 - 财政年份:2003
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Algorithm of Spontaneous Speech Recognition Based on A^<**> Search
基于A^<**>搜索的自发语音识别算法
- 批准号:
07680379 - 财政年份:1995
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Speech Recognition Based on Intelligent Beam Search Algorithm
基于智能波束搜索算法的语音识别
- 批准号:
01460254 - 财政年份:1989
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for General Scientific Research (B)
相似海外基金
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2020
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Construction of acoustic model on small training data for dysarthric speech recognition
用于构音障碍语音识别的小训练数据声学模型的构建
- 批准号:
20K19862 - 财政年份:2020
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
9803678 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
10159735 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Automatic acquisition of optimized acoustic model unit for automatic speech recognition using deep learning
使用深度学习自动获取用于自动语音识别的优化声学模型单元
- 批准号:
19K12027 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Toward the next generation in transcranial MR-guided focused ultrasound: Innovations in thermal and acoustic model-based planning and monitoring for improved safety, efficacy and efficiency
迈向下一代经颅 MR 引导聚焦超声:基于热和声学模型的规划和监测创新,以提高安全性、有效性和效率
- 批准号:
10401242 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2018
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Development and Applications of a Novel Acoustic Model for Respiratory System
新型呼吸系统声学模型的开发与应用
- 批准号:
RGPIN-2016-06549 - 财政年份:2017
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual














{{item.name}}会员




