A scheme for continuous speech recognition in a large context based on the human process of spoken language recognition

基于人类口语识别过程的大上下文连续语音识别方案

基本信息

  • 批准号:
    03452164
  • 负责人:
  • 金额:
    $ 4.48万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for General Scientific Research (B)
  • 财政年份:
    1991
  • 资助国家:
    日本
  • 起止时间:
    1991 至 1992
  • 项目状态:
    已结题

项目摘要

Most of the current systems for automatic speech recognition fail to achieve recognition performance comparable to human listeners, since they are constructed without paying attention to the human processes of spoken language recognition. From this point of view, the present study investigates the human processes and incorporates the findings into a scheme for automatic recognition of continuous speech in a large context. The followings are the main results:1. Experimental investigation and modeling of the human processes of spoken language recognitionUsing as stimuli natural utterances with controlled acoustic, syntactic and semantic information, the following findings were obtained on the human processes of spoken language recognition.(1) The unit of speech recognition varies widely from phones and syllables to words and phrases depending on the experimental condition and context.(2) Larger units generally require less accuracy of representation for correct recognition.(3) The amount … More of acoustic information necessary for recognition of a given unit varies widely depending on the size of context and prior knowledge on the part of the listener.(4) The accuracy and speed of access to mental lexicon varies dynamically depending on the acoustic, syntactic, semantic and discourse information available to the listener.Based on these findings, a model has been constructed for the human processes of spoken language recognition.2. Proposal and implementation of a scheme for automatic recognition of spoken language recognitionBased upon the above findings and the model, a scheme for automatic recognition of continuous speech in a large context has been proposed, featuring (1) use of multiple size units and accuracy of acoustic feature representation, (2) use of prosodic features for word and phrase boundary detection, (3) extraction of syntactic, sematic, and idiosyncratic information from a large context. The main components of the system have been implemented.3. Demonstration of the validity of the proposed schemeThe proposed scheme has been tested by recognition experiments of phones, syllables and words in continuous speech with a large context, and the results have confirmed the essential validity and feasibility of the proposed scheme. Less
目前大多数自动语音识别系统都无法达到与人类听众相当的识别性能,因为它们的构建没有关注人类语音识别的过程。从这个角度来看,本研究调查了人类的过程,并将研究结果纳入了一个在大语境中自动识别连续语音的方案。主要研究结果如下:人类口语识别过程的实验研究与建模利用具有受控声学、句法和语义信息的自然话语作为刺激,对人类口语识别过程进行了如下研究。(1)根据实验条件和上下文,语音识别的单元从电话和音节到单词和短语变化很大。(2)较大的单位通常对正确识别的表示精度要求较低。识别一个给定单位所需的声学信息的数量取决于上下文的大小和听者的先验知识。(4)听者获取心理词汇的准确性和速度会随着听觉、句法、语义和话语信息的不同而发生动态变化。基于这些发现,我们构建了一个人类口语识别过程的模型。基于上述发现和模型,本文提出了一种大语境下连续语音自动识别方案,其特点是:(1)使用多个大小单位和声学特征表示的准确性,(2)使用韵律特征进行单词和短语边界检测,(3)从大语境中提取句法、语义和特质信息。系统的主要组成部分已经实现。通过对大语境下连续语音中的语音、音节和单词的识别实验,验证了所提方案的有效性和可行性。少

项目成果

期刊论文数量(38)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Fujisaki Hiroya: "The influence of semantic and syntactic information on spoken sentence recognition" Proceedings of the 1992 International Conference on Spoken Lnaguage Processing. 1. 153-156 (1992)
Fujisaki Hiroya:“语义和句法信息对口语句子识别的影响”1992 年国际口语语言处理会议论文集。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Sumio Ohno: "Utilization of lexical information at multiple levels in template matching of words and phrases in continuous speech" Reports of 1992 Spring Meeting of the Acoustical Society of Japan. vol. 1. 95-96 (1992)
Sumio Ohno:“在连续语音中单词和短语的模板匹配中多层次词汇信息的利用”日本声学学会 1992 年春季会议的报告。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
峯松,信明: "連続音声知覚における高次言語情報の及ぼす影響" 日本音響学会聴覚研究会資料. H-92-56. 1-6 (1992)
Minematsu, Nobuaki:“高阶语言信息对连续语音感知的影响”日本声学学会听觉研究小组的材料 H-92-56 (1992)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
峯松,信明: "複数の時間的単位・精度の音響的特徴表現を用いた音声認識" 日本音響学会平成4年春季研究発表会講演論文集. 1. 31-32 (1992)
Minematsu、Nobuaki:“使用具有多个时间单位和精度的声学特征表示的语音识别”日本声学学会 1992 年春季会议记录(1992 年)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Nobuaki Minematsu: "Automatic speech recognition using multiple temporal units and accuracy of representation for the acoustic features" Reports of 1992 Spring Meeting of the Acoustical Society of Japan. vol. 1. 31-32 (1992)
Nobuaki Minematsu:“使用多个时间单元的自动语音识别和声学特征表示的准确性”日本声学学会 1992 年春季会议的报告。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

FUJISAKI Hiroya其他文献

FUJISAKI Hiroya的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('FUJISAKI Hiroya', 18)}}的其他基金

Automatic Estimation of Fundamental Frequency Contour Parameters and Automatic Acquisition of Generative rules
基频轮廓参数自动估计及生成规则自动获取
  • 批准号:
    11480090
  • 财政年份:
    1999
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B).
Construction of an Intelligent System for information Retrieval in an Environment of Information Network
信息网络环境下智能信息检索系统的构建
  • 批准号:
    09558041
  • 财政年份:
    1998
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
A System for Rule Synthesis of Prosodic Features of Speech of Multiple Language Based on a Generative Model of Fundamental Frequency Contours
基于基频轮廓生成模型的多语言语音韵律特征规则综合系统
  • 批准号:
    08458090
  • 财政年份:
    1996
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
International Coordination of Speech Databases, Prosodic Labeling, and Speech Input/Output Systems Assessment
语音数据库、韵律标记和语音输入/输出系统评估的国际协调
  • 批准号:
    08044173
  • 财政年份:
    1996
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for international Scientific Research
Trial Construction of an Advanced Computer-readable Lexical Database Capable of Automatic Acquisition of Lexical Information
自动获取词汇信息的先进计算机可读词汇数据库的试建
  • 批准号:
    07558274
  • 财政年份:
    1995
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
International Standardization of Spoken Language Detabases
口语数据库国际标准化
  • 批准号:
    05044112
  • 财政年份:
    1993
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for international Scientific Research
Production of a Prototype Lexical Database Featuring High-speed, High-accuracy Access and Lexical Knowledge Acquisition
高速、高精度访问和词汇知识获取的原型词汇数据库的制作
  • 批准号:
    05558038
  • 财政年份:
    1993
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Developmental Scientific Research (B)
Research on International Standardization of Spoken Language Database and Assessment Techniques for Speech Input/Output
口语数据库国际标准化及语音输入输出评估技术研究
  • 批准号:
    02044041
  • 财政年份:
    1990
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for international Scientific Research
Co-operative Research on Modeling of Language Acquisition and Concept Formation Process in Engineering
工程中语言习得和概念形成过程建模的合作研究
  • 批准号:
    01300004
  • 财政年份:
    1989
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Research on Synthesis Method for Spoken Sentences from Knowledge Representation
知识表示的口语句子合成方法研究
  • 批准号:
    63420051
  • 财政年份:
    1988
  • 资助金额:
    $ 4.48万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (A)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了