权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Development of a speech understanding system

语音理解系统的开发

基本信息

批准号：
04044108
负责人：
MIZOGUCHI Riichiro
金额：
$ 5.25万
依托单位：
Osaka University
依托单位国家：
日本
项目类别：
Grant-in-Aid for international Scientific Research
财政年份：
1992
资助国家：
日本
起止时间：
1992 至 1993
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-04044108/
关键词：
Speech Recognition Speech Understanding Korean Language Fuzzy Dialog Model ATMS ファジィ

项目摘要

The objective of this research is development of fundamental techniques necessary to understanding spoken dialogue, which include knowledge-based speech recognition system, non-monotonic reasoning in natural language processing, and dialogue modeling. The following are the summary of the research results.1) We verified the efficiency of the knowledge-based approach for Korean speech recognition. Furthermore, some new ideas were proposed to improve the speech recognition. To avoid the difficulties in segmentation, a non-uniform unit is introduced. Every unit has its stationary point at each end of the unit, and transient part in the middle. The parameter trajectory is described by symbolic representation and fuzzy linguistic variables. Redundancy of speech data is used to improve the performance of the recognition system in the post-processor. The prototype system was tested with continuous Korean digit speech of unknown length, and the recognition rate of 97% was obtained.2) Understanding of continuous speech is generally a tough problem, since acoustic information is unreliable. An efficient search mechanism is indispensable because the combination of ambiguous information is very large. Then, we developed a framework of speech understanding system based on ATMS, which is a method of non-monotonic reasoning. The introduction of ATMS reduced elapsed time of natural language processing from 64 sec to 45 sec for understanding speech of 8 Japanese sentences.3) Two kinds of dialogue model characterizing structures in dialogue were proposed for understanding spoken dialogue. One is the SR-plan model which describes utterance pairs composed of the stimulus and the response. The other is Topic Packet Network (TPN) and corresponds to the discourse segments. A mechanism for predicting the next utterance was also developed based on these dialogue models and evaluated on some sample dialogues.

这项研究的目标是开发理解口语对话所需的基本技术，包括基于知识的语音识别系统、自然语言处理中的非单调推理和对话建模。以下是研究结果的总结。1）我们验证了基于知识的韩语语音识别方法的效率。此外，还提出了一些新的想法来改进语音识别。为了避免分割的困难，引入了非均匀单元。每个单元的两端都有固定点，中间有瞬态部分。参数轨迹由符号表示和模糊语言变量描述。语音数据的冗余用于提高后处理器中识别系统的性能。该原型系统用未知长度的连续韩语数字语音进行了测试，识别率达到了97％。2）由于声学信息不可靠，连续语音的理解通常是一个难题。由于模糊信息的组合非常大，因此高效的搜索机制是必不可少的。然后，我们开发了一个基于ATMS的语音理解系统框架，这是一种非单调推理的方法。 ATMS的引入将理解8个日语句子的自然语言处理时间从64秒减少到45秒。3）提出了两种表征对话结构的对话模型来理解口语对话。一种是 SR-plan 模型，它描述由刺激和响应组成的话语对。另一个是主题分组网络（TPN），对应于话语片段。还基于这些对话模型开发了一种预测下一个话语的机制，并在一些示例对话上进行了评估。