权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

A study on content summarization for large spoken documents and content retrieval through spoken dialogue

大型口语文档内容摘要及口语对话内容检索研究

基本信息

批准号：
13480095
负责人：
NAKAGAWA Seiichi
金额：
$ 9.47万
依托单位：
Toyohashi University of Technology
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B)
财政年份：
2001
资助国家：
日本
起止时间：
2001 至 2004
项目状态：
已结题

项目摘要

To develop an accurate large vocabulary continuous speech recognition system for spoken document retrieval in open domain, we proposed a search method using two search algorithms in parallel to achieve efficient and accurate decoding. We evaluated this new search algorithm and obtained significant improvement of recognition performance without severe increase of computational cost We also proposed to apply machine learning techniques to the task of combining outputs of multiple LVCSR models. The proposed technique had advantages over that by voting schemes such as ROVER, especially when the majority of participating models are not reliable. By using this technique, we performed a speech-driven Web retrieval task and improved speech recognition accuracy of spoken queries and then improved retrieval accuracy in speech driven Web retrieval We tried the summarization of spoken lectures. For this purpose, we investigated relations between linguistic surface information and human's results, and we obtained useful surface linguistic information. Next, we summarized spoken lectures based on this information, and compared them with human's results. As a result, we obtained a better F-measure and k-value comparable with human's results. We have developed a portable speech recognition module and an interpreter module in a spoken dialogue system. Furthermore, we also developed a dialogue strategy design tool, applied it to Mt.Fuji sightseeing guidance retrieval, literature retrieval and hotel reservation retrieval and then confirmed the usefulness.

为了开发一个用于开放领域口语文档检索的大词汇量连续语音识别系统，提出了一种使用两种并行搜索算法的搜索方法，以实现高效准确的解码。我们评估了这种新的搜索算法，并获得了显着改善的识别性能，而不严重增加的计算成本。我们还提出了将机器学习技术的任务，结合输出的多个LVCSR模型。所提出的技术的优势，通过投票计划，如ROVER，特别是当大多数参与模型是不可靠的。通过使用这种技术，我们进行了语音驱动的Web检索任务，提高语音识别的准确率的口语查询，然后提高检索准确率的语音驱动的Web检索。为此，我们研究了语言表层信息与人类认知结果之间的关系，并从中获得了有用的表层语言信息。接下来，我们根据这些信息总结了口语演讲，并将其与人类的结果进行了比较。因此，我们得到了更好的F-措施和k-值与人类的结果相媲美。我们已经开发了一个便携式语音识别模块和口译模块的口语对话系统。此外，我们还开发了一个对话策略设计工具，将其应用到富士山观光指南检索，文献检索和酒店预订检索，然后确认的有用性。