Analysis and Retrieval of Printed and Electronic Documents for Recycle of Information

分析和检索印刷和电子文档以实现信息回收

基本信息

  • 批准号:
    14580453
  • 负责人:
  • 金额:
    $ 2.43万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
  • 财政年份:
    2002
  • 资助国家:
    日本
  • 起止时间:
    2002 至 2003
  • 项目状态:
    已结题

项目摘要

Recycle of information is the process of reproducing useful information based on materials decomposed from compound information included in Web pages and documents. In this research we have investigated the recycle of information from both printed and electronic documents as well as "reuse'" of information as a previous step of recycling. The results of this research are summarized as follows.1.Retrieval of parts of document images and its application to question answering : As a method of reuse of printed documents, we have proposed a method of retrieving parts of document images based on two-dimensional density distributions of keywords. Experimental results on various English and Japanese documents show the effectiveness of the proposed method. Some basic functions of question answering have also been implemented based on this retrieval method. Question answering is the process of locating possible answers on documents in response to questions written in natural languages, and thus … More is a kind of recycling information. Experimental results on English documents show that the method is capable of locating correct answers at the first rank for about half of questions.2.Embedding and recovery of electronic data on printed documents : As another approach of reusing information on printed documents we have implemented a method of embedding text information on printed documents when they are printed. 4KB of data are successfully embedded and recovered on B5 pages while permitting 20% of reading errors.3.Passage retrieval of electronic documents and its application to question answering : We have also proposed a method of retrieving parts of electronic documents based on the density distributions of keywords. It is applied to question answering as well and proven that the method is capable of locating correct answers at the top for half of questions.4.Information extraction from electronic documents : As a way of recycling information from electronic documents, a method of tabulating information included in documents has been proposed. We have applied this method to 7000 Japanese news articles for extracting personal profile information and shown its effectiveness. In addition, we evaluated basic methods for information extraction both from web pages and with images. Less
信息回收是指从网页和文档中包含的复合信息中分解出有用信息的过程。在这项研究中,我们调查了从印刷和电子文档中回收信息,以及作为回收的前一步的信息的“再利用”。本文的研究结果总结如下:1。文档图像部分检索及其在问答中的应用:作为打印文档的一种再利用方法,我们提出了一种基于关键词二维密度分布的文档图像部分检索方法。对各种英文和日语文档的实验结果表明了该方法的有效性。在此基础上实现了问答的一些基本功能。问答是针对用自然语言写出来的问题在文档中找到可能答案的过程,因此……更多是一种信息的循环利用。在英文文档上的实验结果表明,该方法能够在大约一半的问题中找到第一排的正确答案。在打印文档中嵌入和恢复电子数据:作为另一种重用打印文档信息的方法,我们实现了在打印文档时嵌入文本信息的方法。在允许20%的读取错误的情况下,在B5页面上成功嵌入并恢复了4KB的数据。电子文档段落检索及其在问答中的应用:我们还提出了一种基于关键词密度分布的电子文档部分检索方法。将该方法应用于答题,并证明了该方法能够在一半答题的上方找到正确答案。从电子文档中提取信息:作为一种从电子文档中回收信息的方法,提出了一种将文档中包含的信息制表的方法。我们将该方法应用于7000篇日文新闻文章中提取个人资料信息,并证明了其有效性。此外,我们评估了从网页和图像中提取信息的基本方法。少

项目成果

期刊论文数量(113)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Koiochi Kise, et al.: "Spotting where to Read on Pages -Retrieval of Relevant Parts from Page Images"Document Analysis Systems V, Lecture Notes in Computer Science. Vol.2423. 388-399 (2002)
Koiochi Kise 等人:“找出页面上的阅读位置 - 从页面图像中检索相关部分”文档分析系统 V,计算机科学讲义。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Wuotang Yin, et al.: "Document Image Retrieval Using Pseudo-Relevance Feedback on Two-Dimensional Distributions of Terms"Proc.of the 65^<th> National convention of IPSJ. Vol3. 149-150 (2003)
Wuotang Yin,等:“使用伪相关性反馈对术语二维分布进行文档图像检索”第 65 届 IPSJ 全国大会论文集。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Shota Fukushima, et al.: "A Proposal of a Question-Answering system for Document Images"Proc.of the Joint Convention of Electronics Related Societies in Kansai. G247 (2003)
Shota Fukushima 等人:“文档图像问答系统的提案”,关西电子相关协会联合会议议程。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
黄瀬 浩一: "物理構造に基づく文書画像の索引付けと検索"清報技術レターズ. 203-204 (2003)
Koichi Kise:“基于物理结构的文档图像的索引和检索”Seiho Technical Letters 203-204 (2003)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
尹 沃棠: "単語の2次元分布に基づく擬似関連フィードバックを用いた文書画像検索"情報処理学会第65回全国大会講演論文集. 3. 149-150 (2003)
Yoon, Yoon:“基于单词二维分布的伪相关反馈的文档图像检索”第 65 届日本信息处理学会全国会议论文集 3. 149-150 (2003)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

KISE Koichi其他文献

Exploring Sensor Modalities to Capture User Behaviors for Reading Detection
探索传感器模式来捕获用户行为以进行阅读检测
  • DOI:
    10.1587/transinf.2020zdl0003
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0.7
  • 作者:
    ISLAM Md. Rabiul;VARGO Andrew W.;IWATA Motoi;IWAMURA Masakazu;KISE Koichi
  • 通讯作者:
    KISE Koichi
Face Image Generation of Anime Characters Using an Advanced First Order Motion Model with Facial Landmarks
使用带有面部标志的高级一阶运动模型生成动漫人物的面部图像

KISE Koichi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('KISE Koichi', 18)}}的其他基金

Writing-Life Log: a challenge of camera-pen interface based on image retrieval
书写生活日志:基于图像检索的相机笔接口挑战
  • 批准号:
    23650063
  • 财政年份:
    2011
  • 资助金额:
    $ 2.43万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Large-Scale Specific Object Recognition and Its Application to Real-World Oriented Web
大规模特定对象识别及其在面向现实世界的Web中的应用
  • 批准号:
    22300062
  • 财政年份:
    2010
  • 资助金额:
    $ 2.43万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Theoretical foundations and applications of a technology for large-scale, efficient recognition of images based on near neighbor search on a set of local features
基于一组局部特征的近邻搜索的大规模、高效图像识别技术的理论基础和应用
  • 批准号:
    19300062
  • 财政年份:
    2007
  • 资助金额:
    $ 2.43万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了