Automatic recognition of topic transition for newspaper articles and application to document summary
自动识别报纸文章的主题转换并应用于文档摘要
基本信息
- 批准号:15500086
- 负责人:
- 金额:$ 2.5万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (C)
- 财政年份:2003
- 资助国家:日本
- 起止时间:2003 至 2004
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In this study, we paid attention to the automatic summary for the newspaper articles.The following was researched as a first step to summarize multi-documents precisely.(1)A subject template is made from the large-scale corpus, and we extract the subsequent articles of a target article using that subject template correctly. The extracted subsequent articles are classified in the subject cluster.(2)Every subject cluster is summarized, and the whole of the subsequent articles is summarized in consideration of a connection between the clusters.Regarding (1), we proposed a method for topic tracking using subject templates and machine learning (support vector machines). And also, we showed that our methods can extract subsequent articles with high accuracy using large corpus (the corpus by Topic detection and Tracking and articles of Mainichi Shimbun newspaper) (research paper 4,5).Regarding (2), we found that we have to extract synonyms of each word for multi-document summarization. We proposed a method to identify synonym pairs from Japanese newspaper (3,4). For identifying synonyms, we compared Lin's method with Hindle's method and we found Lin's method is better than Hindle's method for Japanese documents.We proposed a method which is based on Lin's method for Japanese documents. Moreover, we performed some experiments of sentence extraction using automatically extracted synonym pairs and title of newspaper article (research paper 1,2). The method is as the following, firstly from newspaper articles we extracted synonyms of words in titles of newspaper articles using the proposed method which is based on Lin's method, then we performed sentence extraction using the results. The results show that identifying synonyms is useful for sentence extraction.
在本研究中,我们关注的是报纸文章的自动摘要。下面是对多文档精确总结的第一步研究。(1)从大规模语料库中生成主题模板,并正确使用该主题模板提取目标文章的后续文章。提取的后续文章在主题簇中分类。(2)对每个主题集群进行总结,并根据集群之间的联系对后续文章的整体进行总结。对于(1),我们提出了一种使用主题模板和机器学习(支持向量机)进行主题跟踪的方法。此外,我们还展示了我们的方法可以使用大型语料库(主题检测和跟踪语料库和每日新闻报纸的文章)以高精度提取后续文章(研究论文4,5)。对于(2),我们发现要进行多文档摘要,必须提取每个单词的同义词。我们提出了一种从日语报纸中识别同义词对的方法(3,4)。对于同义词的识别,我们将Lin的方法与Hindle的方法进行了比较,发现Lin的方法比Hindle的方法在日语文档中的识别效果更好。我们提出了一种基于林的日文文献方法的方法。此外,我们还利用自动提取的同义词对和报纸文章的标题进行了句子提取实验(研究论文1,2)。方法如下:首先利用本文提出的基于Lin方法的方法从报纸文章中提取报纸文章标题中的同义词,然后利用提取结果进行句子提取。结果表明,同义词识别对句子提取是有用的。
项目成果
期刊论文数量(28)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi: "Complementing News Stories with Newswire Articles for Topic Tracking"Proceedings of PACLING'03(Pacific Association for Computational Linguistics 2003). 1. 265-274 (2003)
Yoshimi Suzuki、Fumiyo Fukumoto、Yoshihiro Sekiguchi:“用新闻专线文章补充新闻故事以进行主题跟踪”PACLING03 会议记录(太平洋计算语言学协会 2003 年)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Extracting Similar Nouns for Sentence Extraction
提取相似名词进行句子提取
- DOI:
- 发表时间:2004
- 期刊:
- 影响因子:0
- 作者:Yoshimi Suzuki;Fumiyo Fukumoto
- 通讯作者:Fumiyo Fukumoto
続報記事抽出のための記事間類似度を利用したSVM学習データの自動生成
利用文章之间的相似度自动生成SVM训练数据以进行后续文章提取
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:北條博之;鈴木良弥
- 通讯作者:鈴木良弥
Complementing news stories with newswire articles for topic tracking
用新闻专线文章补充新闻报道以进行主题跟踪
- DOI:
- 发表时间:2003
- 期刊:
- 影响因子:0
- 作者:Yoshimi SUZUKI;Fumiyo FUKUMOTO;Yoshihiro SEKIGUCHI
- 通讯作者:Yoshihiro SEKIGUCHI
Sentence Extraction using Similar Words
使用相似词提取句子
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Yoshimi Suzuki;Fumiyo Fukumoto
- 通讯作者:Fumiyo Fukumoto
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
SUZUKI Yoshimi其他文献
Ehrlichia ruminantium多型解析のためのMulti-Locus Variable Number Tandem Repeat Analysis (MLUA)法開発
反刍埃里希体多态性分析的多位点可变数量串联重复分析(MLUA)方法的开发
- DOI:
- 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi;中尾亮 - 通讯作者:
中尾亮
Nitrogen fixation in fringing coral reefs : a comparison among different sub-environments
边缘珊瑚礁的固氮:不同亚环境之间的比较
- DOI:
- 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi - 通讯作者:
SUZUKI Yoshimi
SUZUKI Yoshimi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('SUZUKI Yoshimi', 18)}}的其他基金
Construction of a computational model of word sense based on vocabulary, phonology, and pronunciation, and its application to multiple document summarization
基于词汇、音韵、发音的词义计算模型的构建及其在多文档摘要中的应用
- 批准号:
18K11429 - 财政年份:2018
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Developing a competency model of readiness for bioterrorism in public health nurses
开发公共卫生护士应对生物恐怖主义的能力模型
- 批准号:
17K12598 - 财政年份:2017
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Tone Understanding of Article based on Word Connotation and Application to Text Summarization
基于词义的文章语气理解及其在文本概括中的应用
- 批准号:
26330247 - 财政年份:2014
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Evaluation of public health nursing students achievement levels based on the collaboration with local governments and universities pioneering the introduction of public health nursing electives
基于与地方政府和大学合作的公共卫生护理学生成绩水平评估,率先引入公共卫生护理选修课
- 批准号:
26463577 - 财政年份:2014
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Evaluation of the breast cancer early detection educational program for Filipino women in Japan based on partnership
基于伙伴关系的日本菲律宾妇女乳腺癌早期检测教育计划的评估
- 批准号:
23593411 - 财政年份:2011
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The saurus construction using corpus of science and technology and its application to document retrieval
科技语料库构建及其在文献检索中的应用
- 批准号:
20500127 - 财政年份:2008
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Developing partnership program to promote Filipino women's health
制定合作伙伴计划以促进菲律宾妇女的健康
- 批准号:
20890234 - 财政年份:2008
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Young Scientists (Start-up)
Characteristics of coral bleaching in the Mauritius : micro ecosystem and biogeochemistry
毛里求斯珊瑚白化的特征:微生态系统和生物地球化学
- 批准号:
19255002 - 财政年份:2007
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Chemical and biological study of coral reef around the Mauritius : investigation on the bleaching
毛里求斯周围珊瑚礁的化学和生物学研究:白化调查
- 批准号:
15255004 - 财政年份:2003
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Quantitative relationship on the behavior between organic matters and radionuclides in seawater
海水中有机物与放射性核素行为的定量关系
- 批准号:
13480155 - 财政年份:2001
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)