Advanced Dependency Structure Analysis Using Minimum Total Penalty Method

使用最小总惩罚方法进行高级依赖结构分析

基本信息

  • 批准号:
    12680372
  • 负责人:
  • 金额:
    $ 0.7万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
  • 财政年份:
    2000
  • 资助国家:
    日本
  • 起止时间:
    2000 至 2002
  • 项目状态:
    已结题

项目摘要

1. Development of Sentence Compression AlgorithmThe sentence compression problem was formulated as a problem of selecting an optimal subsequence of phrases from a given sentence. Then, based on our dependency analysis technique, an efficient algorithm was developed to solve the problem.2. Estimation of inter-phrase dependency strength and phrase significanceBy using about 34,000 sentences in Kyoto University Corpus, inter-phrase dependency strength was estimated. It is based on the statistical frequency of inter-phrase dependency distance, and was estimated for each modifying phrase class and modified phrase class. Also, a sentence compression experiment was conducted in which human subjects compressed 200 sentences. The result was analyzed statistically and the remaining rate for each phrase class was calculated. Based on the result, phrase significance value for each phrase class was estimated.3. Subjective Evaluation of Compressed SentencesA subjective evaluation experiment was perf … More ormed for sentences automatically compressed by using the above algorithm together with the estimated inter-phrase dependency and phrase significance. In this experiment, 200 test sentences, which are different from the sentences in 2, were used. 5 subjects were employed for evaluating the quality of compressed sentences. Subjective evaluation was performed from the following points of view : (a) total impression, (b) retention of information, and c grammatical correctness. For comparison, the same kind of evaluation experiment was done for sentences compressed by humans, and also by a random method. It was found that the quality of sentences compressed by the proposed method lies just between those of human compression and random compression.4. Segmentation of Long SentencesBecause long sentences are difficult to analyze syntactically, it is desirable to segment long sentences into shorter ones. In this work, a support vector machine (SVM) technique was applied to the problem. Vectors consisting of surface attribute values of relevant phrases were input to the SVM, and segmentation points were automatically estimated. As a result, 77% of precision and 84% of recall were obtained. Correct sentence segmentation rate was 72%. Less
1. 句子压缩算法的发展句子压缩问题被表述为从给定句子中选择最优短语子序列的问题。然后,基于我们的依赖分析技术,开发了一种有效的算法来解决这个问题。利用京都大学语料库中约3.4万个句子,估计了短语间依赖强度和短语意义。它基于短语间依赖距离的统计频率,对每个修饰短语类和修饰短语类进行估计。此外,我们还进行了一个句子压缩实验,让人类受试者压缩200个句子。对结果进行统计分析,并计算每个短语类的剩余率。在此基础上,对每个短语类的短语显著性值进行了估计。压缩句子的主观评价对采用上述算法自动压缩的句子进行了主观评价实验,并结合估计的短语间依赖关系和短语重要度进行了主观评价。本实验使用了200个不同于2中的句子的测试句子。采用5名被试对压缩句质量进行评价。主观评价从以下几个方面进行:(a)总印象,(b)信息保留,和c语法正确性。为了比较,对人类压缩的句子进行了同样的评价实验,同样采用随机方法。结果表明,该方法压缩的句子质量介于人工压缩和随机压缩之间。长句的切分由于长句很难从句法上分析,所以将长句切分成短句是很有必要的。本文将支持向量机(SVM)技术应用于该问题。将相关短语的表面属性值组成的向量输入支持向量机,自动估计分割点。结果,准确率达到77%,召回率达到84%。正确分句率为72%。少

项目成果

期刊论文数量(41)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
久保田新: "係り受け解析におけるポーズ・ピッチの利用法の検討"日本音響学会2001年秋季研究発表会講演論文集. I. 271-272 (2001)
Arata Kubota:“依存分析中停顿和音高的使用研究”日本声学学会 2001 年秋季研究会议论文集 I. 271-272 (2001)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Meirong Lu, Kazuyuki Takagi, and Kazuhiko Ozeki: "The use of multiple pause information in dependency analysis of Japanese sentences"Proceedings of the 2003 Spring Meeting of the Acoustical Society of Japan. to be presented. (2003)
Meirong Lu、Kazuyuki Takagi、Kazuhiko Ozeki:《日语句子依存分析中多重停顿信息的使用》日本声学学会 2003 年春季会议论文集。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
小黒 玲: "文節重要度と係り受け整合度に基づく日本語文簡約アルゴリズム"自然言語処理. Vol.8, No.3. 3-18 (2001)
Rei Oguro:“基于子句重要性和依存一致性的日语句子缩减算法”《自然语言处理》第 8 卷,第 3-18 期(2001 年)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
沖本真美子: "韻律情報を用いた日本語読み上げ文の係り受け解析におけるニューラルネットワークの利用"日本音響学会2003年春季研究発表会講演論文集. I(発表予定). (2003)
Mamiko Okimoto:“利用神经网络使用韵律信息对日语口语句子进行依存分析”日本声学学会 2003 年春季会议论文集 I(即将发表)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Rei Oguro, Kazuhiko Ozeki, Yujie Zhang, and Kazuyuki Takagi: "A Japanese sentence compaction algorithm based on phrase significance and inter-phrase dependency"Journal of Natural Language Processing. 8, No.3. 3-18 (2001)
Rei Oguro、Kazuhiko Ozeki、Yujie Zhang 和 Kazuyuki Takagi:“基于短语重要性和短语间依赖的日语句子压缩算法”自然语言处理杂志。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

OZEKI Kazuhiko其他文献

OZEKI Kazuhiko的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('OZEKI Kazuhiko', 18)}}的其他基金

High Compression-Rate Automatic Summarization of Newspaper Articles Based on Combined Use of Significant Sentence Extraction and Sentence Compression
基于显着句提取和句子压缩结合的报纸文章高压缩率自动摘要
  • 批准号:
    16500077
  • 财政年份:
    2004
  • 资助金额:
    $ 0.7万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Spoken Language Processing by Minimum Total Penalty Dependency Analysis Method
最小总罚值依存分析法口语处理
  • 批准号:
    09680356
  • 财政年份:
    1997
  • 资助金额:
    $ 0.7万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了