权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

テキスト情報からの固有名詞間の関係の自動抽出の研究

从文本信息中自动提取专有名词关系的研究

基本信息

批准号：
10680379
负责人：
梅村恭司
金额：
$ 1.86万
依托单位：
Toyohashi University of Technology
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
1998
资助国家：
日本
起止时间：
1998 至 1999
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-10680379/
关键词：
n-gram 補完類似度データマイニング SOL導出知識発見 KDD

项目摘要

第一に、テキスト情報の特異な部分を抽出するアルゴリズムを明らかにした。プログラムでは同じようなパターンの繰り返しが多い.したがって,プログラム中に一定回数以上現れる,一定長さ以上の文字列を取り出すと,プログラムのほとんどの部分を取り出すことができる.取り出せた部分は複数回現れていることから,プログラムとして意味がある部分であるといえる.一方,取り出せない部分はプログラム中の特異な部分であると考えられる.我々はプログラムの誤り等が取り出せない部分に含まれるのではないかと考え,この部分にマークをつけるツールを作成した.本稿では特異部分を検出するアルゴリズムを説明し,コンパイラの検出できない誤りの発見に,付与されたマークが役に立つことを示した.次に、新聞記事というテキスト情報から、補完類似度による情報の抽出方法を示した.補完類似度とは,パターン認識の分野で用いられる類似の尺度関数である,新聞記事にはあらゆる品詞の語が出現するが,ある範囲の語に関する情報に注目すれば良いと考えた,そこで,固有名詞に着目し,さらに限定して地名に着目した.そして,着目した地名の階層関係を補完類似度を用いて取得することを試みた.その結果,補完類似度を用いて取得した階層関係の適合率が相互情報量を用いて取得した階層関係よりも高かったことが判明した.最後に抽出された規則を活用する目的で、SOL導出を用いたデータマイニングの一つの方法を示した.SOL導出は,特徴節発見問題を完全に解くことができる導出手続きである.データマイニングは,データベースから知識を発見する工程である.この問題に論理的な推論システムであるSOL導出のデータマイニングへの適用を行なった.そして有用度という観点で得られた結果を選別できることを示した。

First, it is clear that the unique part of the instant information can be extracted from the various databases. The most common way to change the situation is to change the way you look at it. A text string with more than a certain number of turns in the list is selected from the list, and a text string with more than a certain length is selected from the list. Take out the part of the plural return to the meaning of the part of the plural return. A party, take out the part of the party. I'm not going to make a mistake. I'm going to make a mistake. This article describes the special part of the search, the search, the error, and the service. Next, the news notes, the information, the completion of similarity, the information extraction method is shown. The similarity degree is different from that of other words. To complete the similarity, the hierarchical relationship of the place name is obtained. The result is that the similarity is obtained by using the appropriate ratio of the hierarchical relationship. Finally, SOL derivation is used to solve the problem of characteristic node discovery completely. The project of knowledge development is very important. The logic of this problem is derived from SOL. The result of the test is that the test is not effective.