权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Constructing Reading Comprehension Datasets to Evaluate Discourse-level Language Understanding

构建阅读理解数据集以评估话语级语言理解

基本信息

批准号：
22K17954
负责人：
菅原朔
金额：
$ 2.91万
依托单位：
National Institute of Informatics
依托单位国家：
日本
项目类别：
Grant-in-Aid for Early-Career Scientists
财政年份：
2022
资助国家：
日本
起止时间：
2022-04-01 至 2025-03-31
项目状态：
未结题

来源：
https://kaken.nii.ac.jp/en/grant/KAKENHI-PROJECT-22K17954/
关键词：
自然言語処理計算言語学自然言語理解文章読解談話理解

项目摘要

2021年度後半から2022年度にかけて大規模なパラメータ数からなるアーキテクチャを大規模なコーパスの上で訓練することで構築した大規模言語モデルと呼ばれるシステムを基礎にした研究が急増している。そのなかで、本研究はとくに文の相互関係の理解に注目し、説明性の高い談話的文章理解を問う評価用データセットの構築を目指している。高度化したシステムの振る舞いを評価するにあたって単文にとどまらない複数の文の理解を総合的に問うアプローチは重要性が高く、集中的に取り組まれる必要がある。大規模言語モデルの発展と軌を一にして、言語理解の評価用のデータセットも多様化・大規模化する傾向があり、現状のデータセットで何が取り組まれており、現状のシステムに何ができるのか、広範で正確な調査が必要とされている。初年度においてはこうした進展を踏まえた文献調査を進めながら、システム分析・簡易的なデータセット作成を通した状況把握に努めた。具体的には、読解問題における文章に含まれる表面的な特徴が文章読解システムの振る舞いのどのような影響を与えているのかを調査した。また、日付情報の理解や常識推論の理解をシステムに問うことを通して、複数の文を同時に理解しなければならないタスクでシステムが適切に振る舞うことができるのかを調査した。このような予備的な調査を通して、今後文間の理解を正確に評価するタスクをデザインする上で重要になる知見などを収集した。

In the second half of 2021, the number of large-scale speech problems will increase. This study focuses on the understanding of the interrelationship between texts, explanatory and high-level conversations, and the understanding of articles, evaluation and construction of articles. The importance of a highly organized and focused approach to the problem of complex text understanding is paramount. The development and trajectory of large-scale speech comprehension, the tendency of diversity and large-scale speech comprehension evaluation, the selection and organization of the status quo, the selection and organization of the status quo, and the necessity of accurate investigation. In the early years, the progress of literature research was made, and the situation was grasped. The article contains the characteristics of the surface and the influence of the article on the vibration of the system. The understanding of daily payment information and common sense inference are investigated. The investigation of the preparation of this report is very important for the understanding of the future.