CRII: III: Capturing Dynamism in Causal Relationships: A New Paradigm for Relationship Extraction from Text
CRII:III:捕捉因果关系的动态:从文本中提取关系的新范式
基本信息
- 批准号:1948322
- 负责人:
- 金额:$ 17.43万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-05-15 至 2023-04-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Text mining made important advances in methods to convert vast and unstructured text data into knowledge. However, the current paradigm of relationship extraction has one major limitation: it models snapshots of information but fails to capture the fundamentally dialogic and dynamic nature of knowledge: conflicting findings, inconsistent discoveries, refutations, contradictions, reinforcements or confirmations, all changing over time. This project aims to capture such fundamental dynamics of knowledge, specifically focusing on causal relationships. Whereas numerous articles, including academic articles, present knowledge and relationships that express causality, such relationships are not static and can change over time due to changing conditions. The objective of this project is to identify cues of causal knowledge from text data, quantify the strength of the causal relationship, and model its dynamics over changing conditions. Ultimately, the project aims at modelling a more holistic view of the knowledge extracted from text. As text data is extensively used by researchers and practitioners from different domains of national importance, including, medicine and health, economics, public policy, journalism, the results of this project seek to provide the foundation to offer practitioners new ways to understand the evolving nature of the causal relationships present in large text datasets. Specifically, the novel approaches developed in the project will be applied to explore public health data to determine how changing climatic, political, economic conditions may affect the mental and physical health of the population in different geographic areas. In addition, there will be various educational activities as part of this project - emerging and related topics from this project will be included in the curricula of various courses in the applied data science master’s program; promote undergraduate research, specifically, recruit students to work in the project who are from underrepresented and economically disadvantaged communities; organize a research workshop to encourage participation of high school students in STEM research. The project activities include the development of a novel model of causal relationship extraction that leverages a unified deep learning framework combining both semantic and syntax cues. This approach will utilize the key syntactical features of a sentence represented by the grammar relationships between noun, verbs and other parts of speech through graphical or tree-like models. This work will determine whether the sentence features a structure that signals causality. Moreover, the sequential component of the model will utilize the semantics and identify the influence of certain words in the sentence to characterize the nature of the causal relationship expressed in the text. This task will capture the strength of the relationship (e.g., using cues like "extremely likely", "definitely"), any supporting or opposing evidences (e.g., "will lead to" or "does not lead to"), and will identify conditional cues (e.g., "in the presence of") etc. Quantifying such qualitative properties will lead to the second innovation of this project – causal distance. Causal distance is a time-variant metric that will denote the magnitude of causality between two entities as well as capture the dynamism of the relationship by modifying itself over time with changing conditions or new evidences. Collectively, the advances pursued in this projects will further enhance our understanding of the novel computational approaches needed to unearth and reason on cues of causal relationships embedded in large text data sets. The outcomes of this project, such as datasets, source code, final software, results and publications will be shared via publicly accessible URLs and online code repositories. Additionally, all the project resources and outcomes will be made available on the project website.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
文本挖掘在将海量的非结构化文本数据转化为知识的方法方面取得了重要进展。然而,当前的关系提取范式有一个主要局限性:它对信息的快照进行建模,但未能捕捉到知识的根本对话和动态性质:相互冲突的发现、不一致的发现、反驳、矛盾、增援或确认,所有这些都随着时间的推移而变化。这个项目旨在捕捉这种基本的知识动态,特别是关注因果关系。虽然许多文章,包括学术文章,提供了表达因果关系的知识和关系,但这种关系不是一成不变的,可能会随着条件的变化而变化。这个项目的目标是从文本数据中识别因果知识的线索,量化因果关系的强度,并对其在不断变化的条件下的动力学进行建模。最终,该项目的目标是对从文本中提取的知识进行更全面的建模。由于文本数据被来自不同国家重要领域的研究人员和从业者广泛使用,包括医学和卫生、经济学、公共政策、新闻学,该项目的结果试图为从业者提供新的方法来理解大型文本数据集中存在的因果关系的演变性质。具体地说,该项目开发的新方法将被应用于探索公共卫生数据,以确定气候、政治、经济条件的变化可能如何影响不同地理区域人口的心理和身体健康。此外,作为该项目的一部分,还将开展各种教育活动--该项目的新兴主题和相关主题将被纳入应用数据科学硕士项目的各种课程的课程中;促进本科生研究,特别是招募来自代表性不足和经济困难社区的学生参与该项目;组织研究研讨会,鼓励高中生参与STEM研究。项目活动包括开发一种新的因果关系提取模型,该模型利用一个结合了语义和语法线索的统一深度学习框架。这种方法将利用由名词、动词和其他词类之间的语法关系通过图形或树形模型表示的句子的关键句法特征。这项工作将确定句子是否具有表示因果关系的结构。此外,模型的顺序成分将利用语义并识别句子中某些词的影响,以表征文本中表达的因果关系的性质。这项任务将捕捉这种关系的强度(例如,使用“极有可能”、“肯定”等线索)、任何支持或反对的证据(例如,“将导致”或“不会导致”),并将识别条件线索(例如,“在存在的情况下”)等。量化这样的定性属性将导致该项目的第二次创新-因果距离。因果距离是一个时变的度量,它将表示两个实体之间的因果关系的大小,并通过随着时间的变化随着条件或新证据的变化而修改自身来捕捉关系的动态化。总体而言,这些项目所取得的进展将进一步增强我们对挖掘大型文本数据集中嵌入的因果关系线索并进行推理所需的新计算方法的理解。该项目的成果,如数据集、源代码、最终软件、成果和出版物,将通过可公开访问的URL和在线代码库共享。此外,所有的项目资源和成果将在项目网站上提供。这一奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Study of Extracting Causal Relationships from Text
从文本中提取因果关系的研究
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Gujarathi, Pranav;Reddy, Manohar;Tayade, Neha;Chakraborty, Sunandan
- 通讯作者:Chakraborty, Sunandan
Mining Latent Disease Factors from Medical Literature using Causality
- DOI:10.1109/bigdata55660.2022.10020994
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:P. Gujarathi;Jack VanSchaik;Venkatanaidu Karri;A. Rajapuri;Biju Cheriyan;T. Thyvalikakath;Sunandan Chakraborty
- 通讯作者:P. Gujarathi;Jack VanSchaik;Venkatanaidu Karri;A. Rajapuri;Biju Cheriyan;T. Thyvalikakath;Sunandan Chakraborty
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sunandan Chakraborty其他文献
Managing microfinance with paper, pen and digital slate
用纸、笔和数字板管理小额信贷
- DOI:
- 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
Aishwarya Ratan;K. Toyama;Sunandan Chakraborty;Keng Siang Ooi;Mike Koenig;P. Chitnis;Matthew Phiong - 通讯作者:
Matthew Phiong
Big Data Analytics for Development: Events, Knowledge Graphs and Predictive Models
- DOI:
- 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Sunandan Chakraborty - 通讯作者:
Sunandan Chakraborty
A Co-Training Model with Label Propagation on a Bipartite Graph to Identify Online Users with Disabilities
在二分图上使用标签传播的协同训练模型来识别残疾在线用户
- DOI:
10.1609/icwsm.v13i01.3268 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Xing Yu;Sunandan Chakraborty;Erin L. Brady - 通讯作者:
Erin L. Brady
Prevalence of endangered shark trophies in automated detection of the online wildlife trade
在线野生动物贸易的自动监测中濒危鲨鱼战利品的流行情况
- DOI:
10.1016/j.biocon.2025.110992 - 发表时间:
2025-04-01 - 期刊:
- 影响因子:4.400
- 作者:
Sunandan Chakraborty;Spencer N. Roberts;Gohar A. Petrossian;Monique Sosnowski;Juliana Freire;Jennifer Jacquet - 通讯作者:
Jennifer Jacquet
Extraction of (Key, Value) Pairs from Unstructured Ads
从非结构化广告中提取(键,值)对
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Sunandan Chakraborty;L. Subramanian;Yaw Nyarko - 通讯作者:
Yaw Nyarko
Sunandan Chakraborty的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sunandan Chakraborty', 18)}}的其他基金
D-ISN/Collaborative Research: An Interdisciplinary Approach to the Discovery, Analysis, and Disruption of Wildlife Trafficking Networks
D-ISN/ — 合作研究:发现、分析和破坏野生动物贩运网络的跨学科方法
- 批准号:
2146351 - 财政年份:2022
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant
相似国自然基金
全钒液流电池负极V(II)/V(III)电化学氧化还原的催化机理研究
- 批准号:2025JJ50094
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
吡咯烷生物碱所致肝窦阻塞综合征III区肝损伤的新机制——局部氨代谢紊乱
- 批准号:JCZRYB202500652
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
硅基III-V族亚微米线激光器的光场模式调控与耦合机理研究
- 批准号:JCZRQN202501004
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
MXene/nZVI@FH材料微域层界面调控水中砷(III)氧化迁移机制
- 批准号:2025JJ50319
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
HOXC8/OPN/CD44/EGFR轴介导的奥沙利铂耐药性在III期右半结肠癌耐药进展中的研究
- 批准号:2025JJ50694
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
AI结合超声原始射频信号评估Bethesda III/IV类甲状腺肿瘤包膜和血管侵犯研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
硫化砷靶向VPS4B-ESCRT-III调控自噬溶酶体通路逆转三阴性乳腺癌顺铂耐药性的研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
ASPGR与MRC2双受体介导铱(III)配合物
脂质体抗肝肿瘤研究
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
Ap-Exo III 联合模式识别构建降尿酸药
物筛选新方法的研究
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
稻田土壤二氧化锰还原生成Mn(III)过程对As(III)的氧化-固定机制
- 批准号:2025JJ60246
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
NEPhos_Phosphoregulation of ESCRT-III during nuclear envelope reformation
NEPhos_ESCRT-III 核膜重构过程中的磷酸调节
- 批准号:
EP/Z00098X/1 - 财政年份:2025
- 资助金额:
$ 17.43万 - 项目类别:
Fellowship
IUCRC Phase III University of Colorado Boulder: Center for Membrane Applications, Science and Technology (MAST)
IUCRC 第三阶段科罗拉多大学博尔德分校:膜应用、科学与技术中心 (MAST)
- 批准号:
2310937 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Continuing Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342498 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant
III属窒化物半導体のイオン注入不純物活性化機構の解明と点欠陥制御
阐明III族氮化物半导体中的离子注入杂质激活机制和点缺陷控制
- 批准号:
23K21082 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
III型分泌装置に依存しない類鼻疽菌の病原性に関与する因子の同定とその機能解析
不依赖于III型分泌器的类鼻疽杆菌致病因子的鉴定及其功能分析
- 批准号:
24K10200 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Carrier recombination dynamics in III-N photodetectors
III-N 光电探测器中的载流子复合动力学
- 批准号:
2341747 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342497 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant
IUCRC Phase III Virginia Institute of Marine Science for Science Center for Marine Fisheries (SCEMFIS)
IUCRC 第三阶段 弗吉尼亚海洋科学研究所海洋渔业科学中心 (SCEMFIS)
- 批准号:
2332984 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Continuing Grant
III : Medium: Collaborative Research: From Open Data to Open Data Curation
III:媒介:协作研究:从开放数据到开放数据管理
- 批准号:
2420691 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant
III: Small: Query-By-Sketch: Simplifying Video Clip Retrieval Through A Visual Query Paradigm
III:小:按草图查询:通过可视化查询范式简化视频剪辑检索
- 批准号:
2335881 - 财政年份:2024
- 资助金额:
$ 17.43万 - 项目类别:
Standard Grant