Frameworks: Infrastructure For Political And Social Event Data using Machine Learning

框架:使用机器学习的政治和社会事件数据的基础设施

基本信息

  • 批准号:
    2311142
  • 负责人:
  • 金额:
    $ 158.9万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-08-01 至 2026-07-31
  • 项目状态:
    未结题

项目摘要

This project intends to revolutionize computerized data extraction for conflict scholars, security analysts, and practitioners who for decades have devoted significant resources to monitor, understand, and predict armed violence, social protests, and other politically relevant events worldwide. Currently, the vast majority of conflict event data are  expensively coded by humans from increasingly large volumes of news reports. This project uses  recent advances in artificial intelligence and large language models to address this fundamental issue for conflict research. It builds on earlier NSF efforts that created a publicly available large language model to study inter- and intra-state conflict and armed violence, called ConfliBERT. This project expands the ConfliBERT model to multilingual settings, including Arabic and Spanish. This will help researchers and policymakers better understand the context of local events and create a continuous data analysis process by feeding in current news stories to identify new political actors and events in real time. As the project's cyberinfrastructure develops, the research community will be empowered through training, education, and outreach with groups at local, national, and international levels, including academics and government.In the last five years, state-of-the-art language models have revolutionized the field of natural language processing (NLP). In particular, there have been significant advances in the use of domain-specific models for understanding social processes. Our research and that of other experts in this field demonstrate how ConfliBERT outperforms prior  models for coding and understanding conflict and violence from raw text (Hu, et al. 2022, Haffner, et al. 2023). This project  supports new NLP developments for conflict research and expands their access to the academic and policy communities. Specifically, it builds on earlier NSF efforts that led to the development of ConfliBERT, a domain-specific language model, publicly available at Hugging Face, trained on an expert-curated corpus about conflict and political violence (Hu et al. 2022). This project will integrate, extend, and apply ConfliBERT and our related innovations (e.g., actor detection for network construction) into a sustainable ecosystem to engineer data from text. It will expand ConfliBERT to multilingual settings including Arabic and Spanish, update the corpora in sustainable ways and retrain ConfliBERT on a continuous basis, provide new political network data, and develop language models for users to create customized datasets and applications. All developed cyberinfrastructure is and will continue to be broadly accessible for the community of researchers, analysts, and others with interests in conflict dynamics, security studies, and international relations. This project funded by the NSF Office of Advanced Cyberinfrastructure is jointly supported by the Directorate for Social, Behavioral, and Economic Sciences, and the Directorate for STEM Education.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在为冲突学者、安全分析师和从业人员提供革命性的计算机数据提取,他们几十年来投入了大量资源来监测、理解和预测全球武装暴力、社会抗议和其他政治相关事件。目前,绝大多数冲突事件数据都是由人类从日益庞大的新闻报道中进行编码,成本高昂。该项目利用人工智能和大型语言模型的最新进展来解决冲突研究的这一基本问题。它建立在NSF早期工作的基础上,该工作创建了一个公开可用的大型语言模型,用于研究国家间和国家内部的冲突和武装暴力,称为ConfliBERT。该项目将ConfliBERT模型扩展到多语言环境,包括阿拉伯语和西班牙语。这将有助于研究人员和政策制定者更好地了解当地事件的背景,并通过提供当前新闻故事来创建持续的数据分析过程,以实时识别新的政治参与者和事件。随着该项目的网络基础设施的发展,研究界将通过培训、教育和与地方、国家和国际各级团体(包括学术界和政府)的外展活动获得授权。在过去的五年中,最先进的语言模型已经彻底改变了自然语言处理(NLP)领域。特别是,在使用领域特定模型来理解社会过程方面已经取得了重大进展。我们和该领域其他专家的研究表明,ConfliBERT在编码和理解原始文本中的冲突和暴力方面如何优于先前的模型(Hu, et al. 2022, Haffner, et al. 2023)。该项目支持冲突研究的新NLP发展,并扩大其在学术和政策社区的使用。具体来说,它建立在早期NSF的努力基础上,后者导致了ConfliBERT的发展,这是一个领域特定的语言模型,在hug Face上公开可用,在一个关于冲突和政治暴力的专家管理语料库上进行了培训(Hu et al. 2022)。该项目将整合、扩展和应用ConfliBERT和我们的相关创新(例如,用于网络建设的行为者检测)到一个可持续的生态系统中,以从文本中设计数据。它将把ConfliBERT扩展到包括阿拉伯语和西班牙语在内的多语言环境,以可持续的方式更新语料库,并在持续的基础上对ConfliBERT进行再培训,提供新的政治网络数据,并为用户开发语言模型,以创建定制的数据集和应用程序。所有已开发的网络基础设施现在并将继续对研究人员、分析人员和其他对冲突动态、安全研究和国际关系感兴趣的人广泛开放。本项目由美国国家科学基金会高级网络基础设施办公室资助,由社会、行为和经济科学理事会和STEM教育理事会共同支持。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Patrick Brandt其他文献

Patrick Brandt的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Patrick Brandt', 18)}}的其他基金

Elements: Data: Sustaining Modern Infrastructure For Political And Social Event Data
要素:数据:维持政治和社会事件数据的现代基础设施
  • 批准号:
    1931541
  • 财政年份:
    2019
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
RIDIR: Modernizing Political Event Data for Big Data Social Science Research
RIDIR:大数据社会科学研究的政治事件数据现代化
  • 批准号:
    1539302
  • 财政年份:
    2015
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Collaborative Research: Development of a Technology for Real Time, Ex Ante Forecasting of Intra and International Conflict and Cooperation
合作研究:开发实时、事前预测内部和国际冲突与合作的技术
  • 批准号:
    0921051
  • 财政年份:
    2009
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Collaborative Research: Bayesian Time Series Models for the Analysis of International Conflict
合作研究:用于分析国际冲突的贝叶斯时间序列模型
  • 批准号:
    0540816
  • 财政年份:
    2005
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Collaborative Research: Bayesian Time Series Models for the Analysis of International Conflict
合作研究:用于分析国际冲突的贝叶斯时间序列模型
  • 批准号:
    0351205
  • 财政年份:
    2004
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant

相似海外基金

Doctoral Dissertation Research: A History of Muscle-Powered Transportation Infrastructure Planning and the Broader Political Economy
博士论文研究:肌肉驱动的交通基础设施规划和更广泛的政治经济学的历史
  • 批准号:
    1947149
  • 财政年份:
    2020
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Elements: Data: Sustaining Modern Infrastructure For Political And Social Event Data
要素:数据:维持政治和社会事件数据的现代基础设施
  • 批准号:
    1931541
  • 财政年份:
    2019
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Developing an Archive Infrastructure for Political Data
开发政治数据档案基础设施
  • 批准号:
    17H00969
  • 财政年份:
    2017
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Theoretical Study on Political Economy for Public Infrastructure and Vertical Fiscal Transfer
公共基础设施与纵向财政转移政治经济学理论研究
  • 批准号:
    16K17130
  • 财政年份:
    2016
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Doctoral Dissertation Research: The role of infrastructure in rural political engagement
博士论文研究:基础设施在农村政治参与中的作用
  • 批准号:
    1627463
  • 财政年份:
    2016
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
CIF21: DIBBs: Building a Unified Infrastructure for Data Integration on Political Violence and Conflict
CIF21:DIBB:构建政治暴力和冲突数据集成的统一基础设施
  • 批准号:
    1255793
  • 财政年份:
    2013
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Political Networks: Conference and Infrastructure Development
政治网络:会议和基础设施开发
  • 批准号:
    0851084
  • 财政年份:
    2009
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Collaborative Research: Studying Information Processing in Political Science: Improving Infrastructure, Testing Theory
合作研究:研究政治学中的信息处理:改善基础设施、测试理论
  • 批准号:
    0647657
  • 财政年份:
    2007
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Collaborative Research: Studying Information Processing in Political Science: Improving Infrastructure, Testing Theory
合作研究:研究政治学中的信息处理:改善基础设施、测试理论
  • 批准号:
    0647738
  • 财政年份:
    2007
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Standard Grant
Political Science Research Infrastructure: Comparative Study of Electoral Systems
政治科学研究基础设施:选举制度的比较研究
  • 批准号:
    0112029
  • 财政年份:
    2001
  • 资助金额:
    $ 158.9万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了