Collaborative Research: Updating the Militarized Dispute Data Through Crowdsourcing: MID5

协作研究:通过众包更新军事化争端数据:MID5

基本信息

  • 批准号:
    1528624
  • 负责人:
  • 金额:
    $ 36.74万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-09-15 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

General Summary The Correlates of War Project's Militarized Interstate Dispute (MID) Data is the most prominent and heavily used data collection in the study of international conflict. The most recent version (MID4) was released in 2014 and brings the period covered to 1816-2010. The MID4 project utilized automated text classification procedures to make the process of identifying relevant news stories more efficient. Over the course of that project, the PIs determined the primary bottleneck in the workflow was the coding of those news documents. To address this inefficiency, The PIs completed a pilot project to determine whether crowdsourcing techniques could be used to code these documents. In the pilot, non-expert workers were paid small sums to read documents and to answer sets of questions, the answers to which were used to identify features of possible militarized incidents (the events that comprise MIDs). A systematic comparison of the crowdsourced responses with those of MID4 Project's trained coders revealed that the crowdsourced codings were completely accurate for 68 percent of the news reports coded; more importantly, high agreement among crowd responses on specific reports was strongly associated with correct coding. This enables the PIs to detect which documents require further expert involvement. As a result, the PIs can produce a majority of the MID data in near-realtime and at limited financial cost. These procedures are applied on the MID5 Project, which will update the MID data for the period 2011-2017.Technical Summary The MID5 project workflow begins with document retrieval from LexisNexis and document classification using the software and methods implemented in MID4. We discard the negatively classified documents, and proceed to extract metadata from the positively classified documents including the document title, the news agency that published the report, the date, and any actors mentioned in the text. Crowd workers are recruited through Amazon's Mechanical Turk and paid a wage to read one of these documents and answer a line of simple, objective questions about it. The questionnaire is predefined, but some extracted metadata is automatically inserted into the questionnaire to improve the quality of responses. Several workers complete a questionnaire for each document, leaving the PIs with problems of aggregation: how to combine multiple worker responses, possibly regarding multiple related questions, into usable data necessary to code the militarized incident. In the pilot study, the PIs show that Bayesian networks are the most effective way to achieve this aggregation. Recently, the PIs have made advances in semi-supervised text classification with hybrid, Deep Restricted Boltzmann Machines, which outperform previous methods in this task.
军事化国家间争端(Militarized Interstate Dispute,MID)数据是国际冲突研究中最突出和最常用的数据收集。最新版本(MID 4)于2014年发布,涵盖时间为1816-2010年。MID 4项目利用自动文本分类程序,提高了确定相关新闻报道的效率。在该项目的过程中,PI确定工作流程中的主要瓶颈是这些新闻文档的编码。为了解决这种效率低下的问题,PI完成了一个试点项目,以确定是否可以使用众包技术来编码这些文档。在试点中,非专家工作人员获得小额报酬,让他们阅读文件和回答一系列问题,这些问题的答案被用来确定可能的军事化事件(构成军事入侵的事件)的特征。将众包的回应与MID 4项目训练有素的编码人员进行系统比较,发现众包编码对68%的新闻报道编码完全准确;更重要的是,人群对特定报道的高度一致性与正确编码密切相关。这使得PI能够检测哪些文档需要进一步的专家参与。因此,PI可以以有限的财务成本近乎实时地生成大部分MID数据。这些程序应用于MID 5项目,该项目将更新2011- 2017年期间的MID数据。技术摘要MID 5项目工作流程首先从LIGHTNEXIS检索文件,并使用MID 4中实现的软件和方法进行文件分类。我们丢弃负面分类的文档,并继续从正面分类的文档中提取元数据,包括文档标题、发布报告的新闻机构、日期以及文本中提到的任何参与者。众筹工作者通过亚马逊的土耳其机器人招募,并支付工资阅读这些文档中的一个,并回答一行简单客观的问题。问卷是预定义的,但一些提取的元数据会自动插入问卷中,以提高回答的质量。几个工人为每个文档填写一份调查问卷,给PI留下了汇总问题:如何将可能涉及多个相关问题的多个工人回答联合收割机组合成编码军事化事件所需的可用数据。在试点研究中,PI表明贝叶斯网络是实现这种聚合的最有效方法。最近,PI在半监督文本分类方面取得了进展,混合深度限制玻尔兹曼机在这项任务中优于以前的方法。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Vito D'Orazio其他文献

Advancing Measurement of Foreign Policy Similarity
推进外交政策相似性的衡量
  • DOI:
    10.31235/osf.io/fuet4
  • 发表时间:
    2012
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Vito D'Orazio
  • 通讯作者:
    Vito D'Orazio
The MID5 Dataset, 2011–2014: Procedures, coding rules, and description
MID5 数据集,2011-2014:程序、编码规则和描述
  • DOI:
    10.1177/0738894221995743
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    Glenn Palmer;Roseanne W. McManus;Vito D'Orazio;Michael R. Kenwick;Mikaela Karstens;C. Bloch;Nick Dietrich;Kayla Kahn;Kellan H. Ritter;Michael J. Soules
  • 通讯作者:
    Michael J. Soules
An Online Structured Political Event Dataset based on CAMEO Ontology
基于CAMEO本体的在线结构化政治事件数据集
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Salam;Patrick T. Brandt;Vito D'Orazio;J. Holmes;Javiar Osorio;L. Khan
  • 通讯作者:
    L. Khan
Updating the Militarized Interstate Dispute Data: A Response to Gibler, Miller, and Little
更新军事化州际争端数据:对吉布勒、米勒和利特尔的回应
  • DOI:
    10.1093/isq/sqz045
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    Glenn Palmer;Vito D'Orazio;Michael R. Kenwick;Roseanne W. McManus
  • 通讯作者:
    Roseanne W. McManus
Error-Correction and Aggregation in Crowd-Sourcing of Geopolitical Incident Information
地缘政治事件信息众包中的纠错和聚合
  • DOI:
    10.1007/978-3-319-16268-3_47
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    3.2
  • 作者:
    Alexander Ororbia;Yang Xu;Vito D'Orazio;D. Reitter
  • 通讯作者:
    D. Reitter

Vito D'Orazio的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: Updating iVirus - the CyVerse-powered analytical toolkit for viruses of microbes
协作研究:更新 iVirus - CyVerse 支持的微生物病毒分析工具包
  • 批准号:
    2149505
  • 财政年份:
    2022
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Continuing Grant
Collaborative Research: Updating iVirus - the CyVerse-powered analytical toolkit for viruses of microbes
协作研究:更新 iVirus - CyVerse 支持的微生物病毒分析工具包
  • 批准号:
    2149506
  • 财政年份:
    2022
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Continuing Grant
Collaborative Research: A New Nonlinear Modal Updating Framework for Soft, Hydrated Materials
协作研究:用于软水合材料的新型非线性模态更新框架
  • 批准号:
    1728186
  • 财政年份:
    2017
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: A New Nonlinear Modal Updating Framework for Soft, Hydrated Materials
协作研究:用于软水合材料的新型非线性模态更新框架
  • 批准号:
    1727761
  • 财政年份:
    2017
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: Updating the Militarized Dispute Data Through Crowdsourcing: MID5
协作研究:通过众包更新军事化争端数据:MID5
  • 批准号:
    1528409
  • 财政年份:
    2015
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Continuing Grant
Collaborative Research: Updating the WeBWorK National Problem Library
合作研究:更新WeBWorK国家问题库
  • 批准号:
    1226081
  • 财政年份:
    2012
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: Updating the WeBWorK National Problem Library
合作研究:更新WeBWorK国家问题库
  • 批准号:
    1226176
  • 财政年份:
    2012
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: Contentious Issues in World Politics: Updating the ICOW Dataset
合作研究:世界政治中有争议的问题:更新 ICOW 数据集
  • 批准号:
    0960567
  • 财政年份:
    2010
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: Contingent Reasoning and Bayesian Updating in Games of Incomplete Information: An Experimental Analysis
协作研究:不完全信息博弈中的条件推理和贝叶斯更新:实验分析
  • 批准号:
    1031101
  • 财政年份:
    2010
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
Collaborative Research: Contingent Reasoning and Bayesian Updating in Games of Incomplete Information: An Experimental Analysis
协作研究:不完全信息博弈中的条件推理和贝叶斯更新:实验分析
  • 批准号:
    1030467
  • 财政年份:
    2010
  • 资助金额:
    $ 36.74万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了