II-NEW: Collaborative Research: Spam Processing, Archiving, and Monitoring Community Facility (SPAM Commons)

II-新:协作研究:垃圾邮件处理、归档和监控社区设施 (SPAM Commons)

基本信息

  • 批准号:
    0855180
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-09-01 至 2012-08-31
  • 项目状态:
    已结题

项目摘要

In this project, the PIs propose to construct and develop a shared infrastructure to support the collection and maintenance of realistic, large scale spam data sets, referred as SPAM Commons.Spam is a problem in many important communications media such as email and web. A sub-problem of spam, phishing (a form of online pretexting), caused an estimated $3.2B in damages in 2007. The broad impact of effective spam filtering methods can be estimated in billions of dollars in several communications media such as email and web.Spam has also invaded other media, with concrete attack examples in social networks, blogosphere, Internet telephony (VoIP), instant messaging, and click fraud. Unfortunately, spam research has been hampered by the lack of published real world data sets due to concerns with privacy and company intellectual property. This project team develops a shared infrastructure to support the collection and maintenance of realistic, large scale spam data sets, called Spam Processing, Archiving, and Monitoring Community Facility (SPAM Commons). The main goals of SPAM Commons are: (1) to facilitate remedial research that will stem the wastes and losses caused by spam, and (2) enable revolutionary research that aim for stopping certain kinds of spam attacks altogether. SPAM Commons is divided into a Public Partition and a Protected Partition.The Public Partition is a direct analog of standard corpora for speech and image recognition research, consisting of a systematic and regular collection of both spam and legitimate data in the various communications media, starting from email and web spam, and expanding into other communications media as spam becomes a serious threat in each area and data become available. The Protected Partition consists of a combined data and processing facility that makes private data or near real-time spam data available for experimental evaluation of spam defense mechanisms in a protected testbed. Access to such protected data will enable new spam research on real-time evolving spam and real world data sets that is infeasible today. The intellectual challenges of the SPAM Commons project extend beyond the new research on various abovementioned spam areas enabled by the availability of data sets. The construction of both partitions of SPAM Commons includes significant intellectual challenges of their own. First, the isolation of Protected Partition addresses partially the concerns of privacy, which remains a general research problem. Second, useful spam and legitimate data sets require automated distinction of spam from legitimate documents with certainty, which remains an open research question in email, web, and other media. Third, the adversarial and mutual evolution of spam producers and defenders require continuous collection of fresh data for further study. Finally, the collection and streaming of near-real-time spam data represent research resources currently unavailable to spam researchers. Advances in these areas will spur the growth and evolution of SPAM Commons that will enable new research on the evolving and growing spam problem.The impact of SPAM Commons data sets on experimental spam research may be similar to the impact of large corpora in disciplines such as speech/image recognition and natural language processing, which achieved a level of scientific result reproducibility and comparativeness after the use of such corpora became standard requirements. The proposed data repository will be supported and used by 9 university partners (Clayton State, Emory, Georgia Tech, NC A&T, Northwestern, Texas A&M, UC Davis, U. Georgia, UNC Charlotte), and several industry partners (IBM, PureWire, Secure Computing).
在这个项目中,PI建议建立和开发一个共享的基础设施,以支持收集和维护现实的,大规模的垃圾邮件数据集,称为SPAM Commons。垃圾邮件是许多重要的通信媒体,如电子邮件和Web中的一个问题。垃圾邮件的一个子问题,网络钓鱼(一种在线伪装的形式),在2007年造成了估计32亿美元的损失。有效的垃圾邮件过滤方法在电子邮件和Web等几种通信媒体中的广泛影响估计可达数十亿美元。垃圾邮件还入侵了其他媒体,具体攻击实例包括社交网络、博客圈、互联网电话(VoIP)、即时消息和点击欺诈。不幸的是,由于对隐私和公司知识产权的关注,垃圾邮件的研究一直受到缺乏已发布的真实的世界数据集的阻碍。这个项目团队开发了一个共享的基础设施,以支持收集和维护现实的,大规模的垃圾邮件数据集,称为垃圾邮件处理,垃圾邮件,和监控社区设施(垃圾邮件共享)。SPAM Commons的主要目标是:(1)促进补救研究,以阻止垃圾邮件造成的浪费和损失,(2)实现旨在完全阻止某些类型的垃圾邮件攻击的革命性研究。SPAM Commons分为Public Partition和Protected Partition。Public Partition是语音和图像识别研究的标准语料库的直接模拟,包括各种通信媒体中的垃圾邮件和合法数据的系统和定期收集,从电子邮件和Web垃圾邮件开始,随着垃圾邮件在每个领域成为严重威胁和数据变得可用,扩展到其他通信媒体。受保护分区由一个组合的数据和处理设施组成,它使私人数据或近实时的垃圾邮件数据可用于受保护测试床中的垃圾邮件防御机制的实验评估。访问这些受保护的数据将使新的垃圾邮件研究的实时不断变化的垃圾邮件和真实的世界的数据集,这是不可行的今天。SPAM Commons项目的智力挑战超出了对上述各种垃圾邮件领域的新研究,这些领域是由数据集的可用性实现的。垃圾邮件共享区的两个分区的构建都包含了它们自己的重大智力挑战。首先,保护分区的隔离部分地解决了隐私问题,这仍然是一个普遍的研究问题。第二,有用的垃圾邮件和合法的数据集需要自动区分垃圾邮件与合法的文件的确定性,这仍然是一个开放的研究问题,在电子邮件,网络和其他媒体。第三,垃圾邮件生产者和防御者的对抗和相互演变需要不断收集新的数据,以供进一步研究。最后,收集和流的近实时垃圾邮件数据表示研究资源,目前不可用的垃圾邮件研究人员。这些领域的进步将刺激SPAM Commons的发展和演变,这将使对不断发展和不断增长的垃圾邮件问题的新研究成为可能。SPAM Commons数据集对实验性垃圾邮件研究的影响可能类似于语音/图像识别和自然语言处理等学科中大型语料库的影响,在使用这些语料库成为标准要求后,达到了一定程度的科学结果可重复性和可比较性。拟议的数据存储库将由9所大学合作伙伴(克莱顿州立大学、埃默里大学、格鲁吉亚理工大学、NC A T大学、西北大学、得克萨斯A M大学、加州大学戴维斯分校、U。格鲁吉亚、夏洛特夏洛特)和几个行业合作伙伴(IBM、PureWire、Secure Computing)。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Calton Pu其他文献

Editorial for CollaborateCom 2011 Special Issue
  • DOI:
    10.1007/s11036-013-0436-0
  • 发表时间:
    2013-02-28
  • 期刊:
  • 影响因子:
    2.000
  • 作者:
    James Caverlee;Calton Pu;Dimitrios Georgakopoulos;James Joshi
  • 通讯作者:
    James Joshi
A rigorous approach to facilitate and guarantee the correctness of the genetic testing management in human genome information systems
  • DOI:
    10.1186/1471-2164-12-s4-s13
  • 发表时间:
    2011-01-01
  • 期刊:
  • 影响因子:
    3.700
  • 作者:
    Luciano V Araújo;Simon Malkowski;Kelly R Braghetto;Maria R Passos-Bueno;Mayana Zatz;Calton Pu;João E Ferreira
  • 通讯作者:
    João E Ferreira
Buffer overflows: attacks and defenses for the vulnerability of the decade
缓冲区溢出:十年来漏洞的攻击与防御
Editorial: Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2012)
  • DOI:
    10.1007/s11036-014-0532-9
  • 发表时间:
    2014-09-16
  • 期刊:
  • 影响因子:
    2.000
  • 作者:
    Lakshmish Ramaswamy;Barbara Carminati;James Joshi;Calton Pu
  • 通讯作者:
    Calton Pu
JTangCSB: A Cloud Service Bus for Cloud and Enterprise Application Integration
JTangCSB:用于云和企业应用集成的云服务总线
  • DOI:
    10.1109/mic.2014.62
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xingjian Lu;Calton Pu;Zhaohui Wu;Hanwei Chen
  • 通讯作者:
    Hanwei Chen

Calton Pu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Calton Pu', 18)}}的其他基金

RAPID: Tracking and Evaluation of the Coronavirus (COVID-19) Epidemic Propagation by Finding and Maintaining Live Knowledge in Social Media
RAPID:通过在社交媒体中查找和维护实时知识来跟踪和评估冠状病毒(COVID-19)的流行传播
  • 批准号:
    2026945
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
HNDS-I: Collaborative Research: Developing a Data Platform for Analysis of Nonprofit Organizations
HNDS-I:协作研究:开发用于分析非营利组织的数据平台
  • 批准号:
    2024320
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
EAGER: Live Reality: Sustainable and Up-to-Date Information Quality in Live Social Media through Continuous Evidence-Based Knowledge Acquisition
EAGER:实时现实:通过持续的循证知识获取,实时社交媒体中可持续且最新的信息质量
  • 批准号:
    2039653
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
1st US-Japan Workshop Enabling Global Collaborations in Big Data Research; June, 2017, Atlanta, GA
第一届美日研讨会促进大数据研究的全球合作;
  • 批准号:
    1741034
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RCN: SAVI: Adaptive Management and Use of Resilient Infrastructures in Smart Cities: Support for Global Collaborative Research on Real-Time Analytics of Heterogeneous Big Data
RCN:SAVI:智慧城市弹性基础设施的适应性管理和使用:支持异构大数据实时分析的全球协作研究
  • 批准号:
    1550379
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
EAGER: An Exploratory Study of Multi-Hazard Management through Multi-Source Integration of Physical and Social Sensors
EAGER:通过物理和社会传感器的多源集成进行多危害管理的探索性研究
  • 批准号:
    1402266
  • 财政年份:
    2014
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CSR: Small: Lightning in Clouds: Detection and Characterization of Very Short Bottlenecks
CSR:小:云中闪电:极短瓶颈的检测和表征
  • 批准号:
    1421561
  • 财政年份:
    2014
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
SAVI: EAGER: for Global Research on Applying Information Technology to Support Effective Disaster Management (GRAIT-DM)
SAVI:EAGER:应用信息技术支持有效灾害管理的全球研究 (GRAIT-DM)
  • 批准号:
    1250260
  • 财政年份:
    2012
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RAPID: Automating Emergency Data and Metadata Management to Support Effective Short Term and Long Term Disaster Recovery Efforts
RAPID:自动化应急数据和元数据管理,支持有效的短期和长期灾难恢复工作
  • 批准号:
    1138666
  • 财政年份:
    2011
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CSR:Small: Multi-Bottlenecks: What They Are and How to Find Them
CSR:小:多瓶颈:它们是什么以及如何找到它们
  • 批准号:
    1116451
  • 财政年份:
    2011
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似海外基金

II-New: Collaborative: A Mixed Reality Environment for Enabling Everywhere Data-Centric Work
II-新:协作:支持无处不在的以数据为中心的工作的混合现实环境
  • 批准号:
    1629890
  • 财政年份:
    2016
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II-New: Collaborative: A Mixed Reality Environment for Enabling Everywhere Data-Centric Work
II-新:协作:支持无处不在的以数据为中心的工作的混合现实环境
  • 批准号:
    1629913
  • 财政年份:
    2016
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1551262
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: An Extensible Software Infrastructure for Unmanned Aerial Vehicles
II-新:协作研究:无人机的可扩展软件基础设施
  • 批准号:
    1513006
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II: New: Collaborative Research: An Extensible Software Infrastructure for Unmanned Aerial Vehicles
II:新内容:协作研究:无人机的可扩展软件基础设施
  • 批准号:
    1512992
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1305382
  • 财政年份:
    2013
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1305359
  • 财政年份:
    2013
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Image Processing Cloud (IPC): A Domain-Specific Cloud Computing Infrastructure for Research and Education
II-新:协作研究:图像处理云 (IPC):用于研究和教育的特定领域云计算基础设施
  • 批准号:
    1205708
  • 财政年份:
    2012
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Image Processing Cloud (IPC): A Domain-Specific Cloud Computing Infrastructure for Research and Education
II-新:协作研究:图像处理云 (IPC):用于研究和教育的特定领域云计算基础设施
  • 批准号:
    1205699
  • 财政年份:
    2012
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Robotic Catheterization Using Ionic Polymer-Metal Composite Actuator
II-新:合作研究:使用离子聚合物-金属复合致动器进行机器人导管插入术
  • 批准号:
    1265123
  • 财政年份:
    2012
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了