II-NEW: Collaborative Research: Spam Processing, Archiving, and Monitoring Community Facility (SPAM Commons)

II-新:协作研究:垃圾邮件处理、归档和监控社区设施 (SPAM Commons)

基本信息

  • 批准号:
    0855067
  • 负责人:
  • 金额:
    $ 12万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-09-01 至 2011-04-30
  • 项目状态:
    已结题

项目摘要

In this project, the PIs propose to construct and develop a shared infrastructure to support the collection and maintenance of realistic, large scale spam data sets, referred as SPAM Commons.Spam is a problem in many important communications media such as email and web. A sub-problem of spam, phishing (a form of online pretexting), caused an estimated $3.2B in damages in 2007. The broad impact of effective spam filtering methods can be estimated in billions of dollars in several communications media such as email and web.Spam has also invaded other media, with concrete attack examples in social networks, blogosphere, Internet telephony (VoIP), instant messaging, and click fraud. Unfortunately, spam research has been hampered by the lack of published real world data sets due to concerns with privacy and company intellectual property. This project team develops a shared infrastructure to support the collection and maintenance of realistic, large scale spam data sets, called Spam Processing, Archiving, and Monitoring Community Facility (SPAM Commons). The main goals of SPAM Commons are: (1) to facilitate remedial research that will stem the wastes and losses caused by spam, and (2) enable revolutionary research that aim for stopping certain kinds of spam attacks altogether. SPAM Commons is divided into a Public Partition and a Protected Partition.The Public Partition is a direct analog of standard corpora for speech and image recognition research, consisting of a systematic and regular collection of both spam and legitimate data in the various communications media, starting from email and web spam, and expanding into other communications media as spam becomes a serious threat in each area and data become available. The Protected Partition consists of a combined data and processing facility that makes private data or near real-time spam data available for experimental evaluation of spam defense mechanisms in a protected testbed. Access to such protected data will enable new spam research on real-time evolving spam and real world data sets that is infeasible today. The intellectual challenges of the SPAM Commons project extend beyond the new research on various abovementioned spam areas enabled by the availability of data sets. The construction of both partitions of SPAM Commons includes significant intellectual challenges of their own. First, the isolation of Protected Partition addresses partially the concerns of privacy, which remains a general research problem. Second, useful spam and legitimate data sets require automated distinction of spam from legitimate documents with certainty, which remains an open research question in email, web, and other media. Third, the adversarial and mutual evolution of spam producers and defenders require continuous collection of fresh data for further study. Finally, the collection and streaming of near-real-time spam data represent research resources currently unavailable to spam researchers. Advances in these areas will spur the growth and evolution of SPAM Commons that will enable new research on the evolving and growing spam problem.The impact of SPAM Commons data sets on experimental spam research may be similar to the impact of large corpora in disciplines such as speech/image recognition and natural language processing, which achieved a level of scientific result reproducibility and comparativeness after the use of such corpora became standard requirements. The proposed data repository will be supported and used by 9 university partners (Clayton State, Emory, Georgia Tech, NC A&T, Northwestern, Texas A&M, UC Davis, U. Georgia, UNC Charlotte), and several industry partners (IBM, PureWire, Secure Computing).
在这个项目中,PI建议建立和开发一个共享的基础设施,以支持收集和维护现实的,大规模的垃圾邮件数据集,称为SPAM Commons。垃圾邮件是许多重要的通信媒体,如电子邮件和Web中的一个问题。垃圾邮件的一个子问题,网络钓鱼(一种在线伪装的形式),在2007年造成了估计32亿美元的损失。有效的垃圾邮件过滤方法在电子邮件和Web等几种通信媒体中的广泛影响估计可达数十亿美元。垃圾邮件还入侵了其他媒体,具体攻击实例包括社交网络、博客圈、互联网电话(VoIP)、即时消息和点击欺诈。不幸的是,由于对隐私和公司知识产权的关注,垃圾邮件的研究一直受到缺乏已发布的真实的世界数据集的阻碍。这个项目团队开发了一个共享的基础设施,以支持收集和维护现实的,大规模的垃圾邮件数据集,称为垃圾邮件处理,垃圾邮件,和监控社区设施(垃圾邮件共享)。SPAM Commons的主要目标是:(1)促进补救研究,以阻止垃圾邮件造成的浪费和损失,(2)实现旨在完全阻止某些类型的垃圾邮件攻击的革命性研究。SPAM Commons分为Public Partition和Protected Partition。Public Partition是语音和图像识别研究的标准语料库的直接模拟,包括各种通信媒体中的垃圾邮件和合法数据的系统和定期收集,从电子邮件和Web垃圾邮件开始,随着垃圾邮件在每个领域成为严重威胁和数据变得可用,扩展到其他通信媒体。受保护分区由一个组合的数据和处理设施组成,它使私人数据或近实时的垃圾邮件数据可用于受保护测试床中的垃圾邮件防御机制的实验评估。访问这些受保护的数据将使新的垃圾邮件研究的实时不断变化的垃圾邮件和真实的世界的数据集,这是不可行的今天。SPAM Commons项目的智力挑战超出了对上述各种垃圾邮件领域的新研究,这些领域是由数据集的可用性实现的。垃圾邮件共享区的两个分区的构建都包含了它们自己的重大智力挑战。首先,保护分区的隔离部分地解决了隐私问题,这仍然是一个普遍的研究问题。第二,有用的垃圾邮件和合法的数据集需要自动区分垃圾邮件与合法的文件的确定性,这仍然是一个开放的研究问题,在电子邮件,网络和其他媒体。第三,垃圾邮件生产者和防御者的对抗和相互演变需要不断收集新的数据,以供进一步研究。最后,收集和流的近实时垃圾邮件数据表示研究资源,目前不可用的垃圾邮件研究人员。这些领域的进步将刺激SPAM Commons的发展和演变,这将使对不断发展和不断增长的垃圾邮件问题的新研究成为可能。SPAM Commons数据集对实验性垃圾邮件研究的影响可能类似于语音/图像识别和自然语言处理等学科中大型语料库的影响,在使用这些语料库成为标准要求后,达到了一定程度的科学结果可重复性和可比较性。拟议的数据存储库将由9所大学合作伙伴(克莱顿州立大学、埃默里大学、格鲁吉亚理工大学、NC A T大学、西北大学、得克萨斯A M大学、加州大学戴维斯分校、U。格鲁吉亚、夏洛特夏洛特)和几个行业合作伙伴(IBM、PureWire、Secure Computing)。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Brent Kang其他文献

Brent Kang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Brent Kang', 18)}}的其他基金

Collaborative Research: Hands-on exercises on DETER testbed for security education
协作研究:安全教育 DETER 测试平台的实践练习
  • 批准号:
    1118424
  • 财政年份:
    2010
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Spam Processing, Archiving, and Monitoring Community Facility (SPAM Commons)
II-新:协作研究:垃圾邮件处理、归档和监控社区设施 (SPAM Commons)
  • 批准号:
    1118355
  • 财政年份:
    2010
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: Hands-on exercises on DETER testbed for security education
协作研究:安全教育 DETER 测试平台的实践练习
  • 批准号:
    0920179
  • 财政年份:
    2009
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Project: Focused Faculty Development Workshop on Cyber Games and Interactive Simulations
合作项目:网络游戏和交互式模拟的重点教师发展研讨会
  • 批准号:
    0723808
  • 财政年份:
    2007
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant

相似海外基金

II-New: Collaborative: A Mixed Reality Environment for Enabling Everywhere Data-Centric Work
II-新:协作:支持无处不在的以数据为中心的工作的混合现实环境
  • 批准号:
    1629890
  • 财政年份:
    2016
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-New: Collaborative: A Mixed Reality Environment for Enabling Everywhere Data-Centric Work
II-新:协作:支持无处不在的以数据为中心的工作的混合现实环境
  • 批准号:
    1629913
  • 财政年份:
    2016
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1551262
  • 财政年份:
    2015
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: An Extensible Software Infrastructure for Unmanned Aerial Vehicles
II-新:协作研究:无人机的可扩展软件基础设施
  • 批准号:
    1513006
  • 财政年份:
    2015
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II: New: Collaborative Research: An Extensible Software Infrastructure for Unmanned Aerial Vehicles
II:新内容:协作研究:无人机的可扩展软件基础设施
  • 批准号:
    1512992
  • 财政年份:
    2015
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1305382
  • 财政年份:
    2013
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: II-NEW: Marcher - A Heterogeneous High Performance Computing Infrastructure for Research and Education in Green Computing
协作研究:II-新:Marcher - 用于绿色计算研究和教育的异构高性能计算基础设施
  • 批准号:
    1305359
  • 财政年份:
    2013
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Image Processing Cloud (IPC): A Domain-Specific Cloud Computing Infrastructure for Research and Education
II-新:协作研究:图像处理云 (IPC):用于研究和教育的特定领域云计算基础设施
  • 批准号:
    1205708
  • 财政年份:
    2012
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Image Processing Cloud (IPC): A Domain-Specific Cloud Computing Infrastructure for Research and Education
II-新:协作研究:图像处理云 (IPC):用于研究和教育的特定领域云计算基础设施
  • 批准号:
    1205699
  • 财政年份:
    2012
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
II-NEW: Collaborative Research: Robotic Catheterization Using Ionic Polymer-Metal Composite Actuator
II-新:合作研究:使用离子聚合物-金属复合致动器进行机器人导管插入术
  • 批准号:
    1265123
  • 财政年份:
    2012
  • 资助金额:
    $ 12万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了