Adaptive and Scalable Event Detection Techniques for Twitter Data Streams
Twitter 数据流的自适应和可扩展事件检测技术
基本信息
- 批准号:275968728
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:德国
- 项目类别:Research Grants
- 财政年份:2015
- 资助国家:德国
- 起止时间:2014-12-31 至 2017-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
With 271 million monthly active users that produce over 500 million tweets per day, Twitter is currently the most popular and fastest-growing microblogging service. Twitter is therefore increasingly used as a source of information on current events as they unfold.For traditional media such as newspaper archives and news website, the problem of event detection has been addressed by research from the area of Topic Detection and Tracking (TDT). However, topic detection in Twitter data streams raises a set of additional challenges. First, Twitter documents are much shorter than traditional news articles due to their length limitation and therefore harder to classify. Second, tweets are not edited and can therefore contain a substantial amount of spam, typos, slang, etc. Finally, the rate with which tweets are being produced is very bursty and will continue to increase as more users adopt Twitter in the future.Several approaches for event detection in social media and, in particular, for Twitter have been proposed. However, most of these proposals tend to focus exclusively on the information extraction aspect and often ignore the streaming nature of the input. For example, many techniques come with a complex but fixed set of parameters that control which events are detected. It is assumed that these parameters are empirically determined by running the algorithm on a sample data set until it produces the desired result. We argue that there are several reasons why this approach is neither realistic nor feasible. First, the data in the stream may undergo qualitative changes that may require parameters to adapt in order to continue to detect events accurately. Second, these parameter not only control the task-based performance of a technique but also the run-time performance. Working with fixed parameters therefore prevents these approaches to scale with quantitative changes in the stream.In this project, we propose to address the need for adaptive and scalable event detection in Twitter in the tradition of Data Stream Management Systems (DSMS) research. In order to focus the project, we will concentrate on the specific task of first story detection, i.e., the detection of general (unknown) events, which is defined as one of the subtasks of TDT. We plan to address these issues in three separate work packages. In the first work package, we will study how event detection methods can adapt to the content of the stream by exploring better ways to segment the stream before it is processed and by adjusting method parameters during processing. The second work package will address scalability requirements in terms of scaling up and down with the volume of one stream but also in terms of scaling up to several parallel streams. Finally, a third work package will be dedicated to the non-trivial task of evaluating event detection techniques.
Twitter拥有2.71亿月活跃用户,每天产生超过5亿条推文,是目前最受欢迎和增长最快的微博客服务。对于传统媒体如报纸档案、新闻网站等,话题检测与跟踪(TDT)领域的研究主要针对事件检测问题。然而,Twitter数据流中的主题检测提出了一系列额外的挑战。首先,由于其长度限制,Twitter文档比传统新闻文章短得多,因此更难分类。第二,推文没有编辑,因此可以包含大量的垃圾邮件,错别字,俚语等。最后,正在生产的推文的速率是非常突发的,并将继续增加,因为更多的用户采用twitter.Several方法在社会媒体,特别是Twitter的事件检测已经提出。然而,这些建议中的大多数倾向于专门关注信息提取方面,并且经常忽略输入的流性质。例如,许多技术都带有一组复杂但固定的参数,这些参数控制检测哪些事件。假设这些参数是通过在样本数据集上运行算法直到产生期望的结果来凭经验确定的。我们认为,有几个原因,这种方法既不现实,也不可行。首先,流中的数据可能会发生质变,这可能需要参数进行调整,以便继续准确地检测事件。其次,这些参数不仅控制技术的基于任务的性能,而且控制运行时性能。因此,使用固定的参数工作,防止这些方法的规模与定量的变化在stream.In这个项目中,我们建议在Twitter的数据流管理系统(DSMS)的研究传统,以解决自适应和可扩展的事件检测的需要。为了突出项目的重点,我们将集中精力完成第一层探测的具体任务,即,一般(未知)事件的检测,这被定义为TDT的子任务之一。我们计划在三个单独的工作包中解决这些问题。在第一个工作包中,我们将研究事件检测方法如何通过探索更好的方法来在处理之前分割流并在处理过程中调整方法参数来适应流的内容。第二个工作包将解决可扩展性要求,即按一个流的容量进行放大和缩小,但也可按比例放大到几个并行流。最后,第三个工作包将致力于评估事件检测技术的重要任务。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
An evaluation of the run-time and task-based performance of event detection techniques for Twitter
- DOI:10.1016/j.is.2016.01.003
- 发表时间:2015-06
- 期刊:
- 影响因子:0
- 作者:Andreas Weiler;Michael Grossniklaus;M. Scholl
- 通讯作者:Andreas Weiler;Michael Grossniklaus;M. Scholl
Towards Reproducible Research of Event Detection Techniques for Twitter
- DOI:10.1109/sds.2019.000-5
- 发表时间:2019-06
- 期刊:
- 影响因子:0
- 作者:Andreas Weiler;Harry Schilling;L. Kircher;Michael Grossniklaus
- 通讯作者:Andreas Weiler;Harry Schilling;L. Kircher;Michael Grossniklaus
Stability Evaluation of Event Detection Techniques for Twitter
- DOI:10.1007/978-3-319-46349-0_32
- 发表时间:2016-10
- 期刊:
- 影响因子:0
- 作者:Andreas Weiler;Jöran Beel;Bela Gipp;Michael Grossniklaus
- 通讯作者:Andreas Weiler;Jöran Beel;Bela Gipp;Michael Grossniklaus
Situation monitoring of urban areas using social media data streams
- DOI:10.1016/j.is.2015.09.004
- 发表时间:2016-04
- 期刊:
- 影响因子:0
- 作者:Andreas Weiler;Michael Grossniklaus;M. Scholl
- 通讯作者:Andreas Weiler;Michael Grossniklaus;M. Scholl
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Professor Dr. Michael Grossniklaus其他文献
Professor Dr. Michael Grossniklaus的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Professor Dr. Michael Grossniklaus', 18)}}的其他基金
A Graph Query Processor for Queries of Class CRPQagg
用于 CRPQagg 类查询的图查询处理器
- 批准号:
265596218 - 财政年份:2015
- 资助金额:
-- - 项目类别:
Research Grants
GraphQueryML: Using Machine Learning to Optimize Queries in Graph Databases
GraphQueryML:使用机器学习来优化图数据库中的查询
- 批准号:
441617860 - 财政年份:
- 资助金额:
-- - 项目类别:
Research Grants
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
相似海外基金
The FUSÃE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
RGPIN-2018-06018 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
The FUSÉE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
RGPIN-2018-06018 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
The FUSÉE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
RGPIN-2018-06018 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
The FUSÉE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
RGPIN-2018-06018 - 财政年份:2019
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
III: Small: Scalable Event Trend Analytics For Data Stream Inquiry
III:小型:用于数据流查询的可扩展事件趋势分析
- 批准号:
1815866 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
The FUSÉE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
DGECR-2018-00119 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Discovery Launch Supplement
The FUSÉE Platform: Fast, Unified, and Scalable Event processing and Event messaging
FUSäE 平台:快速、统一且可扩展的事件处理和事件消息传递
- 批准号:
RGPIN-2018-06018 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
III: Small: Collaborative Research: Scalable Schema-Based Event Extraction
III:小型:协作研究:可扩展的基于模式的事件提取
- 批准号:
1617969 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Standard Grant
III: Small: Collaborative Research: RUI: Scalable Schema-Based Event Extraction
III:小型:协作研究:RUI:可扩展的基于模式的事件提取
- 批准号:
1617952 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Interagency Agreement
EAGER-DynamicData: A Scalable Framework for Data-Driven Real-Time Event Detection in Power Systems
EAGER-DynamicData:电力系统中数据驱动的实时事件检测的可扩展框架
- 批准号:
1462311 - 财政年份:2015
- 资助金额:
-- - 项目类别:
Standard Grant














{{item.name}}会员




