Collaborative Research: OAC Core: Small: Anomaly Detection and Performance Optimization for End-to-End Data Transfers at Scale
协作研究:OAC 核心:小型:大规模端到端数据传输的异常检测和性能优化
基本信息
- 批准号:2007829
- 负责人:
- 金额:$ 22.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-08-01 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Despite continuous efforts and investments to upgrade the networking infrastructure of research and education institutions to meet the needs of large-scale science applications, the data transfers on these networks often perform very poorly. Understanding the underlying reasons for poor transfer performance is important yet challenging due to the sophisticated design of today's cyberinfrastructures. This project offers a set of novel models and algorithms to detect and mitigate performance issues of data transfers in research networks. The proposed suite of tools helps researchers and system administrators to pinpoint the root cause of performance problems of data transfers so that necessary actions can be taken swiftly to minimize their impact on ongoing transfers. The project will also integrate the research into all levels of education, including science projects with K-12 students, development of new curriculum modules for graduate- and undergraduate-level courses, and summer workshops specifically for minority groups.Understanding the true underlying reasons for poor transfer performance is key to mitigating them and delivering the promised transfer speeds. However, the involvement of multiple end systems, dynamically changing background traffic, and the complexity of today's networking infrastructures turns it into a complicated and time-consuming process. This project develops a novel anomaly-detection and performance-optimization framework for end-to-end data transfers at scale. The framework helps to predict, understand, diagnose, and optimize wide-area file transfers in today's extreme-scale cyberinfrastructures. To achieve this goal, it derives deep-neural-network-based predictive models that can relate transfer settings to throughput. These models are then used to estimate the optimal configuration for new transfers. The framework also gathers performance metrics for end-system and network resources periodically to keep track of system utilization. When a transfer anomaly is detected, the collected metrics are fed into anomaly-classification models to identify the root causes. Once the underlying reasons of performance problems are identified, the framework launches a real-time optimization process to reconfigure the transfer settings such that the impact of anomalies can be alleviated.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
尽管不断努力和投资,以更新研究和教育机构的网络基础设施,以满足大规模科学应用的需要,这些网络上的数据传输往往表现很差。由于当今网络基础设施的复杂设计,了解传输性能不佳的根本原因很重要,但也具有挑战性。该项目提供了一套新颖的模型和算法,以检测和减轻研究网络中数据传输的性能问题。拟议的工具套件可帮助研究人员和系统管理员查明数据传输性能问题的根本原因,以便迅速采取必要的行动,尽量减少对正在进行的传输的影响。 该项目还将把研究融入各级教育,包括K-12学生的科学项目,为研究生和本科生课程开发新的课程模块,以及专门针对少数群体的暑期研讨会。了解迁移表现不佳的真正根本原因是缓解这些问题并实现承诺的迁移速度的关键。然而,涉及多个终端系统、动态变化的后台流量以及当今网络基础设施的复杂性使其成为一个复杂且耗时的过程。 该项目开发了一种新的异常检测和性能优化框架,用于大规模的端到端数据传输。该框架有助于预测,理解,诊断和优化当今极端规模的网络基础设施中的广域文件传输。为了实现这一目标,它导出了基于深度神经网络的预测模型,可以将传输设置与吞吐量联系起来。然后,这些模型用于估计新转移的最佳配置。该框架还定期收集终端系统和网络资源的性能指标,以跟踪系统利用率。当检测到传输异常时,将收集的指标馈送到异常分类模型中以识别根本原因。一旦确定了性能问题的根本原因,该框架将启动实时优化流程,重新配置传输设置,从而减轻异常的影响。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Energy-saving Cross-layer Optimization of Big Data Transfer Based on Historical Log Analysis
基于历史日志分析的大数据传输节能跨层优化
- DOI:10.1109/icc42927.2021.9500693
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Rodolph, Lavone;Zulkar Nine, MD S;Di Tacchio, Luigi;Kosar, Tevfik
- 通讯作者:Kosar, Tevfik
Energy-Efficient Data Transfer Optimization via Decision-Tree Based Uncertainty Reduction
通过基于决策树的不确定性降低实现节能数据传输优化
- DOI:10.1109/icccn54977.2022.9868866
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Jamil, Hasibul;Rodolph, Lavone;Goldverg, Jacob;Kosar, Tevfik
- 通讯作者:Kosar, Tevfik
SMURF: Efficient and Scalable Metadata Access for Distributed Applications
- DOI:10.1109/tpds.2022.3175596
- 发表时间:2021-05
- 期刊:
- 影响因子:5.3
- 作者:Bing Zhang;T. Kosar
- 通讯作者:Bing Zhang;T. Kosar
GreenABR: energy-aware adaptive bitrate streaming with deep reinforcement learning
- DOI:10.1145/3524273.3528188
- 发表时间:2022-06
- 期刊:
- 影响因子:0
- 作者:B. Turkkan;Ting Dai;Adithya Raman;T. Kosar;Changyou Chen;Muhammed Fatih Bulut;J. Zola;Daby M. Sow
- 通讯作者:B. Turkkan;Ting Dai;Adithya Raman;T. Kosar;Changyou Chen;Muhammed Fatih Bulut;J. Zola;Daby M. Sow
Qualitative analysis of the relationship between design smells and software engineering challenges
设计味道与软件工程挑战之间关系的定性分析
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Asif Imran;Tevfik Kosar
- 通讯作者:Tevfik Kosar
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tevfik Kosar其他文献
Towards Zero-Carbon Data Movement at the HPC and Cloud Data Centers
在 HPC 和云数据中心实现零碳数据移动
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Tevfik Kosar - 通讯作者:
Tevfik Kosar
Guest Editors’ Introduction: Special Issue on Data-Intensive Computing in the Clouds
- DOI:
10.1007/s10723-012-9216-5 - 发表时间:
2012-03-24 - 期刊:
- 影响因子:2.900
- 作者:
Tevfik Kosar;Ioan Raicu - 通讯作者:
Ioan Raicu
Tevfik Kosar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tevfik Kosar', 18)}}的其他基金
OAC Core: Towards Zero-Carbon Data Movement at the HPC and Cloud Data Centers with GreenDataFlow
OAC 核心:利用 GreenDataFlow 在 HPC 和云数据中心实现零碳数据移动
- 批准号:
2313061 - 财政年份:2023
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
IPA Agreement with University of New York at Buffalo 1st year (Kosar 2020)
与纽约大学布法罗分校签订 IPA 协议第一年 (Kosar 2020)
- 批准号:
2042696 - 财政年份:2020
- 资助金额:
$ 22.5万 - 项目类别:
Intergovernmental Personnel Award
EAGER: GreenDataFlow: Minimizing the Energy Footprint of Global Data Movement
EAGER:GreenDataFlow:最大限度地减少全球数据移动的能源足迹
- 批准号:
1842054 - 财政年份:2018
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
CIF21 DIBBs: PD: OneDataShare: A Universal Data Sharing Building Block for Data-Intensive Applications
CIF21 DIBB:PD:OneDataShare:数据密集型应用程序的通用数据共享构建块
- 批准号:
1724898 - 财政年份:2017
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
CAREER: Data-aware Distributed Computing for Enabling Large-scale Collaborative Science
职业:数据感知分布式计算支持大规模协作科学
- 批准号:
1131889 - 财政年份:2011
- 资助金额:
$ 22.5万 - 项目类别:
Continuing Grant
EAGER: Stork Data Scheduler for Azure
EAGER:适用于 Azure 的 Stork 数据调度程序
- 批准号:
1115805 - 财政年份:2011
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
CAREER: Data-aware Distributed Computing for Enabling Large-scale Collaborative Science
职业:数据感知分布式计算支持大规模协作科学
- 批准号:
0846052 - 财政年份:2009
- 资助金额:
$ 22.5万 - 项目类别:
Continuing Grant
MRI: Development of PetaShare: A Distributed Data Archival, Analysis and Visualization System for Data Intensive Collaborative Research
MRI:PetaShare 的开发:用于数据密集型协作研究的分布式数据存档、分析和可视化系统
- 批准号:
0619843 - 财政年份:2006
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: OAC CORE: Federated-Learning-Driven Traffic Event Management for Intelligent Transportation Systems
合作研究:OAC CORE:智能交通系统的联邦学习驱动的交通事件管理
- 批准号:
2414474 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403312 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Large-Scale Spatial Machine Learning for 3D Surface Topology in Hydrological Applications
合作研究:OAC 核心:水文应用中 3D 表面拓扑的大规模空间机器学习
- 批准号:
2414185 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Learning AI Surrogate of Large-Scale Spatiotemporal Simulations for Coastal Circulation
合作研究:OAC Core:学习沿海环流大规模时空模拟的人工智能替代品
- 批准号:
2402947 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403313 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Learning AI Surrogate of Large-Scale Spatiotemporal Simulations for Coastal Circulation
合作研究:OAC Core:学习沿海环流大规模时空模拟的人工智能替代品
- 批准号:
2402946 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403088 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403090 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC: Core: Harvesting Idle Resources Safely and Timely for Large-scale AI Applications in High-Performance Computing Systems
合作研究:OAC:核心:安全及时地收集闲置资源,用于高性能计算系统中的大规模人工智能应用
- 批准号:
2403399 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403089 - 财政年份:2024
- 资助金额:
$ 22.5万 - 项目类别:
Standard Grant