EAGER: Online Processing of Data in Large Facilities using National Advanced CyberInfrastructure
EAGER:使用国家先进网络基础设施在线处理大型设施中的数据
基本信息
- 批准号:1745246
- 负责人:
- 金额:$ 29.24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2020-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Open, large-scale scientific facilities are an essential part of science and engineering enterprise. These facilities provide shared-use infrastructure, instrumentation, and data products that are openly accessible to a broad community of researchers and/or educators. Current facilities provide increasing volumes of data and data products that have the potential to deliver new insights in a wide range of science and engineering domains. However, while these facilities provide reliable and pervasive access to the data and data products, users typically must download the data of interest and process them using local resources. Consequently, transforming these data and data products into insights requires local access to powerful computing, storage, and networking resources. On the other hand, the NSF Advanced Cyberinfrastructure (ACI) is playing an increasingly important role as an open platform for computational and data-enabled science and engineering and can provide the necessary capabilities to allow a broad user community to effectively process the data in large facilities. However, despite clearly complementing each other, large scientific facilities and NSF ACI remain largely disconnected. As a result, users are forced to actively be part of the process that moves data from large facilities to local computational resources or NSF ACI. Therefore, this data-delivery mode becomes inefficient and limits the potential utility that the data would have if processed in an automatic manner. The outcome of this research can have a significant impact on the scientific and engineering community by improving the accessibility of data and the way scientists interact with both data sources and computational infrastructures. Bringing national ACI and large scientific facilities together will democratize access to science and improve the impact of the NSF-funded infrastructure. This is especially important for small public institutions that have limited resources and do not have high bandwidth Internet connection to the Academic/Research network. The development of human resources, including the training of students, researchers and software professionals, as well as the outreach to minorities and underrepresented groups, will be an integral aspect of this effort. The project uses an open repository to disseminate research papers, prototype implementations, and associated data products to the community.The goal of this project is to explore how NSF-funded ACI, such as the Extreme Science and Engineering Discovery Environment (XSEDE), can be integrated with large facilities generally, and the Ocean Observatories Initiative (OOI) specifically, in an automated manner to support end-to-end user workflows. Specifically, we propose to enable workflows that when triggered can seamlessly orchestrate the entire data-to-discovery pipeline. This involves executing queries on the OOI cyberinfrastructure (possibly based on the occurrence of events of interest), streaming data to appropriate ACI facilities using high bandwidth interconnects (such as Internet2) in order to stage this data close to computing/analytics resources (e.g., XSEDE JetStream), and then launching the modeling and analysis processes to transform such data into insights. In this way, the project will leverage high-performance networks that typically connect these facilities to support data movement, and process this data using state-of-the-art high-performance systems.
开放的大规模科学设施是科学和工程企业的重要组成部分。这些设施提供了共享的基础架构,仪器和数据产品,这些基础架构和数据产品可公开可供广泛的研究人员和/或教育者社区访问。当前的设施提供了越来越多的数据和数据产品,有可能在广泛的科学和工程领域中提供新的见解。但是,尽管这些设施可以可靠且普遍地访问数据和数据产品,但用户通常必须下载感兴趣的数据并使用本地资源来处理它们。因此,将这些数据和数据产品转换为见解需要本地访问强大的计算,存储和网络资源。另一方面,NSF高级网络基础设施(ACI)是越来越重要的角色,作为计算和数据支持科学和工程的开放平台,并可以提供必要的功能,以允许广泛的用户社区有效地处理大型设施中的数据。但是,尽管显然相互补充,但大型科学设施和NSF ACI仍在很大程度上脱节。结果,用户被迫积极地成为将数据从大型设施移至本地计算资源或NSF ACI的过程的一部分。因此,这种数据传递模式变得效率低下,并限制了数据以自动方式处理的潜在效用。这项研究的结果可以通过改善数据的可访问性以及科学家与数据源和计算基础架构互动的方式来对科学和工程社区产生重大影响。将国家ACI和大型科学设施聚集在一起,将民主化科学的访问并改善NSF资助的基础设施的影响。这对于资源有限且与学术/研究网络没有高带宽的互联网连接的小型公共机构尤其重要。人力资源的发展,包括对学生,研究人员和软件专业人士的培训,以及对少数民族和代表性不足的群体的宣传,将是这项工作不可或缺的一部分。该项目使用开放式存储库来将研究论文,原型实现以及相关的数据产品传播到社区。该项目的目的是探索如何将NSF资助的ACI(例如极端的科学和工程发现环境(XSEDE)(XSEDE)(XSEDE)(XSEDE)与大型设施相集成,并普遍集成,以及大洋观察者的启发(OAII),以支持自动启用的方式。具体而言,我们建议启用工作流程,当触发时可以无缝协调整个数据之间发现管道。这涉及在OOI网络frastructure上执行查询(可能是基于感兴趣的事件的发生),将数据流到适当的ACI设施中,使用高带宽互连(例如Internet2)将此数据登上该数据,以便该数据接近计算/分析资源(例如,Xseede jetstream)(例如,Xsede jetstream),然后启动数据进行分析和分析。通过这种方式,该项目将利用通常连接这些设施的高性能网络来支持数据移动,并使用最先进的高性能系统处理此数据。
项目成果
期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Distributed Multi-Sensor Machine Learning Approach to Earthquake Early Warning
- DOI:10.1609/aaai.v34i01.5376
- 发表时间:2020-02
- 期刊:
- 影响因子:0
- 作者:Kevin Fauvel;Daniel Balouek-Thomert;D. Melgar;Pedro Silva;Anthony Simonet;Gabriel Antoniu;Alexandru Costan;Véronique Masson;M. Parashar;I. Rodero;A. Termier
- 通讯作者:Kevin Fauvel;Daniel Balouek-Thomert;D. Melgar;Pedro Silva;Anthony Simonet;Gabriel Antoniu;Alexandru Costan;Véronique Masson;M. Parashar;I. Rodero;A. Termier
Harnessing the Computing Continuum for Urgent Science
- DOI:10.1145/3439602.3439618
- 发表时间:2020-11
- 期刊:
- 影响因子:0
- 作者:Daniel Balouek-Thomert;I. Rodero;M. Parashar
- 通讯作者:Daniel Balouek-Thomert;I. Rodero;M. Parashar
Runtime Management of Data Quality for Scientific Observatories Using Edge and In-Transit Resources
使用边缘和传输中资源对科学观测站的数据质量进行运行时管理
- DOI:10.1109/sbac-pad.2018.00053
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Zamani, Ali Reza;Balouek-Thomert, Daniel;Villalobos, J. J.;Rodero, Ivan;Parashar, Manish
- 通讯作者:Parashar, Manish
Exploring the Potential of Elastic Computing Clusters in Geo-Distributed Data Centers with Fast Fabric Interconnection
通过快速结构互连探索地理分布式数据中心中弹性计算集群的潜力
- DOI:10.1109/hpcc/smartcity/dss.2019.00135
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Chen, Shouwei;Wang, Wensheng;Rodero, Ivan
- 通讯作者:Rodero, Ivan
Optimizing Performance and Computing Resource Management of In-memory Big Data Analytics with Disaggregated Persistent Memory
使用分解的持久内存优化内存大数据分析的性能和计算资源管理
- DOI:10.1109/ccgrid.2019.00012
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Chen, Shouwei;Wang, Wensheng;Wu, Xueyang;Fan, Zhen;Huang, Kunwu;Zhuang, Peiyu;Li, Yue;Rodero, Ivan;Parashar, Manish;Weng, Dennis
- 通讯作者:Weng, Dennis
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ivan Rodero其他文献
Grid broker selection strategies using aggregated resource information
- DOI:
10.1016/j.future.2009.07.009 - 发表时间:
2010-01-01 - 期刊:
- 影响因子:
- 作者:
Ivan Rodero;Francesc Guim;Julita Corbalan;Liana Fong;S. Masoud Sadjadi - 通讯作者:
S. Masoud Sadjadi
Ivan Rodero的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ivan Rodero', 18)}}的其他基金
CIF21 DIBBs: EI: Virtual Data Collaboratory: A Regional Cyberinfrastructure for Collaborative Data Intensive Science
CIF21 DIBB:EI:虚拟数据协作:协作数据密集型科学的区域网络基础设施
- 批准号:
2220826 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: Framework: Data: NSCI: HDR: GeoSCIFramework: Scalable Real-Time Streaming Analytics and Machine Learning for Geoscience and Hazards Research
协作研究:框架:数据:NSCI:HDR:GeoSCIFramework:用于地球科学和灾害研究的可扩展实时流分析和机器学习
- 批准号:
2219975 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: Framework: Data: NSCI: HDR: GeoSCIFramework: Scalable Real-Time Streaming Analytics and Machine Learning for Geoscience and Hazards Research
协作研究:框架:数据:NSCI:HDR:GeoSCIFramework:用于地球科学和灾害研究的可扩展实时流分析和机器学习
- 批准号:
1835692 - 财政年份:2019
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
NSF Large Facilities Cyberinfrastructure Workshop
NSF 大型设施网络基础设施研讨会
- 批准号:
1742969 - 财政年份:2017
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Cross-layer Application-Aware Resilience at Extreme Scale (CAARES)
SPX:协作研究:超大规模跨层应用程序感知弹性 (CAARES)
- 批准号:
1725649 - 财政年份:2017
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
CIF21 DIBBs: EI: Virtual Data Collaboratory: A Regional Cyberinfrastructure for Collaborative Data Intensive Science
CIF21 DIBB:EI:虚拟数据协作:协作数据密集型科学的区域网络基础设施
- 批准号:
1640834 - 财政年份:2016
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: IA: F: Fractured Subsurface Characterization using High Performance Computing and Guided by Big Data
BIGDATA:协作研究:IA:F:使用高性能计算和大数据指导的断裂地下表征
- 批准号:
1546145 - 财政年份:2016
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
CRII: CI: Exploring Advanced Cyber-Infrastructure Co-Design for Big Data Analytics
CRII:CI:探索大数据分析的高级网络基础设施协同设计
- 批准号:
1464317 - 财政年份:2015
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
相似国自然基金
基于视觉感知的移动加工机器人位姿误差建模与在线补偿
- 批准号:52375507
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
可行性研究:飞秒激光表面纹理加工的在线质量检测
- 批准号:52211530491
- 批准年份:2022
- 资助金额:10.00 万元
- 项目类别:国际(地区)合作与交流项目
数据与机理深度融合的微铣削加工过程在线监测理论与刀具磨损智能补偿方法
- 批准号:
- 批准年份:2021
- 资助金额:60 万元
- 项目类别:面上项目
数据与机理深度融合的微铣削加工过程在线监测理论与刀具磨损智能补偿方法
- 批准号:52175528
- 批准年份:2021
- 资助金额:60.00 万元
- 项目类别:面上项目
面向产业化MEMS传感器的表面微加工多层薄膜热学参数在线测试方法研究
- 批准号:62004061
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
相似海外基金
Creation of a knowledgebase of high quality assertions of the clinical actionability of somatic variants in cancer
创建癌症体细胞变异临床可行性的高质量断言知识库
- 批准号:
10555024 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
Advancing Medical Illustration in Patient Education Materials: from Art to Science
推进患者教育材料中的医学插图:从艺术到科学
- 批准号:
10660634 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
Toward measures and behavioral trials for effective online AUD recovery support
采取措施和行为试验以提供有效的在线澳元复苏支持
- 批准号:
10643056 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
NIDDK Extramural Digital Pathology Repository System (HALO LINK)
NIDDK 校外数字病理学存储系统 (HALO LINK)
- 批准号:
10884865 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
Cloud-Based Machine Learning and Biomarker Visual Analytics for Salivary Proteomics
基于云的机器学习和唾液蛋白质组生物标志物可视化分析
- 批准号:
10827649 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别: