Stampede 2: Operations and Maintenance for the Next Generation of Petascale Computing
Stampede 2:下一代千万亿次计算的运维
基本信息
- 批准号:1663578
- 负责人:
- 金额:$ 2400万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Cooperative Agreement
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In 2016, the National Science Foundation funded the acquisition of a large new forward-looking high performance computing (HPC) system, Stampede 2, by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin. In partnership with its lead system vendor (Dell), TACC will deploy the Intel-based system in 2017, doubling the capacity of its predecessor, Stampede, by introducing new memory, processor and interconnect technologies. As Stampede 2 nears its operational deployment, a proposal for operations and maintenance (O&M) of the system was submitted by the University of Texas at Austin. The system is expected to be used as a national resource by thousands of researchers, educators, and students annually. As a critical component of academic infrastructure, it will advance fundamental knowledge in a wide variety of science and engineering frontiers. In addition to continued partnership with Dell, subawards to Clemson University, The University of Colorado, Cornell University, Indiana University, and Ohio State University will ensure a broad national research of innovative HPC to academia and industry. Stampede 2 will operate within the larger landscape of the nation's research cyberinfrastructure (CI). It joins the set of large scale computing resources that rely on and benefit from the collaborative user services model of the NSF-funded Extreme Science and Engineering Discovery Environment (XSEDE) project. These accompanying shared services provide for systems allocations, user training, technical interoperability, research and CI community engagement, and access to expertise. Stampede 2 doubles the computing, storage and networking capacity of the current system, Stampede. Delivering on the potential of this complex scientific instrument requires knowledgeable and ongoing operations, which include: robust system maintenance, reliability and availability; security; software configuration and management; efficient utilization; and research workflow optimization. Most significantly, the thousands of users who currently depend on Stampede rely on expert assistance to help in the development of new skills in order to maximize the value of the new technologies in Stampede 2. These technologies represent the future of large-scale computing. The architecture of Stampede 2 reflects community consensus about HPC's exascale future; while specific technologies are in rapid flux, all paths indicate a transition to more explicit parallelism within applications. Today's applications must adapt, and Stampede 2 offers a bridge to exascale systems of tomorrow, providing capabilities for exploring new approaches to multiscale (both temporal and spatial) simulations, many forms of data intensive science, visualization, and data analysis. Stampede 2's operations will also broaden the usage base of HPC, appealing to and supporting a much greater depth and breadth of large-scale computational science for research than any other national system. The Stampede 2 Operations and Maintenance project plan includes world-class operations, user support and training, application tuning and migration, education, outreach, documentation, data management, visualization, analytics-driven application support, and research collaboration. TACC and its team of partners are established CI providers. Collectively the Stampede 2 operations team will leverage a variety of other NSF-supported projects such as XSEDE, Advanced Cyberinfrastructure Research and Education Facilitators (ACI-REF), and a broad array of scientific software activities. With these complementary collaborations, the value of the O&M award is further increased.
2016年,美国国家科学基金会资助德克萨斯大学奥斯汀分校的德克萨斯高级计算中心(TACC)收购了一个大型的新型前瞻性高性能计算(HPC)系统Stampede 2。通过与其领先的系统供应商(戴尔)合作,TACC将在2017年部署基于英特尔的系统,通过引入新的内存、处理器和互连技术,使其前身Stampede的容量增加一倍。随着Stampede 2接近其操作部署,德克萨斯大学奥斯汀分校提交了一份系统操作和维护(O M)的提案。 该系统预计每年将被数千名研究人员、教育工作者和学生用作国家资源。作为学术基础设施的重要组成部分,它将推动各种科学和工程前沿的基础知识。除了继续与戴尔保持合作关系外,我们还将子项目授予克莱姆森大学、科罗拉多大学、康奈尔大学、印第安纳州大学和俄亥俄州州立大学,以确保学术界和工业界能够广泛开展创新HPC的全国性研究。Stampede 2将在国家研究网络基础设施(CI)的更大范围内运行。它加入了依赖于并受益于NSF资助的极端科学和工程发现环境(XSEDE)项目的协作用户服务模型的大规模计算资源集。这些附带的共享服务提供了系统分配、用户培训、技术互操作性、研究和CI社区参与以及专业知识的获取。Stampede 2使当前系统Stampede的计算、存储和网络容量增加了一倍。要发挥这一复杂科学仪器的潜力,就需要有知识的持续运作,其中包括:强有力的系统维护、可靠性和可用性;安全性;软件配置和管理;有效利用;以及研究工作流程优化。最重要的是,目前依赖Stampede的数千名用户依靠专家的帮助来帮助开发新技能,以最大限度地发挥Stampede 2中新技术的价值。这些技术代表了大规模计算的未来。Stampede 2的架构反映了社区对HPC的亿亿级未来的共识;虽然特定技术处于快速变化中,但所有路径都表明应用程序内向更明确的并行性过渡。今天的应用程序必须适应,而Stampede 2为未来的艾级系统提供了一座桥梁,为探索多尺度(时间和空间)模拟、多种形式的数据密集型科学、可视化和数据分析的新方法提供了能力。Stampede 2的运营还将扩大HPC的使用基础,吸引并支持比任何其他国家系统更深入和更广泛的大规模计算科学研究。Stampede 2操作和维护项目计划包括世界一流的操作、用户支持和培训、应用程序调整和迁移、教育、推广、文档、数据管理、可视化、分析驱动的应用程序支持和研究协作。TACC及其合作伙伴团队是建立CI提供商。总的来说,Stampede 2运营团队将利用各种其他NSF支持的项目,如XSEDE,高级网络基础设施研究和教育促进者(ACI-REF)以及广泛的科学软件活动。通过这些互补的合作,O M奖的价值进一步提高。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Stampede 2: The Evolution of an XSEDE Supercomputer
Stampede 2:XSEDE 超级计算机的演变
- DOI:10.1145/3093338.3093385
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Stanzione, Dan;Barth, Bill;Gaffney, Niall;Gaither, Kelly;Hempel, Chris;Minyard, Tommy;Mehringer, S.;Wernert, Eric;Tufo, H.;Panda, D.
- 通讯作者:Panda, D.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Daniel Stanzione其他文献
Daniel Stanzione的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Daniel Stanzione', 18)}}的其他基金
Final Design Planning for the Leadership-Class Computing Facility
领先级计算设施的最终设计规划
- 批准号:
2212090 - 财政年份:2022
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Characteristic Science Applications for the Leadership Class Computing Facility
领先级计算设施的特色科学应用
- 批准号:
2139536 - 财政年份:2021
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Preliminary Design Planning for the Leadership-Class Computing Facility
领先级计算设施的初步设计规划
- 批准号:
2033468 - 财政年份:2020
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Collaborative Research: Chameleon Phase III: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
合作研究:Chameleon 第三阶段:用于云研究的大规模、可重构实验环境
- 批准号:
2027176 - 财政年份:2020
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
- 批准号:
1925096 - 财政年份:2019
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
- 批准号:
1940979 - 财政年份:2019
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Operations & Maintenance for the Endless Frontier
运营
- 批准号:
1854828 - 财政年份:2019
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Computation for the Endless Frontier
无尽前沿的计算
- 批准号:
1818253 - 财政年份:2018
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Collaborative Research: Chameleon: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
协作研究:Chameleon:用于云研究的大规模、可重构实验环境
- 批准号:
1743354 - 财政年份:2017
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Stampede 2: The Next Generation of Petascale Computing for Science and Engineering
Stampede 2:科学与工程领域的下一代千万亿次计算
- 批准号:
1540931 - 财政年份:2016
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
相似海外基金
Research Infrastructure: LIGO Laboratory Operations and Maintenance 2024-2028 -- Exploring the Gravitational-Wave Cosmos
研究基础设施:LIGO 实验室运营和维护 2024-2028——探索引力波宇宙
- 批准号:
2309200 - 财政年份:2024
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Facility Management, Maintenance and Operations Core
设施管理、维护和运营核心
- 批准号:
10792751 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
Facility Management, Maintenance and Operations Core
设施管理、维护和运营核心
- 批准号:
10793949 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
NEON Operations and Maintenance: Evolving from a Strong Foundation
NEON 运营和维护:从强大的基础发展而来
- 批准号:
2217817 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
Cooperative Agreement
Facility Management, Maintenance and Operations (FMMO) Core
设施管理、维护和运营 (FMMO) 核心
- 批准号:
10793864 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
NERBL Core 1: Facility Management, Maintenance and Operations
NERBL 核心 1:设施管理、维护和运营
- 批准号:
10793932 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
UofL RBL BSL3 Facility Management, Maintenance and Operations Core
UofL RBL BSL3 设施管理、维护和运营核心
- 批准号:
10793919 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
Core 1: Facility Management, Maintenance and Operations Core
核心 1:设施管理、维护和运营核心
- 批准号:
10791948 - 财政年份:2023
- 资助金额:
$ 2400万 - 项目类别:
NHLBI SERVICE DESK, CONFIGURATION MANAGEMENT, AND INFRASTRUCTURE OPERATIONS AND MAINTENANCE
NHLBI 服务台、配置管理以及基础设施运营和维护
- 批准号:
10721132 - 财政年份:2022
- 资助金额:
$ 2400万 - 项目类别: