Stampede 2: Operations and Maintenance for the Next Generation of Petascale Computing

Stampede 2:下一代千万亿次计算的运维

基本信息

  • 批准号:
    1663578
  • 负责人:
  • 金额:
    $ 2400万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Cooperative Agreement
  • 财政年份:
    2017
  • 资助国家:
    美国
  • 起止时间:
    2017-10-01 至 2024-09-30
  • 项目状态:
    已结题

项目摘要

In 2016, the National Science Foundation funded the acquisition of a large new forward-looking high performance computing (HPC) system, Stampede 2, by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin. In partnership with its lead system vendor (Dell), TACC will deploy the Intel-based system in 2017, doubling the capacity of its predecessor, Stampede, by introducing new memory, processor and interconnect technologies. As Stampede 2 nears its operational deployment, a proposal for operations and maintenance (O&M) of the system was submitted by the University of Texas at Austin. The system is expected to be used as a national resource by thousands of researchers, educators, and students annually. As a critical component of academic infrastructure, it will advance fundamental knowledge in a wide variety of science and engineering frontiers. In addition to continued partnership with Dell, subawards to Clemson University, The University of Colorado, Cornell University, Indiana University, and Ohio State University will ensure a broad national research of innovative HPC to academia and industry. Stampede 2 will operate within the larger landscape of the nation's research cyberinfrastructure (CI). It joins the set of large scale computing resources that rely on and benefit from the collaborative user services model of the NSF-funded Extreme Science and Engineering Discovery Environment (XSEDE) project. These accompanying shared services provide for systems allocations, user training, technical interoperability, research and CI community engagement, and access to expertise. Stampede 2 doubles the computing, storage and networking capacity of the current system, Stampede. Delivering on the potential of this complex scientific instrument requires knowledgeable and ongoing operations, which include: robust system maintenance, reliability and availability; security; software configuration and management; efficient utilization; and research workflow optimization. Most significantly, the thousands of users who currently depend on Stampede rely on expert assistance to help in the development of new skills in order to maximize the value of the new technologies in Stampede 2. These technologies represent the future of large-scale computing. The architecture of Stampede 2 reflects community consensus about HPC's exascale future; while specific technologies are in rapid flux, all paths indicate a transition to more explicit parallelism within applications. Today's applications must adapt, and Stampede 2 offers a bridge to exascale systems of tomorrow, providing capabilities for exploring new approaches to multiscale (both temporal and spatial) simulations, many forms of data intensive science, visualization, and data analysis. Stampede 2's operations will also broaden the usage base of HPC, appealing to and supporting a much greater depth and breadth of large-scale computational science for research than any other national system. The Stampede 2 Operations and Maintenance project plan includes world-class operations, user support and training, application tuning and migration, education, outreach, documentation, data management, visualization, analytics-driven application support, and research collaboration. TACC and its team of partners are established CI providers. Collectively the Stampede 2 operations team will leverage a variety of other NSF-supported projects such as XSEDE, Advanced Cyberinfrastructure Research and Education Facilitators (ACI-REF), and a broad array of scientific software activities. With these complementary collaborations, the value of the O&M award is further increased.
2016年,国家科学基金会资助了德克萨斯大学奥斯汀分校的德克萨斯州高级计算中心(TACC)的大型新型前瞻性高性能计算(HPC)系统,即Stampede 2。 TACC与其铅系统供应商(Dell)合作,将在2017年部署基于英特尔的系统,通过引入新的内存,处理器和互连技术,使其前身Stampede的能力翻了一番。当Stampede 2临近其运营部署时,该系统的运营和维护提案是由德克萨斯大学奥斯汀大学提交的。 预计该系统将每年被成千上万的研究人员,教育工作者和学生用作国家资源。作为学术基础设施的关键组成部分,它将促进各种科学和工程领域的基本知识。除了与戴尔(Dell),克莱姆森大学(Clemson University),科罗拉多大学,康奈尔大学,印第安纳大学和俄亥俄州立大学的戴尔(Dell)继续合作,还将确保对学术界和工业的创新HPC进行广泛的国家研究。 Stampede 2将在美国研究网络基础设施(CI)的较大景观中运作。它加入了一组大规模计算资源,这些计算资源依赖并受益于NSF资助的极端科学和工程发现环境(XSEDE)项目的协作用户服务模型。这些随附的共享服务为系统分配,用户培训,技术互操作性,研究和CI社区参与以及获得专业知识提供了服务。 Stampede 2将当前系统的计算,存储和网络容量加倍,即Stampede。发挥这种复杂的科学工具的潜力需要知识渊博和正在进行的操作,其中包括:可靠的系统维护,可靠性和可用性;安全;软件配置和管理;有效利用;和研究工作流优化。最值得注意的是,目前依赖踩踏的数千名用户依靠专家帮助来帮助开发新技能,以最大程度地提高Stampede 2中新技术的价值。这些技术代表了大型计算的未来。 Stampede 2的架构反映了社区对HPC的未来未来的共识。尽管特定的技术处于快速通量状态,但所有路径均表示应用程序内向更明确的并行性过渡。今天的应用程序必须适应,Stampede 2为明天的Exascale系统提供了桥梁,为探索新的多尺度(时间和空间)模拟方法,多种形式的数据密集型科学,可视化和数据分析提供了功能。 Stampede 2的操作还将扩大HPC的使用基础,比其他任何国家系统都吸引并支持大型计算科学的深度和广度。 Stampede 2操作和维护项目计划包括世界一流的操作,用户支持和培训,应用程序调整和迁移,教育,外展,文档,数据管理,可视化,分析驱动的应用程序支持和研究合作。 TACC及其合作伙伴团队是CI提供者。 Stampede 2运营团队将利用其他各种NSF支持的项目,例如XSEDE,高级Cyber​​infrasture研究和教育促进者(ACI-REF)以及一系列广泛的科学软件活动。通过这些互补的合作,O&M奖的价值进一步提高。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Stampede 2: The Evolution of an XSEDE Supercomputer
Stampede 2:XSEDE 超级计算机的演变
  • DOI:
    10.1145/3093338.3093385
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Stanzione, Dan;Barth, Bill;Gaffney, Niall;Gaither, Kelly;Hempel, Chris;Minyard, Tommy;Mehringer, S.;Wernert, Eric;Tufo, H.;Panda, D.
  • 通讯作者:
    Panda, D.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Daniel Stanzione其他文献

Daniel Stanzione的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Daniel Stanzione', 18)}}的其他基金

Final Design Planning for the Leadership-Class Computing Facility
领先级计算设施的最终设计规划
  • 批准号:
    2212090
  • 财政年份:
    2022
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Characteristic Science Applications for the Leadership Class Computing Facility
领先级计算设施的特色科学应用
  • 批准号:
    2139536
  • 财政年份:
    2021
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Preliminary Design Planning for the Leadership-Class Computing Facility
领先级计算设施的初步设计规划
  • 批准号:
    2033468
  • 财政年份:
    2020
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Collaborative Research: Chameleon Phase III: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
合作研究:Chameleon 第三阶段:用于云研究的大规模、可重构实验环境
  • 批准号:
    2027176
  • 财政年份:
    2020
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
  • 批准号:
    1925096
  • 财政年份:
    2019
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
  • 批准号:
    1940979
  • 财政年份:
    2019
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Operations & Maintenance for the Endless Frontier
运营
  • 批准号:
    1854828
  • 财政年份:
    2019
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Computation for the Endless Frontier
无尽前沿的计算
  • 批准号:
    1818253
  • 财政年份:
    2018
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Collaborative Research: Chameleon: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
协作研究:Chameleon:用于云研究的大规模、可重构实验环境
  • 批准号:
    1743354
  • 财政年份:
    2017
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Stampede 2: The Next Generation of Petascale Computing for Science and Engineering
Stampede 2:科学与工程领域的下一代千万亿次计算
  • 批准号:
    1540931
  • 财政年份:
    2016
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement

相似国自然基金

互联互通条件下面向灵活运营组织的轨道交通网络列车运营计划一体化优化研究
  • 批准号:
    72371015
  • 批准年份:
    2023
  • 资助金额:
    39 万元
  • 项目类别:
    面上项目
考虑共乘服务模式的共享电动汽车运营决策建模与优化
  • 批准号:
    72301005
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
差异化视角下零售平台的消费信贷与运营决策研究
  • 批准号:
    72301260
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
考虑长者能力的非营利性居家养老服务平台运营决策研究
  • 批准号:
    72371170
  • 批准年份:
    2023
  • 资助金额:
    40 万元
  • 项目类别:
    面上项目
低效匹配情形下考虑司乘汇合点的排队模型构建与运营策略研究
  • 批准号:
    72301174
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Research Infrastructure: LIGO Laboratory Operations and Maintenance 2024-2028 -- Exploring the Gravitational-Wave Cosmos
研究基础设施:LIGO 实验室运营和维护 2024-2028——探索引力波宇宙
  • 批准号:
    2309200
  • 财政年份:
    2024
  • 资助金额:
    $ 2400万
  • 项目类别:
    Cooperative Agreement
Facility Management, Maintenance and Operations Core
设施管理、维护和运营核心
  • 批准号:
    10792751
  • 财政年份:
    2023
  • 资助金额:
    $ 2400万
  • 项目类别:
Facility Management, Maintenance and Operation Core
设施管理、维护和运营核心
  • 批准号:
    10793908
  • 财政年份:
    2023
  • 资助金额:
    $ 2400万
  • 项目类别:
Operations Core
运营核心
  • 批准号:
    10793943
  • 财政年份:
    2023
  • 资助金额:
    $ 2400万
  • 项目类别:
UofL RBL Operations, Workforce Development and Pandemic Preparedness Research
伦敦大学 RBL 运营、劳动力发展和流行病防范研究
  • 批准号:
    10793918
  • 财政年份:
    2023
  • 资助金额:
    $ 2400万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了