Collaborative Research: Topology-Aware MPI Communication and Scheduling for Petascale Systems

协作研究:Petascale 系统的拓扑感知 MPI 通信和调度

基本信息

  • 批准号:
    0926574
  • 负责人:
  • 金额:
    $ 45.99万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-10-01 至 2013-09-30
  • 项目状态:
    已结题

项目摘要

Abstract This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). Modern networks (like InfiniBand and 10GigE) have capability to provide topology, routing and also network status information at run-time. This leads to the following broad challenge: Can the next generation petascale systems provide topology-aware MPI communication, mapping and scheduling which can improve performance and scalability for a range of applications? This challenge leads to the following research questions: 1) What are the topology- aware communication and scheduling requirements of petascale applications? 2) How to design a network topology and state management framework with static and dynamic network information? 3) How to design topology-aware point-to-point and collective communication schemes (such as broadcast, all-to-all, all-reduce) in an MPI library? 4) How to design topology-aware task mapping and scheduling schemes? and 5) How to define and design a flexible topology information interface? A synergistic and comprehensive research plan, involving computer scientists from The Ohio State University (OSU) and computational scientists from the Texas Advanced Computing Center (TACC) and The Univ. of Calif., San Diego, San Diego Supercomputer Center (SDSC), is proposed to address the above challenges. The research will be driven by a set of applications (PSDNS, UCSDH3D, AWM-Olsen and MPCUGLES) from established NSF computational science researchers running large scale simulations on the Ranger system and other NSF HEC systems. The transformative impact of the proposed research is to develop topology-aware MPI software and a framework for using derived topology information for scheduling integration in order to maximize petascale application performance. The proposed research is a collaborative and synergistic activity between computer scientists and computational scientists and thus, will have significant impact in deriving guidelines for designing, deploying and using next generation petascale systems. The proposed research directions and their solutions will be used in curriculum of the investigators to train graduate and undergraduate students. The established national-scale training and outreach programs at TACC and SDSC will be used to disseminate the results of this research to HEC users and developers. Research results will also be disseminated to the multiple collaborating organizations of the investigators (national laboratories and industry) to enable impact on their software products and applications. The modified MVAPICH2 library (currently being used by more than 840 organizations) and SGE scheduler plug-in will be available to the HEC community in an open-source manner. Case-studies from this research will be presented at the MPI Forum (OSU is a member of this forum) to influence the design of the upcoming MPI-3 standard and other MPI libraries.
该奖项是根据2009年美国复苏和再投资法案(公法111-5)资助的。现代网络(如InfiniBand和10 GigE)具有在运行时提供拓扑、路由以及网络状态信息的能力。这导致了以下广泛的挑战:下一代千万亿次系统能否提供拓扑感知的MPI通信,映射和调度,可以提高性能和可扩展性的应用范围?这一挑战引出了以下研究问题:1)千万亿次应用的拓扑感知通信和调度需求是什么?2)如何利用静态和动态网络信息设计网络拓扑和状态管理框架?3)如何在MPI库中设计拓扑感知的点对点和集体通信方案(如广播,all-to-all,all-reduce)?4)如何设计拓扑感知的任务映射和调度方案?如何定义和设计灵活的拓扑信息接口?一个协同和全面的研究计划,涉及来自俄亥俄州州立大学(OSU)的计算机科学家和来自德克萨斯州高级计算中心(TACC)和加利福尼亚大学的计算科学家,圣地亚哥,圣地亚哥超级计算机中心(SDSC),建议解决上述挑战。该研究将由一组应用程序(PSDNS,UCSDH 3D,AWM-Olsen和MPCUGLES)驱动,这些应用程序来自已建立的NSF计算科学研究人员,他们在Ranger系统和其他NSF HEC系统上运行大规模模拟。所提出的研究的变革性影响是开发拓扑感知的MPI软件和框架,用于使用派生的拓扑信息进行调度集成,以最大限度地提高千万亿次应用程序的性能。拟议的研究是计算机科学家和计算科学家之间的协作和协同活动,因此,将对设计,部署和使用下一代千万亿次系统的指导方针产生重大影响。所提出的研究方向及其解决方案将用于研究人员的课程,以培训研究生和本科生。在TACC和SDSC建立的全国范围的培训和推广计划将用于向HEC用户和开发人员传播这项研究的结果。研究结果还将分发给研究人员的多个合作组织(国家实验室和工业界),以便对其软件产品和应用产生影响。修改后的MVAPICH 2库(目前被840多个组织使用)和SGE调度器插件将以开源方式提供给HEC社区。本研究的案例研究将在MPI论坛(OSU是该论坛的成员)上展示,以指导即将到来的MPI-3标准和其他MPI库的设计。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Karl Schulz其他文献

Antifungal Prophylaxis and the Rate of Bacteremia among Neutropenic Patients
中性粒细胞减少症患者的抗真菌预防和菌血症发生率
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    J. Fellay;L. Cone;R. Byrd;Karl Schulz;P. Schlievert
  • 通讯作者:
    P. Schlievert
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
DeepSpeed4Science 计划:通过复杂的人工智能系统技术实现大规模科学发现
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Song;Bonnie Kruft;Minjia Zhang;Conglong Li;Shiyang Chen;Chengming Zhang;Masahiro Tanaka;Xiaoxia Wu;Jeff Rasley;A. A. Awan;Connor Holmes;Martin Cai;Adam Ghanem;Zhongzhu Zhou;Yuxiong He;Christopher Bishop;Max Welling;Tie;Christian Bodnar;Johannes Brandsetter;W. Bruinsma;Chan Cao;Yuan Chen;Peggy Dai;P. Garvan;Liang He;E. Heider;Pipi Hu;Peiran Jin;Fusong Ju;Yatao Li;Chang Liu;Renqian Luo;Qilong Meng;Frank Noé;Tao Qin;Janwei Zhu;Bin Shao;Yu Shi;Wen;Gregor Simm;Megan Stanley;Lixin Sun;Yue Wang;Tong Wang;Zun Wang;Lijun Wu;Yingce Xia;Leo Xia;Shufang Xie;Shuxin Zheng;Jianwei Zhu;Pete Luferenko;Divya Kumar;Jonathan Weyn;Ruixiong Zhang;Sylwester Klocek;V. Vragov;Mohammed Alquraishi;Gustaf Ahdritz;C. Floristean;Cristina Negri;R. Kotamarthi;V. Vishwanath;Arvind Ramanathan;Sam Foreman;Kyle Hippe;T. Arcomano;R. Maulik;Max Zvyagin;Alexander Brace;Bin Zhang;Cindy Orozco Bohorquez;Austin R. Clyde;B. Kale;Danilo Perez;Heng Ma;Carla M. Mann;Michael Irvin;J. G. Pauloski;Logan Ward;Valerie Hayot;M. Emani;Zhen Xie;Diangen Lin;Maulik Shukla;Thomas Gibbs;Ian Foster;James J. Davis;M. Papka;Thomas Brettin;Prasanna Balaprakash;Gina Tourassi;John P. Gounley;Heidi Hanson;T. Potok;Massimiliano Lupo Pasini;Kate Evans;Dan Lu;D. Lunga;Junqi Yin;Sajal Dash;Feiyi Wang;M. Shankar;Isaac Lyngaas;Xiao Wang;Guojing Cong;Peifeng Zhang;Ming Fan;Siyan Liu;A. Hoisie;Shinjae Yoo;Yihui Ren;William Tang;K. Felker;Alexey Svyatkovskiy;Hang Liu;Ashwin Aji;Angela Dalton;Michael Schulte;Karl Schulz;Yuntian Deng;Weili Nie;Josh Romero;Christian Dallago;Arash Vahdat;Chaowei Xiao;Anima Anandkumar;R. Stevens
  • 通讯作者:
    R. Stevens
Bioseparation Using Supercritical Fluid Extraction/Retrograde Condensation
使用超临界流体萃取/逆行冷凝的生物分离
  • DOI:
    10.1038/nbt0488-393
  • 发表时间:
    1988-04-01
  • 期刊:
  • 影响因子:
    41.700
  • 作者:
    G. Ali Mansoori;Karl Schulz;Eloy E. Martinelli
  • 通讯作者:
    Eloy E. Martinelli
Das elastische Gewebe des Periosts und der Knochen
  • DOI:
    10.1007/bf02243783
  • 发表时间:
    1895-05-01
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Karl Schulz
  • 通讯作者:
    Karl Schulz
Limiting Liberalism (Multi)cultural Epistemologies, (Multi)cultural Subjects
  • DOI:
  • 发表时间:
    2013-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Karl Schulz
  • 通讯作者:
    Karl Schulz

Karl Schulz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Karl Schulz', 18)}}的其他基金

Collaborative Research: Extending One-Sided Communication MPI Programming Model for Next -Generation Ultra-Scale HEC
协作研究:扩展下一代超大规模 HEC 的单侧通信 MPI 编程模型
  • 批准号:
    0833139
  • 财政年份:
    2008
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: OAC Core: Large-Scale Spatial Machine Learning for 3D Surface Topology in Hydrological Applications
合作研究:OAC 核心:水文应用中 3D 表面拓扑的大规模空间机器学习
  • 批准号:
    2414185
  • 财政年份:
    2024
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: Workshops in Geometric Topology
合作研究:会议:几何拓扑研讨会
  • 批准号:
    2350374
  • 财政年份:
    2024
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: Workshops in Geometric Topology
合作研究:会议:几何拓扑研讨会
  • 批准号:
    2350373
  • 财政年份:
    2024
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: OAC Core: Topology-Aware Data Compression for Scientific Analysis and Visualization
合作研究:OAC 核心:用于科学分析和可视化的拓扑感知数据压缩
  • 批准号:
    2313124
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: FuSe: Interconnects with Co-Designed Materials, Topology, and Wire Architecture
合作研究:FuSe:与共同设计的材料、拓扑和线路架构互连
  • 批准号:
    2328906
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Continuing Grant
Collaborative Research: FuSe: Interconnects with Co-Designed Materials, Topology, and Wire Architecture
合作研究:FuSe:与共同设计的材料、拓扑和线路架构互连
  • 批准号:
    2328908
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Continuing Grant
Collaborative Research: ATD: a-DMIT: a novel Distributed, MultI-channel, Topology-aware online monitoring framework of massive spatiotemporal data
合作研究:ATD:a-DMIT:一种新颖的分布式、多通道、拓扑感知的海量时空数据在线监测框架
  • 批准号:
    2220495
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: OAC Core: Topology-Aware Data Compression for Scientific Analysis and Visualization
合作研究:OAC 核心:用于科学分析和可视化的拓扑感知数据压缩
  • 批准号:
    2313122
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: New England Algebraic Topology and Mathematical Physics Seminar (NEAT MAPS)
合作研究:会议:新英格兰代数拓扑与数学物理研讨会(NEAT MAPS)
  • 批准号:
    2329854
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
Collaborative Research: FuSe: Interconnects with Co-Designed Materials, Topology, and Wire Architecture
合作研究:FuSe:与共同设计的材料、拓扑和线路架构互连
  • 批准号:
    2328907
  • 财政年份:
    2023
  • 资助金额:
    $ 45.99万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了