Robust Parallel and Distributed Computing Systems
强大的并行和分布式计算系统
基本信息
- 批准号:0615170
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2006
- 资助国家:美国
- 起止时间:2006-06-15 至 2010-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Parallel and distributed computing systems, consisting of a heterogeneous set of machines, software, and networks, frequently operate in environments where their performance degrades due to circumstances that change unpredictably, such as sudden machine failures or inaccuracies in the estimation of system parameters. An important question then arises: what extent of departure from the assumed circumstances will cause the performance to degrade to the point where the system cannot meet the specified requirements i.e., how robust is the system? The focus of this work is the design of methodologies for generating robustness metrics and using them in resource management. A resource allocation is defined to be robust if degradation in system performance remains within specified limits when certain perturbations in specified system parameters occur. Furthermore, a resource allocations degree of robustness must be mathematically quantified e.g., how many machines can fail, how inaccurate can estimates in system parameters be before a performance requirement violation occurs? Specifically, this research addresses the design of: mathematically precise and widely applicable techniques for modeling and quantifying the robustness of a resource allocation against multiple perturbations in system components and environmental conditions. resource allocation algorithms that continually plan and develop strategies for responding to potential faults, resource degradation, and other changes in system environment. This work represents a partnership between university and industry/government laboratories that are committed to developing high availability computing systems for industry and defense applications. Its results will be widely disseminated through presentations, publications, interdisciplinary workshops, and technology transfer.
并行和分布式计算系统由一组不同的机器、软件和网络组成,它们经常在这样的环境中运行,其中它们的性能由于不可预测的变化的环境而降低,例如突然的机器故障或系统参数估计的不准确。于是,一个重要的问题出现了:偏离假设情况的程度会导致性能下降到系统不能满足具体要求的程度,即系统的健壮性有多强?这项工作的重点是设计用于生成健壮性度量并将其用于资源管理的方法。当特定系统参数中发生某些扰动时,如果系统性能降级保持在指定限制内,则资源分配被定义为稳健的。此外,必须在数学上量化资源分配的稳健性程度,例如,有多少机器可能发生故障,在违反性能要求之前,系统参数的估计有多不准确?具体地说,这项研究致力于设计:数学上精确和广泛适用的技术,用于建模和量化资源分配对系统组件和环境条件中的多个扰动的稳健性。资源分配算法,用于持续计划和开发策略,以响应系统环境中的潜在故障、资源降级和其他变化。这项工作代表了大学和工业/政府实验室之间的合作伙伴关系,这些实验室致力于为工业和国防应用开发高可用性计算系统。其成果将通过演讲、出版物、跨学科研讨会和技术转让广泛传播。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Howard Siegel其他文献
Howard Siegel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Howard Siegel', 18)}}的其他基金
MRI Collaborative Consortium: Acquisition of a Shared Supercomputer by the Rocky Mountain Advanced Computing Consortium
MRI 协作联盟:落基山高级计算联盟收购共享超级计算机
- 批准号:
1532235 - 财政年份:2015
- 资助金额:
-- - 项目类别:
Standard Grant
CSR:Medium:Collaborative Research: Stochastically Robust Resource Allocation for Computing
CSR:中:协作研究:计算的随机鲁棒资源分配
- 批准号:
0905399 - 财政年份:2009
- 资助金额:
-- - 项目类别:
Standard Grant
MRI: Acquisition of the ISTeC High Performance Computing Infrastructure for Science and Engineering Research Projects
MRI:收购 ISTeC 高性能计算基础设施用于科学和工程研究项目
- 批准号:
0923386 - 财政年份:2009
- 资助金额:
-- - 项目类别:
Standard Grant
NSF/Purdue Workshop on Grand Challenges in Computer Architecture for the Support of High Performance Computing; Purdue University; December 11-13, 1991
NSF/普渡大学关于支持高性能计算的计算机体系结构的重大挑战研讨会;
- 批准号:
9200735 - 财政年份:1991
- 资助金额:
-- - 项目类别:
Standard Grant
Infrastructure for Parallel Processing Research
并行处理研究的基础设施
- 批准号:
9015696 - 财政年份:1991
- 资助金额:
-- - 项目类别:
Continuing Grant
相似国自然基金
强流低能加速器束流损失机理的Parallel PIC/MCC算法与实现
- 批准号:11805229
- 批准年份:2018
- 资助金额:27.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CyberTraining: Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321017 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research:CyberTraining:Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321020 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research:CyberTraining:Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321016 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research:CyberTraining:Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321019 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
CRII: SHF: A Parallel and Distributed Framework for Graph Mining on GPUs
CRII:SHF:GPU 上图挖掘的并行分布式框架
- 批准号:
2245792 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Shared and Distributed Memory Parallel Pre-Conditioning and Acceleration Algorithms for "Spline- Enhanced" Spatial Discretisations
用于“样条增强”空间离散化的共享和分布式内存并行预处理和加速算法
- 批准号:
2907459 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
Collaborative Research: CyberTraining:Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321015 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research:CyberTraining:Implementation:Medium: Modern Course Exemplars infused with Parallel and Distributed Computing for the Introductory Computing Course Sequence
协作研究:网络培训:实施:中:为入门计算课程序列注入并行和分布式计算的现代课程范例
- 批准号:
2321018 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Travel: NSF Student Travel Support for 2023 ACM Symposium on High-Performance Parallel and Distributed Computing (ACM HPDC)
旅行:NSF 学生为 2023 年 ACM 高性能并行和分布式计算研讨会 (ACM HPDC) 提供旅行支持
- 批准号:
2326506 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Combinatorial Algorithms for Parallel and Distributed Computing
并行和分布式计算的组合算法
- 批准号:
RGPIN-2020-06789 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual