A data-driven approach to improving data center efficiency
提高数据中心效率的数据驱动方法
基本信息
- 批准号:RGPIN-2020-05969
- 负责人:
- 金额:$ 3.5万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2021
- 资助国家:加拿大
- 起止时间:2021-01-01 至 2022-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large-scale processing and storage of data has become the underpinning of our modern society. Nearly all industry sectors depend on efficient data management and storage as the engine that moves their business and end-users have grown dependent on services enabled by large-scale processing and storing of information. Consequently, it is maybe not surprising that a staggering amount of resources (both monetary as well as environmental) are dedicated to the storage and management of data. For example, a recent study concludes that managing the `Tsunami of data' could consume one fifth of global electricity by 2025. As a result, making optimal use of systems for processing and storing data at large scale is critical both in terms of future economic success and environmental impact. Of course, the problem of efficiency in compute clusters and data centers is not new. Much research in recent years has focused on many different facets of the problem, too many in fact to include an exhaustive survey in this proposal. Some examples include more energy-efficient hardware and energy-proportional servers, optimizations to the data center cooling system to improve airflow, increasing data center temperatures, new storage media, flexible and scalable schedulers, disaggregated storage to allow separate scaling of storage and compute resources, more light-weight virtualization mechanisms, continuous improvements in the efficiency of ML and AI algorithms and many more. The key observation motivating this proposal is that even today we are still making terribly inefficient use of data center and cluster resources. The advent of server virtualization and hyper-scale data centers, which are optimized for high density, and the rich academic literature on the topic create the impression that today's compute infrastructure is generally efficiently utilized. In contrast, the PI's recent discussions with several large companies operating their own data centers revealed that data centers are still greatly under-utilized, with typical utilization levels in the 5-25% range, even for hyper-scale data centers. For smaller systems utilization is believed to be even worse. The long-term goal of the work in this proposal is to identify, characterize and remove major sources of inefficiency in today's storage and compute infrastructures, and increase their utilization. The methodology will follow the PI's signature approach to research: form collaborations with industry to gain access to real data from production machines, analyze said data to characterize and identify problems, and use rigorous statistical and algorithmic methods to tackle core problems. The proposal outlines a set of specific near-term objectives that the PI identified in her recent work and plans to address over a 5-year horizon. She expects the results will open up new threads of research that will inform a longer term research agenda.
数据的大规模处理和存储已成为我们现代社会的基础。几乎所有行业都依赖于高效的数据管理和存储,因为这是推动其业务和最终用户的引擎,而最终用户越来越依赖于通过大规模信息处理和存储实现的服务。因此,大量的资源(包括货币和环境)专用于数据的存储和管理可能并不奇怪。例如,最近的一项研究得出结论,到2025年,管理“数据海啸”可能消耗全球五分之一的电力。因此,最佳利用大规模处理和存储数据的系统对于未来的经济成功和环境影响都至关重要。当然,计算集群和数据中心的效率问题并不新鲜。近年来的许多研究都集中在这个问题的许多不同方面,事实上太多了,无法在本提案中包括一个详尽的调查。一些例子包括更节能的硬件和能源比例服务器,优化数据中心冷却系统以改善气流,提高数据中心温度,新的存储介质,灵活和可扩展的存储器,分散存储以允许单独扩展存储和计算资源,更轻量级的虚拟化机制,ML和AI算法效率的持续改进等等。促成这一建议的关键观察是,即使在今天,我们仍然在非常低效地利用数据中心和集群资源。服务器虚拟化和超大规模数据中心(针对高密度进行了优化)的出现,以及关于该主题的丰富学术文献,给人留下了这样的印象:当今的计算基础设施通常得到了有效利用。相比之下,PI最近与几家运营自己数据中心的大公司的讨论显示,数据中心的利用率仍然很低,即使是超大规模数据中心,典型的利用率水平也在5-25%之间。对于较小的系统,利用率被认为甚至更差。本提案中工作的长期目标是确定、描述和消除当今存储和计算基础设施中效率低下的主要原因,并提高其利用率。该方法将遵循PI的标志性研究方法:与行业合作,从生产机器中获取真实的数据,分析所述数据以表征和识别问题,并使用严格的统计和算法方法来解决核心问题。该提案概述了PI在最近的工作中确定的一系列具体的近期目标,并计划在5年内解决。她预计,这些结果将开辟新的研究思路,为更长期的研究议程提供信息。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Schroeder, Bianca其他文献
Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design
- DOI:
10.1145/2248487.2150989 - 发表时间:
2012-04-01 - 期刊:
- 影响因子:0
- 作者:
Hwang, Andy A.;Stefanovici, Ioan;Schroeder, Bianca - 通讯作者:
Schroeder, Bianca
SSD-based Workload Characteristics and Their Performance Implications
- DOI:
10.1145/3423137 - 发表时间:
2021-02-01 - 期刊:
- 影响因子:1.7
- 作者:
Yadgar, Gala;Gabel, Moshe;Schroeder, Bianca - 通讯作者:
Schroeder, Bianca
Schroeder, Bianca的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Schroeder, Bianca', 18)}}的其他基金
A data-driven approach to improving data center efficiency
提高数据中心效率的数据驱动方法
- 批准号:
RGPIN-2020-05969 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Reliable and efficient data centers
可靠、高效的数据中心
- 批准号:
CRC-2018-00038 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
Canada Research Chairs
Reliable And Efficient Data Centers
可靠高效的数据中心
- 批准号:
CRC-2018-00038 - 财政年份:2021
- 资助金额:
$ 3.5万 - 项目类别:
Canada Research Chairs
Reliable and efficient data centers
可靠、高效的数据中心
- 批准号:
CRC-2018-00038 - 财政年份:2020
- 资助金额:
$ 3.5万 - 项目类别:
Canada Research Chairs
A data-driven approach to improving data center efficiency
提高数据中心效率的数据驱动方法
- 批准号:
RGPIN-2020-05969 - 财政年份:2020
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Reliable and efficient data centers
可靠、高效的数据中心
- 批准号:
CRC-2018-00038 - 财政年份:2019
- 资助金额:
$ 3.5万 - 项目类别:
Canada Research Chairs
Reliable and energy-efficient next-generation data centres
可靠且节能的下一代数据中心
- 批准号:
356073-2013 - 财政年份:2018
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
基于Cache的远程计时攻击研究
- 批准号:60772082
- 批准年份:2007
- 资助金额:28.0 万元
- 项目类别:面上项目
相似海外基金
A data-driven modeling approach for augmenting climate model simulations and its application to Pacific-Atlantic interbasin interactions
增强气候模型模拟的数据驱动建模方法及其在太平洋-大西洋跨流域相互作用中的应用
- 批准号:
23K25946 - 财政年份:2024
- 资助金额:
$ 3.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
A new data-driven approach to bring humanity into virtual worlds with computer vision
一种新的数据驱动方法,通过计算机视觉将人类带入虚拟世界
- 批准号:
23K28129 - 财政年份:2024
- 资助金额:
$ 3.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
A PROGRESS-Driven Approach to Cognitive Outcomes after Traumatic Brain Injury: Advancing Equity, Diversity, and Inclusion through Knowledge Synthesis and Mobilization
创伤性脑损伤后认知结果的进步驱动方法:通过知识合成和动员促进公平、多样性和包容性
- 批准号:
492338 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Operating Grants
A data-driven modeling approach for augmenting climate model simulations and its application to Pacific-Atlantic interbasin interactions
增强气候模型模拟的数据驱动建模方法及其在太平洋-大西洋跨流域相互作用中的应用
- 批准号:
23H01250 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Data-driven design of Next Generation Cross-Coupling catalysts by Ligand Parameterisation: A Combined Experimental and Computational Approach.
通过配体参数化进行下一代交叉偶联催化剂的数据驱动设计:实验和计算相结合的方法。
- 批准号:
2896325 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Studentship
EAGER: Development of a Hybrid Knowledge- and Data-Driven Approach to Guide the Design of Immunotherapeutic Cells
EAGER:开发混合知识和数据驱动的方法来指导免疫治疗细胞的设计
- 批准号:
2324742 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Continuing Grant
Study on Heavy Rainfall Mechanism by Mathematical and Data-Driven Approach Using Large Ensemble
利用大集合的数学和数据驱动方法研究强降雨机制
- 批准号:
23KF0161 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Information-Theoretic Surprise-Driven Approach to Enhance Decision Making in Healthcare
信息论惊喜驱动方法增强医疗保健决策
- 批准号:
10575550 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Semiconductor Biomaterials to Speed Bone Healing: A Bioengineering-Driven Approach
半导体生物材料加速骨骼愈合:生物工程驱动的方法
- 批准号:
10587508 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Enabling the mortgage industry to drive net zero retrofitting through a data-driven portfolio approach
使抵押贷款行业能够通过数据驱动的投资组合方法推动净零改造
- 批准号:
10092176 - 财政年份:2023
- 资助金额:
$ 3.5万 - 项目类别:
Collaborative R&D