EAGER: Exploring Automatic Optimization of Multi-tiered HPC Storage Systems via Practical Reinforcement Learning
EAGER:通过实用强化学习探索多层 HPC 存储系统的自动优化
基本信息
- 批准号:2412345
- 负责人:
- 金额:$ 13.4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-07-01 至 2025-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Nowadays, scientific discovery increasingly involves generating and analyzing large amounts of data. These data-intensive scientific applications pose significant challenges to the storage systems of high-performance computing (HPC) clusters, that are heterogeneous and extremely complex. Scientists who need high-speed data access often experience frustration in effectively using these heterogeneous storage options. There is need to build the long-missing automated HPC I/O (Input/Output) middleware to transparently help scientists achieve optimal data access performance without their manual efforts. Designing automated HPC I/O middleware for large-scale, heterogeneous, and shared HPC storage systems is an extremely challenging task. The researchers supported by this grant plan to leverage machine learning techniques to understand the requests and the current system status, intelligently and adaptively scheduling and coordinating I/O requests. The outcomes of this research are expected to work with existing storage components and minimize the impacts on both scientific applications and the HPC systems.This project plans to tackle this grand challenge by exploring practical reinforcement learning-based (RL) methods and building relevant software infrastructure in an HPC environment. There are two main focuses in the project: 1) RL-based data placement for high storage utilization, and 2) RL-based I/O coordination for shared storage. Both tasks depend on identifying effective reinforcement learning methods and integrating these methods effectively into HPC systems. To achieve this goal, a novel, system-centric reinforcement learning framework will be developed. Moreover, in each research focus, various RL algorithms, deep neural network designs, and reward shaping will be proposed, implemented, rigorously benchmarked, and compared with state-of-the-art solutions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
如今,科学发现越来越多地涉及生成和分析大量数据。这些数据密集型科学应用对异构且极其复杂的高性能计算(HPC)集群的存储系统提出了重大挑战。需要高速数据访问的科学家经常在有效使用这些异构存储选项时遇到挫折。需要构建长期缺失的自动化HPC I/O(输入/输出)中间件,以透明地帮助科学家实现最佳的数据访问性能,而无需手动操作。为大规模、异构和共享HPC存储系统设计自动化HPC I/O中间件是一项极具挑战性的任务。该研究计划利用机器学习技术来理解请求和当前系统状态,智能地自适应地调度和协调I/O请求。本研究的成果有望与现有的存储组件配合使用,并将对科学应用和HPC系统的影响降到最低。本项目计划通过探索实用的基于强化学习(RL)的方法,并在HPC环境中构建相关的软件基础设施,来应对这一重大挑战。该项目主要关注两个方面:1)基于RL的数据放置,以实现高存储利用率; 2)基于RL的I/O协调,以实现共享存储。这两项任务都依赖于识别有效的强化学习方法,并将这些方法有效地集成到HPC系统中。为了实现这一目标,将开发一种新的、以系统为中心的强化学习框架。此外,在每个研究重点中,将提出各种RL算法、深度神经网络设计和奖励塑造,并进行严格的基准测试,并与最先进的解决方案进行比较。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Dong Dai其他文献
Pattern-Directed Replication Scheme for Heterogeneous Object-Based Storage
基于异构对象的存储的模式定向复制方案
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Jiang Zhou;Wei Xie;Dong Dai;Yong Chen - 通讯作者:
Yong Chen
Horseshoes, homoclinic connections and global chaos in current-mode controlled DC/DC converters
电流模式控制 DC/DC 转换器中的马蹄铁、同宿连接和全局混沌
- DOI:
10.1109/iscas.2005.1464876 - 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
Dong Dai;Yue Ma;C. Tse - 通讯作者:
C. Tse
Real-World Patient Experience of Pexidartinib for Tenosynovial Giant-Cell Tumor
Pexidartinib 治疗腱鞘巨细胞瘤的真实患者体验
- DOI:
10.1093/oncolo/oyad282 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Feng Lin;W. Kwong;Irene Pan;Xin Ye;Dong Dai;William Tap - 通讯作者:
William Tap
Group Scheduling for Improving Both CPU and Memory Power Efficiency Simultaneously
分组调度同时提高CPU和内存的能效
- DOI:
10.1109/hpcc.and.euc.2013.260 - 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
Gangyong Jia;Xi Li;Jian Wan;Chao Wang;Dong Dai - 通讯作者:
Dong Dai
Identification of Gingival Inflammation Surface Image Features Using Intraoral Scanning and Deep Learning
利用口内扫描和深度学习识别牙龈炎症表面图像特征
- DOI:
10.1016/j.identj.2025.01.002 - 发表时间:
2025-06-01 - 期刊:
- 影响因子:3.700
- 作者:
Wei Li;Linlin Li;Wenchong Xu;Yuting Guo;Min Xu;Shengyuan Huang;Dong Dai;Chang Lu;Shuai Li;Jiang Lin - 通讯作者:
Jiang Lin
Dong Dai的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Dong Dai', 18)}}的其他基金
CNS Core: Small: Moving Machine Learning into the Next-Generation Cloud Flexibly, Agilely and Efficiently
CNS核心:小:灵活、敏捷、高效地将机器学习迁移到下一代云
- 批准号:
2008265 - 财政年份:2020
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
SHF: Small: A Hybrid NVM based Computing Architecture for Machine Learning Applications
SHF:小型:用于机器学习应用的基于混合 NVM 的计算架构
- 批准号:
1908843 - 财政年份:2019
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
SHF: Small: Collaborative Research: A Parallel Graph-Based Paradigm for HPC Parallel File System Checkers
SHF:小型:协作研究:基于并行图的 HPC 并行文件系统检查器范例
- 批准号:
1910727 - 财政年份:2019
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
CRII: CSR: Partitioning Large Graphs in Deep Storage Architecture
CRII:CSR:深度存储架构中的大图分区
- 批准号:
1852815 - 财政年份:2018
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
CRII: CSR: Partitioning Large Graphs in Deep Storage Architecture
CRII:CSR:深度存储架构中的大图分区
- 批准号:
1756012 - 财政年份:2018
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
NSF Student Travel Grant for 2017 IEEE/ACM International Conference on Utility and Cloud Computing (UCC) and Co-located BDCAT Conference
NSF 学生旅费补助金用于 2017 年 IEEE/ACM 国际实用程序和云计算会议 (UCC) 以及同期举办的 BDCAT 会议
- 批准号:
1743903 - 财政年份:2017
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
相似国自然基金
Exploring Changing Fertility Intentions in China
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金
Exploring the Intrinsic Mechanisms of CEO Turnover and Market
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金
Exploring the Intrinsic Mechanisms of CEO Turnover and Market Reaction: An Explanation Based on Information Asymmetry
- 批准号:W2433169
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
相似海外基金
EMPOWHPVR: Exploring the factors that impact HPV self-sampling uptake amongst Black women and people with a cervix in Peel region, Ontario
EMPOWHPVR:探讨影响安大略省皮尔地区黑人女性和宫颈癌患者 HPV 自我采样率的因素
- 批准号:
502585 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Exploring volcanic arcs as factories of critical minerals
探索火山弧作为关键矿物工厂
- 批准号:
FT230100230 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
ARC Future Fellowships
Exploring the mental health and wellbeing of adolescent parent families affected by HIV in South Africa
探讨南非受艾滋病毒影响的青少年父母家庭的心理健康和福祉
- 批准号:
ES/Y00860X/1 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Fellowship
Exploring the Impact of Clinical Diagnosis on Health and Education Outcomes for Children Receiving Special Educational Needs support for Autism
探索临床诊断对接受自闭症特殊教育需求支持的儿童的健康和教育结果的影响
- 批准号:
ES/Z502431/1 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Fellowship
Exploring factors affecting the disability pay gap
探讨影响伤残工资差距的因素
- 批准号:
ES/Z50242X/1 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Fellowship
Women's mental illness in pregnancy: Exploring contact with secondary mental health services and links with offspring health and education outcomes
妇女妊娠期精神疾病:探索与二级心理健康服务的联系以及与后代健康和教育成果的联系
- 批准号:
ES/Z502492/1 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Fellowship
A2M: Exploring in-silico predicted arms-races at the plant-pathogen interface
A2M:探索植物-病原体界面的计算机预测军备竞赛
- 批准号:
BB/Y000560/1 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Research Grant
Winds of Change: Exploring the Meteorological Drivers of Global Dust
变革之风:探索全球沙尘的气象驱动因素
- 批准号:
2333139 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
Planning: FIRE-PLAN: Exploring fire as medicine to revitalize cultural burning in the Upper Midwest
规划:FIRE-PLAN:探索火作为药物,以振兴中西部北部的文化燃烧
- 批准号:
2349282 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant
Postdoctoral Fellowship: CREST-PRP: Exploring the Impact of Heat-Waves and Nutrients on Bloom-Forming and Habitat-Building Seaweeds Along the South Florida Coast
博士后奖学金:CREST-PRP:探索热浪和营养物质对南佛罗里达海岸海藻形成和栖息地建设的影响
- 批准号:
2401066 - 财政年份:2024
- 资助金额:
$ 13.4万 - 项目类别:
Standard Grant