CC* Data Storage: Institutional Storage for the University of Notre Dame (NDStore)
CC* 数据存储:圣母大学机构存储 (NDStore)
基本信息
- 批准号:2232803
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project equips the Center for Research Computing at the University of Notre Dame (ND CRC) and its scientific users across all Notre Dame colleges and departments to enable transformative research in social and physical sciences and engineering domains through the acquisition of institutional storage called NDStore. Major beneficiaries of NDStore are researchers utilizing various research cores at Notre Dame, such as the Genomics and Bioinformatics Core and the Notre Dame Integrated Imaging Facility, as well as other researchers from University Centers and Institutes, such as the Institute for Data and Society, in addition to the broader national community via the Open Science Grid. Together, these different facilities and researchers generate hundreds of terabytes of data per year, and they enable expert users to address the most complex research problems of today’s world. The major capabilities provided by NDStore accelerate existing research otherwise throttled by insufficient storage capability. They also enable full data lifecycle at previously inaccessible scales, enable new national data-intensive collaborations, and incubate new research projects. NDStore brings to Notre Dame an additional 2 petabytes of storage capacity for data manipulation, curation, and long-term preservation, as well as 250 terabytes of fast scratch storage for machine learning-related workloads. NDStore is a highly available solution based on an open-source Ceph-based storage clustering standard. It was designed with the flexibility to meet the needs of researchers at various stages of their research. NDStore provides a clear benefit to researchers who generate data with various instruments in core facilities and need to transfer the data to their home directories for analysis and curation before the data is shared with their communities. Before this project was funded, the amount of storage provided by ND CRC to each faculty lab was not satisfactory for most of the users dealing with large data coming from instruments at core facilities, such as microscopes, sequencing machines, or other benchtop devices. In addition, ND CRC’s high-performance scratch storage system has been shared between high-performance computing and machine learning workloads; very often, mixing these workloads led to performance bottlenecks, negatively impacting all of the storage system users at Notre Dame. NDStore helps Notre Dame create an independent scratch system for machine learning workloads.Another important aspect of the intellectual merit of this project is the opportunity for the CRC to deploy NDStore in such a way that the entire data lifecycle at Notre Dame’s research enterprise is supported. Research data use cases at ND are highly diverse, complex, and heterogeneous. They differ in types of data captured, scientific instruments used, data processing and analyses conducted, policies and methods for data sharing and use, and, internal to the lab, cyberinfrastructure-related knowledge. Data life cycle stages include: 1) data capture; 2) initial processing near the instrument(s); 3) central processing at data centers or clouds; 4) data storage, curation, and archiving; and 5) data access, dissemination, and visualization. Until NDStore was deployed, Notre Dame infrastructure could adequately support only stages 1-3 and 5, with very minimal support for stage 4. NDStore fills this gap. NDStore will also be integrated into classroom and undergraduate internship programs hosted by numerous faculty in colleges and ND CRC. Through user training, research experience for undergraduates, pre-college programs for high school students, workshops, internships, and experiential training programs, ND CRC will ensure that NDStore has the broadest possible impact on the local and national academic community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目为圣母大学研究计算中心(ND CRC)及其所有圣母大学学院和系的科学用户提供装备,通过收购名为NDStore的机构存储,实现社会和物理科学以及工程领域的变革性研究。NDStore的主要受益者是利用Notre Dame各种研究核心的研究人员,如基因组学和生物信息学核心和Notre Dame综合成像设施,以及来自大学中心和研究所的其他研究人员,如数据与社会研究所,以及通过开放科学网格更广泛的国家社区。这些不同的设施和研究人员每年产生数百TB的数据,使专家用户能够解决当今世界最复杂的研究问题。NDStore提供的主要功能加速了现有的研究,否则会因存储能力不足而受到限制。它们还可以在以前无法访问的规模上实现完整的数据生命周期,实现新的国家数据密集型合作,并孵化新的研究项目。 NDStore为Notre Dame带来了额外的2 PB存储容量,用于数据操作,管理和长期保存,以及250 TB的快速暂存存储,用于机器学习相关的工作负载。NDStore是一个高度可用的解决方案,基于开源的Ceph存储集群标准。它的设计具有灵活性,以满足研究人员在不同研究阶段的需求。NDStore为那些在核心设施中使用各种工具生成数据的研究人员提供了明显的好处,并且需要在与社区共享数据之前将数据传输到其主目录进行分析和管理。 在此项目获得资助之前,ND CRC为每个教师实验室提供的存储量对于大多数处理来自核心设施仪器(如显微镜,测序机或其他台式设备)的大数据的用户来说并不令人满意。此外,ND CRC的高性能暂存存储系统在高性能计算和机器学习工作负载之间共享;通常,混合这些工作负载会导致性能瓶颈,对Notre Dame的所有存储系统用户产生负面影响。NDStore帮助Notre Dame为机器学习工作负载创建了一个独立的暂存系统。该项目的另一个重要方面是CRC有机会以支持Notre Dame研究企业整个数据生命周期的方式部署NDStore。ND的研究数据用例是高度多样化、复杂和异构的。它们在获取的数据类型、使用的科学仪器、进行的数据处理和分析、数据共享和使用的政策和方法以及实验室内部的网络基础设施相关知识方面有所不同。数据生命周期阶段包括:1)数据捕获; 2)仪器附近的初始处理; 3)数据中心或云的中央处理; 4)数据存储、管理和存档; 5)数据访问、传播和可视化。在NDStore部署之前,Notre Dame基础设施只能充分支持阶段1-3和5,对阶段4的支持非常少。NDStore填补了这一空白。 NDStore还将被整合到由大学和ND CRC的众多教师主办的课堂和本科生实习计划中。通过用户培训、本科生研究经验、高中生预科课程、研讨会、实习和体验式培训项目,ND CRC将确保NDStore对当地和国家学术界产生尽可能广泛的影响。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响力审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jaroslaw Nabrzyski其他文献
Enhancing scalability and accuracy of quantum poisson solver
增强量子泊松求解器的可扩展性和准确性
- DOI:
10.1007/s11128-024-04420-y - 发表时间:
2024 - 期刊:
- 影响因子:2.5
- 作者:
Kamal K. Saha;Walter Robson;Connor Howington;In;Zhimin Wang;Jaroslaw Nabrzyski - 通讯作者:
Jaroslaw Nabrzyski
Jaroslaw Nabrzyski的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jaroslaw Nabrzyski', 18)}}的其他基金
IUCRC Phase I University of Notre Dame: Center for Science, Management, Application/s, Regulation, and Training [SMART]
IUCCRC 第一阶段圣母大学:科学、管理、应用、监管和培训中心 [SMART]
- 批准号:
2113718 - 财政年份:2021
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
EarthCube RCN: Collaborative Research: Research Coordination Network for High-Performance Distributed Computing in the Polar Sciences
EarthCube RCN:协作研究:极地科学高性能分布式计算的研究协调网络
- 批准号:
1542052 - 财政年份:2015
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC-NIE Networking Infrastructure: Accelerating Research Data Transit Between the Scientist's Desktop, Campus, and National Cyberinfrastructure
CC-NIE 网络基础设施:加速科学家桌面、校园和国家网络基础设施之间的研究数据传输
- 批准号:
1340990 - 财政年份:2014
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Workshop: Grid Computing - The Next Decade
研讨会:网格计算 - 下一个十年
- 批准号:
1205193 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
MRI: Acquisition of a Data Analytics Cluster for Computational Social Science
MRI:收购计算社会科学数据分析集群
- 批准号:
1229450 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
REU Site: Multidisciplinary Computational Science at the University of Notre Dame
REU 网站:圣母大学多学科计算科学
- 批准号:
1063084 - 财政年份:2011
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
基于Linked Open Data的Web服务语义互操作关键技术
- 批准号:61373035
- 批准年份:2013
- 资助金额:77.0 万元
- 项目类别:面上项目
Molecular Interaction Reconstruction of Rheumatoid Arthritis Therapies Using Clinical Data
- 批准号:31070748
- 批准年份:2010
- 资助金额:34.0 万元
- 项目类别:面上项目
高维数据的函数型数据(functional data)分析方法
- 批准号:11001084
- 批准年份:2010
- 资助金额:16.0 万元
- 项目类别:青年科学基金项目
染色体复制负调控因子datA在细胞周期中的作用
- 批准号:31060015
- 批准年份:2010
- 资助金额:25.0 万元
- 项目类别:地区科学基金项目
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Research Infrastructure: CC* Data Storage: Foundational Campus Research Storage for Digital Transformation
研究基础设施:CC* 数据存储:数字化转型的基础校园研究存储
- 批准号:
2346636 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: High-Capacity Active Archive to Enable Economical Data Access and Distribution for Illinois Researchers and the National Community
CC* 数据存储:大容量主动存档,为伊利诺伊州研究人员和国家社区提供经济的数据访问和分发
- 批准号:
2346737 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: Cost-effective Attached Storage for High throughput computing using Homo- geneous IT (CASH HIT) supporting Penn State Science, the Open Science Grid and LIGO
CC* 数据存储:使用同质 IT (CASH HIT) 实现高吞吐量计算的经济高效附加存储,支持宾夕法尼亚州立大学科学学院、开放科学网格和 LIGO
- 批准号:
2346596 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Research Infrastructure: CC* Data Storage: Broadening UMBCs Data Storage footprint to Advance Scientific Research and Discovery
研究基础设施:CC* 数据存储:扩大 UMBC 数据存储足迹以推进科学研究和发现
- 批准号:
2346667 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: Shareable, Equitable, and Extensible Data Storage for Collaborative Data-intensive Research
CC* 数据存储:用于协作数据密集型研究的可共享、公平和可扩展的数据存储
- 批准号:
2321980 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: NRDStor: Nebraska Research Data Storage
CC* 数据存储:NRDStor:内布拉斯加州研究数据存储
- 批准号:
2232851 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: Closing Caltech's data storage gap: from ad-hoc to well-managed stewardship of large-scale datasets
CC* 数据存储:缩小加州理工学院的数据存储差距:从大规模数据集的临时管理到管理良好的管理
- 批准号:
2322420 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: FASTER Data Infrastructure to Accelerate Computing
CC* 数据存储:更快的数据基础设施以加速计算
- 批准号:
2322377 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Equipment: CC* Data Storage: Improving Research Ability with Data Storage at the University of Montana
设备:CC* 数据存储:通过蒙大拿大学的数据存储提高研究能力
- 批准号:
2321843 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CC* Data Storage: Remote Instrumentation Science Environment for Intelligent Image Analytics
CC* 数据存储:用于智能图像分析的远程仪器科学环境
- 批准号:
2322063 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant