GOALI: Frameworks: At-Scale Heterogeneous Data based Adaptive Development Platform for Machine-Learning Models for Material and Chemical Discovery
GOALI:框架:基于大规模异构数据的自适应开发平台,用于材料和化学发现的机器学习模型
基本信息
- 批准号:2311632
- 负责人:
- 金额:$ 450万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-10-01 至 2028-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project seeks to establish a new technological paradigm and the software infrastructure necessary for the development of Machine Learning (ML) models capable of predicting the properties of unseen molecular and materials systems/structures, thus enabling modeling of atomic behavior and the computational discovery of new molecules and materials at significantly higher throughput than afforded by existing first principles (quantum) methods. ML-enabled materials discovery is poised to play a critical role in addressing modern societal challenges such as energy sustainability and, as such, the technology and infrastructure developed by this project are expected to have a transformative impact across many scientific and engineering domains. The platform facilitates access, sharing, and discovery of vast amounts of first principles and experimental data, removing inefficiencies and accelerating scientific discovery by enabling the development of ML models on a scale previously inaccessible. To achieve these goals, this project is carried out in partnership with Amazon Web Services (AWS), providing the necessary know-how for the development of specialized open-source tools for training ML models at scale. This project is committed to the advancement of diversity, equity and inclusiveness in higher education, and as such it incorporates a variety of mechanisms to include underrepresented and low-income students (high-school and undergraduate) in its research activities across the four participating universities (New York University, University of Minnesota, University of Florida, and Brigham Young University), in addition to the mentoring of graduate students, the development of teaching materials, and workshops aimed at industrial outreach and training. To assure alignment between the platform/software and community needs, this project is supported by an Advisory Board of experts in cyberinfrastructure development, machine learning, material and chemical sciences, and STEM outreach who evaluate and provide strategic advice to the PIs.The key technological advance that serves as the basis of this work are "foundation models", an approach for building ML systems in which a model trained on extremely large amounts of diverse and easily available data can be adapted to diverse applications with a small amount of additional model fitting (fine-tuning). This project thus focuses on the development of a foundation model, called FERMat, for molecular and material property prediction, and ML interatomic potentials for modeling atomic behavior. FERMat is to be delivered via an integrated adaptive platform in the form of a software package and an online framework for developing and deploying specialized ML models for materials and chemistry applications, called "FERMat Apps". In collaboration with AWS this project seeks to develop open-source software for training foundation models like FERMat at scale on large amounts of highly heterogeneous and multi-modal data. The high data needs will be met by leveraging and significantly expanding the ColabFit Exchange, an online repository of first principles and experimental data optimized for training of ML models, in cooperation with a large number of materials and molecular data repositories, standards organizations, and existing cyberinfrastructures. FERMat and any ML model derived from it is designed to support uncertainty quantification (based on information geometry, Bayesian, and frequentist approaches) to ensure the robustness of predictions. As guiding target applications, this project considers two problems of scientific interest: 2D material driven catalysis and the prediction of molecular crystal polymorphs.This award by the Office of Advanced Cyberinfrastructure is jointly supported by the Division of Materials Research within the Directorate for Mathematical and Physical Sciences.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在建立一个新的技术范式以及能够预测未见分子和材料系统/结构的特性的机器学习(ML)模型所需的软件基础架构,从而使原子行为以及对新分子和材料的计算启用了与现有原理相比,新的分子和材料的计算发现。支持ML的材料发现有望在解决现代社会挑战(例如能源可持续性)中发挥关键作用,因此,该项目开发的技术和基础设施预计将对许多科学和工程领域产生变革性的影响。该平台促进了大量的第一原则和实验数据的访问,共享和发现,从而消除了效率低下并通过以前无法访问的规模来开发ML模型来消除效率低下并加速科学发现。为了实现这些目标,该项目是与亚马逊网络服务(AWS)合作进行的,为开发专门的开源工具提供了必要的专业知识,以大规模培训ML模型。该项目致力于在高等教育方面的多样性,公平和包容性的发展,因此,它结合了各种机制,包括在四个参与大学(纽约大学,明尼苏达大学,弗洛里达大学和布里格纳姆大学的毕业生)中,包括四个参与大学(纽约大学)的研究活动中,包括多种多样的和低收入的学生(高中和本科)的研究活动。针对工业外展和培训。为了确保平台/软件和社区需求之间的一致性,该项目得到了Cyberinfrasture开发,机器学习,机器学习,材料和化学科学的专家顾问委员会的支持,以及评估和提供对PIS的战略建议的STEM宣传。这项工作的基础是“基础模型”的多样性,该模型是多样性地培训的,这些方法可以构建多样性,并具有多样化的模型。具有少量额外模型拟合(微调)的应用。因此,该项目的重点是用于分子和材料性质预测的基础模型(称为Fermat)的开发,以及用于建模原子行为的ML个体间潜力。 Fermat将以软件包的形式通过集成的自适应平台以及用于开发和部署用于材料和化学应用程序的专业ML模型的在线框架的形式进行交付,称为“ Fermat Apps”。该项目与AWS合作开发开源软件,以根据大量高度异构和多模式数据进行大规模培训基础模型(例如Fermat)进行培训。通过利用并显着扩展ColabFit Exchange(在线原理的在线存储库和用于培训ML模型的培训),与大量材料和分子数据存储库,标准组织以及现有的Cyberinfrastrastures合作,将满足高数据需求。 Fermat及其从IT得出的任何ML模型旨在支持不确定性量化(基于信息几何,贝叶斯和频繁的方法),以确保预测的稳健性。作为指导目标应用,该项目考虑了两个科学兴趣的问题:2D物质驱动的催化和预测分子晶体聚合物的预测。该奖项获得了高级网络基础设施办公室的奖项,由材料研究局共同支持了材料研究局,该局在数学和实体奖中均通过Inforthational Inforthational Inforthation the Nerit Infortional of Deem deemed of Deem deem deem deem deem deem deem deem deem deem a奖。更广泛的影响审查标准。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Stefano Martiniani其他文献
Transport and Energetics of Bacterial Rectification
细菌整流的运输和能量学
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Satyam Anand;Xiaolei Ma;Shuo Guo;Stefano Martiniani;Xiang Cheng - 通讯作者:
Xiang Cheng
Monte Carlo sampling for stochastic weight functions
随机权重函数的蒙特卡罗采样
- DOI:
10.1073/pnas.1620497114 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
D. Frenkel;K. J. Schrenk;Stefano Martiniani - 通讯作者:
Stefano Martiniani
On the complexity of energy landscapes: algorithms and a direct test of the Edwards conjecture
关于能源景观的复杂性:算法和爱德华兹猜想的直接检验
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Stefano Martiniani - 通讯作者:
Stefano Martiniani
Structural analysis of high-dimensional basins of attraction.
高维吸引力盆地的结构分析。
- DOI:
10.1103/physreve.94.031301 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Stefano Martiniani;K. J. Schrenk;J. Stevenson;D. Wales;D. Frenkel - 通讯作者:
D. Frenkel
Vicsek model by time-interlaced compression: A dynamical computable information density.
时间交错压缩的 Vicsek 模型:动态可计算信息密度。
- DOI:
10.1103/physreve.103.062141 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
A. Cavagna;P. Chaikin;D. Levine;Stefano Martiniani;A. Puglisi;M. Viale - 通讯作者:
M. Viale
Stefano Martiniani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Stefano Martiniani', 18)}}的其他基金
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
- 批准号:
2226387 - 财政年份:2022
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
- 批准号:
2132995 - 财政年份:2021
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
相似国自然基金
基于大规模监测的教育结果公平的测评框架构建与实证研究
- 批准号:72304037
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向大规模深度神经网络的云边端协同并行处理框架与体系结构研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向大规模深度神经网络的云边端协同并行处理框架与体系结构研究
- 批准号:62202154
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
大规模框架式软件组合符号化分析研究
- 批准号:
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:面上项目
大规模框架式软件组合符号化分析研究
- 批准号:62172429
- 批准年份:2021
- 资助金额:58.00 万元
- 项目类别:面上项目
相似海外基金
CAREER: Novel Parallelization Frameworks for Large-Scale Network Optimization with Combinatorial Requirements: Solution Methods and Applications
职业:具有组合要求的大规模网络优化的新型并行化框架:解决方法和应用
- 批准号:
2338641 - 财政年份:2024
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
Frameworks: arXiv as an accessible large-scale open research platform
框架:arXiv 作为一个可访问的大型开放研究平台
- 批准号:
2311521 - 财政年份:2024
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE)
协作研究:框架:分布式和超大规模系统的可扩展性能和准确性分析 (SPADE)
- 批准号:
2311707 - 财政年份:2023
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE)
协作研究:框架:分布式和超大规模系统的可扩展性能和准确性分析 (SPADE)
- 批准号:
2311708 - 财政年份:2023
- 资助金额:
$ 450万 - 项目类别:
Standard Grant
Multiscale computational frameworks for integrating large-scale cortical dynamics, connectivity, and behavior
用于集成大规模皮层动力学、连接性和行为的多尺度计算框架
- 批准号:
10840682 - 财政年份:2023
- 资助金额:
$ 450万 - 项目类别: