D3SC: EAGER: Collaborative Research: A probabilistic framework for automated force field parameterization from experimental datasets
D3SC:EAGER:协作研究:根据实验数据集自动进行力场参数化的概率框架
基本信息
- 批准号:1738975
- 负责人:
- 金额:$ 11.89万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-08-01 至 2019-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Michael Shirts of the University of Colorado Boulder and John Chodera of the Sloan Kettering Institute are supported by a grant from the Chemical Theory, Models and Computational Methods program in the Division of Chemistry to develop statistical and probabilistic methods for parameterizing molecular force fields through the automated integration of information from large experimental datasets. This project is supported under the Data-Driven Discovery Science in Chemistry (D3SC) Dear Colleague Letter (DCL), and is co-funded by the Cyberinfrastructure for Emerging Science and Engineering Research (CESER) Program in the Office of Advanced Cyberinfrastructure. Force fields are classical approximations to quantum mechanical descriptions of interacting molecules. Because they are several orders of magnitude faster to use in simulations than even approximate quantum approaches, force fields are integral to computational modeling in chemistry, chemical engineering, biophysics, materials science, and soft-matter physics. New and more accurate force fields are needed in order to accelerate drug discovery, biomaterials design, and nanoscale device engineering. Currently, force fields are primarily tuned using quantum chemical calculations and small amounts of experimental data, and rely on optimization methods that often require considerable manual intervention, may not identify optimal solutions, and do not provide a way of characterizing and propagating parameter uncertainty. When experimental data is included in the tuning process, there is currently no systematic way to incorporate information on measurement error to weight the data accordingly. Professors Shirts and Chodera are developing a rigorous Bayesian probabilistic framework and statistical techniques to overcome these problems. Their approach is designed to take advantage of large, rich experimental datasets including measurement uncertainty, leverage available data more efficiently, and automate both parameter selection and the choice of functional forms in the mathematical formulation of the force field. Software from the project is being disseminated as open source Python code that can be interfaced to simulation codes such as OpenMM and GROMACS. A new Open Force Field Group, with collaborators from academia, the National Institute of Standards and Technology, and industry, will advance community-driven force field development and applications during the project and beyond.This project is addressing the challenges of force field parameterization by applying a rigorous Bayesian inference framework to determine force fields that are maximally compatible with experimental datasets. The formalism is being applied initially to organic and aqueous liquid mixtures using the NIST ThermoML Archive, which contains a wide range of thermophysical property measurements and associated measurement errors for thousands of molecules. Specific tasks include (1) developing and evaluating an automated Bayesian force field parameterization framework that scales to large numbers of parameters and large data sets, and (2) using this approach to explore the automated selection of force field functional forms. The Bayesian probabilistic framework developed in this work promises to greatly reduce human effort, maximize force field transferability and generalizability by avoiding over-fitting, and enable the systematic extraction of available information from a given set of experimental data. The probabilistic formulation will allow force fields to be easily extended to accommodate new experimental data in a consistent manner via conditional Bayesian updates, and will provide direct routes for estimating systematic error. Initial tests of the new approach will help resolve important questions on the parameterization of molecular force fields for liquid systems, such as optimal choices of functional forms and combining rules. The same techniques can be later used to determine whether pure fluid thermodynamic properties are sufficient to parameterize fluids to reproduce mixture properties, as measured experimentally by the project, and to assess the importance of including polarization in force fields. Open source software tools will be released as easily-installed interoperable Python modules and online instructive IPython/Jupyter notebooks, and all experimental datasets and parameter sets will be freely disseminated.
科罗拉多大学博尔德分校的Michael Shirts和斯隆凯特林研究所的John Chodera得到了化学学部化学理论、模型和计算方法项目的资助,他们通过自动整合来自大型实验数据集的信息,开发了统计和概率方法,用于参数化分子力场。该项目由数据驱动发现化学科学(D3SC)致同事信(DCL)支持,并由先进网络基础设施办公室新兴科学与工程研究网络基础设施(CESER)计划共同资助。力场是相互作用分子的量子力学描述的经典近似。因为在模拟中使用力场比近似量子方法快几个数量级,力场在化学、化学工程、生物物理学、材料科学和软物质物理学的计算建模中是不可或缺的。为了加速药物发现、生物材料设计和纳米级器件工程,需要新的和更精确的力场。目前,力场主要使用量子化学计算和少量实验数据进行调整,并且依赖于通常需要大量人工干预的优化方法,可能无法确定最佳解决方案,并且无法提供表征和传播参数不确定性的方法。当实验数据包含在调谐过程中,目前还没有系统的方法来结合测量误差信息来相应地加权数据。t教授和Chodera教授正在开发一个严格的贝叶斯概率框架和统计技术来克服这些问题。他们的方法旨在利用大型,丰富的实验数据集,包括测量不确定性,更有效地利用可用数据,并自动化参数选择和力场数学公式中函数形式的选择。该项目的软件作为开源Python代码分发,可以与OpenMM和GROMACS等模拟代码接口。一个新的开放力场小组,由来自学术界、国家标准与技术研究所和工业界的合作者组成,将在项目期间和以后推进社区驱动的力场开发和应用。该项目通过应用严格的贝叶斯推理框架来确定与实验数据集最大程度兼容的力场,从而解决力场参数化的挑战。使用NIST的thermol档案,将形式主义最初应用于有机和含水液体混合物,该档案包含了数千个分子的广泛的热物理性质测量和相关的测量误差。具体任务包括:(1)开发和评估一个自动化贝叶斯力场参数化框架,该框架可扩展到大量参数和大型数据集,以及(2)使用该方法探索力场函数形式的自动选择。本工作开发的贝叶斯概率框架有望大大减少人力,通过避免过度拟合,最大限度地提高力场的可转移性和泛化性,并能够从给定的实验数据集中系统地提取可用信息。概率公式将允许力场以一致的方式通过条件贝叶斯更新轻松扩展以适应新的实验数据,并将为估计系统误差提供直接途径。新方法的初步测试将有助于解决液体系统分子力场参数化的重要问题,如功能形式的最佳选择和组合规则。同样的技术可以在以后用于确定纯流体热力学性质是否足以参数化流体以再现混合性质,如项目实验测量的那样,并评估在力场中包括极化的重要性。开源软件工具将以易于安装互操作的Python模块和在线教学IPython/Jupyter笔记本的形式发布,所有实验数据集和参数集将自由传播。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Configuration-Sampling-Based Surrogate Models for Rapid Parameterization of Non-Bonded Interactions
- DOI:10.1021/acs.jctc.8b00223
- 发表时间:2018-06-01
- 期刊:
- 影响因子:5.5
- 作者:Messerly, Richard A.;Razavi, S. Mostafa;Shirts, Michael R.
- 通讯作者:Shirts, Michael R.
Uncertainty quantification confirms unreliable extrapolation toward high pressures for united-atom Mie λ -6 force field
不确定性量化证实了联合原子 Mie δ -6 力场对高压的不可靠外推
- DOI:10.1063/1.5039504
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Messerly, Richard A.;Shirts, Michael R.;Kazakov, Andrei F.
- 通讯作者:Kazakov, Andrei F.
Toward Learned Chemical Perception of Force Field Typing Rules
- DOI:10.1021/acs.jctc.8b00821
- 发表时间:2019-01-01
- 期刊:
- 影响因子:5.5
- 作者:Zanette, Camila;Bannan, Caitlin C.;Mobley, David L.
- 通讯作者:Mobley, David L.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Shirts其他文献
Frustration enables the liquid-liquid phase separation of nonspecifically interacting coiled-coil proteins
- DOI:
10.1016/j.bpj.2023.11.2717 - 发表时间:
2024-02-08 - 期刊:
- 影响因子:
- 作者:
Dominique Ramirez;Loren Hough;Michael Shirts - 通讯作者:
Michael Shirts
Michael Shirts的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Shirts', 18)}}的其他基金
Collaborative Research: CyberTraining: Implementation: Medium: Establishing Sustainable Ecosystem for Computational Molecular Science Training and Education
合作研究:网络培训:实施:中:建立计算分子科学培训和教育的可持续生态系统
- 批准号:
2118174 - 财政年份:2021
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: NSCI Framework: Software: SCALE-MS - Scalable Adaptive Large Ensembles of Molecular Simulations
合作研究:NSCI 框架:软件:SCALE-MS - 可扩展自适应大型分子模拟集成
- 批准号:
1835720 - 财政年份:2019
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
CAREER: Understanding the thermodynamics of crystalline materials using advanced molecular simulation sampling methods
职业:使用先进的分子模拟采样方法了解晶体材料的热力学
- 批准号:
1639105 - 财政年份:2016
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
CAREER: Understanding the thermodynamics of crystalline materials using advanced molecular simulation sampling methods
职业:使用先进的分子模拟采样方法了解晶体材料的热力学
- 批准号:
1351635 - 财政年份:2014
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Highly multidimensional thermodynamic property prediction for chemical design using atomistic simulations
使用原子模拟进行化学设计的高度多维热力学性质预测
- 批准号:
1152786 - 财政年份:2012
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: EAGER: The next crisis for coral reefs is how to study vanishing coral species; AUVs equipped with AI may be the only tool for the job
合作研究:EAGER:珊瑚礁的下一个危机是如何研究正在消失的珊瑚物种;
- 批准号:
2333604 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
- 批准号:
2347624 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: Revealing the Physical Mechanisms Underlying the Extraordinary Stability of Flying Insects
EAGER/合作研究:揭示飞行昆虫非凡稳定性的物理机制
- 批准号:
2344215 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345581 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345582 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345583 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Energy for persistent sensing of carbon dioxide under near shore waves.
合作研究:EAGER:近岸波浪下持续感知二氧化碳的能量。
- 批准号:
2339062 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: IMPRESS-U: Groundwater Resilience Assessment through iNtegrated Data Exploration for Ukraine (GRANDE-U)
合作研究:EAGER:IMPRESS-U:通过乌克兰综合数据探索进行地下水恢复力评估 (GRANDE-U)
- 批准号:
2409395 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: The next crisis for coral reefs is how to study vanishing coral species; AUVs equipped with AI may be the only tool for the job
合作研究:EAGER:珊瑚礁的下一个危机是如何研究正在消失的珊瑚物种;
- 批准号:
2333603 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
- 批准号:
2347623 - 财政年份:2024
- 资助金额:
$ 11.89万 - 项目类别:
Standard Grant