DMS/NIGMS 1: Design and Analysis of Machine Learning Approaches for Long Timescale Prediction from Short Trajectory Data

DMS/NIGMS 1:根据短轨迹数据进行长时间尺度预测的机器学习方法的设计和分析

基本信息

  • 批准号:
    2054306
  • 负责人:
  • 金额:
    $ 60万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Events like the changes in molecular interactions underlying functions in our bodies, or the extremely intense hurricanes that cause devastating damage to our coastal cities, are difficult to study computationally because they occur very infrequently on feasible simulation timescales. Remarkably, it is theoretically possible to bypass this issue by appealing to certain high-dimensional equations characterizing long-timescale statistics in terms of short-timescale properties. Leveraging these equations and new tools in statistical and machine learning to solve them, this project introduces algorithms to learn long-time statistics using a data set consisting of many short simulations. The development of these algorithms will be informed by mathematical analysis aimed at fully characterizing their potential utility. The research will complement and support significant and diverse applications of critical societal interest. Examples include studies of protein assemblies relevant to treating diabetes and wound healing and changes in atmospheric conditions that lead to polar vortices and tropical cyclones. Because the methods developed in this project avoid the need to make simplifying assumptions when formulating the models, they promise to reveal the underlying physical mechanisms of these and other processes in unprecedented detail. The project will provide opportunities to graduate students to be involved in the research.This project concerns the development and analysis of a family of algorithms that assemble forecasts of events occurring over extremely long times using only a data set consisting of short trajectories. In this approach, forecasts (conditional expectations of future behavior) are cast as solutions to equations involving the operator determining the statistics of the underlying dynamical system. Building on significant and promising preliminary efforts employing a basis expansion approximation of the target predictive functions, this project will develop more expressive approximations that remain robust and reliable. A first aim will explore kernelized extensions of the basis expansion approach, which will allow careful control of the degree of approximation flexibility. A second aim will explore variational representations, allowing the introduction of neural network approximations and their extreme expressive power. Introducing this higher level of approximation flexibility while maintaining reliability and reproducibility will be a significant challenge. A third but parallel thrust will provide a careful and complete mathematical analysis of the basis expansion and kernel-based methods with an emphasis on building a theory that informs even the more complicated neural network-based approaches. All computational approaches will be extensively validated on a benchmark protein folding/unfolding data set. Their development will be accompanied by careful mathematical error analysis aimed at understanding the effect of various design choices such as the measure with respect to which the data is sampled and the length of the short simulations in the data set.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
像我们身体功能基础分子相互作用的变化,或者对我们沿海城市造成毁灭性破坏的极其强烈的飓风等事件,很难通过计算进行研究,因为它们在可行的模拟时间尺度上很少发生。值得注意的是,理论上可以通过利用某些用短时间尺度属性来表征长时间尺度统计的高维方程来绕过这个问题。该项目利用这些方程以及统计和机器学习中的新工具来解决它们,引入了使用由许多简短模拟组成的数据集来学习长期统计数据的算法。这些算法的开发将基于数学分析,旨在充分表征其潜在效用。该研究将补充和支持具有关键社会利益的重要且多样化的应用。例如,对与治疗糖尿病和伤口愈合相关的蛋白质组装以及导致极地涡旋和热带气旋的大气条件变化的研究。由于该项目开发的方法避免了在制定模型时做出简化假设的需要,因此它们有望以前所未有的细节揭示这些过程和其他过程的潜在物理机制。该项目将为研究生提供参与研究的机会。该项目涉及一系列算法的开发和分析,这些算法仅使用由短轨迹组成的数据集来组合对极长时间内发生的事件的预测。在这种方法中,预测(未来行为的条件期望)被视为方程的解,涉及操作员确定基础动力系统的统计数据。基于采用目标预测函数的基础扩展近似的重大且有前途的初步努力,该项目将开发更具表现力的近似值,并且保持稳健和可靠。第一个目标是探索基础扩展方法的核化扩展,这将允许仔细控制近似灵活性的程度。第二个目标是探索变分表示,允许引入神经网络近似及其极端的表达能力。在保持可靠性和可重复性的同时引入这种更高水平的近似灵活性将是一个重大挑战。第三个但平行的推动力将为基础扩展和基于内核的方法提供仔细而完整的数学分析,重点是建立一种为更复杂的基于神经网络的方法提供信息的理论。所有计算方法都将在基准蛋白质折叠/展开数据集上进行广泛验证。它们的开发将伴随着仔细的数学误差分析,旨在了解各种设计选择的影响,例如数据采样的度量和数据集中短期模拟的长度。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力优点和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Data-Driven Transition Path Analysis Yields a Statistical Understanding of Sudden Stratospheric Warming Events in an Idealized Model
数据驱动的转变路径分析可以在理想化模型中对平流层突然变暖事件产生统计了解
  • DOI:
    10.1175/jas-d-21-0213.1
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Finkel, Justin;Webber, Robert J.;Gerber, Edwin P.;Abbot, Dorian S.;Weare, Jonathan
  • 通讯作者:
    Weare, Jonathan
Predicting rare events using neural networks and short-trajectory data
  • DOI:
    10.1016/j.jcp.2023.112152
  • 发表时间:
    2022-08
  • 期刊:
  • 影响因子:
    4.1
  • 作者:
    J. Strahan;J. Finkel;A. Dinner;J. Weare
  • 通讯作者:
    J. Strahan;J. Finkel;A. Dinner;J. Weare
Computing transition path theory quantities with trajectory stratification
  • DOI:
    10.1063/5.0087058
  • 发表时间:
    2022-07-21
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Vani, Bodhi P.;Weare, Jonathan;Dinner, Aaron R.
  • 通讯作者:
    Dinner, Aaron R.
Understanding and eliminating spurious modes in variational Monte Carlo using collective variables
使用集体变量理解和消除变分蒙特卡罗中的杂散模式
  • DOI:
    10.1103/physrevresearch.5.023101
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    4.2
  • 作者:
    Zhang, Huan;Webber, Robert J.;Lindsey, Michael;Berkelbach, Timothy C.;Weare, Jonathan
  • 通讯作者:
    Weare, Jonathan
Revealing the Statistics of Extreme Events Hidden in Short Weather Forecast Data
  • DOI:
    10.1029/2023av000881
  • 发表时间:
    2022-06
  • 期刊:
  • 影响因子:
    8.4
  • 作者:
    J. Finkel;E. Gerber;D. Abbot;J. Weare
  • 通讯作者:
    J. Finkel;E. Gerber;D. Abbot;J. Weare
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jonathan Weare其他文献

The surprising efficiency of temporal difference learning for rare event prediction
时间差异学习在罕见事件预测中的惊人效率
  • DOI:
    10.48550/arxiv.2405.17638
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiaoou Cheng;Jonathan Weare
  • 通讯作者:
    Jonathan Weare

Jonathan Weare的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jonathan Weare', 18)}}的其他基金

Long Time Scales and Unlikely Events: Sampling and Coarse Graining Strategies
长时间尺度和不太可能发生的事件:采样和粗粒度策略
  • 批准号:
    1109731
  • 财政年份:
    2011
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant

相似海外基金

DMS/NIGMS 1: Multilevel stochastic orthogonal subspace transformations for robust machine learning with applications to biomedical data and Alzheimer's disease subtyping
DMS/NIGMS 1:多级随机正交子空间变换,用于稳健的机器学习,应用于生物医学数据和阿尔茨海默病亚型分析
  • 批准号:
    2347698
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Collaborative Research: DMS/NIGMS 1: Simulating cell migration with a multi-scale 3D model fed by intracellular tension sensing measurements
合作研究:DMS/NIGMS 1:使用由细胞内张力传感测量提供的多尺度 3D 模型模拟细胞迁移
  • 批准号:
    2347957
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Collaborative Research: DMS/NIGMS 1: Simulating cell migration with a multi-scale 3D model fed by intracellular tension sensing measurements
合作研究:DMS/NIGMS 1:使用由细胞内张力传感测量提供的多尺度 3D 模型模拟细胞迁移
  • 批准号:
    2347956
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
DMS/NIGMS 2: Deep learning for repository-scale analysis of tandem mass spectrometry proteomics data
DMS/NIGMS 2:用于串联质谱蛋白质组数据存储库规模分析的深度学习
  • 批准号:
    2245300
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 1: Viscoelasticity and Flow of Biological Condensates via Continuum Descriptions - How Droplets Coalesce and Wet Cellular Surfaces
DMS/NIGMS 1:通过连续体描述的生物凝聚物的粘弹性和流动 - 液滴如何聚结和润湿细胞表面
  • 批准号:
    2245850
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 2: Spatial, Multi-Host Petri Net Models for Zoonotic Disease Forecasting
DMS/NIGMS 2:用于人畜共患疾病预测的空间、多主机 Petri 网络模型
  • 批准号:
    10797423
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
DMS/NIGMS 1: Multi-timescale stochastic modeling to investigate epigenetic memory in bacteria
DMS/NIGMS 1:用于研究细菌表观遗传记忆的多时间尺度随机模型
  • 批准号:
    2245816
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 1: Modeling Microbial Community Response to Invasion: A Multi-Omics and Multifacton
DMS/NIGMS 1:模拟微生物群落对入侵的反应:多组学和多因素
  • 批准号:
    10794584
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
DMS/NIGMS 2: Regulation of Cellular Stemness during the Epithelial-Mesenchymal Transition (EMT)
DMS/NIGMS 2:上皮-间质转化 (EMT) 期间细胞干性的调节
  • 批准号:
    2245957
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Collaborative Research: DMS/NIGMS 2: Novel machine-learning framework for AFMscanner in DNA-protein interaction detection
合作研究:DMS/NIGMS 2:用于 DNA-蛋白质相互作用检测的 AFM 扫描仪的新型机器学习框架
  • 批准号:
    10797460
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了