DMS/NIGMS 1: Design and Analysis of Machine Learning Approaches for Long Timescale Prediction from Short Trajectory Data

DMS/NIGMS 1:根据短轨迹数据进行长时间尺度预测的机器学习方法的设计和分析

基本信息

  • 批准号:
    2054306
  • 负责人:
  • 金额:
    $ 60万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Events like the changes in molecular interactions underlying functions in our bodies, or the extremely intense hurricanes that cause devastating damage to our coastal cities, are difficult to study computationally because they occur very infrequently on feasible simulation timescales. Remarkably, it is theoretically possible to bypass this issue by appealing to certain high-dimensional equations characterizing long-timescale statistics in terms of short-timescale properties. Leveraging these equations and new tools in statistical and machine learning to solve them, this project introduces algorithms to learn long-time statistics using a data set consisting of many short simulations. The development of these algorithms will be informed by mathematical analysis aimed at fully characterizing their potential utility. The research will complement and support significant and diverse applications of critical societal interest. Examples include studies of protein assemblies relevant to treating diabetes and wound healing and changes in atmospheric conditions that lead to polar vortices and tropical cyclones. Because the methods developed in this project avoid the need to make simplifying assumptions when formulating the models, they promise to reveal the underlying physical mechanisms of these and other processes in unprecedented detail. The project will provide opportunities to graduate students to be involved in the research.This project concerns the development and analysis of a family of algorithms that assemble forecasts of events occurring over extremely long times using only a data set consisting of short trajectories. In this approach, forecasts (conditional expectations of future behavior) are cast as solutions to equations involving the operator determining the statistics of the underlying dynamical system. Building on significant and promising preliminary efforts employing a basis expansion approximation of the target predictive functions, this project will develop more expressive approximations that remain robust and reliable. A first aim will explore kernelized extensions of the basis expansion approach, which will allow careful control of the degree of approximation flexibility. A second aim will explore variational representations, allowing the introduction of neural network approximations and their extreme expressive power. Introducing this higher level of approximation flexibility while maintaining reliability and reproducibility will be a significant challenge. A third but parallel thrust will provide a careful and complete mathematical analysis of the basis expansion and kernel-based methods with an emphasis on building a theory that informs even the more complicated neural network-based approaches. All computational approaches will be extensively validated on a benchmark protein folding/unfolding data set. Their development will be accompanied by careful mathematical error analysis aimed at understanding the effect of various design choices such as the measure with respect to which the data is sampled and the length of the short simulations in the data set.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
像我们身体中潜在功能的分子相互作用的变化,或者对我们的沿海城市造成毁灭性破坏的极端强烈的飓风等事件,很难通过计算进行研究,因为它们在可行的模拟时间尺度上很少发生。值得注意的是,从理论上讲,通过利用某些高维方程来描述长时间尺度统计的短时间尺度特性,可以绕过这个问题。利用这些方程和统计和机器学习中的新工具来解决这些问题,该项目引入了使用由许多短模拟组成的数据集来学习长期统计数据的算法。这些算法的发展将通过旨在充分表征其潜在效用的数学分析来了解。该研究将补充和支持重要的社会利益的重要和多样化的应用。例如,研究与治疗糖尿病和伤口愈合有关的蛋白质组合,以及导致极地涡旋和热带气旋的大气条件变化。由于该项目中开发的方法避免了在制定模型时进行简化假设的需要,因此它们有望以前所未有的细节揭示这些过程和其他过程的潜在物理机制。该项目将为研究生提供参与研究的机会。该项目涉及开发和分析一系列算法,这些算法仅使用由短轨迹组成的数据集来组合对极长时间内发生的事件的预测。在这种方法中,预测(未来行为的条件期望)被转换为方程的解,方程涉及决定底层动力系统统计的算子。在采用目标预测函数的基础扩展近似的重要和有前途的初步努力的基础上,该项目将开发更具表达力的近似,保持稳健和可靠。第一个目标将探讨核扩展的基础上扩展的方法,这将允许仔细控制的程度近似的灵活性。第二个目标是探索变分表示,允许引入神经网络近似及其极端的表达能力。在保持可靠性和再现性的同时引入这种更高水平的近似灵活性将是一个重大挑战。第三个但平行的推力将提供一个仔细和完整的数学分析的基础扩展和基于内核的方法,重点是建立一个理论,甚至通知更复杂的基于神经网络的方法。所有的计算方法将在基准蛋白质折叠/展开数据集上进行广泛验证。他们的发展将伴随着仔细的数学误差分析,旨在了解各种设计选择的影响,如数据采样的措施和数据集中的短期模拟的长度。该奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Data-Driven Transition Path Analysis Yields a Statistical Understanding of Sudden Stratospheric Warming Events in an Idealized Model
数据驱动的转变路径分析可以在理想化模型中对平流层突然变暖事件产生统计了解
  • DOI:
    10.1175/jas-d-21-0213.1
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Finkel, Justin;Webber, Robert J.;Gerber, Edwin P.;Abbot, Dorian S.;Weare, Jonathan
  • 通讯作者:
    Weare, Jonathan
Predicting rare events using neural networks and short-trajectory data
  • DOI:
    10.1016/j.jcp.2023.112152
  • 发表时间:
    2022-08
  • 期刊:
  • 影响因子:
    4.1
  • 作者:
    J. Strahan;J. Finkel;A. Dinner;J. Weare
  • 通讯作者:
    J. Strahan;J. Finkel;A. Dinner;J. Weare
Computing transition path theory quantities with trajectory stratification
  • DOI:
    10.1063/5.0087058
  • 发表时间:
    2022-07-21
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Vani, Bodhi P.;Weare, Jonathan;Dinner, Aaron R.
  • 通讯作者:
    Dinner, Aaron R.
Understanding and eliminating spurious modes in variational Monte Carlo using collective variables
使用集体变量理解和消除变分蒙特卡罗中的杂散模式
  • DOI:
    10.1103/physrevresearch.5.023101
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    4.2
  • 作者:
    Zhang, Huan;Webber, Robert J.;Lindsey, Michael;Berkelbach, Timothy C.;Weare, Jonathan
  • 通讯作者:
    Weare, Jonathan
Revealing the Statistics of Extreme Events Hidden in Short Weather Forecast Data
  • DOI:
    10.1029/2023av000881
  • 发表时间:
    2022-06
  • 期刊:
  • 影响因子:
    8.4
  • 作者:
    J. Finkel;E. Gerber;D. Abbot;J. Weare
  • 通讯作者:
    J. Finkel;E. Gerber;D. Abbot;J. Weare
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jonathan Weare其他文献

The surprising efficiency of temporal difference learning for rare event prediction
时间差异学习在罕见事件预测中的惊人效率
  • DOI:
    10.48550/arxiv.2405.17638
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiaoou Cheng;Jonathan Weare
  • 通讯作者:
    Jonathan Weare

Jonathan Weare的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jonathan Weare', 18)}}的其他基金

Long Time Scales and Unlikely Events: Sampling and Coarse Graining Strategies
长时间尺度和不太可能发生的事件:采样和粗粒度策略
  • 批准号:
    1109731
  • 财政年份:
    2011
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant

相似海外基金

DMS/NIGMS 1: Multilevel stochastic orthogonal subspace transformations for robust machine learning with applications to biomedical data and Alzheimer's disease subtyping
DMS/NIGMS 1:多级随机正交子空间变换,用于稳健的机器学习,应用于生物医学数据和阿尔茨海默病亚型分析
  • 批准号:
    2347698
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Collaborative Research: DMS/NIGMS 1: Simulating cell migration with a multi-scale 3D model fed by intracellular tension sensing measurements
合作研究:DMS/NIGMS 1:使用由细胞内张力传感测量提供的多尺度 3D 模型模拟细胞迁移
  • 批准号:
    2347957
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Collaborative Research: DMS/NIGMS 1: Simulating cell migration with a multi-scale 3D model fed by intracellular tension sensing measurements
合作研究:DMS/NIGMS 1:使用由细胞内张力传感测量提供的多尺度 3D 模型模拟细胞迁移
  • 批准号:
    2347956
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
DMS/NIGMS 2: Deep learning for repository-scale analysis of tandem mass spectrometry proteomics data
DMS/NIGMS 2:用于串联质谱蛋白质组数据存储库规模分析的深度学习
  • 批准号:
    2245300
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 1: Viscoelasticity and Flow of Biological Condensates via Continuum Descriptions - How Droplets Coalesce and Wet Cellular Surfaces
DMS/NIGMS 1:通过连续体描述的生物凝聚物的粘弹性和流动 - 液滴如何聚结和润湿细胞表面
  • 批准号:
    2245850
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 2: Spatial, Multi-Host Petri Net Models for Zoonotic Disease Forecasting
DMS/NIGMS 2:用于人畜共患疾病预测的空间、多主机 Petri 网络模型
  • 批准号:
    10797423
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
DMS/NIGMS 1: Multi-timescale stochastic modeling to investigate epigenetic memory in bacteria
DMS/NIGMS 1:用于研究细菌表观遗传记忆的多时间尺度随机模型
  • 批准号:
    2245816
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 1: Modeling Microbial Community Response to Invasion: A Multi-Omics and Multifacton
DMS/NIGMS 1:模拟微生物群落对入侵的反应:多组学和多因素
  • 批准号:
    10794584
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
DMS/NIGMS 2: Regulation of Cellular Stemness during the Epithelial-Mesenchymal Transition (EMT)
DMS/NIGMS 2:上皮-间质转化 (EMT) 期间细胞干性的调节
  • 批准号:
    2245957
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Collaborative Research: DMS/NIGMS 2: Novel machine-learning framework for AFMscanner in DNA-protein interaction detection
合作研究:DMS/NIGMS 2:用于 DNA-蛋白质相互作用检测的 AFM 扫描仪的新型机器学习框架
  • 批准号:
    10797460
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了