Data-driven statistical dynamical modeling: Shortage of training data and high- dimensionality
数据驱动的统计动态建模:训练数据短缺和高维
基本信息
- 批准号:2207328
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-08-01 至 2025-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Today, machine learning is a prominent scientific computing tool with many practical applications. Notable successes are the classification problems of identifying pictures and the Artificial Intelligence (AI) Go-player that beats the best human player in the world. While these successes are important milestones, there are emerging needs to replicate these successes in the statistical modeling of time-evolving complex systems, with examples ranging from predicting climate to nanomaterials under external disturbances. The goal of this project is to develop the next-generation mathematical and algorithmic tools to overcome two important issues in extending machine learning to such problems, namely a shortage of informative data for effective learning and the expensive computational costs. This objective will be addressed by a theoretical and algorithmic development in computational mathematics, leveraging the fundamental knowledge from the basic sciences, including geometry, dynamical systems, data sciences, and statistics. This project will contribute to the NSF mission of advancing STEM through the training of graduate students and curricular development through the design of courses in the mathematical theory of machine learning. In particular, this project will support one graduate student. The goals of this project are to overcome the shortage of training data and exploit the manifold assumption to avoid the curse of dimension in the statistical modeling of dynamical systems. Beyond uncertainty quantification (UQ) applications, a statistical closure model will be developed to enhance the training of ML-based prediction models when the observed time series is too short for accurate estimation. Specifically, the proposed projects are: 1) To develop a systematic reduced-order statistical closure model. This project extends the recently developed ML-based non-Markovian closure framework for accurate predictions of statistical responses subjected to unseen external forcings, which is important for UQ. 2) To develop a dimensionality reduction technique that respects the geometry of the data under a manifold assumption on the dynamical variables. The approach includes an accurate Radial Basis Function approximation to the Bochner Laplacian from the embedded data. Subsequently, the estimated eigen-vector-fields will be used as a frame to represent the vector fields corresponding to the unresolved dynamics. This model reduction framework provides a computationally cheaper alternative to deep learning. 3) To study the theoretical convergence property of a recently developed algorithm, Bayesian Machine Learning (BML), which uses solutions of a statistically consistent model to enhance the training of the neural network (NN) model in learning non-Markovian dynamics with a short observational time series. This study is motivated by a recent empirical finding that the NN model obtained from the BML training algorithm improves the accuracy of the El Niño prediction by at least two months compared to the same NN architecture trained using the standard stochastic gradient descent algorithm. The ultimate goal of this study is to evaluate and develop a theoretical understanding of the effectiveness of the statistical closure model from Task 1) to enhance Bayesian Machine Learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
今天,机器学习是一个突出的科学计算工具,具有许多实际应用。值得注意的成功是识别图片的分类问题,以及击败世界上最好的人类棋手的人工智能(AI)围棋选手。虽然这些成功是重要的里程碑,但在时间演变的复杂系统的统计建模中,需要复制这些成功,例如从预测气候到外部干扰下的纳米材料。该项目的目标是开发下一代数学和算法工具,以克服将机器学习扩展到此类问题的两个重要问题,即缺乏有效学习的信息数据和昂贵的计算成本。这一目标将通过计算数学的理论和算法发展来解决,利用基础科学的基础知识,包括几何、动力系统、数据科学和统计学。该项目将有助于NSF通过培养研究生和通过设计机器学习数学理论课程开发来推进STEM的使命。特别地,这个项目将支持一名研究生。本课题的目标是克服训练数据的不足,利用流形假设来避免动力系统统计建模中的维数诅咒。除了不确定性量化(UQ)应用之外,当观测时间序列太短而无法准确估计时,将开发一种统计闭合模型来增强基于ml的预测模型的训练。具体而言,建议的项目包括:1)建立系统的降阶统计闭包模型。该项目扩展了最近开发的基于ml的非马尔可夫封闭框架,用于准确预测受未知外部强迫影响的统计响应,这对UQ很重要。2)在动态变量的流形假设下,开发一种尊重数据几何的降维技术。该方法包括对嵌入数据的Bochner拉普拉斯函数的精确径向基函数逼近。随后,估计的特征向量场将作为一个框架来表示与未解析动力学相对应的向量场。这个模型简化框架为深度学习提供了一个计算成本更低的替代方案。3)研究贝叶斯机器学习(BML)算法的理论收敛性,该算法使用统计一致模型的解来增强神经网络(NN)模型在短观测时间序列下学习非马尔可夫动力学的训练。这项研究的动机是最近的一项经验发现,与使用标准随机梯度下降算法训练的相同神经网络架构相比,从BML训练算法获得的神经网络模型将El Niño预测的准确性提高了至少两个月。本研究的最终目标是评估和发展对任务1中统计闭合模型有效性的理论理解,以增强贝叶斯机器学习。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A data-driven statistical-stochastic surrogate modeling strategy for complex nonlinear non-stationary dynamics
复杂非线性非平稳动力学的数据驱动统计随机代理建模策略
- DOI:10.1016/j.jcp.2023.112085
- 发表时间:2023
- 期刊:
- 影响因子:4.1
- 作者:Qi, Di;Harlim, John
- 通讯作者:Harlim, John
Machine learning-based statistical closure models for turbulent dynamical systems
基于机器学习的湍流动力系统统计闭合模型
- DOI:10.1098/rsta.2021.0205
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Qi, Di;Harlim, John
- 通讯作者:Harlim, John
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
John Harlim其他文献
John Harlim的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('John Harlim', 18)}}的其他基金
FRG: Collaborative Research: Non-Smooth Geometry, Spectral Theory, and Data: Learning and Representing Projections of Complex Systems
FRG:协作研究:非光滑几何、谱理论和数据:学习和表示复杂系统的投影
- 批准号:
1854299 - 财政年份:2019
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Data-driven Modeling of Equilibrium and Non-equilibrium Statistics
均衡和非均衡统计的数据驱动建模
- 批准号:
1619661 - 财政年份:2016
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Practical Filtering Methods with Model Errors
具有模型误差的实用过滤方法
- 批准号:
1317919 - 财政年份:2013
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似国自然基金
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
基于Cache的远程计时攻击研究
- 批准号:60772082
- 批准年份:2007
- 资助金额:28.0 万元
- 项目类别:面上项目
相似海外基金
Revealing mechanisms of specificity and adaptability in molecular information processing through data-driven models
通过数据驱动模型揭示分子信息处理的特异性和适应性机制
- 批准号:
10715575 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Identifying genetically driven gene dysregulation in Alzheimer's disease and related dementias using statistical data integration
使用统计数据整合识别阿尔茨海默病和相关痴呆症中遗传驱动的基因失调
- 批准号:
10659349 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Unsupervised Statistical Methods for Data-driven Analyses in Spatially Resolved Transcriptomics Data
空间分辨转录组数据中数据驱动分析的无监督统计方法
- 批准号:
10556351 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Resolving single-cell analysis challenges via data-driven decision frameworks and novel statistical methods
通过数据驱动的决策框架和新颖的统计方法解决单细胞分析挑战
- 批准号:
10707308 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Data-driven optimization for DBS programming in temporal lobe epilepsy
颞叶癫痫 DBS 编程的数据驱动优化
- 批准号:
10574839 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Data-Driven Approaches to Identify Biomarkers for Guiding Coronary Artery Bifurcation Lesion Interventions from Patient-Specific Hemodynamic Models
从患者特异性血流动力学模型中识别生物标志物的数据驱动方法,用于指导冠状动脉分叉病变干预
- 批准号:
10373696 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
A novel data-driven approach for personalizing smoking cessation pharmacotherapy
一种新的数据驱动的个性化戒烟药物治疗方法
- 批准号:
10437438 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
A novel data-driven approach for personalizing smoking cessation pharmacotherapy
一种新的数据驱动的个性化戒烟药物治疗方法
- 批准号:
10578721 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:














{{item.name}}会员




