On Statistical Modeling and Parameter Estimation for High Dimensional Systems
高维系统的统计建模和参数估计
基本信息
- 批准号:1612924
- 负责人:
- 金额:$ 15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-09-01 至 2018-01-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The dramatic improvements in data collection and acquisition technologies over the last decades have enabled scientists to collect massive amounts of high-dimensional data that allow for monitoring and studying of complex systems. Due to their intrinsic nature, many of the high-dimensional datasets, such as omics and genome-wide association study (GWAS) data, have a much smaller sample size compared to the dimension (referred to as the small-n-large-P problem). Current research on statistical modeling of small-n-large-P data focuses on linear and generalized linear models. However, these approaches are often not adequate for modeling complex systems, and estimation of the model parameters is challenging. This project addresses two fundamental problems, statistical modeling and parameter estimation, toward a valid statistical analysis of high-dimensional data. Successful completion of this project will generate hands-on tools for statistical inference of high-dimensional complex systems, which can benefit researchers in many areas of science and technology. In particular, the proposed applications to biomedical studies will lead to accurate tools for detecting biomarkers associated with disease processes and tailoring optimal therapy for individual patients with complex diseases. The research results will be disseminated to the statistical and biomedical communities, via collaboration, conference presentations, books, and articles to be published in academic journals. The project will also have significant impact on education through the involvement of graduate students in the project, and incorporation of results into undergraduate and graduate courses. In addition, the R package developed under this project will provide a valuable tool for statistical analysis of high-dimensional data.The current approach to modeling small-n-large-P data focuses on linear and generalized linear models, and casts the problem as variable selection by imposing a sparsity constraint on parameter values. Although these models have many advantages, such as simplicity and computational efficiency, estimation of the parameters is still a challenging problem. While regularization is often used in these situations, it can perform poorly when the sample size is small and the variables are highly correlated. Two new methods are proposed to address these concerns, namely, Bayesian neural network (BNN) and blockwise coordinate consistency (BCC). The BNN method works by first fitting the data with a feed-forward neural network, conducting variable selection through network structure selection under a Bayesian framework, and resolving the associated computational difficulty via parallel computing. Compared to existing methods, BNN can lead to much more precise selection of relevant variables and outcome prediction for high-dimensional nonlinear systems. The BCC method works by maximizing a new objective function, the expectation of the log-likelihood function, using a cyclic algorithm and iteratively finding consistent estimates for each block of parameters conditional on the current estimates of the other parameters. The BCC method reduces the high-dimensional parameter estimation problem to a series of low-dimensional parameter estimation problems. The preliminary results indicate that BCC can provide a drastic improvement in both parameter estimation and variable selection over regularization methods. The validity of the proposed methods will be rigorously studied and applied to biomarker discovery, precision medicine, and joint estimation of the regression coefficients and precision matrix for high-dimensional multivariate regression.
在过去的几十年里,数据收集和获取技术的巨大进步使科学家能够收集大量的高维数据,从而可以监测和研究复杂的系统。 由于其固有的性质,许多高维数据集,如组学和全基因组关联研究(GWAS)数据,与维度相比,样本量要小得多(称为小N大P问题)。目前小N大P数据统计建模的研究主要集中在线性和广义线性模型。然而,这些方法通常不足以对复杂系统进行建模,并且模型参数的估计具有挑战性。 这个项目解决了两个基本问题,统计建模和参数估计,对一个有效的高维数据的统计分析。 该项目的成功完成将产生用于高维复杂系统的统计推断的实用工具,这可以使许多科学和技术领域的研究人员受益。 特别是,生物医学研究的拟议应用将导致准确的工具,用于检测与疾病过程相关的生物标志物,并为患有复杂疾病的个体患者定制最佳治疗。研究成果将通过合作、会议介绍、书籍和在学术期刊上发表的文章传播给统计和生物医学界。该项目还将通过让研究生参与该项目,并将成果纳入本科生和研究生课程,对教育产生重大影响。 此外,本项目开发的R软件包将为高维数据的统计分析提供一个有价值的工具。目前对small-n-large-P数据建模的方法主要集中在线性和广义线性模型上,并通过对参数值施加稀疏约束来将问题转化为变量选择。 虽然这些模型具有许多优点,如简单和计算效率,参数的估计仍然是一个具有挑战性的问题。 虽然正则化经常用于这些情况,但当样本量很小并且变量高度相关时,它的性能可能很差。 提出了两种新的方法来解决这些问题,即贝叶斯神经网络(BNN)和分块坐标一致性(BCC)。 BNN方法的工作原理是首先用前馈神经网络拟合数据,在贝叶斯框架下通过网络结构选择进行变量选择,并通过并行计算解决相关的计算困难。与现有的方法相比,BNN可以导致更精确的选择相关变量和结果预测的高维非线性系统。 BCC方法的工作原理是最大化一个新的目标函数,对数似然函数的期望值,使用循环算法和迭代地找到一致的估计为每个块的参数条件下的其他参数的当前估计。 BCC方法将高维参数估计问题简化为一系列低维参数估计问题。 初步结果表明,BCC可以提供一个显着的改善,在参数估计和变量选择正则化方法。所提出的方法的有效性将被严格研究,并应用于生物标志物发现,精准医学,以及高维多元回归的回归系数和精度矩阵的联合估计。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Faming Liang其他文献
Bayesian phylogeny analysis via stochastic approximation Monte Carlo
- DOI:
10.1016/j.ympev.2009.06.019 - 发表时间:
2009-11-01 - 期刊:
- 影响因子:
- 作者:
Sooyoung Cheon;Faming Liang - 通讯作者:
Faming Liang
Networks Involved in Coronary Collateral Formation
参与冠状动脉侧支形成的网络
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Jian Zhang;J. Regieli;M. Schipper;M. M. Entius;Faming Liang;J. Koerselman;H. J. Ruven;Yolanda van der Graaf;D. Grobbee;Pieter A. Doevendans;Pieter A. Doevendans - 通讯作者:
Pieter A. Doevendans
A New Paradigm for Generative Adversarial Networks Based on Randomized Decision Rules
基于随机决策规则的生成对抗网络新范式
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:1.4
- 作者:
Sehwan Kim;Qifan Song;Faming Liang - 通讯作者:
Faming Liang
An extended Langevinized ensemble Kalman filter for non-Gaussian dynamic systems
用于非高斯动态系统的扩展 Langevinized 系综卡尔曼滤波器
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Peiyi Zhang;Tianning Dong;Faming Liang - 通讯作者:
Faming Liang
Fast Value Tracking for Deep Reinforcement Learning
深度强化学习的快速价值跟踪
- DOI:
10.48550/arxiv.2403.13178 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Frank Shih;Faming Liang - 通讯作者:
Faming Liang
Faming Liang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Faming Liang', 18)}}的其他基金
A New Stochastic Neural Network: Statistical Perspectives and Applications
一种新的随机神经网络:统计视角和应用
- 批准号:
2210819 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Scalable Algorithms for Bayesian On-Line Learning with Large-Scale Dynamic Data
用于大规模动态数据的贝叶斯在线学习的可扩展算法
- 批准号:
2015498 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Statistical Inference for Biomedical Big Data: Theory, Methods, and Tools
生物医学大数据的统计推断:理论、方法和工具
- 批准号:
1703077 - 财政年份:2017
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
On Statistical Modeling and Parameter Estimation for High Dimensional Systems
高维系统的统计建模和参数估计
- 批准号:
1818674 - 财政年份:2017
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Monte Carlo Methods for Analysis of Large Spatial Data
用于分析大空间数据的蒙特卡罗方法
- 批准号:
1545738 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Collaborative Research: Efficient Parallel Iterative Monte Carlo Methods for Statistical Analysis of Big Data
合作研究:用于大数据统计分析的高效并行迭代蒙特卡罗方法
- 批准号:
1545202 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Collaborative Research: Efficient Parallel Iterative Monte Carlo Methods for Statistical Analysis of Big Data
合作研究:用于大数据统计分析的高效并行迭代蒙特卡罗方法
- 批准号:
1317131 - 财政年份:2013
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Monte Carlo Methods for Analysis of Large Spatial Data
用于分析大空间数据的蒙特卡罗方法
- 批准号:
1106494 - 财政年份:2011
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Sampling from Distributions with Intractable Integrals
从具有棘手积分的分布中采样
- 批准号:
1007457 - 财政年份:2010
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
Development of Stochastic Approximation Monte Carlo Methods
随机逼近蒙特卡罗方法的发展
- 批准号:
0706755 - 财政年份:2007
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
相似国自然基金
Galaxy Analytical Modeling
Evolution (GAME) and cosmological
hydrodynamic simulations.
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
相似海外基金
ERI: Intelligent Modeling and Parameter Selection in Distributed Optimization for Power Networks
ERI:电力网络分布式优化中的智能建模和参数选择
- 批准号:
2347120 - 财政年份:2024
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Establishing parameter estimation theory of stochastic differential equations for advanced modeling of life systems
建立生命系统高级建模的随机微分方程参数估计理论
- 批准号:
20K12059 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: Explorations of Salt Finger Convection in the Extreme Oceanic Parameter Regime: An Asymptotic Modeling Approach.
合作研究:极端海洋参数体系中盐指对流的探索:渐近建模方法。
- 批准号:
2023499 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Collaborative Research: Explorations of Salt Finger Convection in the Extreme Oceanic Parameter Regime: An Asymptotic Modeling Approach.
合作研究:极端海洋参数体系中盐指对流的探索:渐近建模方法。
- 批准号:
2023541 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Mathematical Modeling and Advanced Parameter Estimation for Polymerization Processes
聚合过程的数学建模和高级参数估计
- 批准号:
RGPIN-2015-03668 - 财政年份:2019
- 资助金额:
$ 15万 - 项目类别:
Discovery Grants Program - Individual
Modeling, Identification, and Estimation of Distributed Parameter Systems Using Mobile Sensor Networks
使用移动传感器网络对分布式参数系统进行建模、识别和估计
- 批准号:
1917300 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Mathematical modeling and parameter identification of human-bicycle balance fluctuations
人车平衡波动的数学建模与参数辨识
- 批准号:
18H01391 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Parameter Optimization for Modeling of Hydraulic Fracturing
水力压裂建模参数优化
- 批准号:
529189-2018 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Mathematical Modeling and Advanced Parameter Estimation for Polymerization Processes
聚合过程的数学建模和高级参数估计
- 批准号:
RGPIN-2015-03668 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Discovery Grants Program - Individual
Modeling, Identification, and Estimation of Distributed Parameter Systems Using Mobile Sensor Networks
使用移动传感器网络对分布式参数系统进行建模、识别和估计
- 批准号:
1663073 - 财政年份:2017
- 资助金额:
$ 15万 - 项目类别:
Standard Grant