Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
基本信息
- 批准号:2005746
- 负责人:
- 金额:$ 3.06万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-08-01 至 2020-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern massive data appear in increasing volume and high heterogeneity. Examples include internet searches, social networks, mobile devices, satellites, genomics, medical scans, etc. Bayesian approaches are particularly useful in such context since the complex structures in the data can be naturally incorporated in Bayesian hierarchical models. Besides, uncertainty quantification can be easily executed through Bayesian computation. However, due to storage and computational bottlenecks, traditional Bayesian computation implemented in a single machine is no longer applicable to modern massive data. In this project, a set of nonparametric Bayesian aggregation procedures with theoretical justifications are developed based on a standard parallel computing strategy known as Divide-and-Conquer. This research will significantly enhance the availability of Bayesian tools and software for analyzing massive data. The educational plan of the project will be in the form of graduate student advising and offering of special topics courses. This project consists of three major components. First, the PIs will establish a Gaussian approximation of general nonparametric posterior distributions which serves as a theoretical foundation for general distributed Bayesian algorithms. Second, the PIs will develop a nonparametric Bayesian aggregation procedure with theoretical guarantees that is particularly useful to handle massive data in a parallel fashion. Third, the PIs will develop an efficient parallel Markov Chain Monte Carlo (MCMC) algorithm for nonparametric Bayesian models which will perform as well as traditional MCMC with substantially less computational costs. This research will lead to an emergence of "Splitotics (Split+Asymptotics) Theory" providing theoretical guidelines for Bayesian practices. The smoothing spline inference results recently obtained by the PIs will be used as a promising tool for achieving the above goals.
现代海量数据呈现出海量化、异构化的特点。例子包括互联网搜索、社交网络、移动设备、卫星、基因组学、医学扫描等。贝叶斯方法在这种情况下特别有用,因为数据中的复杂结构可以自然地纳入贝叶斯层次模型。此外,通过贝叶斯计算可以方便地进行不确定性量化。然而,由于存储和计算的瓶颈,传统的单机贝叶斯计算已经不再适用于现代海量数据。在这个项目中,一组具有理论依据的非参数贝叶斯聚合过程是基于被称为分而治之的标准并行计算策略开发的。本研究将显著提高贝叶斯工具和软件在海量数据分析中的可用性。该项目的教学计划将采取研究生指导和专题课程的形式。这个项目由三个主要部分组成。首先,pi将建立一般非参数后验分布的高斯近似,作为一般分布贝叶斯算法的理论基础。其次,pi将开发具有理论保证的非参数贝叶斯聚合过程,该过程对于以并行方式处理大量数据特别有用。第三,pi将为非参数贝叶斯模型开发一种高效的并行马尔可夫链蒙特卡罗(MCMC)算法,该算法的性能与传统的MCMC一样好,但计算成本大大降低。这一研究将导致“分裂(分裂+渐近)理论”的出现,为贝叶斯实践提供理论指导。pi最近获得的平滑样条推断结果将被用作实现上述目标的有前途的工具。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Nonparametric Bayesian Aggregation for Massive Data
- DOI:
- 发表时间:2015-08
- 期刊:
- 影响因子:0
- 作者:Zuofeng Shang;Botao Hao;Guang Cheng
- 通讯作者:Zuofeng Shang;Botao Hao;Guang Cheng
An MM algorithm for estimation of a two component semiparametric density mixture with a known component
用于估计具有已知成分的二成分半参数密度混合物的 MM 算法
- DOI:10.1214/18-ejs1417
- 发表时间:2018
- 期刊:
- 影响因子:1.1
- 作者:Shen, Zhou;Levine, Michael;Shang, Zuofeng
- 通讯作者:Shang, Zuofeng
Non-asymptotic Analysis for Nonparametric Testing
- DOI:
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Yun Yang;Zuofeng Shang;Guang Cheng
- 通讯作者:Yun Yang;Zuofeng Shang;Guang Cheng
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Zuofeng Shang其他文献
Statistica Sinica Preprint No: SS-2022-0057
《统计》预印本编号:SS-2022-0057
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Shuoyang Wang;Zuofeng Shang;Guanqun Cao;Jun Liu - 通讯作者:
Jun Liu
Empirical likelihood test for community structure in networks
网络中社区结构的经验似然检验
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Mingao Yuan;Sharmin Hossain;Zuofeng Shang - 通讯作者:
Zuofeng Shang
Testing community structure for hypergraphs
测试超图的社区结构
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:4.5
- 作者:
Mingao Yuan;Ruiqi Liu;Yang Feng;Zuofeng Shang - 通讯作者:
Zuofeng Shang
Sharp detection boundaries on testing dense subhypergraph
测试密集子超图时的清晰检测边界
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:1.5
- 作者:
Mingao Yuan;Zuofeng Shang - 通讯作者:
Zuofeng Shang
A Fast Non-Linear Coupled Tensor Completion Algorithm for Financial Data Integration and Imputation
一种用于金融数据集成和插补的快速非线性耦合张量完成算法
- DOI:
10.1145/3604237.3626899 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
D. Zhou;Ajim Uddin;Zuofeng Shang;C. Sylla;Xinyuan Tao;Dantong Yu - 通讯作者:
Dantong Yu
Zuofeng Shang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Zuofeng Shang', 18)}}的其他基金
CDS&E: Collaborative Research: Scalable Nonparametric Learning for Massive Data with Statistical Guarantees
CDS
- 批准号:
2005779 - 财政年份:2019
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
CDS&E: Collaborative Research: Scalable Nonparametric Learning for Massive Data with Statistical Guarantees
CDS
- 批准号:
1821157 - 财政年份:2018
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
- 批准号:
1764280 - 财政年份:2017
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
- 批准号:
1712919 - 财政年份:2017
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
CDS&E: Collaborative Research: Scalable Nonparametric Learning for Massive Data with Statistical Guarantees
CDS
- 批准号:
2005779 - 财政年份:2019
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
Collaborative Research: New Bayesian Nonparametric Paradigms of Personalized Medicine for Lung Cancer
合作研究:肺癌个体化医疗的新贝叶斯非参数范式
- 批准号:
1922567 - 财政年份:2018
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
CDS&E: Collaborative Research: Scalable Nonparametric Learning for Massive Data with Statistical Guarantees
CDS
- 批准号:
1821157 - 财政年份:2018
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
Collaborative Research: New Bayesian Nonparametric Paradigms of Personalized Medicine for Lung Cancer
合作研究:肺癌个体化医疗的新贝叶斯非参数范式
- 批准号:
1854003 - 财政年份:2018
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
CDS&E: Collaborative Research: Scalable Nonparametric Learning for Massive Data with Statistical Guarantees
CDS
- 批准号:
1821183 - 财政年份:2018
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
- 批准号:
1764280 - 财政年份:2017
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
- 批准号:
1712907 - 财政年份:2017
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
Collaborative Research: Nonparametric Bayesian Aggregation for Massive Data
协作研究:海量数据的非参数贝叶斯聚合
- 批准号:
1712919 - 财政年份:2017
- 资助金额:
$ 3.06万 - 项目类别:
Continuing Grant
Collaborative Research: Honest Inference and Efficiency Bounds for Nonparametric Regression and Approximate Moment Condition Models
协作研究:非参数回归和近似矩条件模型的诚实推理和效率界限
- 批准号:
1628878 - 财政年份:2016
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant
Collaborative Research: Honest Inference and Efficiency Bounds for Nonparametric Regression and Approximate Moment Condition Models
协作研究:非参数回归和近似矩条件模型的诚实推理和效率界限
- 批准号:
1628939 - 财政年份:2016
- 资助金额:
$ 3.06万 - 项目类别:
Standard Grant