Iterative Algorithms for Statistics: From Convergence Rates to Statistical Accuracy
统计迭代算法:从收敛率到统计准确性
基本信息
- 批准号:2301050
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-10-01 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Science, engineering, and industry are all being revolutionized by the modern era of data science, in which increasingly large and rich forms of data are now available. The applications are diverse and broadly significant, including data-driven discovery in astronomy, statistical machine learning approaches to drug design, and decision-making in robotics and automated driving, among many others. This grant supports research on techniques and models for learning from such massive datasets, leading to computationally efficient algorithms that can be scaled to the large problem instances encountered in practice. The PI plans to integrate research and education through the involvement of graduate students in the research, the inclusion of the research results in courses at UC Berkeley and in publicly available web-based course materials, as well as in mini courses at summer schools and workshops. This project will also provide mentoring and support for graduate students and postdocs who are female or belong to URM communities.Many estimates in statistics are defined via an iterative algorithm applied to a data-dependent objective function (e.g., the EM algorithm for missing data and latent variable models; gradient-based methods and Newton's method for M-estimation; boosting algorithms used in non-parametric regression). This projectl gives several research thrusts that are centered around exploiting the dynamics of these algorithms in order to answer statistical questions, with applications to statistical parameter estimation; selection of the number of components in a mixture model; and optimal bias-variance trade-offs in non-parametric regression. In more detail, the aims of this project include (i) providing a general analysis of the EM algorithm for non-regular mixture models and related singular problems, in which very slow (sub-geometric) convergence is typically observed; (ii) developing a principled method for model selection based on the convergence rate of EM, and to prove theoretical guarantees on its performance; developing a general theoretical framework for combining the convergence rate of an algorithm with bounds on its (in)stability so as to establish bounds on the statistical estimation error; and (iii) providing a complete analysis of the full boosting path for various types of boosting updates, including kernel boosting, as well as gradient-boosted regression trees, and to analyze the "overfitting" regime, elucidating conditions under which overfitting does or does not occur.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学、工程和工业都在数据科学的现代时代发生革命性的变化,在这个时代,越来越多的数据形式变得越来越大和丰富。 这些应用是多样的,具有广泛的意义,包括天文学中的数据驱动发现,药物设计的统计机器学习方法,以及机器人和自动驾驶的决策等。 这项资助支持对从如此庞大的数据集中学习的技术和模型的研究,从而产生计算效率高的算法,这些算法可以扩展到实践中遇到的大型问题实例。PI计划通过研究生参与研究,将研究成果纳入加州大学伯克利分校的课程和公开的基于网络的课程材料,以及暑期学校和研讨会的迷你课程,来整合研究和教育。该项目还将为女性或属于URM社区的研究生和博士后提供指导和支持。统计学中的许多估计是通过应用于数据依赖目标函数的迭代算法定义的(例如,用于缺失数据和潜在变量模型的EM算法;用于M估计的基于梯度的方法和牛顿法;用于非参数回归的boosting算法)。 该项目给出了几个研究重点,这些研究重点围绕利用这些算法的动态来回答统计问题,并应用于统计参数估计;混合模型中组件数量的选择;以及非参数回归中的最佳偏差-方差权衡。 更详细地说,这个项目的目的包括:(i)提供一个一般的分析EM算法的非正则混合模型和相关的奇异问题,其中非常缓慢(亚几何)收敛通常观察;(ii)开发一个原则性的方法,模型选择的基础上EM的收敛速度,并证明其性能的理论保证;开发一个通用的理论框架,用于将算法的收敛速度与其(不)稳定性的界限相结合,以便建立统计估计误差的界限;以及(iii)为各种类型的提升更新提供完整的提升路径的完整分析,包括内核提升,以及梯度提升回归树,该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Martin Wainwright其他文献
Martin Wainwright的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Martin Wainwright', 18)}}的其他基金
Non-parametric estimation under covariate shift: From fundamental bounds to efficient algorithms
协变量平移下的非参数估计:从基本界限到高效算法
- 批准号:
2311072 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Iterative Algorithms for Statistics: From Convergence Rates to Statistical Accuracy
统计迭代算法:从收敛率到统计准确性
- 批准号:
2015454 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Statistical Estimation in Resource-Constrained Environments: Computation, Communication and Privacy
资源受限环境中的统计估计:计算、通信和隐私
- 批准号:
1612948 - 财政年份:2016
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
CIF: Medium: Collaborative Research: New Approaches to Robustness in High-Dimensions
CIF:中:协作研究:高维鲁棒性的新方法
- 批准号:
1302687 - 财政年份:2013
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Sparse and structured networks: Statistical theory and algorithms
稀疏和结构化网络:统计理论和算法
- 批准号:
1107000 - 财政年份:2011
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
CAREER: Novel Message-Passing Algorithms for Distributed Computation in Graphical Models: Theory and Applications in Signal Processing
职业:图形模型中分布式计算的新型消息传递算法:信号处理中的理论与应用
- 批准号:
0545862 - 财政年份:2006
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
相似海外基金
Collaborative Research: AMPS: Rare Events in Power Systems: Novel Mathematics, Statistics and Algorithms.
合作研究:AMPS:电力系统中的罕见事件:新颖的数学、统计和算法。
- 批准号:
2229011 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: AMPS: Rare Events in Power Systems: Novel Mathematics, Statistics and Algorithms.
合作研究:AMPS:电力系统中的罕见事件:新颖的数学、统计和算法。
- 批准号:
2229012 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Estimation and Inference via Computational Statistics Algorithms
通过计算统计算法进行估计和推理
- 批准号:
RGPIN-2019-04142 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
AF: Small: Faster Algorithms for High-Dimensional Robust Statistics
AF:小:用于高维稳健统计的更快算法
- 批准号:
2122628 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
AF: Small: Faster Algorithms for High-Dimensional Robust Statistics
AF:小:用于高维稳健统计的更快算法
- 批准号:
2307106 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Leveraging k-mer sketching statistics to enhance metagenomic methods and alignment algorithms
利用 k-mer 草图统计来增强宏基因组方法和比对算法
- 批准号:
10675449 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Advanced Algorithms, Statistics, and Computing for Astrophysics
天体物理学的高级算法、统计和计算
- 批准号:
RGPIN-2020-04254 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
Estimation and Inference via Computational Statistics Algorithms
通过计算统计算法进行估计和推理
- 批准号:
RGPIN-2019-04142 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
Advanced Algorithms, Statistics, and Computing for Astrophysics
天体物理学的高级算法、统计和计算
- 批准号:
RGPIN-2020-04254 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
CAREER:Reducibility among high-dimensional statistics problems: information preserving mappings, algorithms, and complexity.
职业:高维统计问题的可归约性:信息保存映射、算法和复杂性。
- 批准号:
1940205 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant