权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Statistical Learning from a Modern Perspective: Over-parameterization, Regularization, and Generalization

职业：现代视角下的统计学习：过度参数化、正则化和泛化

基本信息

批准号：
2143215
负责人：
Yuting Wei
金额：
$ 40万
依托单位：
University of Pennsylvania
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-09-01 至 2027-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2143215&HistoricalAwards=false
关键词：
CAREER Statistical Learning Modern Perspective

项目摘要

Statistical methods have been a major driving force towards interpretable, actionable, and trustworthy machine learning. However, the existing statistical theory remains highly inadequate in explaining many new phenomena that emerge, and become pervasive in modern machine learning applications. For instance, the prevalence of over-parameterized models (i.e., the ones that have more model parameters than samples) challenges our classical statistical insights about the bias-variance tradeoff; the fact that many learning algorithms exhibit favorable algorithmic regularization to alleviate overfitting is largely beyond the reach of previous statistical literature, and the unconventional shapes of the risk curves in modern applications puzzle many statisticians. Compared to the rich theory developed for classical settings, however, the statistical underpinnings for these curious yet mysterious phenomena remain far from sufficient. Motivated by this, the overarching goal of the project is to enrich the statistical foundation of machine learning by adapting it to contemporary settings, thereby bridging classical statistics and cutting-edge machine learning. In addition, the project will provide valuable opportunities for training students (particularly underrepresented groups) at all levels across multiple disciplines in the STEM field, and will exert scientific and societal impacts on several domains beyond the tasks described herein, including but not limited to neuroscience, online education, and equitable machine learning.Striving for interpretability and actionable insights, this project plans to revisit multiple classical statistical problems---ranging from minimum-norm interpolation, risk estimation, cross validation, kernel boosting, data-imbalanced classification, to transfer learning---with an emphasis on unveiling new insights for modern yet under-explored regimes. Several recurring themes include: (i) characterizing precise risk behavior in the face of large model complexity; (ii) reconciling the seemingly conflicting goals of over-parameterization and regularization; (iii) developing algorithm-specific statistical reasoning tools; and (iv) exploring the interplay between regularization and generalization. The project comprises three distinct yet related thrusts: (1) statistical insights for over-parameterization: which explores the prolific interplay between model complexity and out-of-sample performance; (2) algorithmic regularization via early stopping: which aims to develop statistical principles that underlie early stopping; (3) risk (non)-monotonicity with imbalanced data: which is motivated by the non-monotonicity of generalization errors in the sample size and pursues principled debiasing methods to rectify it. The project will develop a suite of statistical insights that can inform cutting-edge machine learning practice, as well as an array of statistical methodologies that will be practically appealing for modern data-driven applications.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

统计方法一直是实现可解释、可操作和可信赖的机器学习的主要推动力。然而，现有的统计理论仍然远远不足以解释许多新现象的出现，并在现代机器学习应用中变得普遍。例如，过度参数化模型（即模型参数多于样本的模型）的流行挑战了我们关于偏差-方差权衡的经典统计见解；事实上，许多学习算法表现出良好的算法正则化来缓解过拟合，这在很大程度上超出了以前的统计文献的范围，而现代应用中风险曲线的非常规形状令许多统计学家感到困惑。然而，与经典背景下发展起来的丰富理论相比，这些奇怪而神秘的现象的统计基础还远远不够。受此启发，该项目的总体目标是通过使机器学习适应当代环境来丰富机器学习的统计基础，从而将经典统计学与尖端机器学习联系起来。此外，该项目将为培训STEM领域多个学科的各级学生（特别是代表性不足的群体）提供宝贵的机会，并将在本文描述的任务之外的几个领域产生科学和社会影响，包括但不限于神经科学、在线教育和公平的机器学习。为了获得可解释性和可操作的见解，该项目计划重新审视多个经典统计问题——从最小范数插值、风险估计、交叉验证、核增强、数据不平衡分类到迁移学习——重点是为现代尚未充分探索的制度揭示新的见解。几个反复出现的主题包括：(i)在面对大型模型复杂性时精确描述风险行为；调和过度参数化和正则化这两个看似矛盾的目标；（iii）开发特定算法的统计推理工具；（iv）探索正则化和泛化之间的相互作用。该项目包括三个不同但相关的重点：(1)过度参数化的统计见解：探索模型复杂性和样本外性能之间的丰富相互作用；(2)通过早期停止的算法正则化：旨在制定早期停止的统计原理；(3)不平衡数据的风险（非）单调性：由样本量泛化误差的非单调性驱动，并采用原则性的去偏方法来纠正它。该项目将开发一套统计见解，可以为尖端的机器学习实践提供信息，以及一系列统计方法，这些方法将对现代数据驱动的应用程序具有实际吸引力。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（4）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

The Lasso with general Gaussian designs with applications to hypothesis testing

DOI：
10.1214/23-aos2327
发表时间：
2020-07
期刊：
ArXiv
影响因子：
0
作者：
Michael Celentano;A. Montanari;Yuting Wei
通讯作者：
Michael Celentano;A. Montanari;Yuting Wei

Softmax policy gradient methods can take exponential time to converge

DOI：
10.1007/s10107-022-01920-6
发表时间：
2021-02
期刊：
Mathematical Programming
影响因子：
2.7
作者：
Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
通讯作者：
Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen

Derandomizing Knockoffs

DOI：
10.1080/01621459.2021.1962720
发表时间：
2020-12
期刊：
Journal of the American Statistical Association
影响因子：
3.7
作者：
Zhimei Ren;Yuting Wei;E. Candès
通讯作者：
Zhimei Ren;Yuting Wei;E. Candès

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Yuting Wei其他文献

Advances in chondroitinase delivery for spinal cord repair.

软骨素酶递送用于脊髓修复的进展。

DOI：
10.31083/j.jin2104118
发表时间：
2022
期刊：
Journal of integrative neuroscience
影响因子：
1.8
作者：
Yuting Wei;Melissa R. Andrews
通讯作者：
Melissa R. Andrews

Improved design method for line gear pair based on screw theory

基于螺旋理论的线齿轮副改进设计方法

DOI：
10.1007/s12206-022-0324-2
发表时间：
2022-03
期刊：
Journal of Mechanical Science and Technology
影响因子：
1.6
作者：
Jiang Ding;Liwei Liu;Yuting Wei;Aiping Deng
通讯作者：
Aiping Deng

From Gauss to Kolmogorov: Localized Measures of Complexity for Ellipses

从高斯到柯尔莫哥洛夫：椭圆复杂性的局部度量

DOI：
发表时间：
2018
期刊：
Electronic Journal of Statistics
影响因子：
1.1
作者：
Yuting Wei;Billy Fang;M. Wainwright
通讯作者：
M. Wainwright

The promoting effects of pyriproxyfen on autophagy and apoptosis in silk glands of non-target insect silkworm, emBombyx mori/em

吡丙醚对非靶标昆虫家蚕丝腺自噬和凋亡的促进作用

DOI：
10.1016/j.pestbp.2023.105586
发表时间：
2023-11-01
期刊：
Pesticide Biochemistry and Physiology
影响因子：
4.000
作者：
Guoli Li;Yizhe Li;Chunhui He;Yuting Wei;Kunpei Cai;Qingyu Lu;Xuebin Liu;Yizhou Zhu;Kaizun Xu
通讯作者：
Kaizun Xu

Measurement of the half-life of 95mTc and the 96Ru (n, x) 95mTc reaction cross-section induced by D–T neutron with covariance analysis

DOI：
10.1140/epja/s10050-022-00879-4
发表时间：
2022
期刊：
The European Physical Journal A
影响因子：
作者：
Yuting Wei;Changlin Lan;Yujie Ge;Xianlin Yang;Liyang Jiang;Yangbo Nie;Xiaojun Li;Jiahao Wang;Gong Jiang;Xichao Ruan;Xiaolong Huang;Xiaodong Pan
通讯作者：
Xiaodong Pan