权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Understanding Complexity and the Bias-Variance Tradeoff in High Dimensions: Theory and Data Evidence

理解高维度的复杂性和偏差-方差权衡：理论和数据证据

基本信息

批准号：
2015341
负责人：
Bin Yu
金额：
$ 30万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-07-01 至 2024-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2015341&HistoricalAwards=false
关键词：
Understanding Complexity Bias Variance Tradeoff

项目摘要

The past decade has witnessed a significant rise in the usage of very large machine-learning models in modern data problems; these models have shown success in a variety of tasks, such as image classification, language translation, and speech recognition. More recently, machine learning is entering new fields, such as robotics, autonomous driving, and medicine. However, these models are often not robust to perturbations and are vulnerable to attacks by adversaries. These shortcomings warrant an urgent and insightful understanding of the "black-box" nature of these models. The principal investigator plans to understand these models by characterizing their "complexity" in a technical manner. A new complexity measure, based on the principle of minimum description length, sheds insight into classical statistical foundations as well as informing how and when these new high-dimensional models will work. This novel complexity measure is promising to enable applications to mission-critical fields like precision medicine, where the collection of a labeled dataset is expensive, by sample-size calculations and improving model selection with limited data. This research has both theoretical and applied impacts in the fields of statistics and machine learning including deep learning. In the duration of the project, graduate students will be trained in theory, domain-driven data science, and open-source software development. The research will be further disseminated through courses, an upcoming book, and presentations at workshops and conferences.Deep neural networks (DNNs) in many cases generalize well in the sense that a DNN trained on one task often performs well on similar unseen data for the same task. They can do so despite being highly overparameterized, i.e., the number of parameters is much larger than the number of training samples. Occam's razor and the bias-variance trade-off wisdom suggest to prefer a simple model when choosing from amongst models of varying complexity with similar performance. The good performance of DNNs, despite the overparametrization, has led many researchers to question the validity of the classical statistical principle of bias-variance trade-off (and preferring a simple model) for high-dimensional settings common in modern machine learning (ML) and statistical tasks. In this project, the principal investigator begins by reconsidering the definition of a valid complexity measure – which forms the basis of Occam’s razor and the bias-variance trade-off principle – for high-dimensional models. Finding one such measure for high-dimensional models has remained a difficult task. Merely counting the number of parameters is not a valid complexity measure, especially when the number of training examples is small. The principle of minimum description length will be used to provide a systematic approach to understanding the complexity of high-dimensional linear models, kernel methods, and finally DNNs. The complexity measure will serve as a basis for understanding key concepts such as the bias-variance trade-off and for further analysis into high-dimensional models. The theoretical results will be augmented with an extensive set of data-inspired experiments. After establishing the bias-variance trade-off with the new complexity measures, these measures will then be investigated for (i) selecting a simple model from amongst a set of competitive models, where simple will be defined via the MDL-based complexity and not the number of parameters, and (ii) regularizing or pruning a large (pre-trained) model, for example, in a transfer learning setting with limited dataset, by trading off the training performance with the complexity of the model.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在过去的十年中，在现代数据问题中使用非常大的机器学习模型的情况显著增加;这些模型在各种任务中取得了成功，例如图像分类，语言翻译和语音识别。最近，机器学习正在进入新的领域，如机器人、自动驾驶和医学。然而，这些模型往往是不稳定的扰动，容易受到攻击的对手。这些缺点保证了对这些模型的“黑箱”性质的迫切和深刻的理解。首席研究员计划通过以技术方式描述其“复杂性”来理解这些模型。基于最小描述长度原则的一种新的复杂性度量，揭示了经典统计基础，并告知这些新的高维模型将如何以及何时工作。这种新颖的复杂性度量有望应用于精密医学等关键任务领域，其中通过样本大小计算和改进有限数据的模型选择来收集标记数据集是昂贵的。这项研究在统计学和机器学习（包括深度学习）领域具有理论和应用影响。在项目期间，研究生将接受理论、领域驱动的数据科学和开源软件开发方面的培训。这项研究将通过课程、即将出版的书籍以及研讨会和会议上的演讲进一步传播。在许多情况下，深度神经网络（DNN）的泛化能力很好，因为在一项任务上训练的DNN通常在相同任务的类似未知数据上表现良好。它们可以这样做，尽管它们被高度过度参数化，即，参数的数量比训练样本的数量大得多。奥卡姆剃刀和偏差-方差权衡智慧建议，在从具有相似性能的不同复杂性的模型中进行选择时，首选简单模型。DNN的良好性能，尽管过度参数化，导致许多研究人员质疑经典的偏差-方差权衡统计原理（并且更喜欢简单的模型）对于现代机器学习（ML）和统计任务中常见的高维设置的有效性。在这个项目中，首席研究员开始重新考虑有效复杂性度量的定义-这构成了奥卡姆剃刀和偏差-方差权衡原则的基础-对于高维模型。找到一个这样的措施高维模型仍然是一项艰巨的任务。仅仅计算参数的数量并不是一个有效的复杂性度量，特别是当训练样本的数量很小时。最小描述长度的原则将用于提供一种系统的方法来理解高维线性模型，核方法和DNN的复杂性。复杂性度量将作为理解关键概念（如偏差-方差权衡）和进一步分析高维模型的基础。理论结果将通过一系列广泛的数据启发实验来增强。在用新的复杂性度量建立偏差-方差权衡之后，然后将研究这些度量用于（i）从一组竞争模型中选择简单模型，其中简单将经由基于MDL的复杂性而不是参数的数量来定义，以及（ii）正则化或修剪大的（预训练的）模型，例如，在具有有限数据集的迁移学习设置中，该奖项反映了NSF的法定使命，并被认为是值得支持的，使用基金会的知识价值和更广泛的影响审查标准进行评估。

项目成果

期刊论文数量（6）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

MDI+: A Flexible Random Forest-Based Feature Importance Framework

MDI：一种灵活的基于随机森林的特征重要性框架

DOI：
发表时间：
2023
期刊：
arXivorg
影响因子：
0
作者：
Agarwal, Abhineet;Kenney, Ana M.;Tan, Yan Shuo;Tang, Tiffany M.;Yu, Bin
通讯作者：
Yu, Bin

Fast Interpretable Greedy-Tree Sums (FIGS)

快速可解释的贪婪树和（FIGS）

DOI：
发表时间：
2023
期刊：
ArXivorg
影响因子：
0
作者：
Tan, Yan Shuo;Singh, Chandan;Nasseri, Keyan;Agarwal, Abhineet;Duncan, James;Ronen, Omer;Epland, Matthew;Kornblith, Aaron;Yu, Bin
通讯作者：
Yu, Bin

The Three Stages of Learning Dynamics in High-dimensional Kernel Methods

高维核方法中学习动力学的三个阶段

DOI：
发表时间：
2021
期刊：
ArXivorg
影响因子：
0
作者：
Nikhil Ghosh, Song Mei
通讯作者：
Nikhil Ghosh, Song Mei

An investigation into the effects of pre-training data distributions for pathology report classification

预训练数据分布对病理报告分类影响的调查

DOI：
发表时间：
2023
期刊：
arXivorg
影响因子：
0
作者：
Hsu, Aliyah R.;Cherapanamjeri, Yeshwanth;Park, Briton;Naumann, Tristan;Odisho Anobel Y.;Yu, Bin
通讯作者：
Yu, Bin

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Bin Yu其他文献

Does ceruloplasmin differential express in the brain of Ts65Dn: a mouse mode of Down syndrome?

铜蓝蛋白在唐氏综合症小鼠模型 Ts65Dn 的大脑中是否存在差异表达？

DOI：
发表时间：
2014
期刊：
Neurological Sciences
影响因子：
3.3
作者：
Bin Yu;Jing Kong;Bao;Ziqi Zhu;Bin Zhang;Qiu;S. Shao
通讯作者：
S. Shao

A PILOT STUDY IN AN APPLICATION OF TEXT MINING TO LEARNING SYSTEM EVALUATION by NITSAWAN KATERATTANAKUL

文本挖掘在学习系统评估中的应用试点研究，作者：NITSAWAN KATERATTANAKUL

DOI：
发表时间：
2010
期刊：
影响因子：
0
作者：
Bin Yu
通讯作者：
Bin Yu

Lamellar gel containing emulsions as an effective carrier for stabilization and transdermal delivery of retinyl propionate

含有乳液的层状凝胶作为丙酸视黄酯的稳定和透皮递送的有效载体

DOI：
发表时间：
2023
期刊：
Colloids and Surfaces A: Physicochemical and Engineering Aspects
影响因子：
0
作者：
Yuyan Yang;Shaowei Yan;Bin Yu;Chang Gao;Kuan Chang;Jing Wang
通讯作者：
Jing Wang

Verifiable Visual Cryptography Based on Iterative Algorithm: Verifiable Visual Cryptography Based on Iterative Algorithm

基于迭代算法的可验证视觉密码：基于迭代算法的可验证视觉密码

DOI：
10.3724/sp.j.1146.2010.00270
发表时间：
2011
期刊：
影响因子：
0
作者：
Bin Yu;Jin;Liguo Fang
通讯作者：
Liguo Fang

Loc680254 regulates Schwann cell proliferation through Psrc1 and Ska1 as a microRNA sponge following sciatic nerve injury

Loc680254 在坐骨神经损伤后作为 microRNA 海绵通过 Psrc1 和 Ska1 调节雪旺细胞增殖

DOI：
10.1002/glia.24045
发表时间：
2021-06
期刊：
Glia
影响因子：
6.2
作者：
Chun Yao;Qihui Wang;Yaxian Wang;Jiancheng Wu;Xuemin Cao;Yan Lu;Yanping Chen;Wei Feng;Xiaosong Gu;Xin‐Peng Dun;Bin Yu
通讯作者：
Bin Yu