Deep Learning and Random Forests for High-Dimensional Regression

用于高维回归的深度学习和随机森林

基本信息

  • 批准号:
    1915932
  • 负责人:
  • 金额:
    $ 18万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-08-15 至 2020-11-30
  • 项目状态:
    已结题

项目摘要

This project aims to investigate two of the most widely used and state-of-the-art methods for high-dimensional regression: deep neural networks and random forests. Despite their widespread implementation, pinning down their theoretical properties has eluded researchers until recently. The proposed research aims to add to the growing body of literature on their analysis, by both developing tools of theoretical value and providing guarantees and guidance for practitioners and applied scientists who use these popular methods frequently in their work.The success of multi-layer networks has largely been buoyed by their ability to generalize well despite being able to fit most datasets, given enough parameters. This phenomenon is particularly striking when the input dimension is far greater than the available sample size, as is the case with many modern applications in molecular biology, medical imaging, and astrophysics, to name a few. A major component of the proposed work will be to obtain complexity bounds for classes of deep neural networks with controls on the size of their weights, which can then be used to bound generalization error and statistical risk. These complexity bounds reveal the role of complexity penalization, which is based on certain norms of the weights of the network. Motivated by these observations, another stream of the proposed research seeks to provide statistical guarantees of certain complexity penalized estimators and their adaptive properties. Current theoretical results for random forests are either for stylized versions of those that are used in practice or are asymptotic in nature and it is therefore difficult to determine the quality of convergence as a function of the parameters of the random forest. Furthermore, the setting for the analysis of more practical implementations of random forests is limited to structured, fixed-dimensional regression function classes. Given these restrictions, the first component of the proposal aims to investigate how random forests behave in the high-dimensional regime when the number of predictors grows with the sample size. Another research objective is to isolate and study families of flexible high-dimensional regression functions for which finite sample convergence rates can be established. The final endeavor of this project is to connect popular measures of variable importance to the bias of random forests. Since variable importance measures are used for assessing the role each predictor variable plays in influencing the output, this connection will partially explain why random forests are adaptive to sparsity. The relationship will also help to theoretically motivate variable importance measures as useful tools for model interpretability.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在研究两种最广泛使用和最先进的高维回归方法:深度神经网络和随机森林。尽管它们得到了广泛的应用,但直到最近,研究人员才确定它们的理论性质。该研究旨在通过开发具有理论价值的工具,并为经常使用这些流行方法的实践者和应用科学家提供保证和指导,来增加越来越多的关于其分析的文献。多层网络的成功在很大程度上得益于它们的泛化能力,尽管它们能够适应大多数数据集,给定足够的参数。当输入维度远远大于可用样本大小时,这种现象尤其引人注目,例如分子生物学,医学成像和天体物理学等许多现代应用中的情况。拟议工作的一个主要组成部分将是获得深度神经网络类的复杂性界限,并控制其权重的大小,然后可用于限制泛化误差和统计风险。这些复杂性界限揭示了复杂性惩罚的作用,这是基于网络权重的某些规范。受这些观察的启发,另一个流的拟议研究旨在提供某些复杂性惩罚估计和自适应特性的统计保证。目前的随机森林的理论结果是程式化的版本,在实践中使用的或渐近的性质,因此很难确定的随机森林的参数的函数的收敛质量。此外,分析随机森林更实际的实现的设置仅限于结构化的、固定维度的回归函数类。考虑到这些限制,该提案的第一个组成部分旨在研究当预测因子的数量随着样本大小而增长时,随机森林在高维制度中的行为。另一个研究目标是分离和研究灵活的高维回归函数的家庭,有限样本收敛速度可以建立。这个项目的最后奋进是将流行的变量重要性度量与随机森林的偏差联系起来。由于变量重要性度量用于评估每个预测变量在影响输出中所起的作用,这种联系将部分解释为什么随机森林适应稀疏性。这种关系也将有助于从理论上激励变量重要性措施作为模型可解释性的有用工具。该奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing
  • DOI:
    10.1109/tit.2020.3025272
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Bu, Zhiqi;Klusowski, Jason M.;Su, Weijie J.
  • 通讯作者:
    Su, Weijie J.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jason Klusowski其他文献

Jason Klusowski的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jason Klusowski', 18)}}的其他基金

CAREER: Statistical Learning with Recursive Partitioning: Algorithms, Accuracy, and Applications
职业:递归分区的统计学习:算法、准确性和应用
  • 批准号:
    2239448
  • 财政年份:
    2023
  • 资助金额:
    $ 18万
  • 项目类别:
    Continuing Grant
Deep Learning and Random Forests for High-Dimensional Regression
用于高维回归的深度学习和随机森林
  • 批准号:
    2054808
  • 财政年份:
    2020
  • 资助金额:
    $ 18万
  • 项目类别:
    Continuing Grant

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队
Understanding structural evolution of galaxies with machine learning
  • 批准号:
    n/a
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
煤矿安全人机混合群智感知任务的约束动态多目标Q-learning进化分配
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于领弹失效考量的智能弹药编队短时在线Q-learning协同控制机理
  • 批准号:
    62003314
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
集成上下文张量分解的e-learning资源推荐方法研究
  • 批准号:
    61902016
  • 批准年份:
    2019
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
具有时序迁移能力的Spiking-Transfer learning (脉冲-迁移学习)方法研究
  • 批准号:
    61806040
  • 批准年份:
    2018
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
基于Deep-learning的三江源区冰川监测动态识别技术研究
  • 批准号:
    51769027
  • 批准年份:
    2017
  • 资助金额:
    38.0 万元
  • 项目类别:
    地区科学基金项目
具有时序处理能力的Spiking-Deep Learning(脉冲深度学习)方法研究
  • 批准号:
    61573081
  • 批准年份:
    2015
  • 资助金额:
    64.0 万元
  • 项目类别:
    面上项目
基于有向超图的大型个性化e-learning学习过程模型的自动生成与优化
  • 批准号:
    61572533
  • 批准年份:
    2015
  • 资助金额:
    66.0 万元
  • 项目类别:
    面上项目
E-Learning中学习者情感补偿方法的研究
  • 批准号:
    61402392
  • 批准年份:
    2014
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

EAGER: IMPRESS-U: Random Matrix Theory and its Applications to Deep Learning
EAGER:IMPRESS-U:随机矩阵理论及其在深度学习中的应用
  • 批准号:
    2401227
  • 财政年份:
    2024
  • 资助金额:
    $ 18万
  • 项目类别:
    Standard Grant
DeepMARA - Deep Reinforcement Learning based Massive Random Access Toward Massive Machine-to-Machine Communications
DeepMARA - 基于深度强化学习的大规模随机访问实现大规模机器对机器通信
  • 批准号:
    EP/Y028252/1
  • 财政年份:
    2024
  • 资助金额:
    $ 18万
  • 项目类别:
    Fellowship
High-Dimensional Random Forests Learning, Inference, and Beyond
高维随机森林学习、推理及其他
  • 批准号:
    2310981
  • 财政年份:
    2023
  • 资助金额:
    $ 18万
  • 项目类别:
    Standard Grant
Collaborative Research: Bayesian Residual Learning and Random Recursive Partitioning Methods for Gaussian Process Modeling
合作研究:高斯过程建模的贝叶斯残差学习和随机递归划分方法
  • 批准号:
    2348163
  • 财政年份:
    2023
  • 资助金额:
    $ 18万
  • 项目类别:
    Standard Grant
FuSe: Co-designing Continual-Learning Edge Architectures with Hetero-Integrated Silicon-CMOS and Electrochemical Random-Access Memory
FuSe:利用异质集成硅 CMOS 和电化学随机存取存储器共同设计持续学习边缘架构
  • 批准号:
    2329096
  • 财政年份:
    2023
  • 资助金额:
    $ 18万
  • 项目类别:
    Continuing Grant
Machine learning methods for generation of random images and equilibrated configurations of gluon fields in Quantum Chromodynamics
量子色动力学中随机图像生成和胶子场平衡配置的机器学习方法
  • 批准号:
    2889923
  • 财政年份:
    2023
  • 资助金额:
    $ 18万
  • 项目类别:
    Studentship
Collaborative Research: Bayesian Residual Learning and Random Recursive Partitioning Methods for Gaussian Process Modeling
合作研究:高斯过程建模的贝叶斯残差学习和随机递归划分方法
  • 批准号:
    2152999
  • 财政年份:
    2022
  • 资助金额:
    $ 18万
  • 项目类别:
    Standard Grant
Collaborative Research: Bayesian Residual Learning and Random Recursive Partitioning Methods for Gaussian Process Modeling
合作研究:高斯过程建模的贝叶斯残差学习和随机递归划分方法
  • 批准号:
    2152998
  • 财政年份:
    2022
  • 资助金额:
    $ 18万
  • 项目类别:
    Standard Grant
RINGS: Walk For Resiliency & Privacy: A Random Walk Framework for Learning at the Edge
RINGS:步行以增强弹性
  • 批准号:
    2148182
  • 财政年份:
    2022
  • 资助金额:
    $ 18万
  • 项目类别:
    Continuing Grant
Discrete Random Variables in Machine Learning
机器学习中的离散随机变量
  • 批准号:
    518720-2018
  • 财政年份:
    2021
  • 资助金额:
    $ 18万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了