Non-Parametric Estimation under Shape/Norm Constraints

形状/范数约束下的非参数估计

基本信息

项目摘要

This project is about advancing methodology and theoretical understanding of certain algorithms that are widely used today in machine learning and statistics. Decision Trees are an important type of predictive modelling method. They have a long history and modern variations like random forest are among the most powerful techniques available. One part of the project would develop new and theoretically sound decision trees accompanied by software implementation. The other part of the project applies to signal processing. Total Variation Denoising is a popular method used in image processing to do noise removal. The current algorithm as it stands, is not fully automated. This project would develop a fully automated version of this algorithm which is theoretically valid. In the bigger picture, the research would generate improved versions of these time tested algorithms and our understanding of how and why they work would be refined. A main focus of the project is to study non parametric estimation of non smooth functions such as piecewise constant/linear/polynomial functions, in high dimensions. One major agenda here is to give theoretical guarantees for CART like estimators. These guarantees would provably demonstrate adaptivity to the number and arrangement of rectangular level sets of the regression function. Theoretical understanding of such adaptivity for CART like estimators are largely absent in the literature and this research should be a first step towards filling this gap. In this project, an extension of the Dyadic CART estimator, called Model Selection Cart (MS Cart) is proposed as a computationally and theoretically tractable method to achieve the desired adaptivity. We have already developed an algorithm (to be implemented and made publicly available), based on a dynamic programming approach, which provably computes the MS Cart estimator efficiently. We are currently working on showing theoretical guarantees for MS Cart. Another main focus of the proposal is to study the methodology of Total Variation Denoising (TVD). This technique is a non linear image denoising technique heavily used in the image processing community. The first problem talked about in this proposal, under this topic, is a step towards rigorous understanding of the statistical risk of the TVD estimator. Worst case analysis of this risk is now well understood in the literature. The research proposed here will go beyond worst case analysis and reveal the adaptivity of the TVD estimator. The second proposed problem deals with the very practical issue of choosing the tuning parameter for TVD in a fully data driven way. A new tuning parameter free estimator is proposed here whose practical performance has been thoroughly checked by us in simulations. We will show that our proposed estimator is minimax rate optimal while being fully data driven. Such an estimator is not available in the literature, as of now. This estimator has the potential to make the TVD methodology much more user friendly than it already is as choosing a tuning parameter can be a delicate issue. The PI will also address several non parametric estimation problems in settings such as quantile regression and in general exponential families. In particular, one agenda here is to study shape constrained estimation in quantile regression which would be a first step towards going beyond restrictive linear assumptions often made in the existing literature.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
这个项目是关于推进在机器学习和统计学中广泛使用的某些算法的方法论和理论理解。决策树是一种重要的预测建模方法。它们有着悠久的历史,像随机森林这样的现代变体是最强大的技术之一。项目的一部分将开发新的和理论上合理的决策树,并伴随着软件实现。项目的另一部分应用于信号处理。全变分去噪是图像处理中常用的去噪方法。目前的算法并不是完全自动化的。该项目将开发该算法的全自动版本,这在理论上是有效的。从更大的角度来看,这项研究将产生这些经过时间考验的算法的改进版本,我们对它们如何以及为什么工作的理解将得到改进。该项目的一个主要重点是研究非光滑函数的非参数估计,如分段常数/线性/多项式函数,在高维。这里的一个主要议程是为CART之类的估算器提供理论保证。这些保证证明了回归函数对矩形水平集的数量和排列的适应性。对CART类估计器的这种适应性的理论理解在文献中很大程度上是缺乏的,这项研究应该是填补这一空白的第一步。在这个项目中,提出了二元CART估计器的扩展,称为模型选择CART (MS CART),作为一种计算和理论上易于处理的方法来实现期望的自适应性。我们已经开发了一种基于动态规划方法的算法(即将实现并公开可用),它可以有效地计算MS Cart估计器。我们目前正在努力展示MS购物车的理论保证。本文的另一个重点是研究全变差去噪(TVD)方法。该技术是图像处理界广泛使用的一种非线性图像去噪技术。在本主题下,本提案中讨论的第一个问题是朝着严格理解TVD估计器的统计风险迈出的一步。这种风险的最坏情况分析现在在文献中得到了很好的理解。本文提出的研究将超越最坏情况分析,并揭示TVD估计器的自适应性。第二个问题处理了一个非常实际的问题,即以完全数据驱动的方式选择TVD的调谐参数。本文提出了一种新的无调谐参数估计器,并通过仿真对其实际性能进行了验证。我们将证明我们提出的估计器在完全数据驱动的情况下是最小最大速率最优的。到目前为止,在文献中还没有这样的估计器。这个估计器有潜力使TVD方法比现在更加用户友好,因为选择调优参数可能是一个微妙的问题。PI还将解决诸如分位数回归和一般指数族等设置中的几个非参数估计问题。特别是,这里的一个议程是研究分位数回归中的形状约束估计,这将是超越现有文献中经常做出的限制性线性假设的第一步。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sabyasachi Chatterjee其他文献

Synergism of VAM and Rhizobium on Production and Metabolism of IAA in Roots and Root Nodules of Vigna Mungo
丛枝菌根真菌和根瘤菌在绿豆根和根瘤中对吲哚乙酸产生和代谢的协同作用
  • DOI:
    10.1007/s00284-010-9597-2
  • 发表时间:
    2010-03-21
  • 期刊:
  • 影响因子:
    2.600
  • 作者:
    Jayanta Chakrabarti;Sabyasachi Chatterjee;Sisir Ghosh;Narayan Chandra Chatterjee;Sikha Dutta
  • 通讯作者:
    Sikha Dutta
Computing singularly perturbed differential equations
  • DOI:
    10.1016/j.jcp.2017.10.025
  • 发表时间:
    2018-02-01
  • 期刊:
  • 影响因子:
  • 作者:
    Sabyasachi Chatterjee;Amit Acharya;Zvi Artstein
  • 通讯作者:
    Zvi Artstein
In silico approach on structural and functional characterization of heat shock protein from Sulfobacillus acidophilus
  • DOI:
    10.1007/s13353-025-00964-6
  • 发表时间:
    2025-04-15
  • 期刊:
  • 影响因子:
    1.900
  • 作者:
    Pritish Mitra;Sabyasachi Chatterjee
  • 通讯作者:
    Sabyasachi Chatterjee
Enhancing agriculture production through smart assessment of soil nutrients
通过智能评估土壤养分提高农业产量
  • DOI:
    10.1080/15427528.2024.2355249
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    1.3
  • 作者:
    Kedar Purohit;Ashish Kumar Singh;Sabyasachi Chatterjee
  • 通讯作者:
    Sabyasachi Chatterjee
A molecular docking study between heavy metals and hydrophilic Hsp70 protein to explore binding pockets
重金属与亲水性 Hsp70 蛋白之间的分子对接研究,探索结合口袋
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Pritish Mitra;Sourav Singha;Payel Roy;Deblina Saha;Sabyasachi Chatterjee
  • 通讯作者:
    Sabyasachi Chatterjee

Sabyasachi Chatterjee的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

A shape-constrained approach for non-parametric variance estimation for Markov Chains
马尔可夫链非参数方差估计的形状约束方法
  • 批准号:
    2311141
  • 财政年份:
    2023
  • 资助金额:
    $ 16万
  • 项目类别:
    Continuing Grant
Non-parametric estimation under covariate shift: From fundamental bounds to efficient algorithms
协变量平移下的非参数估计:从基本界限到高效算法
  • 批准号:
    2311072
  • 财政年份:
    2023
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
Non-parametric identification, estimation and inference: generalized functions approach
非参数识别、估计和推理:广义函数方法
  • 批准号:
    RGPIN-2020-05444
  • 财政年份:
    2022
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
Non-parametric identification, estimation and inference: generalized functions approach
非参数识别、估计和推理:广义函数方法
  • 批准号:
    RGPIN-2020-05444
  • 财政年份:
    2021
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
Non-parametric identification, estimation and inference: generalized functions approach
非参数识别、估计和推理:广义函数方法
  • 批准号:
    RGPIN-2020-05444
  • 财政年份:
    2020
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
Non-Parametric Estimation of Scoring Rates
评分率的非参数估计
  • 批准号:
    542732-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 16万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Master's
Estimation, model selection and inference in two classes of non- and semi-parametric models for repeated measurements
用于重复测量的两类非参数和半参数模型的估计、模型选择和推理
  • 批准号:
    1306972
  • 财政年份:
    2013
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
Non-parametric estimation of forecast distributions in non-Gaussian state space models
非高斯状态空间模型中预测分布的非参数估计
  • 批准号:
    DP0985234
  • 财政年份:
    2009
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Projects
On non parametric estimation of hazard rate function
危险率函数的非参数估计
  • 批准号:
    383089-2009
  • 财政年份:
    2009
  • 资助金额:
    $ 16万
  • 项目类别:
    University Undergraduate Student Research Awards
Contributions to analysis of longitudinal survey data and non-parametric curve estimation
对纵向调查数据分析和非参数曲线估计的贡献
  • 批准号:
    304715-2004
  • 财政年份:
    2006
  • 资助金额:
    $ 16万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了