RUI: Classification, regression, and density estimation with missing variables

RUI:分类、回归和缺失变量的密度估计

基本信息

项目摘要

This project develops statistical theory and methods for nonparametric classification and curve estimation in the presence of missing or incomplete data. Many data sets have missing values; these include the data from biomedical studies, remote sensing, as well as social sciences. There are a number of classical approaches for handing the missing data. Many of the existing results first impute for the missing values and then apply a standard statistical technique to carry out inferences. However, a study of the theoretical validity of such techniques can become intractable due to the loss of independence assumption in the data; this is particularly true for distribution-free statistical methods. The new results of the Principal Investigator (PI) will answer a number of fundamental questions in statistical classification and pattern recognition with applications to biomedical, remote sensing, and social sciences. The new results will also solve many important theoretical problems at the intersection of machine learning and statistical classification.A long-standing problem in classification with missing covariates involves the situation where missing covariates can appear in both the data and in the new unclassified observation. This is fundamentally different from the simpler problem where missing covariates appear in the data only. In the latter case, standard methods based on Horvitz-Thompson inverse weighting can be used to construct asymptotically optimal classifiers. One part of the PI's research project focuses on this challenging case of classification with missing covariates. The PI will develop new asymptotically optimal local-averaging-type classifiers, such as kernel and partitioning rules. Another part of this project concentrates on the continuation and refinements of the PI's previous efforts on combined classification and estimation, based on recently obtained results in the literature. The PI is currently developing new methods to combine several individual classifiers in such a way that the asymptotic error of the resulting classifier will be at least as good as that of the best individual classifier. The PI will also develop methods to combine several regression function estimators in an optimal way. Tools from the empirical process theory will be used to establish the large-sample optimality of the resulting classifiers and estimators. The third part of the project focuses on the weak convergence of various norms of kernel density estimates in the presence of missing data. The PI will study weighted bootstrap approximations of these statistics. Such results will allow someone to construct correct confidence bands for the unknown density in the presence of missing values. The main tools here are the strong approximation theorems that allow one to replace the weighted bootstrapped empirical processes by a sequence of Brownian bridges.
该项目开发了在存在缺失或不完整数据的情况下进行非参数分类和曲线估计的统计理论和方法。许多数据集都有缺失值;其中包括来自生物医学研究、遥感以及社会科学的数据。有许多处理丢失数据的经典方法。 许多现有结果首先估算缺失值,然后应用标准统计技术进行推断。 然而,由于数据中独立性假设的丧失,对此类技术的理论有效性的研究可能会变得棘手;对于无分布统计方法尤其如此。首席研究员(PI)的新成果将回答统计分类和模式识别方面的许多基本问题,并应用于生物医学、遥感和社会科学。 新结果还将解决机器学习和统计分类交叉领域的许多重要理论问题。 缺失协变量分类中长期存在的问题涉及缺失协变量可能出现在数据和新的未分类观察中的情况。这与缺失协变量仅出现在数据中的更简单问题有根本的不同。在后一种情况下,基于 Horvitz-Thompson 逆加权的标准方法可用于构造渐近最优分类器。 PI 研究项目的一部分重点关注协变量缺失的分类这一具有挑战性的案例。 PI 将开发新的渐近最优局部平均型分类器,例如内核和划分规则。 该项目的另一部分重点是根据最近获得的文献结果,继续和完善 PI 之前在组合分类和估计方面的努力。 PI 目前正在开发新方法来组合多个单独的分类器,从而使所得分类器的渐近误差至少与最佳单独分类器的渐近误差一样好。 PI 还将开发以最佳方式组合多个回归函数估计器的方法。经验过程理论的工具将用于建立所得分类器和估计器的大样本最优性。该项目的第三部分重点关注在存在缺失数据的情况下核密度估计的各种规范的弱收敛性。 PI 将研究这些统计数据的加权引导近似值。这样的结果将允许人们在存在缺失值的情况下为未知密度构建正确的置信带。这里的主要工具是强近似定理,它允许人们用一系列布朗桥代替加权自举经验过程。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Majid Mojirsheibani其他文献

On the correct regression function (in <em>L</em><sub>2</sub>) and its applications when the dimension of the covariate vector is random
  • DOI:
    10.1016/j.jspi.2012.03.017
  • 发表时间:
    2012-09-01
  • 期刊:
  • 影响因子:
  • 作者:
    Majid Mojirsheibani
  • 通讯作者:
    Majid Mojirsheibani
On the $$L_p$$ norms of kernel regression estimators for incomplete data with applications to classification
  • DOI:
    10.1007/s10260-016-0359-6
  • 发表时间:
    2016-04-05
  • 期刊:
  • 影响因子:
    0.800
  • 作者:
    Timothy Reese;Majid Mojirsheibani
  • 通讯作者:
    Majid Mojirsheibani
A Note on the Strong Approximation of the Smoothed Empirical Process of α-mixing Sequences

Majid Mojirsheibani的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Majid Mojirsheibani', 18)}}的其他基金

RUI: Predictive models with Incomplete and Fragmented Observations, and New Advances in Virtual Re-sampling for Big Data
RUI:具有不完整和碎片观测的预测模型,以及大数据虚拟重采样的新进展
  • 批准号:
    2310504
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
RUI: Partially Observed Curves, and Big-Data Virtual Bootstrap
RUI:部分观察曲线和大数据虚拟引导程序
  • 批准号:
    1916161
  • 财政年份:
    2019
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant

相似海外基金

Functional Regression and Classification for Data Supported on Complex Geometries
复杂几何形状支持的数据的函数回归和分类
  • 批准号:
    2210064
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Visuospatial deficits after stroke: Towards better classification, diagnostics, and rehabilitation.
中风后视觉空间缺陷:更好的分类、诊断和康复。
  • 批准号:
    10440965
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
A System for Xerostomia Risk Classification after Head & Neck Cancer Radiotherapy
头后口干症风险分类系统
  • 批准号:
    10410192
  • 财政年份:
    2021
  • 资助金额:
    $ 12万
  • 项目类别:
A System for Xerostomia Risk Classification after Head & Neck Cancer Radiotherapy
头后口干症风险分类系统
  • 批准号:
    10255864
  • 财政年份:
    2021
  • 资助金额:
    $ 12万
  • 项目类别:
Robust sparse partial least squares regression and classification
鲁棒稀疏偏最小二乘回归和分类
  • 批准号:
    487299-2016
  • 财政年份:
    2019
  • 资助金额:
    $ 12万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Robust sparse partial least squares regression and classification
鲁棒稀疏偏最小二乘回归和分类
  • 批准号:
    487299-2016
  • 财政年份:
    2018
  • 资助金额:
    $ 12万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Robust sparse partial least squares regression and classification
鲁棒稀疏偏最小二乘回归和分类
  • 批准号:
    487299-2016
  • 财政年份:
    2017
  • 资助金额:
    $ 12万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Robust sparse partial least squares regression and classification
鲁棒稀疏偏最小二乘回归和分类
  • 批准号:
    487299-2016
  • 财政年份:
    2016
  • 资助金额:
    $ 12万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Pattern Classification Using Magnetic Resonance Imaging in Traumatic Brain Injury
使用磁共振成像对创伤性脑损伤进行模式分类
  • 批准号:
    9295067
  • 财政年份:
    2016
  • 资助金额:
    $ 12万
  • 项目类别:
Establishment of classification and regression tree model for assessing of a risk for future occurence of life-style related disease using a community-based cohort data.
建立分类和回归树模型,用于使用基于社区的队列数据评估未来发生生活方式相关疾病的风险。
  • 批准号:
    25460768
  • 财政年份:
    2013
  • 资助金额:
    $ 12万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了