Variable Selection via Measurement Error Modeling

通过测量误差建模进行变量选择

基本信息

  • 批准号:
    1406456
  • 负责人:
  • 金额:
    $ 30万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-07-01 至 2019-06-30
  • 项目状态:
    已结题

项目摘要

Technological advances make it possible to collect and store enormous amounts of data. The implications for how businesses run (online retailing, precision manufacturing), how science is conducted (environmental science, climate monitoring and modeling, astrophysics), and how governments operate (health care delivery, public safety, homeland security) are comparably enormous. However, for many particular uses of massive data sets, not all of the available information is relevant; and a key first step in many big-data explorations is the identification of the most relevant subset of information required to address the particular question at hand. For example, when studying certain diseases, it is essential to first identify the most relevant risk factors and precursors. The more information that is available, the more difficult it is to identify the most relevant subset for a particular purpose, akin to the problem of finding a needle in a haystack. Just as a threshing machine separates the wheat from the chaff, the research in this project will develop statistical methods that separate the relevant information (the wheat) from that information that is not relevant (the chaff), thereby enabling more focused and productive analyses of large data sets.More specifically, the research in this project will develop methods for identifying the subset of information that is most relevant when the data are used to derive a regression/prediction model or algorithm. In this case the problem of separating the wheat from the chaff is the often-studied problem of variable selection. This project will develop a new approach to variable selection that differs conceptually from existing approaches and promises to offer new insights as well as new methodologies. The new approach is based on the intuitive and universally relevant idea that a non-informative variable can be contaminated with noise without a subsequent loss of predictive power; whereas any amount of contamination to an informative predictor necessarily entails a loss of predictive power. Starting from the noise-contamination idea of variable informativeness, the project shows how the theory, methods, and algorithms from the field of measurement error modeling can be used to develop new methods of variable selection applicable across the full spectrum of model- and algorithmic-based prediction methods. Instances of the general strategy will be studied and refined for several particular prediction methods such as: nonparametric regression (based on splines, or kernels, etc.); classification/regression trees; dimension reduction methods (principle components, partial least squares, SIR, etc.); bagged or model-averaged predictors of any type; and ridge regression.
技术进步使收集和存储大量数据成为可能。 企业如何运营(在线零售,精确制造),如何进行科学(环境科学,气候监测和建模,天体物理学)以及政府如何运作(卫生保健,公共安全,国土安全)的影响。 但是,对于大量数据集的许多特殊用途,并非所有可用信息都相关;许多大数据探索的关键第一步是确定解决当前特定问题所需的最相关的信息子集。例如,在研究某些疾病时,必须先识别最相关的风险因素和前体。可用的信息越多,为特定目的确定最相关的子集的困难,类似于在干草堆中找到针头的问题。 就像脱粒机将小麦与谷壳分开一样,该项目的研究将开发统计方法,将相关信息(小麦)与无关相关的信息(谷壳)区分开来,从而更加具体地对大型数据集进行更加专注和生产性分析。 算法。在这种情况下,将小麦与谷壳分开的问题是经常研究的可变选择问题。该项目将开发一种新的方法来选择可变选择,该方法在概念上与现有方法不同,并有望提供新的见解和新方法。新方法是基于直观且普遍相关的思想,即非信息变量可以被噪声污染而不会随后失去预测能力。而对信息丰富的预测因子的任何数量污染必然会导致预测能力的损失。从可变信息性的噪声污染概念开始,该项目显示了如何使用测量误差模型领域的理论,方法和算法来开发可在基于模型和算法的基于基于模型和算法的预测方法中适用于可变选择的新方法。将研究和完善一般策略的实例,以进行几种特定的预测方法,例如:非参数回归(基于花纹或核等);分类/回归树;降低方法(原理组件,部分最小二乘,SIR等);任何类型的装袋或模型平均预测指标;和山脊回归。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Leonard Stefanski其他文献

Leonard Stefanski的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Leonard Stefanski', 18)}}的其他基金

Fractional Ridge Regression
分数岭回归
  • 批准号:
    2310208
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
EMSW21-VIGRE Project: VIGRE-II - "Integrated and Mentored Program of Research and Education in Statistical Sciences" (IMPRESS)
EMSW21-VIGRE 项目:VIGRE-II -“统计科学研究与教育综合和指导计划”(IMPRESS)
  • 批准号:
    0354189
  • 财政年份:
    2004
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Regression and Deconvolution with Heteroscedastic Measurement Error
异方差测量误差的回归和反卷积
  • 批准号:
    0304900
  • 财政年份:
    2003
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Robust Statistics for Correlated Data
相关数据的稳健统计
  • 批准号:
    0204297
  • 财政年份:
    2002
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Measurement Error and Statistical Inference
数学科学:测量误差和统计推断
  • 批准号:
    9423706
  • 财政年份:
    1995
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Statistics Inference in the Presence of Measurement Error: II
数学科学:存在测量误差的统计推断:II
  • 批准号:
    9200915
  • 财政年份:
    1992
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Statistical Inference in the Presenceof Measurement Error
数学科学:存在测量误差的统计推断
  • 批准号:
    8613681
  • 财政年份:
    1986
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant

相似国自然基金

MAOA通过抑制ASC选择性自噬降解加剧骨关节炎进程及其机制研究
  • 批准号:
    82302731
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
PRMT5选择性剪接异构体通过甲基化PDCD4调控肝癌辐射敏感性的机制研究
  • 批准号:
    82304081
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
TRIM25介导的泛素化及ISGylation通过选择性剪接和糖代谢调控髓细胞分化
  • 批准号:
    82370111
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
拟南芥剪接因子SR蛋白通过选择性剪接调控获得性耐热的机理研究
  • 批准号:
    32300247
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

III: Small: A New Perspective on Grouped Variable Selection via Modern Optimization
III:小:通过现代优化进行分组变量选择的新视角
  • 批准号:
    1718258
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Variable Selection via Inverse Modeling for Detecting Nonlinear Relationships
通过逆向建模进行变量选择以检测非线性关系
  • 批准号:
    1613035
  • 财政年份:
    2016
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Exact methods for variable selection via mathematical programming
通过数学规划进行变量选择的精确方法
  • 批准号:
    26560165
  • 财政年份:
    2014
  • 资助金额:
    $ 30万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Collaborative Research: Penalization Methods for Screening, Variable Selection and Dimension Reduction in High-Dimensional Regression via Multiple Index Models
合作研究:通过多指标模型进行高维回归筛选、变量选择和降维的惩罚方法
  • 批准号:
    1107047
  • 财政年份:
    2011
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Collaborative Research: Penalization Methods for Screening, Variable Selection and Dimension Reduction in High-dimensional Regression via Multiple Index Models
合作研究:通过多指标模型进行高维回归筛选、变量选择和降维的惩罚方法
  • 批准号:
    1107029
  • 财政年份:
    2011
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了