Variable Selection via Measurement Error Modeling

通过测量误差建模进行变量选择

基本信息

  • 批准号:
    1406456
  • 负责人:
  • 金额:
    $ 30万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-07-01 至 2019-06-30
  • 项目状态:
    已结题

项目摘要

Technological advances make it possible to collect and store enormous amounts of data. The implications for how businesses run (online retailing, precision manufacturing), how science is conducted (environmental science, climate monitoring and modeling, astrophysics), and how governments operate (health care delivery, public safety, homeland security) are comparably enormous. However, for many particular uses of massive data sets, not all of the available information is relevant; and a key first step in many big-data explorations is the identification of the most relevant subset of information required to address the particular question at hand. For example, when studying certain diseases, it is essential to first identify the most relevant risk factors and precursors. The more information that is available, the more difficult it is to identify the most relevant subset for a particular purpose, akin to the problem of finding a needle in a haystack. Just as a threshing machine separates the wheat from the chaff, the research in this project will develop statistical methods that separate the relevant information (the wheat) from that information that is not relevant (the chaff), thereby enabling more focused and productive analyses of large data sets.More specifically, the research in this project will develop methods for identifying the subset of information that is most relevant when the data are used to derive a regression/prediction model or algorithm. In this case the problem of separating the wheat from the chaff is the often-studied problem of variable selection. This project will develop a new approach to variable selection that differs conceptually from existing approaches and promises to offer new insights as well as new methodologies. The new approach is based on the intuitive and universally relevant idea that a non-informative variable can be contaminated with noise without a subsequent loss of predictive power; whereas any amount of contamination to an informative predictor necessarily entails a loss of predictive power. Starting from the noise-contamination idea of variable informativeness, the project shows how the theory, methods, and algorithms from the field of measurement error modeling can be used to develop new methods of variable selection applicable across the full spectrum of model- and algorithmic-based prediction methods. Instances of the general strategy will be studied and refined for several particular prediction methods such as: nonparametric regression (based on splines, or kernels, etc.); classification/regression trees; dimension reduction methods (principle components, partial least squares, SIR, etc.); bagged or model-averaged predictors of any type; and ridge regression.
技术进步使收集和存储大量数据成为可能。 这对企业如何运作(在线零售、精密制造)、科学如何运作(环境科学、气候监测和建模、天体物理学)以及政府如何运作(医疗保健提供、公共安全、国土安全)的影响是巨大的。 然而,对于海量数据集的许多特定用途,并非所有可用信息都是相关的;许多大数据探索的关键第一步是确定解决手头特定问题所需的最相关信息子集。例如,在研究某些疾病时,必须首先确定最相关的风险因素和前兆。可用的信息越多,识别与特定目的最相关的子集就越困难,类似于大海捞针的问题。 就像一台分离机将小麦从谷壳中分离出来一样,本项目的研究将开发出分离相关信息的统计方法(小麦)从那些不相关的信息中(谷壳),从而能够对大型数据集进行更有针对性和更有成效的分析。更具体地说,该项目的研究将开发方法,用于识别当数据用于推导回归/预测模型或算法时最相关的信息子集。在这种情况下,将小麦从谷壳中分离出来的问题是经常研究的变量选择问题。该项目将开发一种新的变量选择方法,该方法在概念上不同于现有方法,并有望提供新的见解和新的方法。新方法是基于直观的和普遍相关的想法,即一个非信息变量可以被噪声污染,而没有随后的预测能力的损失;而任何数量的污染的信息预测必然会导致预测能力的损失。从变量信息量的噪声污染思想出发,该项目展示了测量误差建模领域的理论、方法和算法如何用于开发适用于基于模型和算法的全方位预测方法的变量选择新方法。将研究一般策略的简化,并针对几种特定的预测方法进行改进,例如:非参数回归(基于样条或核等);分类/回归树;降维方法(主成分、偏最小二乘法、SIR等);任何类型的袋装或模型平均预测因子;以及岭回归。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Leonard Stefanski其他文献

Leonard Stefanski的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Leonard Stefanski', 18)}}的其他基金

Fractional Ridge Regression
分数岭回归
  • 批准号:
    2310208
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
EMSW21-VIGRE Project: VIGRE-II - "Integrated and Mentored Program of Research and Education in Statistical Sciences" (IMPRESS)
EMSW21-VIGRE 项目:VIGRE-II -“统计科学研究与教育综合和指导计划”(IMPRESS)
  • 批准号:
    0354189
  • 财政年份:
    2004
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Regression and Deconvolution with Heteroscedastic Measurement Error
异方差测量误差的回归和反卷积
  • 批准号:
    0304900
  • 财政年份:
    2003
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Robust Statistics for Correlated Data
相关数据的稳健统计
  • 批准号:
    0204297
  • 财政年份:
    2002
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Measurement Error and Statistical Inference
数学科学:测量误差和统计推断
  • 批准号:
    9423706
  • 财政年份:
    1995
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Statistics Inference in the Presence of Measurement Error: II
数学科学:存在测量误差的统计推断:II
  • 批准号:
    9200915
  • 财政年份:
    1992
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Statistical Inference in the Presenceof Measurement Error
数学科学:存在测量误差的统计推断
  • 批准号:
    8613681
  • 财政年份:
    1986
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant

相似国自然基金

Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目
连锁群选育法(Linkage Group Selection)在柔嫩艾美耳球虫表型相关基因研究中应用
  • 批准号:
    30700601
  • 批准年份:
    2007
  • 资助金额:
    17.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Improving Science Communication via Results-Blind Selection: Lessons from Registered Reports
通过结果盲选改善科学传播:注册报告的经验教训
  • 批准号:
    2244878
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Data-driven selection of a convex loss function via shape-constrained estimation
通过形状约束估计来数据驱动选择凸损失函数
  • 批准号:
    2311299
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
The study for the augmentation of happiness via food selection behavior and mental by subjective and objective evaluations
通过主客观评价通过食物选择行为和心理增加幸福感的研究
  • 批准号:
    22K05844
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Better bioassays via designs for robots analyses with improved model selection and similarity bounds that limit potency bias
通过机器人分析设计实现更好的生物测定,并改进模型选择和限制效力偏差的相似性界限
  • 批准号:
    10155988
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
RUI: Evaluating selection via ocean acidification and evolutionary responses of two coastal fishes
RUI:通过海洋酸化和两种沿海鱼类的进化反应评估选择
  • 批准号:
    1948975
  • 财政年份:
    2020
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
  • 批准号:
    9788501
  • 财政年份:
    2018
  • 资助金额:
    $ 30万
  • 项目类别:
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
  • 批准号:
    10463845
  • 财政年份:
    2018
  • 资助金额:
    $ 30万
  • 项目类别:
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
  • 批准号:
    10241443
  • 财政年份:
    2018
  • 资助金额:
    $ 30万
  • 项目类别:
Characterization of regulatory mechanisms mediating microRNA selection, packaging and transfer via microvesicles
通过微泡介导 microRNA 选择、包装和转移的调控机制的表征
  • 批准号:
    349825282
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Research Grants
III: Small: A New Perspective on Grouped Variable Selection via Modern Optimization
III:小:通过现代优化进行分组变量选择的新视角
  • 批准号:
    1718258
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了