Variable Selection via Measurement Error Modeling
通过测量误差建模进行变量选择
基本信息
- 批准号:1406456
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-07-01 至 2019-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Technological advances make it possible to collect and store enormous amounts of data. The implications for how businesses run (online retailing, precision manufacturing), how science is conducted (environmental science, climate monitoring and modeling, astrophysics), and how governments operate (health care delivery, public safety, homeland security) are comparably enormous. However, for many particular uses of massive data sets, not all of the available information is relevant; and a key first step in many big-data explorations is the identification of the most relevant subset of information required to address the particular question at hand. For example, when studying certain diseases, it is essential to first identify the most relevant risk factors and precursors. The more information that is available, the more difficult it is to identify the most relevant subset for a particular purpose, akin to the problem of finding a needle in a haystack. Just as a threshing machine separates the wheat from the chaff, the research in this project will develop statistical methods that separate the relevant information (the wheat) from that information that is not relevant (the chaff), thereby enabling more focused and productive analyses of large data sets.More specifically, the research in this project will develop methods for identifying the subset of information that is most relevant when the data are used to derive a regression/prediction model or algorithm. In this case the problem of separating the wheat from the chaff is the often-studied problem of variable selection. This project will develop a new approach to variable selection that differs conceptually from existing approaches and promises to offer new insights as well as new methodologies. The new approach is based on the intuitive and universally relevant idea that a non-informative variable can be contaminated with noise without a subsequent loss of predictive power; whereas any amount of contamination to an informative predictor necessarily entails a loss of predictive power. Starting from the noise-contamination idea of variable informativeness, the project shows how the theory, methods, and algorithms from the field of measurement error modeling can be used to develop new methods of variable selection applicable across the full spectrum of model- and algorithmic-based prediction methods. Instances of the general strategy will be studied and refined for several particular prediction methods such as: nonparametric regression (based on splines, or kernels, etc.); classification/regression trees; dimension reduction methods (principle components, partial least squares, SIR, etc.); bagged or model-averaged predictors of any type; and ridge regression.
技术进步使收集和存储海量数据成为可能。这对企业如何运营(在线零售、精密制造)、科学如何进行(环境科学、气候监测和建模、天体物理学)以及政府如何运作(医疗保健提供、公共安全、国土安全)的影响相当巨大。然而,对于海量数据集的许多特定用途,并不是所有可用的信息都是相关的;在许多大数据探索中,关键的第一步是确定解决手头特定问题所需的最相关的信息子集。例如,在研究某些疾病时,必须首先确定最相关的风险因素和前兆。可获得的信息越多,识别与特定目的最相关的子集就越困难,就像大海捞针一样。就像脱粒机将小麦从谷壳中分离出来一样,该项目的研究将开发统计方法,将相关信息(小麦)与不相关的信息(谷壳)分开,从而能够对大数据集进行更有针对性和更有成效的分析。更具体地说,当使用数据来推导回归/预测模型或算法时,该项目的研究将开发出识别最相关的信息子集的方法。在这种情况下,小麦和谷壳的分离问题是经常研究的变量选择问题。这个项目将开发一种新的变量选择方法,在概念上不同于现有的方法,并承诺提供新的见解和新的方法。新方法基于一种直观且普遍相关的观点,即非信息性变量可能会被噪声污染,而不会随后失去预测能力;而信息性预测指标受到任何程度的污染都必然导致预测能力的损失。该项目从变量信息量的噪声污染思想出发,展示了如何使用测量误差建模领域的理论、方法和算法来开发适用于所有基于模型和基于算法的预测方法的变量选择的新方法。将为几种特定的预测方法研究和改进一般策略的实例,例如:非参数回归(基于样条或核等);分类/回归树;降维方法(主成分、偏最小二乘、SIR等);任何类型的袋装或模型平均预测值;以及岭回归。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Leonard Stefanski其他文献
Leonard Stefanski的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Leonard Stefanski', 18)}}的其他基金
EMSW21-VIGRE Project: VIGRE-II - "Integrated and Mentored Program of Research and Education in Statistical Sciences" (IMPRESS)
EMSW21-VIGRE 项目:VIGRE-II -“统计科学研究与教育综合和指导计划”(IMPRESS)
- 批准号:
0354189 - 财政年份:2004
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Regression and Deconvolution with Heteroscedastic Measurement Error
异方差测量误差的回归和反卷积
- 批准号:
0304900 - 财政年份:2003
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Mathematical Sciences: Measurement Error and Statistical Inference
数学科学:测量误差和统计推断
- 批准号:
9423706 - 财政年份:1995
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Mathematical Sciences: Statistics Inference in the Presence of Measurement Error: II
数学科学:存在测量误差的统计推断:II
- 批准号:
9200915 - 财政年份:1992
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Mathematical Sciences: Statistical Inference in the Presenceof Measurement Error
数学科学:存在测量误差的统计推断
- 批准号:
8613681 - 财政年份:1986
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似国自然基金
Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
连锁群选育法(Linkage Group Selection)在柔嫩艾美耳球虫表型相关基因研究中应用
- 批准号:30700601
- 批准年份:2007
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Improving Science Communication via Results-Blind Selection: Lessons from Registered Reports
通过结果盲选改善科学传播:注册报告的经验教训
- 批准号:
2244878 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Data-driven selection of a convex loss function via shape-constrained estimation
通过形状约束估计来数据驱动选择凸损失函数
- 批准号:
2311299 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
The study for the augmentation of happiness via food selection behavior and mental by subjective and objective evaluations
通过主客观评价通过食物选择行为和心理增加幸福感的研究
- 批准号:
22K05844 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Better bioassays via designs for robots analyses with improved model selection and similarity bounds that limit potency bias
通过机器人分析设计实现更好的生物测定,并改进模型选择和限制效力偏差的相似性界限
- 批准号:
10155988 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
RUI: Evaluating selection via ocean acidification and evolutionary responses of two coastal fishes
RUI:通过海洋酸化和两种沿海鱼类的进化反应评估选择
- 批准号:
1948975 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
- 批准号:
9788501 - 财政年份:2018
- 资助金额:
$ 30万 - 项目类别:
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
- 批准号:
10463845 - 财政年份:2018
- 资助金额:
$ 30万 - 项目类别:
Characterizing modes of natural selection via diverse ancient and modern samples
通过不同的古代和现代样本表征自然选择模式
- 批准号:
10241443 - 财政年份:2018
- 资助金额:
$ 30万 - 项目类别:
Characterization of regulatory mechanisms mediating microRNA selection, packaging and transfer via microvesicles
通过微泡介导 microRNA 选择、包装和转移的调控机制的表征
- 批准号:
349825282 - 财政年份:2017
- 资助金额:
$ 30万 - 项目类别:
Research Grants
III: Small: A New Perspective on Grouped Variable Selection via Modern Optimization
III:小:通过现代优化进行分组变量选择的新视角
- 批准号:
1718258 - 财政年份:2017
- 资助金额:
$ 30万 - 项目类别:
Standard Grant














{{item.name}}会员




