Post Model Selection Inference and Empirical Bayes Methods

模型选择后推理和经验贝叶斯方法

基本信息

  • 批准号:
    1007657
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-07-01 至 2014-06-30
  • 项目状态:
    已结题

项目摘要

Consider a standard Gaussian multiple regression model involving p independent covariates. In many applications a first step of the analysis is to reduce the data via model selection to one containing only a subset of these possible predictors. If the covariates are correlated, conventional inference based on the selected model may be invalid; for example, probabilities that confidence intervals cover the true parameter values for the selected model may be grossly overstated. The investigators propose a version of classical inference criteria and a corresponding method for guaranteeing that post selection inferences will be valid within these criteria. The inference is conservative in that it is valid independent of the model selection method that was used, and correct (though possibly conservative) marginal coverage is guaranteed for all parameter configurations. The procedure is algorithmically easy to describe. However in its optimal implementation requires numerical estimation of certain probabilities related to high dimensional Gaussian distributions, and feasible computation of these probabilities for larger values of p is an issue still under investigation. Notwithstanding certain useful asymptotic bounds can be derived, and some important special cases can be analyzed with greater precision. Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in applications involving such common procedures as the Analysis of Variance and multiple regression it is often the case that one or more model selection procedures are first undertaken in order to help determine a model for the analysis. This model selection is then followed by statistical tests and confidence intervals computed as if the final model had been chosen in advance of examining the data. Examples abound in the social sciences, in the econometric literature, in epidemiology and in genomics. This proposal begins by examining consequences of such a practice in order to categorize the degree to which it may be misleading and misguided. Without additional care the parameters being estimated are no longer well defined, and post-model-selection sampling distributions have properties that are very different from what would be the case without model selection. Statistical inference such as confidence intervals and statistical tests does not perform as is customarily assumed. Many authors have noted some or all of these problems, but have not proposed valid general statistical inference procedures to cope with the situation. The investigators propose and study a method that produces valid statistical inference within the models selected based on the observed data. The proposed approach is universally valid, independent of the procedure that was used to select the variables to be retained in the model. Thus, from this perspective it is not necessary to investigate the details of the various model selection proposals in current use. Nevertheless, certain models and model selection procedures do yield improved performance of our confidence interval proposal, and some aspects of this will naturally be included in our research. In particular some new model selection methods based on nonparametric Bayesian ideas will be investigated both for their ability to flexibly produce satisfactory models and from the perspective of post model selection inference. Extension of these post model selection ideas will also be explored in a variety of statistical settings beyond the most common Gaussian linear models that are the initial target of this proposal.
考虑一个包含p个独立协变量的标准高斯多元回归模型。在许多应用程序中,分析的第一步是通过模型选择将数据减少到只包含这些可能预测因子的一个子集。如果协变量是相关的,基于所选模型的常规推断可能无效;例如,置信区间覆盖所选模型的真实参数值的概率可能被严重夸大。研究人员提出了一个版本的经典推理标准和相应的方法,以保证后选择推理将在这些标准内有效。推断是保守的,因为它独立于所使用的模型选择方法是有效的,并且对于所有参数配置保证了正确的(尽管可能是保守的)边际覆盖。这个过程在算法上很容易描述。然而,在其最佳实现中,需要对与高维高斯分布相关的某些概率进行数值估计,并且对于较大p值的这些概率的可行计算仍在研究中。尽管如此,我们还是可以推导出一些有用的渐近界,并且可以更精确地分析一些重要的特殊情况。传统的统计推断要求在分析数据之前知道数据是如何产生的模型。然而,在涉及诸如方差分析和多元回归等常见程序的应用程序中,通常首先进行一个或多个模型选择程序,以帮助确定分析的模型。然后,在选择模型之后进行统计检验和计算置信区间,就好像在检查数据之前已经选择了最终模型一样。这样的例子在社会科学、计量经济学文献、流行病学和基因组学中比比皆是。本建议首先审查这种做法的后果,以便对其可能造成误导和误导的程度进行分类。如果没有额外的注意,被估计的参数就不再被很好地定义,并且模型选择后的抽样分布具有与没有模型选择的情况非常不同的属性。统计推断,如置信区间和统计检验,并不像通常假设的那样执行。许多作者已经注意到一些或所有这些问题,但没有提出有效的一般统计推断程序来处理这种情况。研究者提出并研究了一种方法,该方法可以根据观察到的数据在选择的模型中产生有效的统计推断。所提出的方法是普遍有效的,独立于用于选择要保留在模型中的变量的过程。因此,从这个角度来看,没有必要研究当前使用的各种模型选择建议的细节。然而,某些模型和模型选择程序确实提高了我们的置信区间建议的性能,这方面的一些方面自然会包括在我们的研究中。特别是基于非参数贝叶斯思想的一些新的模型选择方法,将从灵活地产生令人满意的模型的能力和从模型后选择推理的角度进行研究。这些后模型选择思想的扩展也将在各种统计设置中进行探索,而不是最常见的高斯线性模型,这是本提案的初始目标。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lawrence Brown其他文献

Surveillance results and bone effects in the Gulf War depleted uranium-exposed cohort
海湾战争贫铀暴露人群的监测结果和骨骼影响
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    M. McDiarmid;Marianne Cloeren;J. Gaitens;S. Hines;E. Streeten;Richard J. Breyer;Clayton H. Brown;M. Condon;T. Roth;M. Oliver;Lawrence Brown;M. Dux;M. Lewin;Frederick G. Strathmann;Maria A. Velez;P. Gucer
  • 通讯作者:
    P. Gucer
Correction to: Working with Misspecified Regression Models
  • DOI:
    10.1007/s10940-020-09464-8
  • 发表时间:
    2020-06-01
  • 期刊:
  • 影响因子:
    3.300
  • 作者:
    Richard Berk;Lawrence Brown;Andreas Buja;Edward George;Linda Zhao
  • 通讯作者:
    Linda Zhao
Biologic monitoring and surveillance results for the department of veterans affairs' depleted uranium cohort: Lessons learned from sustained exposure over two decades.
退伍军人事务部贫铀队列的生物监测和监测结果:二十年来持续暴露的经验教训。
  • DOI:
    10.1002/ajim.22435
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    3.5
  • 作者:
    M. McDiarmid;J. Gaitens;S. Hines;M. Condon;T. Roth;M. Oliver;P. Gucer;Lawrence Brown;J. Centeno;E. Streeten;K. Squibb
  • 通讯作者:
    K. Squibb
Health effects of depleted uranium on exposed Gulf War veterans.
贫铀对暴露的海湾战争退伍军人的健康影响。
  • DOI:
  • 发表时间:
    2000
  • 期刊:
  • 影响因子:
    8.3
  • 作者:
    M. McDiarmid;James P. Keogh;Frank J. Hooper;Kathleen McPhaul;K. Squibb;Robert L. Kane;R. DiPino;M. Kabat;Bruce Kaup;Larry D. Anderson;D. Hoover;Lawrence Brown;Matthew M. Hamilton;David Jacobson;Belton A. Burrows;Mark Walsh
  • 通讯作者:
    Mark Walsh
The Gulf War Depleted Uranium Cohort at 20 years: Bioassay Results and Novel Approaches to Fragment Surveillance
海湾战争 20 年后的贫铀队列:生物测定结果和碎片监视的新方法
  • DOI:
    10.1097/hp.0b013e31827b1740
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    2.2
  • 作者:
    M. McDiarmid;J. Gaitens;S. Hines;Richard J. Breyer;J. Wong;Susan M. Engelhardt;M. Oliver;P. Gucer;Robert L. Kane;A. Cernich;Bruce Kaup;D. Hoover;A. Gaspari;Juan Liu;Erin M. Harberts;Lawrence Brown;J. Centeno;Patrick J. Gray;Hanna Xu;K. Squibb
  • 通讯作者:
    K. Squibb

Lawrence Brown的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lawrence Brown', 18)}}的其他基金

Collaborative Research: Inference for Linear Model Parameters in Model-free Populations
合作研究:无模型群体中线性模型参数的推断
  • 批准号:
    1310795
  • 财政年份:
    2013
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Seventh International Workshop on Objective Bayesian Methodology; Philadelphia, PA
第七届客观贝叶斯方法论国际研讨会;
  • 批准号:
    0924257
  • 财政年份:
    2009
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Shrinkage Estimation in Modern Statistics
现代统计学中的收缩估计
  • 批准号:
    0707033
  • 财政年份:
    2007
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing grant
Prediction for Multi-factor Point Process Models
多因素点过程模型的预测
  • 批准号:
    0405716
  • 财政年份:
    2004
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Service Engineering of Human Tele-Queues: Empirically Based Stochastic Analysis of Telephone Call Centers
人工电话队列服务工程:基于经验的电话呼叫中心随机分析
  • 批准号:
    0223304
  • 财政年份:
    2002
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Asymptotic Equivalence in Nonparametric Function Problems-Theory and Applications
非参数函数问题中的渐近等价-理论与应用
  • 批准号:
    9971751
  • 财政年份:
    1999
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing grant
Dissertation Research: Making Ends Meet: Differences AmongYoruba Women in Benin in the use of a Multiple Enterprise Economic Strategy
论文研究:收支平衡:贝宁约鲁巴妇女在使用多元化企业经济战略方面的差异
  • 批准号:
    9711900
  • 财政年份:
    1997
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Three Topics in Mathematical Statistics
数学科学:数理统计的三个主题
  • 批准号:
    9626118
  • 财政年份:
    1996
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing grant
Mathematical Sciences: Investigations in Mathematical Statistics
数学科学:数理统计研究
  • 批准号:
    9596094
  • 财政年份:
    1994
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing grant
Mathematical Sciences: Investigations in Mathematical Statistics
数学科学:数理统计研究
  • 批准号:
    9310228
  • 财政年份:
    1993
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于术中实时影像的SAM(Segment anything model)开发AI指导房间隔穿刺位置决策的增强现实模型
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    40 万元
  • 项目类别:
应用Agent-Based-Model研究围术期单剂量地塞米松对手术切口愈合的影响及机制
  • 批准号:
    81771933
  • 批准年份:
    2017
  • 资助金额:
    50.0 万元
  • 项目类别:
    面上项目
基于Multilevel Model的雷公藤多苷致育龄女性闭经预测模型研究
  • 批准号:
    81503449
  • 批准年份:
    2015
  • 资助金额:
    18.0 万元
  • 项目类别:
    青年科学基金项目
基于非齐性 Makov model 建立病证结合的绝经后骨质疏松症早期风险评估模型
  • 批准号:
    30873339
  • 批准年份:
    2008
  • 资助金额:
    32.0 万元
  • 项目类别:
    面上项目

相似海外基金

Universal Model Selection Criteria for Scientific Machine Learning
科学机器学习的通用模型选择标准
  • 批准号:
    DE240100144
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Early Career Researcher Award
a model for supply chain integration strategy selection
供应链整合策略选择模型
  • 批准号:
    23K12537
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Extremal graphical model selection
极值图形模型选择
  • 批准号:
    568313-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Postdoctoral Fellowships
Advanced Monte Carlo methods for inference and model selection of dynamic systems
用于动态系统推理和模型选择的高级蒙特卡罗方法
  • 批准号:
    559741-2021
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
CAREER: Sparse Model Selection for Nonlinear Evolution Equations
职业:非线性演化方程的稀疏模型选择
  • 批准号:
    2331100
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Mathematical model of internal standard selection criteria for accurate quantitative analysis
精确定量分析内标选择标准的数学模型
  • 批准号:
    22K10604
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Regularization and approximation: statistical inference, model selection, and large data
正则化和近似:统计推断、模型选择和大数据
  • 批准号:
    RGPIN-2021-02618
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods, Computation, Model Selection and Goodness of Fit with Complex Data
复杂数据的贝叶斯方法、计算、模型选择和拟合优度
  • 批准号:
    RGPIN-2018-05008
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
Distance-based robust inferences and model selection for semiparametric models
半参数模型的基于距离的鲁棒推理和模型选择
  • 批准号:
    RGPIN-2018-04328
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
Developing methods for model selection in causal health analyses.
开发因果健康分析中模型选择的方法。
  • 批准号:
    2741534
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了