Novel p-Value Based Multiple Testing Methods for Variable Selection with False Discovery Rate Control
基于 p 值的新颖变量选择多重测试方法以及错误发现率控制
基本信息
- 批准号:2210687
- 负责人:
- 金额:$ 26.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-01 至 2025-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Multiple testing is one of the most common statistical challenges encountered in modern scientific investigations. This project aims at resolving some longstanding issues with application of multiple testing methods. One of these issues arises in the context of discovering, among a large collection of variables, those that are important influences on an outcome of interest. Inapplicability of standard multiple testing methods due to the unknown interdependency of the variables is such an issue. The methods under development aim to provide new approaches to discovering important variables no matter how the variables depend on each other, with the guarantee that, on average, only a small, controlled fraction of unimportant variables end up as false discoveries. An example application is in the identification of genetic variants which, among many thousands of them, can influence a certain disease. The new methods can aid in identifying genes as being relatively more relevant for therapeutic intervention. The fundamental theoretical and methodological ideas behind the development of these methods will be extended towards resolving similar issues with multiple testing methods in other experimental settings as well. The research to be carried out in the project will be incorporated into courses, benefiting the training of undergraduates and graduate students.This research project is focused on addressing important theoretical and methodological issues related to multiple testing. For instance, feature/variable selection under the setting of multiple linear regression with Gaussian noise, which plays an important role in data science and is a ubiquitous statistical framework in scientific investigations, is often framed as a multiple testing problem. A p-value based multiple testing method, irrespective of what error rate is being considered to control the falsely discovered important explanatory variables, capturing the correlation matrix of the explanatory variables in full without losing control over the error rate, would be most ideal. Unfortunately, such methods are yet to be developed in a non-asymptotic setting. Similarly, for the related problem of simultaneous testing of multivariate Gaussian means with non-diagonal correlation matrix, subject to a control of an error rate, a p-value based multiple testing method fully capturing the correlation information without losing control over that rate is largely absent from the literature. The challenges will be met by research that cross-fertilizes two seminal ideas on multiple inference: 1) the use of p-value based multiple testing methods to control false discoveries; and 2) the use of the knockoff of the design matrix for variable selection in linear regression settings. Concretely, the project aims at developing novel p-value based false discovery rate and other powerful error rates controlling multiple testing methods for 1) variable selection in multiple linear regression with Gaussian noise, both in low- and high-dimensional settings; and 2) simultaneous testing of multivariate Gaussian means with a general non-diagonal covariance matrix.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
多重检验是现代科学研究中遇到的最常见的统计挑战之一。该项目旨在解决多种测试方法应用中的一些长期存在的问题。其中一个问题是在大量变量中发现对感兴趣的结果有重要影响的变量。由于变量的未知相互依赖性,标准的多种测试方法不适用就是这样一个问题。正在开发的方法旨在提供新的方法来发现重要的变量,无论变量如何相互依赖,并保证平均而言,只有一小部分受控的不重要变量最终成为错误的发现。一个应用实例是在鉴定遗传变异中,在成千上万的遗传变异中,遗传变异可以影响某种疾病。新方法可以帮助识别与治疗干预相对更相关的基因。这些方法的发展背后的基本理论和方法论思想将扩展到解决类似的问题,在其他实验环境中的多种测试方法。该项目的研究成果将纳入课程,有利于本科生和研究生的培训,该研究项目的重点是解决与多重测试有关的重要理论和方法问题。例如,具有高斯噪声的多元线性回归设置下的特征/变量选择,在数据科学中起着重要作用,并且是科学调查中普遍存在的统计框架,通常被框定为多重测试问题。一个基于p值的多重检验方法,无论错误率被认为是控制错误发现的重要解释变量,捕捉解释变量的相关矩阵,而不会失去对错误率的控制,将是最理想的。不幸的是,这样的方法还有待开发的非渐近设置。类似地,对于具有非对角相关矩阵的多变量高斯均值的同时检验的相关问题,在误差率的控制下,基于p值的多重检验方法充分捕获相关信息而不失去对该速率的控制,在文献中基本上是不存在的。这些挑战将通过研究来应对,这些研究交叉了关于多重推理的两个开创性想法:1)使用基于p值的多重检验方法来控制错误发现; 2)使用设计矩阵的仿制品在线性回归设置中进行变量选择。具体地说,该项目旨在开发新的基于p值的错误发现率和其他强大的错误率控制多种测试方法,用于1)在低维和高维设置中具有高斯噪声的多元线性回归中的变量选择;和2)多元高斯均值与一般非线性的同时检验,该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sanat Sarkar其他文献
Sanat Sarkar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sanat Sarkar', 18)}}的其他基金
Collaborative Research: New Directions for Research on Some Large-Scale Multiple Testing Problems
协作研究:一些大规模多重测试问题研究的新方向
- 批准号:
1309273 - 财政年份:2013
- 资助金额:
$ 26.97万 - 项目类别:
Continuing Grant
Collaborative Research: Constructing New Multiple Testing Methods
协作研究:构建新的多重测试方法
- 批准号:
1006344 - 财政年份:2010
- 资助金额:
$ 26.97万 - 项目类别:
Standard Grant
Multiple Testing: Further Development Of Theory And Methodology
多重测试:理论和方法的进一步发展
- 批准号:
0603868 - 财政年份:2006
- 资助金额:
$ 26.97万 - 项目类别:
Standard Grant
New Problems in Multiple Hypotheses Testing
多重假设检验中的新问题
- 批准号:
0306366 - 财政年份:2003
- 资助金额:
$ 26.97万 - 项目类别:
Standard Grant
NSF-CBMS Regional Conference in Mathematical Sciences: New Horizons in Multiple Comparison Procedures August 13-17, 2001
NSF-CBMS 数学科学区域会议:多重比较程序的新视野 2001 年 8 月 13-17 日
- 批准号:
0086140 - 财政年份:2000
- 资助金额:
$ 26.97万 - 项目类别:
Standard Grant
相似国自然基金
基于时间序列间分位相依性(quantile dependence)的风险值(Value-at-Risk)预测模型研究
- 批准号:71903144
- 批准年份:2019
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
What is the role of striatal dopamine in value-based decision-making?
纹状体多巴胺在基于价值的决策中发挥什么作用?
- 批准号:
DP240103246 - 财政年份:2024
- 资助金额:
$ 26.97万 - 项目类别:
Discovery Projects
Development of Integrated Quantum Inspired Algorithms for Shapley Value based Fast and Interpretable Feature Subset Selection
基于 Shapley 值的快速且可解释的特征子集选择的集成量子启发算法的开发
- 批准号:
24K15089 - 财政年份:2024
- 资助金额:
$ 26.97万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Interaction Design for Circular Economy Based on the Dynamics of Subjective Value for Objects
基于客体主观价值动态的循环经济交互设计
- 批准号:
23H03685 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Design, Analysis, and Optimization of Equitable and Value-based Baseline Testing Policies for Sports-Related Concussion
运动相关脑震荡公平且基于价值的基线测试政策的设计、分析和优化
- 批准号:
10649169 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Exploring affect-motivated alcohol use as a value-based decision-making process
探索情感驱动的饮酒作为基于价值的决策过程
- 批准号:
10738470 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Prognostic Value of CMR-Based Myocardial Entropy in Heart Failure Patients with non-reduced Ejection Fraction (HF non-rEF)
基于 CMR 的心肌熵对射血分数非降低的心力衰竭 (HF non-rEF) 患者的预后价值
- 批准号:
23K15106 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
New bio-based and sustainable raw materials enabling circular value chains of high performance lightweight biocomposites
新型生物基可持续原材料可实现高性能轻质生物复合材料的循环价值链
- 批准号:
10070588 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
EU-Funded
Improving Healthcare Quality and Equity For Older Adults with HIV Under Value-Based Care Models
在基于价值的护理模式下提高艾滋病毒感染者的医疗质量和公平性
- 批准号:
10762522 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Orbitofrontal modulation of dopamine during value-based decision-making
基于价值的决策过程中多巴胺的眶额调节
- 批准号:
10607543 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Estimation of optimal dosage of opioid based on vascular stiffness value
基于血管硬度值估算阿片类药物最佳剂量
- 批准号:
23K15598 - 财政年份:2023
- 资助金额:
$ 26.97万 - 项目类别:
Grant-in-Aid for Early-Career Scientists