CAREER: Beyond Conditional Independence: New Model-Free Targets for High-Dimensional Inference
职业:超越条件独立:高维推理的新无模型目标
基本信息
- 批准号:2045981
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-07-01 至 2026-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The field of statistics has seen great success over many decades drawing scientific insights from simple, easy-to-understand models. But progress in computing is now allowing researchers to measure and store huge amounts of data at once, opening the door to understanding and manipulating much more complex systems than ever before. Indeed, the field of machine learning has been very successful at fitting predictive models to such data sets, but the black-box nature of these methods makes it hard to draw scientific insights from them using the usual statistical approach. In fact, not only do classical statistical methods fail in this modern big data setting, but the questions they answer no longer even make sense because the models they are based on do not hold even approximately. In this research, the PI will first come up with new statistical ways of posing scientific questions that make sense in complex data but still have interpretable answers. Second, the PI will find new statistical methods to answer those questions in a rigorous way, and study the mathematical and computational properties of these methods so that they can be used as effectively as possible. And finally, the PI will work with experts in the areas of genomics, the microbiome, and political science to use these methods to gain new scientific insights in these fields. Throughout the project, the PI will also run a free statistical consulting service to help the broader research community, develop new curricula with high school teachers, and provide enriching educational and research experiences for undergraduate and graduate students.To understand the importance of a covariate in a high-dimensional regression, it has become increasingly popular to perform a hypothesis test for conditional independence with the response. The appeal of such a test is that it provides statistically rigorous insight that is well-defined and scientifically interpretable no matter how the response depends on the covariates, including the case when their relationship is highly nonlinear and includes interactions, possibly of high order. However, conditional independence as a model-free inferential target only provides a type of scientific insight that can be of little use in some applications. The PI will extend conditional independence to new model-free targets for two types of data on which it does not provide a useful inferential target, namely, data with highly-locally-dependent covariates and data with compositional covariates. Then, the PI will move past conditional independence entirely to propose novel model-free targets that instead of just identifying (as in variable selection) actually numerically quantify relationships in the data, such as the relationship between a covariate and a response or the interaction between two covariates in the response's conditional distribution. Along with each new target, the PI will develop entirely new methods for powerful and provably valid inference. The novelty of the proposed targets and associated methods will provide for new connections with other fields of statistics including Bayesian computation, measure theory, statistical physics, experimental design, causal inference, and graphical model estimation. Ultimately, this research aims to provide a suite of new tools for researchers to move beyond the constraints of parametric targets and instead leverage state-of-the-art machine learning tools to answer novel and important questions about their data in a statistically principled way.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
几十年来,统计学领域取得了巨大的成功,从简单,易于理解的模型中获得了科学见解。但是,计算技术的进步使研究人员能够同时测量和存储大量数据,为理解和操纵比以往任何时候都复杂得多的系统打开了大门。事实上,机器学习领域在将预测模型拟合到此类数据集方面非常成功,但这些方法的黑箱性质使得很难使用通常的统计方法从中获得科学见解。事实上,不仅经典的统计方法在现代大数据环境中失败了,而且它们所回答的问题甚至不再有意义,因为它们所基于的模型甚至不近似。在这项研究中,PI将首先提出新的统计方法来提出科学问题,这些问题在复杂的数据中有意义,但仍然有可解释的答案。第二,PI将找到新的统计方法,以严格的方式回答这些问题,并研究这些方法的数学和计算特性,以便尽可能有效地使用它们。最后,PI将与基因组学、微生物组学和政治学领域的专家合作,利用这些方法在这些领域获得新的科学见解。在整个项目中,PI还将提供免费的统计咨询服务,以帮助更广泛的研究社区,与高中教师一起开发新课程,并为本科生和研究生提供丰富的教育和研究经验。为了了解协变量在高维回归中的重要性,越来越多的人使用条件独立性与响应进行假设检验。这种检验的吸引力在于,它提供了统计学上严格的见解,无论响应如何依赖于协变量,包括它们的关系是高度非线性的,包括相互作用,可能是高阶的,都是定义明确的,科学上可解释的。然而,作为无模型推理目标的条件独立性仅提供了一种在某些应用中几乎没有用处的科学见解。PI将条件独立性扩展到两种类型数据的新无模型目标,即具有高度局部依赖协变量的数据和具有成分协变量的数据。然后,PI将完全超越条件独立性,提出新的无模型目标,而不是仅仅识别(如变量选择)实际上在数字上量化数据中的关系,例如协变量和响应之间的关系或响应条件分布中两个协变量之间的相互作用。沿着每一个新的目标,PI将开发出全新的方法来进行强大的、可证明有效的推理。所提出的目标和相关方法的新奇将提供与其他统计领域的新联系,包括贝叶斯计算,测量理论,统计物理,实验设计,因果推理和图形模型估计。最终,这项研究旨在为研究人员提供一套新的工具,使其超越参数目标的限制,而是利用最先进的机器学习工具,以统计原则的方式回答有关其数据的新颖和重要问题。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Lucas Janson其他文献
Optimal sampling-based motion planning under differential constraints: The drift case with linear affine dynamics
微分约束下基于采样的最优运动规划:线性仿射动力学的漂移情况
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
E. Schmerling;Lucas Janson;M. Pavone - 通讯作者:
M. Pavone
Supplementary material to “Panning for gold: Model-X knock-offs for high-dimensional controlled variable selection”
“淘金:用于高维控制变量选择的 Model-X 仿制品”的补充材料
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
E. Candès;Yingying Fan;Lucas Janson;Jinchi Lv - 通讯作者:
Jinchi Lv
The $\ell$-test: leveraging sparsity in the Gaussian linear model for improved inference
$ell$-测试:利用高斯线性模型中的稀疏性来改进推理
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Souhardya Sengupta;Lucas Janson - 通讯作者:
Lucas Janson
Controlled Discovery and Localization of Signals via Bayesian Linear Programming
通过贝叶斯线性规划控制信号的发现和定位
- DOI:
10.1080/01621459.2024.2347667 - 发表时间:
2022 - 期刊:
- 影响因子:3.7
- 作者:
Asher Spector;Lucas Janson - 通讯作者:
Lucas Janson
Cross-validation Confidence Intervals for Test Error
测试误差的交叉验证置信区间
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Pierre Bayle;Alexandre Bayle;Lucas Janson;Lester W. Mackey - 通讯作者:
Lester W. Mackey
Lucas Janson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Collaborative Research: Beyond the Single-Atom Paradigm: A Priori Design of Dual-Atom Alloy Active Sites for Efficient and Selective Chemical Conversions
合作研究:超越单原子范式:双原子合金活性位点的先验设计,用于高效和选择性化学转化
- 批准号:
2334970 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: Research Infrastructure: MorphoCloud: A Cloud Powered, Open-Source Platform For Research, Teaching And Collaboration In 3d Digital Morphology And Beyond
协作研究:研究基础设施:MorphoCloud:云驱动的开源平台,用于 3D 数字形态学及其他领域的研究、教学和协作
- 批准号:
2301410 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Democratizing HIV science beyond community-based research
将艾滋病毒科学民主化,超越社区研究
- 批准号:
502555 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Droughts Beyond Hydro-climatological Extremes
超出水文气候极端值的干旱
- 批准号:
24K17352 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Beyond thiols, beyond gold: Novel NHC-stabilized nanoclusters in catalysis
超越硫醇,超越金:催化中新型 NHC 稳定纳米团簇
- 批准号:
23K21120 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Amalgamating Evidence About Causes: Medicine, the Medical Sciences, and Beyond
合并有关原因的证据:医学、医学科学及其他领域
- 批准号:
AH/Y007654/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
LSS_BeyondAverage: Probing cosmic large-scale structure beyond the average
LSS_BeyondAverage:探测超出平均水平的宇宙大尺度结构
- 批准号:
EP/Y027906/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
BeyondSNO: Signalling beyond protein S-nitrosylation - determining the roles of nitroxyl and hydroxylamine
BeyondSNO:蛋白质 S-亚硝基化之外的信号传导 - 确定硝酰基和羟胺的作用
- 批准号:
EP/Y027698/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Twistors and Quantum Field Theory: Strong fields, holography and beyond
扭量和量子场论:强场、全息术及其他
- 批准号:
EP/Z000157/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Exploitation of High Voltage CMOS sensors for tracking applications in physics experiments and beyond
利用高压 CMOS 传感器跟踪物理实验及其他领域的应用
- 批准号:
MR/X023834/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Fellowship