CAREER: Valid and Scalable Inference for High-dimensional Statistical Models
职业:高维统计模型的有效且可扩展的推理
基本信息
- 批准号:1844481
- 负责人:
- 金额:$ 40.22万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-03-15 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Due to the advent of "big data" technologies, fine-grained data sets can be collected at unprecedented scales, bringing transformative changes to modern life ranging from healthcare, and social networks, to recommendations systems and commerce. As a result, data-driven methods are becoming de rigueur nowadays, driving the need for increasingly sophisticated algorithms that find subtle statistical patterns in massive amount of data. This trend however is a double-edged sword: on the one hand, modern statistical learning methods help researchers in various fields to discover unexpected patterns from data and to make better decisions impacting everyday life. On the other hand, the rapid growth in the size and scope of data sets as well as the complexity of the methods used has made statistical models less transparent. Employing the derived models without a proper understanding of their validity can lead to many false discoveries, incorrect predictions and massive costs. For example, suppose the medical records of patients are used to develop a model for providing personalized risk score for a chronic disease. A high-risk score can trigger an intervention, such as incentive for healthy behavior, additional tests, and medical follow-ups which are all costly. This raises the concern about the validity of the outcomes retuned by this model. Should one interpret them at an average level or an individual level? Are the model predictions biased and, if so, how much? The overarching goal of this project is to develop novel foundational perspectives on the emerging inferential and computational challenges in statistics and data science. In addition, the PI plans to develop software packages to implement the proposed methods and make them publicly available. The proposed work will benefit a broad range of researchers from various areas ranging from bioinformatics and machine learning to finance and engineering. The PI will also integrate components from this project into an advanced graduate class and use selected results to motivate undergraduate students to pursue careers in STEM (Science, Technology, Engineering and Math). This project aims at developing statistical methods that are 1) scalable and 2) valid in the sense that in addition point estimation, they also quantify the statistical uncertainty that is intrinsic in the estimation and predications. These issues are among the central topics in modern statistics and it is imperative to develop solid theory and powerful methodology to address them. This project focuses on three interrelated prongs that develop fundamental insights for these ubiquitous challenges: (1) Uncertainty assessment and high-dimensional inference: the PI will develop a flexible framework for general hypothesis testing problems in high-dimensional setting using the so-called debiasing approach, and further study inference for high-dimensional models with adaptively collected samples, such as time series; (2) Online hypotheses testing: the PI will formulate the decentralized false discovery rate (FDR) control where the number of hypotheses to be tested is unknown (possibly infinite) and the decision maker should take an action on each before the next p-value is received; and (3) Optimal iterative estimation for non-linear decision regions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
由于大数据技术的出现,可以以前所未有的规模收集细粒度的数据集,从医疗保健和社交网络到推荐系统和商业,都给现代生活带来了变革性的变化。因此,数据驱动的方法如今正变得司空见惯,这推动了对越来越复杂的算法的需求,这些算法可以在海量数据中找到微妙的统计模式。然而,这一趋势是一把双刃剑:一方面,现代统计学习方法帮助各个领域的研究人员从数据中发现意想不到的模式,并做出影响日常生活的更好决策。另一方面,数据集的规模和范围的迅速增长以及所用方法的复杂性使统计模型变得不那么透明。在没有正确理解其有效性的情况下使用导出的模型可能会导致许多错误的发现、错误的预测和巨大的成本。例如,假设使用患者的医疗记录来开发为慢性病提供个性化风险评分的模型。高风险得分可能会引发干预,例如鼓励健康行为、额外的测试和医学后续检查,这些都是昂贵的。这引发了人们对该模型返回的结果的有效性的担忧。我们应该从平均水平还是个人水平来解读它们?模型预测是否有偏差?如果是的话,偏差有多大?这个项目的首要目标是对统计和数据科学中新出现的推理和计算挑战提出新的基本观点。此外,PI计划开发软件包,以实施拟议的方法并将其公之于众。拟议的工作将使来自不同领域的广泛研究人员受益,从生物信息学和机器学习到金融和工程。PI还将把这个项目的组成部分整合到一个高级研究生班级中,并使用精选的结果来激励本科生追求STEM(科学、技术、工程和数学)的职业生涯。该项目的目的是开发1)可扩展和2)有效的统计方法,即除了点估计之外,它们还量化估计和预测中固有的统计不确定性。这些问题是现代统计学的中心议题之一,必须发展坚实的理论和强有力的方法来解决这些问题。这个项目集中于三个相互关联的方面,它们为这些普遍存在的挑战提供了基本的见解:(1)不确定性评估和高维推理:PI将使用所谓的去偏向方法为高维环境中的一般假设检验问题开发一个灵活的框架,并进一步研究具有自适应收集样本的高维模型的推理,例如时间序列;(2)在线假设测试:PI将制定分散的错误发现率(FDR)控制,其中要测试的假设数量未知(可能是无限的),并且决策者应该在收到下一个p值之前对每个假设采取行动;以及(3)非线性决策区域的最优迭代估计。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(14)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
onlineFDR: an R package to control the false discovery rate for growing data repositories
- DOI:10.1093/bioinformatics/btz191
- 发表时间:2019-10-15
- 期刊:
- 影响因子:5.8
- 作者:Robertson, David S.;Wildenhain, Jan;Karp, Natasha A.
- 通讯作者:Karp, Natasha A.
Analysis of a two-layer neural network via displacement convexity
通过位移凸性分析两层神经网络
- DOI:10.1214/20-aos1945
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Javanmard, Adel;Mondelli, Marco;Montanari, Andrea
- 通讯作者:Montanari, Andrea
False discovery rate control via debiased lasso
- DOI:10.1214/19-ejs1554
- 发表时间:2019-01-01
- 期刊:
- 影响因子:1.1
- 作者:Javanmard, Adel;Javadi, Hamid
- 通讯作者:Javadi, Hamid
Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions
动态激励感知学习:关联拍卖中的稳健定价
- DOI:10.1287/opre.2020.1991
- 发表时间:2021
- 期刊:
- 影响因子:2.7
- 作者:Negin Golrezaei, Adel Javanmard
- 通讯作者:Negin Golrezaei, Adel Javanmard
Precise Tradeoffs in Adversarial Training for Linear Regression
- DOI:
- 发表时间:2020-02
- 期刊:
- 影响因子:0
- 作者:Adel Javanmard;M. Soltanolkotabi;Hamed Hassani
- 通讯作者:Adel Javanmard;M. Soltanolkotabi;Hamed Hassani
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Adel Javanmard其他文献
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses
PriorBoost:一种从聚合响应中学习的自适应算法
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Adel Javanmard;Matthew Fahrbach;V. Mirrokni - 通讯作者:
V. Mirrokni
Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization
通过相似聚类进行匿名学习:模型泛化的精确分析
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Adel Javanmard;V. Mirrokni - 通讯作者:
V. Mirrokni
Robust max-product belief propagation
鲁棒的最大乘积置信传播
- DOI:
10.1109/acssc.2011.6189951 - 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
M. Ibrahimi;Adel Javanmard;Yashodhan Kanoria;A. Montanari - 通讯作者:
A. Montanari
Near-Optimal Model Discrimination with Non-Disclosure
不公开的近乎最优模型判别
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Dmitrii Ostrovskii;M. Ndaoud;Adel Javanmard;Meisam Razaviyayn - 通讯作者:
Meisam Razaviyayn
Pearson Chi-squared Conditional Randomization Test
皮尔逊卡方条件随机化检验
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Adel Javanmard;M. Mehrabi - 通讯作者:
M. Mehrabi
Adel Javanmard的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Adel Javanmard', 18)}}的其他基金
Robust and scalable algorithms for learning hidden structures in sparse network data with the aid of side information
借助辅助信息学习稀疏网络数据中隐藏结构的鲁棒且可扩展的算法
- 批准号:
2311024 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Standard Grant
相似海外基金
Constructing Valid, Equitable, and Flexible Kinematics and Dynamics Assessment Scales with Evidence-Centered Design
通过以证据为中心的设计构建有效、公平、灵活的运动学和动力学评估量表
- 批准号:
2235595 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Standard Grant
The Common Fund Knowledge Center (CFKC): providing scientifically valid knowledge from the Common Fund Data Ecosystem to a diverse biomedical research community.
共同基金知识中心(CFKC):从共同基金数据生态系统向多元化的生物医学研究社区提供科学有效的知识。
- 批准号:
10851461 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Theoretical and empirical study on effective and valid reputation assignment rules for cooperation
有效有效的合作声誉分配规则的理论与实证研究
- 批准号:
22KJ0105 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Constructing Valid, Equitable, and Flexible Kinematics and Dynamics Assessment Scales with Evidence-Centered Design
通过以证据为中心的设计构建有效、公平、灵活的运动学和动力学评估量表
- 批准号:
2235518 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Standard Grant
Towards a reliable and valid assessment of preteen suicidal thoughts and behavior
对青春期前的自杀想法和行为进行可靠和有效的评估
- 批准号:
10583418 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Constructing Valid, Equitable, and Flexible Kinematics and Dynamics Assessment Scales with Evidence-Centered Design
通过以证据为中心的设计构建有效、公平、灵活的运动学和动力学评估量表
- 批准号:
2235681 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Standard Grant
Game-theoretic statistics and safe anytime-valid inference
博弈论统计和安全且随时有效的推理
- 批准号:
2310718 - 财政年份:2023
- 资助金额:
$ 40.22万 - 项目类别:
Standard Grant
Applying ecologically valid approaches to social cognitive ageing
将生态有效的方法应用于社会认知老化
- 批准号:
DE220100561 - 财政年份:2022
- 资助金额:
$ 40.22万 - 项目类别:
Discovery Early Career Researcher Award
Phase III Development of a Valid, Reliable, Clinically Feasible Measure of Transactional Success in Aphasic Conversation: Modernizing Methods of Acquisition and Analysis of Discourse Data
失语对话中交易成功的有效、可靠、临床可行的衡量标准的第三阶段开发:话语数据采集和分析的现代化方法
- 批准号:
10617305 - 财政年份:2022
- 资助金额:
$ 40.22万 - 项目类别:
Mobile Ecological Momentary Diet Assessment: A Low Burden, Ecologically-Valid Approach to Measuring Dietary Intake in Near-Real Time
移动生态瞬时饮食评估:一种低负担、生态有效的近实时测量膳食摄入量的方法
- 批准号:
10593785 - 财政年份:2022
- 资助金额:
$ 40.22万 - 项目类别: