权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Valid and Scalable Inference for High-dimensional Statistical Models

职业：高维统计模型的有效且可扩展的推理

基本信息

批准号：
1844481
负责人：
Adel Javanmard
金额：
$ 40.22万
依托单位：
University of Southern California
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-03-15 至 2025-02-28
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1844481&HistoricalAwards=false
关键词：
CAREER Valid Scalable Inference dimensional

项目摘要

Due to the advent of "big data" technologies, fine-grained data sets can be collected at unprecedented scales, bringing transformative changes to modern life ranging from healthcare, and social networks, to recommendations systems and commerce. As a result, data-driven methods are becoming de rigueur nowadays, driving the need for increasingly sophisticated algorithms that find subtle statistical patterns in massive amount of data. This trend however is a double-edged sword: on the one hand, modern statistical learning methods help researchers in various fields to discover unexpected patterns from data and to make better decisions impacting everyday life. On the other hand, the rapid growth in the size and scope of data sets as well as the complexity of the methods used has made statistical models less transparent. Employing the derived models without a proper understanding of their validity can lead to many false discoveries, incorrect predictions and massive costs. For example, suppose the medical records of patients are used to develop a model for providing personalized risk score for a chronic disease. A high-risk score can trigger an intervention, such as incentive for healthy behavior, additional tests, and medical follow-ups which are all costly. This raises the concern about the validity of the outcomes retuned by this model. Should one interpret them at an average level or an individual level? Are the model predictions biased and, if so, how much? The overarching goal of this project is to develop novel foundational perspectives on the emerging inferential and computational challenges in statistics and data science. In addition, the PI plans to develop software packages to implement the proposed methods and make them publicly available. The proposed work will benefit a broad range of researchers from various areas ranging from bioinformatics and machine learning to finance and engineering. The PI will also integrate components from this project into an advanced graduate class and use selected results to motivate undergraduate students to pursue careers in STEM (Science, Technology, Engineering and Math). This project aims at developing statistical methods that are 1) scalable and 2) valid in the sense that in addition point estimation, they also quantify the statistical uncertainty that is intrinsic in the estimation and predications. These issues are among the central topics in modern statistics and it is imperative to develop solid theory and powerful methodology to address them. This project focuses on three interrelated prongs that develop fundamental insights for these ubiquitous challenges: (1) Uncertainty assessment and high-dimensional inference: the PI will develop a flexible framework for general hypothesis testing problems in high-dimensional setting using the so-called debiasing approach, and further study inference for high-dimensional models with adaptively collected samples, such as time series; (2) Online hypotheses testing: the PI will formulate the decentralized false discovery rate (FDR) control where the number of hypotheses to be tested is unknown (possibly infinite) and the decision maker should take an action on each before the next p-value is received; and (3) Optimal iterative estimation for non-linear decision regions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

由于大数据技术的出现，可以以前所未有的规模收集细粒度的数据集，从医疗保健和社交网络到推荐系统和商业，都给现代生活带来了变革性的变化。因此，数据驱动的方法如今正变得司空见惯，这推动了对越来越复杂的算法的需求，这些算法可以在海量数据中找到微妙的统计模式。然而，这一趋势是一把双刃剑：一方面，现代统计学习方法帮助各个领域的研究人员从数据中发现意想不到的模式，并做出影响日常生活的更好决策。另一方面，数据集的规模和范围的迅速增长以及所用方法的复杂性使统计模型变得不那么透明。在没有正确理解其有效性的情况下使用导出的模型可能会导致许多错误的发现、错误的预测和巨大的成本。例如，假设使用患者的医疗记录来开发为慢性病提供个性化风险评分的模型。高风险得分可能会引发干预，例如鼓励健康行为、额外的测试和医学后续检查，这些都是昂贵的。这引发了人们对该模型返回的结果的有效性的担忧。我们应该从平均水平还是个人水平来解读它们？模型预测是否有偏差？如果是的话，偏差有多大？这个项目的首要目标是对统计和数据科学中新出现的推理和计算挑战提出新的基本观点。此外，PI计划开发软件包，以实施拟议的方法并将其公之于众。拟议的工作将使来自不同领域的广泛研究人员受益，从生物信息学和机器学习到金融和工程。PI还将把这个项目的组成部分整合到一个高级研究生班级中，并使用精选的结果来激励本科生追求STEM(科学、技术、工程和数学)的职业生涯。该项目的目的是开发1)可扩展和2)有效的统计方法，即除了点估计之外，它们还量化估计和预测中固有的统计不确定性。这些问题是现代统计学的中心议题之一，必须发展坚实的理论和强有力的方法来解决这些问题。这个项目集中于三个相互关联的方面，它们为这些普遍存在的挑战提供了基本的见解：(1)不确定性评估和高维推理：PI将使用所谓的去偏向方法为高维环境中的一般假设检验问题开发一个灵活的框架，并进一步研究具有自适应收集样本的高维模型的推理，例如时间序列；(2)在线假设测试：PI将制定分散的错误发现率(FDR)控制，其中要测试的假设数量未知(可能是无限的)，并且决策者应该在收到下一个p值之前对每个假设采取行动；以及(3)非线性决策区域的最优迭代估计。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（14）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

onlineFDR: an R package to control the false discovery rate for growing data repositories

DOI：
10.1093/bioinformatics/btz191
发表时间：
2019-10-15
期刊：
BIOINFORMATICS
影响因子：
5.8
作者：
Robertson, David S.;Wildenhain, Jan;Karp, Natasha A.
通讯作者：
Karp, Natasha A.

Analysis of a two-layer neural network via displacement convexity

通过位移凸性分析两层神经网络

DOI：
10.1214/20-aos1945
发表时间：
2020
期刊：
The Annals of Statistics
影响因子：
0
作者：
Javanmard, Adel;Mondelli, Marco;Montanari, Andrea
通讯作者：
Montanari, Andrea

False discovery rate control via debiased lasso

DOI：
10.1214/19-ejs1554
发表时间：
2019-01-01
期刊：
ELECTRONIC JOURNAL OF STATISTICS
影响因子：
1.1
作者：
Javanmard, Adel;Javadi, Hamid
通讯作者：
Javadi, Hamid

Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

动态激励感知学习：关联拍卖中的稳健定价

DOI：
10.1287/opre.2020.1991
发表时间：
2021
期刊：
Operations research
影响因子：
2.7
作者：
Negin Golrezaei, Adel Javanmard
通讯作者：
Negin Golrezaei, Adel Javanmard

Precise Tradeoffs in Adversarial Training for Linear Regression

DOI：
发表时间：
2020-02
期刊：
影响因子：
0
作者：
Adel Javanmard;M. Soltanolkotabi;Hamed Hassani
通讯作者：
Adel Javanmard;M. Soltanolkotabi;Hamed Hassani

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Adel Javanmard其他文献

PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses

PriorBoost：一种从聚合响应中学习的自适应算法

DOI：
发表时间：
2024
期刊：
arXiv.org
影响因子：
0
作者：
Adel Javanmard;Matthew Fahrbach;V. Mirrokni
通讯作者：
V. Mirrokni

Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

通过相似聚类进行匿名学习：模型泛化的精确分析

DOI：
发表时间：
2023
期刊：
Neural Information Processing Systems
影响因子：
0
作者：
Adel Javanmard;V. Mirrokni
通讯作者：
V. Mirrokni

Robust max-product belief propagation

鲁棒的最大乘积置信传播

DOI：
10.1109/acssc.2011.6189951
发表时间：
2011
期刊：
2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR)
影响因子：
0
作者：
M. Ibrahimi;Adel Javanmard;Yashodhan Kanoria;A. Montanari
通讯作者：
A. Montanari