Dependable Predictive Inference with Uncertainty-Aware Machine Learning
通过不确定性感知机器学习进行可靠的预测推理
基本信息
- 批准号:2210637
- 负责人:
- 金额:$ 16万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-08-15 至 2025-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Complex statistical and machine learning models, including deep neural networks, are widely applied in many fields and they are becoming increasingly central to data-driven science, despite serious concerns about their reliability. These models cannot always be trusted, especially in sensitive and high-noise applications such as those found in genomics, as well as in all of those contexts in which machine learning predictions will affect people’s health or welfare. A crucial current limitation of machine learning models is that they may not adequately capture uncertainty and their predictions often tend to be overconfident. Further, machine learning models are known to sometimes reinforce latent biases hidden in the data, and thus they may lead to predictions that are systematically biased against certain groups of individuals. Finally, many statistical and machine learning models may perform well within the specific data set in which they are trained, but their predictions are not robust to changing data environments, such as those corresponding to the genetic analysis of individuals from populations with different ancestries. To address the above limitations, this research project will develop general methods for accurate, fair, and robust uncertainty estimation in machine learning. In the specific contexts of genomics, this work will lead to improved genetic risk prediction across human populations, facilitating further developments in personalized medicine, bridging health disparities across populations, and helping deepen our scientific knowledge of heritable diseases. This project will support education in statistical and machine learning research by providing training opportunities for graduate students. This project will also help promote diversity in statistical and machine learning research by helping support the investigator’s involvement with the Diversity, Inclusion, Access JumpStart initiative of the University of Southern California. In particular, the investigator will offer summer research opportunities focusing for undergraduate students on the topics of this project.This research consists of three distinct but closely connected parts. The first part will develop novel conformal inference methods to train and calibrate uncertainty-aware machine learning models that are both accurate and reliable. This research will involve the development of novel loss functions and innovative stochastic optimization algorithms. The second part of this project will develop methods for training and calibrating uncertainty-aware machine learning models that treat individuals belonging to different groups fairly, carefully using hold-out observations to correct for possible algorithmic or data biases. The third part of this project will develop methods based on data holdout and conformal inference to construct predictive models that are more robust to possible shifts in the covariate distribution. These models will be able to leverage possible interactions among the available predictive variables and ultimately lead to powerful multivariate models of genetic risk for heritable diseases that may be relied on across different populations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
复杂的统计和机器学习模型(包括深度神经网络)在许多领域得到广泛应用,它们在数据驱动的科学中正变得越来越重要,尽管人们对它们的可靠性存在严重担忧。这些模型并不总是值得信任的,尤其是在敏感和高噪声的应用中,比如在基因组学中发现的应用,以及在机器学习预测将影响人们健康或福利的所有环境中。目前机器学习模型的一个关键限制是,它们可能无法充分捕捉不确定性,而且它们的预测往往过于自信。此外,众所周知,机器学习模型有时会强化隐藏在数据中的潜在偏见,因此它们可能导致对某些群体的系统性偏见的预测。最后,许多统计和机器学习模型可能在它们接受训练的特定数据集中表现良好,但它们的预测对于不断变化的数据环境(例如与来自不同祖先群体的个体的遗传分析相对应的数据环境)并不稳健。为了解决上述限制,本研究项目将开发机器学习中准确、公平和稳健的不确定性估计的通用方法。在基因组学的特定背景下,这项工作将改善人类群体的遗传风险预测,促进个性化医疗的进一步发展,弥合人群之间的健康差距,并帮助加深我们对遗传性疾病的科学知识。该项目将通过为研究生提供培训机会来支持统计和机器学习研究方面的教育。该项目还将有助于促进统计和机器学习研究的多样性,帮助支持研究者参与南加州大学的“多样性、包容性、访问JumpStart”计划。特别是,研究者将为本科生提供夏季研究机会,重点研究本项目的主题。这项研究由三个不同但又紧密相连的部分组成。第一部分将开发新的保形推理方法来训练和校准既准确又可靠的不确定性感知机器学习模型。这项研究将涉及开发新的损失函数和创新的随机优化算法。该项目的第二部分将开发训练和校准不确定性感知机器学习模型的方法,这些模型公平地对待属于不同群体的个体,小心地使用保留观察来纠正可能的算法或数据偏差。该项目的第三部分将开发基于数据保留和保形推理的方法,以构建对协变量分布的可能变化更具鲁棒性的预测模型。这些模型将能够利用现有预测变量之间可能的相互作用,并最终导致可能在不同人群中依赖的遗传性疾病遗传风险的强大的多变量模型。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Training Uncertainty-Aware Classifiers with Conformalized Deep Learning
- DOI:10.48550/arxiv.2205.05878
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Bat-Sheva Einbinder;Yaniv Romano;Matteo Sesia;Yanfei Zhou
- 通讯作者:Bat-Sheva Einbinder;Yaniv Romano;Matteo Sesia;Yanfei Zhou
Conformal Frequency Estimation with Sketched Data
使用草图数据进行共形频率估计
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Sesia, Matteo;Favaro, Stefano
- 通讯作者:Favaro, Stefano
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Matteo Sesia其他文献
Derandomized Novelty Detection with FDR Control via Conformal E-values
通过共形 E 值进行 FDR 控制的去随机新颖性检测
- DOI:
10.48550/arxiv.2302.07294 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Meshi Bashari;Amir Epstein;Yaniv Romano;Matteo Sesia - 通讯作者:
Matteo Sesia
ASO Visual Abstract: Robotic-Assisted Esophagectomy Leads to Significant Reduction in Postoperative Acute Pain—A Retrospective Clinical Trial
- DOI:
10.1245/s10434-022-12280-y - 发表时间:
2022-07-30 - 期刊:
- 影响因子:3.500
- 作者:
Jens P. Hoelzen;Karl J. Sander;Matteo Sesia;Dhruvajyoti Roy;Emile Rijcken;Alexander Schnabel;Benjamin Struecker;Mazen A. Juratli;Andreas Pascher - 通讯作者:
Andreas Pascher
Conformalized Frequency Estimation from Sketched Data (preprint)
根据草图数据进行保形频率估计(预印本)
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Matteo Sesia;S. Favaro - 通讯作者:
S. Favaro
Integrative conformal p-values for out-of-distribution testing with labelled outliers
用于带有标记异常值的分布外测试的综合共形 p 值
- DOI:
10.1093/jrsssb/qkad138 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Zi;Matteo Sesia;Wenguang Sun - 通讯作者:
Wenguang Sun
Matteo Sesia的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Theory and Methods for Modern Predictive Inference
现代预测推理的理论与方法
- 批准号:
2310764 - 财政年份:2023
- 资助金额:
$ 16万 - 项目类别:
Standard Grant
Mean Field Asymptotics in Statistical Inference: Variational Approach, Multiple Testing, and Predictive Inference
统计推断中的平均场渐进:变分方法、多重测试和预测推断
- 批准号:
2210827 - 财政年份:2022
- 资助金额:
$ 16万 - 项目类别:
Continuing Grant
Performance Evaluation of Nonparametric Predictive Inference
非参数预测推理的性能评估
- 批准号:
2600057 - 财政年份:2021
- 资助金额:
$ 16万 - 项目类别:
Studentship
Predictive inference for clinical trials with the parametric bootstrap
使用参数引导程序进行临床试验的预测推理
- 批准号:
2565020 - 财政年份:2021
- 资助金额:
$ 16万 - 项目类别:
Studentship
Conference on Predictive Inference and Its Applications
预测推理及其应用会议
- 批准号:
1810945 - 财政年份:2018
- 资助金额:
$ 16万 - 项目类别:
Standard Grant
ASPIRE: Automated Sensing & Predictive Inference for Respiratory Exacerbation
ASPIRE:自动传感
- 批准号:
EP/P009824/1 - 财政年份:2017
- 资助金额:
$ 16万 - 项目类别:
Research Grant
Integrating Multi-Scale Imaging, Reaction-Diffusion Simulation, and Markov Model Inference to Enhance Predictive Design and Interpretation of Single-Molecule Gene Regulation Experiments
集成多尺度成像、反应扩散模拟和马尔可夫模型推理,增强单分子基因调控实验的预测设计和解释
- 批准号:
10406604 - 财政年份:2017
- 资助金额:
$ 16万 - 项目类别:
Integrating Multi-Scale Imaging, Reaction-Diffusion Simulation, and Markov Model Inference to Enhance Predictive Design and Interpretation of Single-Molecule Gene Regulation Experiments
集成多尺度成像、反应扩散模拟和马尔可夫模型推理,增强单分子基因调控实验的预测设计和解释
- 批准号:
10704524 - 财政年份:2017
- 资助金额:
$ 16万 - 项目类别:
Efficient statistical inference strategies and Bayesian analysis for constrained parameters and predictive densities
针对约束参数和预测密度的高效统计推断策略和贝叶斯分析
- 批准号:
105806-2012 - 财政年份:2016
- 资助金额:
$ 16万 - 项目类别:
Discovery Grants Program - Individual
Efficient statistical inference strategies and Bayesian analysis for constrained parameters and predictive densities
针对约束参数和预测密度的高效统计推断策略和贝叶斯分析
- 批准号:
105806-2012 - 财政年份:2015
- 资助金额:
$ 16万 - 项目类别:
Discovery Grants Program - Individual