Career: Building Models that Avoid Spurious Correlations through Interpretability and Representation Learning
职业:通过可解释性和表示学习构建避免虚假相关性的模型
基本信息
- 批准号:2145542
- 负责人:
- 金额:$ 54.67万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-01 至 2027-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Advances in artificial intelligence (AI) modeling have allowed AI to uncover and use all kinds of information to make accurate predictions. These predictions touch our day-to-day life, for example through fitness wearables that monitor health. Sometimes the predictions made by AI models make use of information that is unstable or spurious. For example, using sand to classify whether an image contains a camel versus a cow would be incorrect when presented with a camel in a grassy field. Examples of AI models failing because of the use of spurious information exist in other domains such as healthcare, where AI models can make predictions on the basis of how the data was collected rather than on the physiological information in the data. This project aims to develop tools to both help identify when AI models make use of spurious information and tools to build better AI models that avoid the use of spurious information. The results of this project will be algorithms that are applicable across several types of data and domains. The project will foster the development of undergraduate and PhD students through new lectures on AI for the real world and will promote data literacy through visualizations of AI models made by the tools the project will develop.There are two technical thrusts in this project. The first thrust focuses on interpretability of AI models. This thrust seeks to develop methods that can help identify the use of spurious information and, given knowledge of spurious information in an input, can help identify the semantic information in an input useful in predicting a label. This thrust will adapt the concept of learning to explain, which seeks to train a function to highlight the important part of an input for predicting a label, to the task of identifying and downweighing spurious information. The second thrust constructs new representation learning algorithms for building models that avoid the use of spurious information. Spurious information consists of relationships between variables that change across a family of data generating distributions. This thrust will seek to study the limits of reweighting-based estimators and flexible models and the utility of stronger assumptions, such as the existence of a residual. It will also study assumptions needed to address violations of positivity and how to do representation learning that avoids spurious information in multimodal data.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
人工智能(AI)建模的进步使人工智能能够发现并使用各种信息来做出准确的预测。这些预测涉及到我们的日常生活,例如通过监测健康状况的健身可穿戴设备。有时,人工智能模型的预测使用了不稳定或虚假的信息。例如,当呈现在草地上的骆驼时,使用沙子来区分图像中是骆驼还是奶牛是不正确的。由于使用虚假信息而导致人工智能模型失败的例子存在于医疗保健等其他领域,在这些领域,人工智能模型可以根据数据的收集方式而不是数据中的生理信息进行预测。该项目旨在开发工具来帮助识别人工智能模型何时使用虚假信息,并开发工具来构建更好的人工智能模型,以避免使用虚假信息。这个项目的结果将是算法,适用于多种类型的数据和领域。该项目将通过人工智能在现实世界中的新讲座促进本科生和博士生的发展,并将通过项目开发的工具制作的人工智能模型的可视化来提高数据素养。在这个项目中有两个技术重点。第一个重点是人工智能模型的可解释性。该主旨旨在开发有助于识别虚假信息使用的方法,并且在给定输入中的虚假信息的知识的情况下,可以帮助识别对预测标签有用的输入中的语义信息。这种推力将适应学习解释的概念,它旨在训练一个函数来突出预测标签的输入的重要部分,以识别和减轻虚假信息的任务。第二部分构建了新的表征学习算法,用于构建避免使用虚假信息的模型。虚假信息包括在一系列数据生成分布中变化的变量之间的关系。这一重点将设法研究基于重加权的估计器和灵活模型的局限性,以及更强有力的假设的效用,例如残差的存在。它还将研究解决违反积极性所需的假设,以及如何进行表征学习,以避免多模态数据中的虚假信息。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
DIET: Conditional independence testing with marginal dependence measures of residual information
- DOI:10.48550/arxiv.2208.08579
- 发表时间:2022-08
- 期刊:
- 影响因子:0
- 作者:Mukund Sudarshan;A. Puli;Wesley Tansey;R. Ranganath
- 通讯作者:Mukund Sudarshan;A. Puli;Wesley Tansey;R. Ranganath
Where to Diffuse, How to Diffuse and How to Get Back: Automated Learning in Multivariate Diffusions
在哪里扩散,如何扩散以及如何返回:多元扩散中的自动学习
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Singhal, Raghav;Goldstein, Mark;Ranganath, Rajesh
- 通讯作者:Ranganath, Rajesh
Don't be fooled: label leakage in explanation methods and the importance of their quantitative evaluation
- DOI:10.48550/arxiv.2302.12893
- 发表时间:2023-02
- 期刊:
- 影响因子:0
- 作者:N. Jethani;A. Saporta;R. Ranganath
- 通讯作者:N. Jethani;A. Saporta;R. Ranganath
Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection
- DOI:10.48550/arxiv.2302.04132
- 发表时间:2023-02
- 期刊:
- 影响因子:0
- 作者:Lily H. Zhang;R. Ranganath
- 通讯作者:Lily H. Zhang;R. Ranganath
Survival Mixture Density Networks
- DOI:10.48550/arxiv.2208.10759
- 发表时间:2022-08
- 期刊:
- 影响因子:0
- 作者:Xintian Han;Mark Goldstein;R. Ranganath
- 通讯作者:Xintian Han;Mark Goldstein;R. Ranganath
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Rajesh Ranganath其他文献
4299 External reproduction of a causal estimation of the individual and cohort benefit of sequential vs concurrent chemoradiotherapy in SIII NSCLC patients
4299 关于 III 期非小细胞肺癌患者序贯与同步放化疗的个体及队列获益因果评估的外部再现
- DOI:
10.1016/s0167-8140(25)03287-6 - 发表时间:
2025-05-01 - 期刊:
- 影响因子:5.300
- 作者:
Charlie Cunniffe;Wouter van Amsterdam;Rajesh Ranganath;Fiona Blackhall;Matthew Sperrin;Gareth Price - 通讯作者:
Gareth Price
From algorithms to action: improving patient care requires causality
- DOI:
10.1186/s12911-024-02513-3 - 发表时间:
2024-04-26 - 期刊:
- 影响因子:3.800
- 作者:
Wouter A. C. van Amsterdam;Pim A. de Jong;Joost J. C. Verhoeff;Tim Leiner;Rajesh Ranganath - 通讯作者:
Rajesh Ranganath
PO-04-210 strongNEW-ONSET DIABETES SCREENING USING ARTIFICIAL INTELLIGENCE-ENHANCED ELECTROCARDIOGRAM/strong
PO-04-210 利用人工智能增强型心电图进行的新发糖尿病筛查
- DOI:
10.1016/j.hrthm.2023.03.1301 - 发表时间:
2023-05-01 - 期刊:
- 影响因子:5.700
- 作者:
Lior Jankelson;Neil Jethani;Aahlad Puli;Hao Zhang;Leonid Garber;Yindalon Aphinyanaphongs;Rajesh Ranganath - 通讯作者:
Rajesh Ranganath
Correction to: The role of machine learning in clinical research: transforming the future of evidence generation
- DOI:
10.1186/s13063-021-05571-4 - 发表时间:
2021-09-06 - 期刊:
- 影响因子:2.000
- 作者:
E. Hope Weissler;Tristan Naumann;Tomas Andersson;Rajesh Ranganath;Olivier Elemento;Yuan Luo;Daniel F. Freitag;James Benoit;Michael C. Hughes;Faisal Khan;Paul Slater;Khader Shameer;Matthew Roe;Emmette Hutchison;Scott H. Kollins;Uli Broedl;Zhaoling Meng;Jennifer L. Wong;Lesley Curtis;Erich Huang;Marzyeh Ghassemi - 通讯作者:
Marzyeh Ghassemi
Towards Minimal Targeted Updates of Language Models with Targeted Negative Training
通过有针对性的负训练实现语言模型的最小针对性更新
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Lily H. Zhang;Rajesh Ranganath;Arya Tafvizi - 通讯作者:
Arya Tafvizi
Rajesh Ranganath的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
基于支链淀粉building blocks构建优质BE突变酶定向修饰淀粉调控机制的研究
- 批准号:31771933
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
相似海外基金
Building AI-Powered Responsible Workforce by Integrating Large Language Models into Computer Science Curriculum
通过将大型语言模型集成到计算机科学课程中,打造人工智能驱动的负责任的劳动力队伍
- 批准号:
2336061 - 财政年份:2024
- 资助金额:
$ 54.67万 - 项目类别:
Standard Grant
Building better molecular models through artificial intelligence
通过人工智能建立更好的分子模型
- 批准号:
2888940 - 财政年份:2023
- 资助金额:
$ 54.67万 - 项目类别:
Studentship
Advanced large-scale damage estimation method using deep learning and 3D building models
使用深度学习和 3D 建筑模型的先进大规模损伤估计方法
- 批准号:
23K04108 - 财政年份:2023
- 资助金额:
$ 54.67万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Research on the development and solution of supply chain models for domestic production and building a sustainable society
国内生产、构建可持续社会的供应链模式开发及解决方案研究
- 批准号:
23H01638 - 财政年份:2023
- 资助金额:
$ 54.67万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
CAREER: Building Next-Generation Language Models Based on Retrieval
职业:基于检索构建下一代语言模型
- 批准号:
2239290 - 财政年份:2023
- 资助金额:
$ 54.67万 - 项目类别:
Continuing Grant
Building the next generation of computational psycholinguistic models of speech perception
构建下一代语音感知计算心理语言学模型
- 批准号:
DGECR-2022-00296 - 财政年份:2022
- 资助金额:
$ 54.67万 - 项目类别:
Discovery Launch Supplement
Building joint models of language and the 3D world
构建语言和 3D 世界的联合模型
- 批准号:
RGPIN-2020-07196 - 财政年份:2022
- 资助金额:
$ 54.67万 - 项目类别:
Discovery Grants Program - Individual
Bayesian processes for calibration of building performance simulation models and optimisation of building energy systems design
用于校准建筑性能模拟模型和优化建筑能源系统设计的贝叶斯过程
- 批准号:
RGPIN-2019-06188 - 财政年份:2022
- 资助金额:
$ 54.67万 - 项目类别:
Discovery Grants Program - Individual
Towards building advanced machine learning image translation models to estimate Amyloid-Beta and Tau PET images from structural MRI
致力于构建先进的机器学习图像翻译模型,以估计来自结构 MRI 的淀粉样蛋白-Beta 和 Tau PET 图像
- 批准号:
580342-2022 - 财政年份:2022
- 资助金额:
$ 54.67万 - 项目类别:
Alliance Grants
Strengthening YWHO's Integrated Service Delivery and Measurement Based Care models: Building standards for a youth-focused learning health system
加强 YWHO 的综合服务提供和基于测量的护理模式:为以青年为中心的学习健康系统制定标准
- 批准号:
466080 - 财政年份:2022
- 资助金额:
$ 54.67万 - 项目类别:
Directed Grant