权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CRII: RI: Explaining Decisions of Black-box Models via Input Perturbations

CRII：RI：通过输入扰动解释黑盒模型的决策

基本信息

批准号：
1756023
负责人：
Sameer Singh
金额：
$ 17.49万
依托单位：
University of California-Irvine
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2018
资助国家：
美国
起止时间：
2018-07-01 至 2021-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1756023&HistoricalAwards=false
关键词：
CRII RI Explaining Decisions Black

项目摘要

Machine learning is at the forefront of many recent advances in science and technology, enabled in part by complex models and algorithms. However, as a consequence of this complexity, machine learning systems essentially act as "black-boxes" as far as users are concerned. Thus, it is incredibly difficult to predict what they will do when deployed, understand why they are making the decisions, guarantee their robustness, or broadly speaking, trust their behavior. As these algorithms become an increasing part of our society, our financial systems, our healthcare providers, our scientific advances, and our defense systems, it is crucial to address this challenge. In this work, the PI and his team will develop algorithms that explain why any classifier is making its decisions, without any access to its underlying implementation, in order to make the inner workings understandable to the users. Such explanations make machine learning more transparent, leading to a more robust evaluation pipeline, reduced debugging efforts, and increased ease of use (and of trust) of these complex, black-box systems.For a decision made by a machine learning classifier, the team will develop methods that accurately characterize the relationship between the input instance and the algorithm's prediction, and present it in an intuitive manner. The primary intuition is to estimate the instance-specific behavior of the predictor by observing the output of the classifier as the input instance is perturbed. The first proposed thrust of this work extends this basic framework by considering rules that define counter-examples, and summarize the behavior over multiple instances, providing detailed and accurate insights into the behavior with minimal effort on the users' part. The second thrust identifies automated ways to learn domain-specific perturbation functions that generate realistic instances to compute the explanations. The team proposes a comprehensive evaluation of these explainers consisting of user experiments in comparing, trusting, and modifying machine learning algorithms, with applications to diverse tasks such as sentiment analysis, machine translation, time series, visual question answering, and object detection.Due to the many potential applications of this work, both for machine learning practitioners and end-users, dissemination of the results is a key focus, and the team will augment standard channels (such as publications) with novel ones that include open-source software, jargon-free documentation, and interactive tutorials/demonstrations to encourage application of machine learning to novel domains.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习是科学和技术领域许多最新进展的前沿，部分原因是复杂的模型和算法。然而，由于这种复杂性，机器学习系统本质上就像用户所关心的“黑匣子”。因此，要预测它们在部署时会做什么，理解它们为什么要做出决定，保证它们的健壮性，或者广义地说，信任它们的行为是非常困难的。随着这些算法在我们的社会、我们的金融系统、我们的医疗保健提供者、我们的科学进步和我们的防御系统中越来越多地发挥作用，应对这一挑战至关重要。在这项工作中，PI和他的团队将开发算法，解释为什么任何分类器都在不访问其底层实现的情况下做出决策，以便使用户能够理解内部工作。这样的解释使机器学习更加透明，导致更强大的评估管道，减少调试工作，并提高这些复杂的黑盒系统的易用性（和信任度）。对于机器学习分类器做出的决策，团队将开发方法，准确表征输入实例和算法预测之间的关系，并以直观的方式呈现。主要的直觉是通过在输入实例被扰动时观察分类器的输出来估计预测器的实例特定行为。这项工作的第一个建议的推力通过考虑定义反例的规则扩展了这个基本框架，并总结了多个实例的行为，提供了详细和准确的见解，对用户的一部分，以最小的努力。第二个推力确定自动化的方法来学习特定领域的扰动函数，生成现实的情况下计算的解释。该团队提出了对这些解释器的全面评估，包括比较，信任和修改机器学习算法的用户实验，以及情感分析，机器翻译，时间序列，视觉问答和对象检测等各种任务的应用。由于这项工作的许多潜在应用，无论是机器学习从业者还是最终用户，传播结果是一个关键重点，该小组将扩大标准渠道（如出版物）与新颖的，包括开放源码软件，行话免费文档，和交互式教程/该奖项反映了NSF的法定使命，并被认为是值得的。通过使用基金会的知识价值和更广泛的影响审查标准进行评估来提供支持。

项目成果

期刊论文数量（11）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Evaluating Models’ Local Decision Boundaries via Contrast Sets

DOI：
10.18653/v1/2020.findings-emnlp.117
发表时间：
2020-04
期刊：
影响因子：
0
作者：
Matt Gardner;Yoav Artzi;Jonathan Berant;Ben Bogin;Sihao Chen;Dheeru Dua;Yanai Elazar;Ananth Gottumukkala;Nitish Gupta;Hannaneh Hajishirzi;Gabriel Ilharco;Daniel Khashabi;Kevin Lin;Jiangming Liu;Nelson F. Liu;Phoebe Mulcaire;Qiang Ning;Sameer Singh;Noah A. Smith;Sanjay Subramanian;Eric Wallace;Ally Zhang;Ben Zhou
通讯作者：
Matt Gardner;Yoav Artzi;Jonathan Berant;Ben Bogin;Sihao Chen;Dheeru Dua;Yanai Elazar;Ananth Gottumukkala;Nitish Gupta;Hannaneh Hajishirzi;Gabriel Ilharco;Daniel Khashabi;Kevin Lin;Jiangming Liu;Nelson F. Liu;Phoebe Mulcaire;Qiang Ning;Sameer Singh;Noah A. Smith;Sanjay Subramanian;Eric Wallace;Ally Zhang;Ben Zhou

Are Red Roses Red? Evaluating Consistency of Question-Answering Models

DOI：
10.18653/v1/p19-1621
发表时间：
2019-07
期刊：
影响因子：
0
作者：
Marco Tulio Ribeiro;Carlos Guestrin;Sameer Singh
通讯作者：
Marco Tulio Ribeiro;Carlos Guestrin;Sameer Singh

Gradient-based Analysis of NLP Models is Manipulable

DOI：
10.18653/v1/2020.findings-emnlp.24
发表时间：
2020-10
期刊：
ArXiv
影响因子：
0
作者：
Junlin Wang;Jens Tuyls;Eric Wallace;Sameer Singh
通讯作者：
Junlin Wang;Jens Tuyls;Eric Wallace;Sameer Singh

Semantically Equivalent Adversarial Rules for Debugging NLP models

DOI：
10.18653/v1/p18-1079
发表时间：
2018-07
期刊：
影响因子：
0
作者：
Marco Tulio Ribeiro;Sameer Singh;Carlos Guestrin
通讯作者：
Marco Tulio Ribeiro;Sameer Singh;Carlos Guestrin

Universal Adversarial Triggers for Attacking and Analyzing NLP

DOI：
10.18653/v1/d19-1221
发表时间：
2019-08
期刊：
影响因子：
0
作者：
Eric Wallace;Shi Feng;Nikhil Kandpal;Matt Gardner;Sameer Singh
通讯作者：
Eric Wallace;Shi Feng;Nikhil Kandpal;Matt Gardner;Sameer Singh

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sameer Singh其他文献

Multi-stage Classification for Audio Based Activity Recognition

基于音频的活动识别的多级分类

DOI：
10.1007/11875581_100
发表时间：
2006
期刊：
影响因子：
0
作者：
José Lopes;Charles Lin;Sameer Singh
通讯作者：
Sameer Singh

A survey of object recognition methods for automatic asset detection in high-definition video

高清视频中自动资产检测的对象识别方法综述

DOI：
10.1109/ukricis.2010.5898117
发表时间：
2010
期刊：
2010 IEEE 9th International Conference on Cyberntic Intelligent Systems
影响因子：
0
作者：
Thomas Warsop;Sameer Singh
通讯作者：
Sameer Singh

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

技能集优化：通过可转移技能强化语言模型行为

DOI：
10.48550/arxiv.2402.03244
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Kolby Nottingham;Bodhisattwa Prasad Majumder;Bhavana Dalvi;Sameer Singh;Peter Clark;Roy Fox
通讯作者：
Roy Fox

ezCoref : A Scalable Approach for Collecting Crowdsourced Annotations for Coreference Resolution

ezCoref：一种收集众包注释以进行共指解析的可扩展方法

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
A. Crowdsourced;David Bamman;Olivia Lewke;Rachel Bawden;Rico Sennrich;Alexandra Birch;Ari Bornstein;Arie Cattan;Ido Dagan;Hong Chen;Zhenhua Fan;Hao Lu;Alan Yuille;Eduard Hovy;Mitch Marcus;M. Palmer;Lance;Rodney Huddleston. 2002;Frédéric Landragin;T. Poibeau;Bernard Vic;Belinda Z. Li;Gabriel Stanovsky;Robert L Logan;Andrew McCallum;Sameer Singh
通讯作者：
Sameer Singh