权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Small: Human Validation in Batch Reinforcement Learning

RI：小：批量强化学习中的人工验证

基本信息

批准号：
2007076
负责人：
Finale Doshi-Velez
金额：
$ 45万
依托单位：
Harvard University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2007076&HistoricalAwards=false
关键词：
RI Small Human Validation Batch

项目摘要

There exist many settings in which trying out a new decision might be costly, but logs of past decisions and outcomes might help to inform this new decision. For example, health records track clinical decisions and outcomes; online courses may track different teaching and engagement strategies and final performance; factories may track different process choices and output quality. Information from past logs may prevent us from making the same mistakes and improve outcomes. However, learning from such logged data is not easy: not all possible decisions may have been tried, and not all relevant information will have been recorded: for example, a health record may accurately contain what lab tests a patient received but lack potentially relevant information about their home and work environment. These challenges make it hard for systems to reason about the effect of following a different decision-making strategy than current practice. Current approaches fall into two main types: statistical methods, which have strong theoretical foundations but require many assumptions; and those based on human expertise, which can be strong but also fallible. This work brings together the strengths of statistical and human-based approaches to validation to help identify promising decision-making strategies from logged data.Specifically, the project focuses on integrating human and statistical inputs for two major tasks. The first is the task of converting the raw inputs (histories of measurements) into human-understandable representations, where statistical methods are used to proposed representations that will be useful for defining or summarizing a policy and human input is used to ensure that the representation is intuitive, or at least understandable. The second is the task of estimating differences in outcomes if a different decision is made. Here, statistical methods are used to form the initial estimate as well as identify what data are most influential to that estimate, and human input is used to determine whether the estimate is reliable given that it relies particularly on those data. These two building blocks, which allow us to summarize the data and a policy, as well as estimate outcomes, are then used for both evaluating a given policy and proposing new policies.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在许多情况下，尝试一个新的决定可能会花费很大的成本，但是过去的决定和结果的日志可能有助于为这个新的决定提供信息。例如，健康记录可以跟踪临床决策和结果;在线课程可以跟踪不同的教学和参与策略以及最终表现;工厂可以跟踪不同的流程选择和输出质量。来自过去日志的信息可以防止我们犯同样的错误并改善结果。然而，从这些记录的数据中学习并不容易：并非所有可能的决策都可能已经尝试过，并且并非所有相关信息都将被记录：例如，健康记录可能准确地包含患者接受的实验室测试，但缺乏关于他们的家庭和工作环境的潜在相关信息。这些挑战使得系统很难推理遵循与当前实践不同的决策策略的效果。目前的方法主要分为两类：统计方法，有很强的理论基础，但需要许多假设;以及基于人类专业知识的方法，可能很强大，但也可能出错。这项工作汇集了统计和基于人的验证方法的优势，以帮助从记录的数据中识别有前途的决策策略。具体来说，该项目侧重于为两项主要任务整合人力和统计输入。第一个任务是将原始输入（测量的历史）转换为人类可理解的表示，其中统计方法用于提出对定义或总结策略有用的表示，并且人类输入用于确保表示是直观的，或者至少是可理解的。第二个任务是，如果做出不同的决定，估计结果的差异。在这里，统计方法用于形成初始估计以及识别哪些数据对该估计最有影响，并且考虑到估计特别依赖于这些数据，人工输入用于确定估计是否可靠。这两个基本模块使我们能够总结数据和政策，以及估计结果，然后用于评估给定的政策和提出新的政策。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Finale Doshi-Velez其他文献

How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection

机器学习推荐如何影响临床医生的治疗选择：以抗抑郁药选择为例

DOI：
10.1038/s41398-021-01224-x
发表时间：
2021-02-04
期刊：
Translational Psychiatry
影响因子：
6.200
作者：
Maia Jacobs;Melanie F. Pradier;Thomas H. McCoy;Roy H. Perlis;Finale Doshi-Velez;Krzysztof Z. Gajos
通讯作者：
Krzysztof Z. Gajos

Ethical and regulatory challenges of large language models in medicine

医学中大型语言模型的伦理和监管挑战

DOI：
10.1016/s2589-7500(24)00061-x
发表时间：
2024-06-01
期刊：
Lancet Digital Health
影响因子：
24.100
作者：
Jasmine Chiat Ling Ong;Shelley Yin-Hsi Chang;Wasswa William;Atul J Butte;Nigam H Shah;Lita Sui Tjien Chew;Nan Liu;Finale Doshi-Velez;Wei Lu;Julian Savulescu;Daniel Shu Wei Ting
通讯作者：
Daniel Shu Wei Ting

Association between prescriber practices and major depression treatment outcomes

DOI：
10.1016/j.xjmad.2024.100080
发表时间：
2024-12-01
期刊：
Research article
影响因子：
作者：
Sarah Rathnam;Abhishek Sharma;Kamber L. Hart;Pilar F. Verhaak;Thomas H. McCoy;Roy H. Perlis;Finale Doshi-Velez
通讯作者：
Finale Doshi-Velez