权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

III: Medium: Collaborative Research: Counterfactual Learning and Evaluation for Interactive Information Systems

III：媒介：协作研究：交互式信息系统的反事实学习和评估

基本信息

批准号：
1901168
负责人：
Thorsten Joachims
金额：
$ 98万
依托单位：
Cornell University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-08-15 至 2024-07-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1901168&HistoricalAwards=false
关键词：
III Medium Collaborative Research Counterfactual

项目摘要

Many information systems engage with their users through the following loop of interactions: the system receives a context as input (e.g. query, user profile), responds with a context-dependent action (e.g. ranking, recommendation, ad), and then receives some explicit or implicit feedback on the quality of the action (e.g. star rating, following a search result, clicking on an ad). While ubiquitous and plentiful, log data from this interaction loop does not fit the standard mold of supervised learning, since the feedback is both biased and partial -- the system determines through its actions where it gets feedback, and even for the chosen actions it typically doesn't observe all feedback (e.g. missing clicks on relevant results in ranking). This project will address the question of how this logged data can nevertheless be used for evaluating and learning new systems. The potential upsides of reusing the existing log data are evident. For evaluation, the use of historic log data enables engineers to rapidly evaluate many new systems offline (e.g. new ranking functions, recommendation policies), without the weeks of delay and the potential negative impact on user experience implied by online A/B testing. For learning, it similarly enables offline reuse of existing data instead of slowly collecting new data through an online learning algorithm. This can greatly speed up the machine-learning development cycle, since model selection, feature selection, and eventual quality control can happen offline before any learned policy gets deployed to the users. Reusing existing log data is particularly important for small-scale information systems (e.g. scholarly search), where it is often the only type of potential training data that is readily available in sufficient quantity.The intellectual merit of the project will lie in the development of principled machine learning methods that enable information systems to reliably learn from logs of the partial and biased feedback they produce. The theoretical basis for the research lies in deep connections to counterfactual and causal inference, exploiting the analogy between logs and controlled experiments with actions as treatments and the current system as the assignment mechanism. The research builds upon recent advances in counterfactual estimators, answering the question of how a new system would have performed, if it had been used instead of the system that logged the data. The project will develop new counterfactual estimators specifically designed for the action spaces typically encountered in information systems (e.g. rankings), new propensity models, and new counterfactual policy learning algorithms that incorporate both. Finally, to validate the real-world effectiveness of the research, the project will build the Localify system, which provides local music-event recommendations and personalized playlists.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

许多信息系统通过以下交互循环与它们的用户接触：系统接收上下文作为输入(例如，查询、用户简档)，用依赖于上下文的动作(例如，排名、推荐、广告)进行响应，然后接收关于动作质量的一些显式或隐式反馈(例如，星级评级、跟随搜索结果、点击广告)。虽然来自该交互循环的日志数据无处不在且丰富，但它并不符合监督学习的标准模型，因为反馈是有偏见的和部分的--系统通过其操作确定它从哪里获得反馈，甚至对于选择的操作，它通常不会观察到所有反馈(例如，在排名中错过了对相关结果的点击)。该项目将解决如何将这些记录的数据用于评估和学习新系统的问题。重用现有日志数据的潜在好处是显而易见的。对于评估，使用历史日志数据使工程师能够快速离线评估许多新系统(例如，新的排名功能、推荐策略)，而不会出现数周的延迟和在线A/B测试对用户体验的潜在负面影响。对于学习，它同样允许离线重复使用现有数据，而不是通过在线学习算法缓慢收集新数据。这可以极大地加快机器学习开发周期，因为在向用户部署任何学习的策略之前，可以离线进行模型选择、功能选择和最终的质量控制。重复使用现有的日志数据对于小规模的信息系统(例如学术搜索)尤其重要，因为它往往是现成的足够数量的唯一类型的潜在训练数据。该项目的智力价值在于开发有原则的机器学习方法，使信息系统能够可靠地从它们产生的部分和有偏见的反馈的日志中学习。这项研究的理论基础在于与反事实和因果推理的深层联系，利用日志和对照实验之间的类比，以行为为处理，以现行系统为分配机制。这项研究建立在反事实估计器的最新进展基础上，回答了这样一个问题：如果一个新的系统被用来取代记录数据的系统，它的表现会如何。该项目将开发专门为信息系统中通常遇到的行动空间(例如排名)设计的新的反事实估计器、新的倾向模型和将两者结合在一起的新的反事实政策学习算法。最后，为了验证研究的现实有效性，该项目将建立本地化系统，提供本地音乐活动推荐和个性化的播放列表。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（17）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Policy Learning for Fairness in Ranking

DOI：
发表时间：
2019-02
期刊：
影响因子：
0
作者：
Ashudeep Singh;T. Joachims
通讯作者：
Ashudeep Singh;T. Joachims

Bandits with Costly Reward Observations

强盗的观察代价高昂

DOI：
发表时间：
2023
期刊：
Conference on Uncertainty in Artificial Intelligence (UAI
影响因子：
0
作者：
Tucker, Aaron;Biddulph, Caleb;Wang, Claire;Joachims, Thorsten
通讯作者：
Joachims, Thorsten

Variance-Minimizing Augmentation Logging for Counterfactual Evaluation in Contextual Bandits

DOI：
10.1145/3539597.3570452
发表时间：
2023-02
期刊：
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining
影响因子：
0
作者：
Aaron David Tucker;T. Joachims
通讯作者：
Aaron David Tucker;T. Joachims

Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

DOI：
10.48550/arxiv.2305.08062
发表时间：
2023-05
期刊：
ArXiv
影响因子：
0
作者：
Yuta Saito;Qingyang Ren;T. Joachims
通讯作者：
Yuta Saito;Qingyang Ren;T. Joachims

Controlling Fairness and Bias in Dynamic Learning-to-Rank

DOI：
10.1145/3397271.3401100
发表时间：
2020-05
期刊：
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
影响因子：
0
作者：
Marco Morik;Ashudeep Singh;Jessica Hong;T. Joachims
通讯作者：
Marco Morik;Ashudeep Singh;Jessica Hong;T. Joachims

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Thorsten Joachims其他文献

P ROMPT O PTIMIZATION WITH L OGGED B ANDIT D ATA

使用记录的带宽和数据进行提示优化

DOI：
发表时间：
期刊：
影响因子：
0
作者：
Haruka Kiyohara;Yuta Saito;Daniel Yiming Cao;Thorsten Joachims
通讯作者：
Thorsten Joachims

Fair Ranking under Disparate Uncertainty

不同不确定性下的公平排名

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Richa Rastogi;Thorsten Joachims
通讯作者：
Thorsten Joachims

Localify.org: Locally-focus Music Artist and Event Recommendation

Localify.org：本地关注的音乐艺术家和活动推荐

DOI：
发表时间：
2023
期刊：
ACM Conference on Recommender Systems
影响因子：
0
作者：
Douglas Turnbull;April Trainor;Douglas Turnbull;Elizabeth Richards;Kieran Bentley;Victoria Conrad;Paul Gagliano;Cassandra Raineault;Thorsten Joachims
通讯作者：
Thorsten Joachims

Rankings for Two-Sided Market Platforms

双边市场平台排名

DOI：
发表时间：
2020
期刊：
影响因子：
0
作者：
Yi;Thorsten Joachims
通讯作者：
Thorsten Joachims

Analysis of nutrition data by means of a matrix factorization method

DOI：
10.1007/s13748-015-0062-0
发表时间：
2015-11-06
期刊：
Progress in Artificial Intelligence
影响因子：
2.400
作者：
Jorge Díez;Edna Gamboa;Teresita González de Cossío;Oscar Luaces;Thorsten Joachims;Antonio Bahamonde
通讯作者：
Antonio Bahamonde