权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CRII:RI: Adaptive and Practical Algorithms for Personalization

CRII:RI：个性化的自适应实用算法

基本信息

批准号：
1755781
负责人：
Haipeng Luo
金额：
$ 17.5万
依托单位：
University of Southern California
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2018
资助国家：
美国
起止时间：
2018-05-01 至 2021-04-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1755781&HistoricalAwards=false
关键词：
CRII RI Adaptive Practical Algorithms

项目摘要

Intelligent personalization systems, such as those in news, advertising, search, online shopping, and clinical trials, are playing an increasingly important role in daily lives, bringing to us tremendous convenience as well as increasing the productivity of society. The main challenge in developing algorithmic solutions for these systems lies in the fact that only feedback for the recommended options, but not the other ones, is provided by the users. Many simple heuristics have been used in practice, and there are also some recent advances on more rigorous approaches based on the "contextual bandit" model, referring to an analogy with the objective to maximize the sum of rewards earned through a sequence of lever pulls where an encoding of past performance provides context. However, there is still great room for improvement in terms of both practicality and performance guarantees. This project seeks to develop more practical and adaptive contextual bandit algorithms for such systems. The success of this project requires developing new algorithmic techniques as well as mathematical tools from statistics, optimization, machine learning, and their combinations in an innovative way, which advances the theory and practice of the field of online decision making. Education is integrated into the project through curriculum development and student mentoring. Outreach activities include collaborations with other universities as well as with industry, and also organizing related workshops at top conferences. Specifically, the project aims at designing a family of practical contextual bandit algorithms which not only enjoy some information-theoretic worst-case guarantees but can also achieve much better performance when the problem exhibits some kind of "easiness". First, the project systematically studies different kinds of "easiness" measurements and develops and analyzes specific algorithms for each of these measurements. Second, the project further considers the question of whether it is possible to have a single algorithm that is optimal against all problem instances, where optimality is in terms of the best performance among a reasonable class of algorithms. Finally, the project implements all the developed algorithms and conducts empirical evaluation on benchmark datasets, with the goal of releasing easy-to-use and publicly available software.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

智能个性化系统，如新闻，广告，搜索，在线购物和临床试验，在日常生活中发挥着越来越重要的作用，为我们带来了巨大的便利，并提高了社会的生产力。为这些系统开发算法解决方案的主要挑战在于，用户只提供推荐选项的反馈，而不提供其他选项的反馈。在实践中已经使用了许多简单的策略，最近也有一些基于“情境强盗”模型的更严格方法的进展，指的是一种类比，其目标是通过一系列杠杆拉动来最大化所获得的奖励总和，其中对过去表现的编码提供了背景。不过，无论是实用性还是性能保障方面，都还有很大的提升空间。本项目旨在为此类系统开发更实用和自适应的上下文强盗算法。该项目的成功需要以创新的方式开发新的算法技术以及统计，优化，机器学习及其组合的数学工具，从而推进在线决策领域的理论和实践。通过课程开发和学生辅导，将教育纳入该项目。外展活动包括与其他大学以及行业的合作，并在顶级会议上组织相关研讨会。具体而言，该项目的目的是设计一个家庭的实际上下文强盗算法，不仅享有一些信息理论的最坏情况下的保证，但也可以实现更好的性能时，问题表现出某种“容易”。首先，该项目系统地研究了不同类型的“容易”测量，并开发和分析了这些测量的具体算法。其次，该项目进一步考虑是否有可能有一个单一的算法，是最佳的所有问题的情况下，最优性是在一个合理的算法类中的最佳性能。最后，该项目实施所有开发的算法，并对基准数据集进行实证评估，目标是发布易于使用和公开可用的软件。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（17）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Improved Path-length Regret Bounds for Bandits

DOI：
发表时间：
2019-01
期刊：
ArXiv
影响因子：
0
作者：
Sébastien Bubeck;Yuanzhi Li;Haipeng Luo;Chen-Yu Wei
通讯作者：
Sébastien Bubeck;Yuanzhi Li;Haipeng Luo;Chen-Yu Wei

Model Selection for Contextual Bandits

上下文强盗的模型选择

DOI：
发表时间：
2019
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Foster, Dylan;Krishnamurthy, Akshay;Luo, Haipeng
通讯作者：
Luo, Haipeng

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously

DOI：
发表时间：
2019-01
期刊：
ArXiv
影响因子：
0
作者：
Julian Zimmert;Haipeng Luo;Chen-Yu Wei
通讯作者：
Julian Zimmert;Haipeng Luo;Chen-Yu Wei

Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds

具有变化的动作集的对抗性在线学习：具有近似遗憾界限的高效算法

DOI：
发表时间：
2020
期刊：
ArXiv
影响因子：
0
作者：
E. Emamjomeh;Chen;Haipeng Luo;D. Kempe
通讯作者：
D. Kempe

Open Problem: Model Selection for Contextual Bandits

开放问题：上下文强盗的模型选择

DOI：
发表时间：
2020
期刊：
Annual Conference Computational Learning Theory
影响因子：
0
作者：
Dylan J. Foster;A. Krishnamurthy;Haipeng Luo
通讯作者：
Haipeng Luo

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Haipeng Luo其他文献

Towards Minimax Online Learning with Unknown Time Horizon

迈向未知时间范围的极小极大在线学习

DOI：
发表时间：
2013
期刊：
影响因子：
0
作者：
Haipeng Luo;R. Schapire
通讯作者：
R. Schapire

Clairvoyant Regret Minimization: Equivalence with Nemirovski's Conceptual Prox Method and Extension to General Convex Games

透视遗憾最小化：与 Nemirovski 概念 Prox 方法的等价以及对一般凸博弈的扩展

DOI：
10.48550/arxiv.2208.14891
发表时间：
2022
期刊：
ArXiv
影响因子：
0
作者：
Gabriele Farina;Christian Kroer;Chung;Haipeng Luo
通讯作者：
Haipeng Luo

Efficient electro-optical tuning of an optical frequency microcomb on a monolithically integrated high-Q lithium niobate microdisk

DOI：
https://doi.org/10.1364/OL.44.005953
发表时间：
2019
期刊：
Optics Letters
影响因子：
作者：
Zhiwei Fang;Haipeng Luo;Jintian Lin;Min Wang;Jianhao Zhang;Rongbo Wu;Junxia Zhou;Wei Chu;Tao Lu;Ya Cheng
通讯作者：
Ya Cheng