权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CIF: Small: Online Learning and Optimal Experiment Design with a Budget

CIF：小型：在线学习和预算内的最佳实验设计

基本信息

批准号：
2007036
负责人：
Kevin Jamieson
金额：
$ 50.02万
依托单位：
University of Washington
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2007036&HistoricalAwards=false
关键词：
CIF Small Online Learning Optimal

项目摘要

Machine learning is routinely used in science and industry to make inferences about a phenomenon that cannot be observed directly, but can be probed through a series of experiments. For instance, the chief metric when optimizing a chemical reaction may be the yield of the desired output, but many experimental conditions such as pH and ambient temperature may affect the yield. Adaptive experimental design provides a framework to exploit observed measurements of the past to plan measurements in the future in a closed loop. It has been shown to require far fewer overall measurements to achieve the same inference goals compared to any fixed plan chosen in advance. However, a limitation is the implicit assumption that every possible measurement is available at all times. In practice this is rarely true - for example chemical reagents can run out and restrict the possible experiments. This forces a tradeoff on practitioners: if only a subset of measurements are possible at the current time and you have a fixed budget of experiments, is it worth it to take one of the available experiments, or abstain in the hope of better opportunities in the future? The focus of this research is to formalize such questions and develop a framework for addressing online adaptive experimental design in the sequential setting of unpredictable measurement availability. The project also includes a plan to vertically integrate robust data collection techniques across the university touching all levels and disciplines, as well as outreach that starts with K-12 students and extends to the community at large.This project amalgamates insights from adaptive experimental design, multi-armed bandits, and online algorithms. Current adaptive experimental design methods, for instance in stochastic optimization and best-arm identification, assume access to a fixed batch of experiments to choose from at each time, and explicitly plan to evolve the allocation of measurements over this batch using optimal design techniques such as G-optimal design. However, if the measurement set is changing at each time, potentially adversarially, such planning is extremely difficult. Motivated by progress in specific cases that leverage advances in convex optimization, the project seeks to provide a general framework for experimental design including optimization and multiple testing in online settings.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习在科学和工业中经常被用来对一种无法直接观察到但可以通过一系列实验进行探索的现象进行推断。例如，优化化学反应的主要指标可能是所需产量，但许多实验条件，如pH和环境温度可能会影响产量。自适应实验设计提供了一个框架，可以利用过去的观测测量来计划未来的闭环系统中的测量。已经证明，与事先选择的任何固定计划相比，实现相同的推理目标所需的总体测量要少得多。然而，一个限制是隐含的假设，即所有可能的测量都是随时可用的。在实践中，这很少是真的--例如，化学试剂可能会耗尽并限制可能的实验。这迫使从业者做出权衡：如果目前只有一部分测量是可能的，并且你有固定的实验预算，那么是否值得进行其中一个可用的实验，或者放弃以期在未来有更好的机会？这项研究的重点是将这些问题形式化，并开发一个框架，用于解决不可预测测量可用性的连续设置下的在线适应性实验设计。该项目还包括一项计划，将在整个大学垂直整合涉及所有级别和学科的强大数据收集技术，以及从K-12学生开始并扩展到整个社区的扩展。该项目融合了自适应实验设计、多臂土匪和在线算法的见解。当前的自适应实验设计方法，例如在随机最优化和最佳臂识别中，假设每次都可以访问一批固定的实验以供选择，并显式地计划使用诸如G-最优设计之类的最优设计技术来进化在这一批上的测量分配。然而，如果测量集每次都在变化，可能会相反，这样的计划是极其困难的。在利用凸优化技术进步的特定案例的推动下，该项目寻求为实验设计提供一个通用框架，包括在线环境中的优化和多重测试。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Experimental Design for Regret Minimization in Linear Bandits

DOI：
发表时间：
2020-11
期刊：
影响因子：
0
作者：
Andrew Wagenmaker;Julian Katz-Samuels;Kevin G. Jamieson
通讯作者：
Andrew Wagenmaker;Julian Katz-Samuels;Kevin G. Jamieson

Best Arm Identification with Safety Constraints

具有安全约束的最佳手臂识别

DOI：
发表时间：
2021
期刊：
Proceedings of Machine Learning Research
影响因子：
0
作者：
Wang, Zhenlin;Wagenmaker, Andrew;Jamieson, Kevin
通讯作者：
Jamieson, Kevin

Stochastic Contextual Bandits with Long Horizon Rewards

具有长期奖励的随机上下文强盗

DOI：
发表时间：
2023
期刊：
Proceedings of the AAAI Conference on Artificial Intelligence
影响因子：
0
作者：
Qin, Yuzhen;Li, Yingcong;Pasqualetti, Fabio;Fazel, Maryam;Oymak, Samet
通讯作者：
Oymak, Samet

Task-Optimal Exploration in Linear Dynamical Systems

DOI：
发表时间：
2021-02
期刊：
ArXiv
影响因子：
0
作者：
Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson
通讯作者：
Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

DOI：
发表时间：
2021-02
期刊：
影响因子：
0
作者：
Zhihan Xiong;Ruoqi Shen;Qiwen Cui;Maryam Fazel;S. Du
通讯作者：
Zhihan Xiong;Ruoqi Shen;Qiwen Cui;Maryam Fazel;S. Du

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Kevin Jamieson其他文献

Fair Active Learning in Low-Data Regimes

低数据制度下的公平主动学习

DOI：
10.48550/arxiv.2312.08559
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Romain Camilleri;Andrew J. Wagenmaker;Jamie Morgenstern;Lalit Jain;Kevin Jamieson
通讯作者：
Kevin Jamieson

Query-Efficient Algorithms to Find the Unique Nash Equilibrium in a Two-Player Zero-Sum Matrix Game

在两人零和矩阵博弈中寻找唯一纳什均衡的高效查询算法

DOI：
10.48550/arxiv.2310.16236
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Arnab Maiti;Ross Boczar;Kevin Jamieson;Lillian J. Ratliff
通讯作者：
Lillian J. Ratliff

Unbiased Identification of Broadly Appealing Content Using a Pure Exploration Infinitely-Armed Bandit Strategy

使用纯粹探索无限武装强盗策略公正地识别具有广泛吸引力的内容

DOI：
发表时间：
2023
期刊：
ACM Transactions on Recommender Systems
影响因子：
0
作者：
Maryam Aziz;J. Anderton;Kevin Jamieson;Alice Wang;Hugues Bouchard;J. Aslam
通讯作者：
J. Aslam

Optimal Exploration is no harder than Thompson Sampling

最优探索并不比汤普森采样难

DOI：
发表时间：
2023
期刊：
International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Zhaoqi Li;Kevin Jamieson;Lalit Jain
通讯作者：
Lalit Jain

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning

利用策略和主动学习构建具有成本效益的代理奖励模型

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Yifang Chen;Shuohang Wang;Ziyi Yang;Hiteshi Sharma;Nikos Karampatziakis;Donghan Yu;Kevin Jamieson;Simon Shaolei Du;Yelong Shen
通讯作者：
Yelong Shen