CAREER: Non-asymptotic, Instance-optimal Closed-loop Learning
职业:非渐近、实例最优闭环学习
基本信息
- 批准号:2141511
- 负责人:
- 金额:$ 50.69万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-06-15 至 2027-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Machine Learning and Artificial Intelligence can recognize and exploit hidden patterns in data in order to predict future outcomes in applications ranging from content recommendation to personalized medicine. However, there are many problem areas where collecting the data is time-consuming (e.g., cells need to grow in lab environments) or expensive (e.g., special materials or expert opinions are required). Ideally, in order to reduce the amount of data needed to reach conclusions, already-collected data can be leveraged to guide the selection of future measurements in a closed-loop manner. While the behavior and benefits of some closed-loop data collection strategies are well understood in simple settings, this family of strategies is not commonly employed in real-world scientific laboratories or in medical trials due to a lack of predictability and accuracy of the outcomes. This project seeks to make foundational contributions to the understanding of closed-loop learning strategies with a view towards designing new data-collection strategies that are both effective and reliable. In practice, this may lead to requiring fewer patients in a clinical trial or to halving the time to identify a disease-curing drug. The investigator also plans to engage with high-school students and machine-learning enthusiasts alike to increase their level of awareness around data collection -- for instance, how even a simple survey, if not carefully designed, can result in privacy violations, demographic under-representation and bias of many forms, all of which may lead to inaccurate conclusions.For many problems of interest in closed-loop learning, prior art has focused only on minimax optimality, where the sample complexity of the worst-case problem instance is minimized. This approach leads to algorithms that are significantly inferior on "easy" or benign instances that may occur in nature but which are far from adversarial. In contrast this project will study the fundamental limits of instance-optimal sample complexity for problems of interactive learning and reinforcement learning in the Probably Approximately Correct (PAC) setting. The insights to be gained will be applied towards the design of algorithms that automatically adapt to the intrinsic difficulty of the particular problem instance being faced, be it benign or not. The proposed approach is motivated by the observation that the instance-optimal sample complexity decomposes into an asymptotic term, which is by now well characterized, and a moderate-confidence term, which is known to dominate the asymptotic term for all practical purposes. As the properties of the latter term are still poorly understood, lower bounds for it will be constructed together with algorithms that nearly achieve them. Such results will lead to algorithms that greatly reduce the overall instance-optimal sample complexity and vastly improve upon state-of-the-art algorithms that tend to cater to worst-case scenarios. The efforts will initially focus on structured linear bandits and reinforcement learning in the tabular and linear-function approximation settings. While these paradigms are of wide applicability to practice, they also have enough complexity to allow insights to be extrapolated to more generic closed-loop learning paradigms.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习和人工智能可以识别和利用数据中的隐藏模式,以预测从内容建议到个性化医学等应用程序的未来结果。但是,在许多问题领域中,收集数据是耗时的(例如,需要在实验室环境中生长)或昂贵(例如,需要特殊的材料或专家意见)。理想情况下,为了减少得出结论所需的数据量,可以利用已收集的数据来以闭环方式指导未来测量结果。尽管在简单的环境中对某些闭环数据收集策略的行为和好处有充分的理解,但由于缺乏结果的可预测性和准确性,这种策略家族并未在现实世界的科学实验室或医学试验中使用。该项目旨在为理解闭环学习策略做出基本贡献,以设计有效且可靠的新数据收集策略。实际上,这可能导致在临床试验中需要更少的患者或减少鉴定疾病康复药物的时间。 The investigator also plans to engage with high-school students and machine-learning enthusiasts alike to increase their level of awareness around data collection -- for instance, how even a simple survey, if not carefully designed, can result in privacy violations, demographic under-representation and bias of many forms, all of which may lead to inaccurate conclusions.For many problems of interest in closed-loop learning, prior art has focused only on minimax optimality, where the sample complexity最严重的问题实例最小化。这种方法导致算法在自然界中可能发生的“容易”或良性实例上显着较低,但远非对抗性。相比之下,该项目将研究实例 - 最佳样本复杂性的基本限制,以互动学习和强化学习的问题可能是正确的(PAC)设置。要获得的见解将用于设计算法的设计,这些算法会自动适应所面临的特定问题实例的内在难度,无论是否良性。提出的方法是由以下观察结果激励的,即最佳样本复杂性将其分解为渐近术语,该术语目前已经表征了良好的表征,并且中度信心项,该术语已知,该术语众所周知,该术语在所有实际目的中都占主导地位。由于后一项的特性仍然很少了解,因此将其构造的下限将与几乎实现它们的算法一起构建。这样的结果将导致算法大大降低了总体实例 - 最佳样本的复杂性,并大大改善了最新的算法,这些算法倾向于迎合最坏情况。最初,努力将集中在表格和线性功能近似设置中的结构化线性匪徒和增强学习。尽管这些范式非常适用于练习,但它们也具有足够的复杂性,可以将见解推断到更通用的闭环学习范式中。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛的审查标准来通过评估来获得支持的。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
- DOI:
- 发表时间:2021-08
- 期刊:
- 影响因子:0
- 作者:Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson
- 通讯作者:Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson
Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design
- DOI:10.48550/arxiv.2207.02575
- 发表时间:2022-07
- 期刊:
- 影响因子:0
- 作者:Andrew J. Wagenmaker;Kevin G. Jamieson
- 通讯作者:Andrew J. Wagenmaker;Kevin G. Jamieson
Instance-optimal PAC Algorithms for Contextual Bandits
针对上下文强盗的实例最优 PAC 算法
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Li, Zhaoqi;Ratliff, Lillian;Nassif, Houssam;Jamieson, Kevin;Jain, Lalit
- 通讯作者:Jain, Lalit
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kevin Jamieson其他文献
Fair Active Learning in Low-Data Regimes
低数据制度下的公平主动学习
- DOI:
10.48550/arxiv.2312.08559 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Romain Camilleri;Andrew J. Wagenmaker;Jamie Morgenstern;Lalit Jain;Kevin Jamieson - 通讯作者:
Kevin Jamieson
Query-Efficient Algorithms to Find the Unique Nash Equilibrium in a Two-Player Zero-Sum Matrix Game
在两人零和矩阵博弈中寻找唯一纳什均衡的高效查询算法
- DOI:
10.48550/arxiv.2310.16236 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Arnab Maiti;Ross Boczar;Kevin Jamieson;Lillian J. Ratliff - 通讯作者:
Lillian J. Ratliff
Unbiased Identification of Broadly Appealing Content Using a Pure Exploration Infinitely-Armed Bandit Strategy
使用纯粹探索无限武装强盗策略公正地识别具有广泛吸引力的内容
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Maryam Aziz;J. Anderton;Kevin Jamieson;Alice Wang;Hugues Bouchard;J. Aslam - 通讯作者:
J. Aslam
Optimal Exploration is no harder than Thompson Sampling
最优探索并不比汤普森采样难
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Zhaoqi Li;Kevin Jamieson;Lalit Jain - 通讯作者:
Lalit Jain
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning
利用策略和主动学习构建具有成本效益的代理奖励模型
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yifang Chen;Shuohang Wang;Ziyi Yang;Hiteshi Sharma;Nikos Karampatziakis;Donghan Yu;Kevin Jamieson;Simon Shaolei Du;Yelong Shen - 通讯作者:
Yelong Shen
Kevin Jamieson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kevin Jamieson', 18)}}的其他基金
CIF: Small: Online Learning and Optimal Experiment Design with a Budget
CIF:小型:在线学习和预算内的最佳实验设计
- 批准号:
2007036 - 财政年份:2020
- 资助金额:
$ 50.69万 - 项目类别:
Standard Grant
相似国自然基金
星载SAR非沿迹场景匹配曲线成像理论与方法
- 批准号:62331007
- 批准年份:2023
- 资助金额:237 万元
- 项目类别:重点项目
线性热超构材料的非互易传热性能
- 批准号:12302171
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
长链非编码RNA lnRPT通过YB1/eEF1调控心肌纤维化的功能和机理研究
- 批准号:82370274
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
生物钟基因Nr1d1通过调控NLRP3焦亡通路抑制非酒精性脂肪性肝炎进展的机制研究
- 批准号:82300652
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
微通道非共沸混合工质流动沸腾传热机理及模型预测
- 批准号:52376149
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
CAREER: Non-Asymptotic Random Matrix Theory and Connections
职业:非渐近随机矩阵理论和联系
- 批准号:
2237646 - 财政年份:2023
- 资助金额:
$ 50.69万 - 项目类别:
Continuing Grant
Asymptotic analysis and behavior of free boundary for nonlinear parabolic problems
非线性抛物线问题的渐近分析和自由边界行为
- 批准号:
22K03387 - 财政年份:2022
- 资助金额:
$ 50.69万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2022
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
Asymptotic analysis on PDEs appearing in mean field games, crystal growth and anomalous diffusion
平均场博弈、晶体生长和反常扩散中偏微分方程的渐近分析
- 批准号:
22K03382 - 财政年份:2022
- 资助金额:
$ 50.69万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Asymptotic analysis for partial differential equations of nonlinear waves with dissipation and dispersion
具有耗散和色散的非线性波偏微分方程的渐近分析
- 批准号:
22K13939 - 财政年份:2022
- 资助金额:
$ 50.69万 - 项目类别:
Grant-in-Aid for Early-Career Scientists