CAREER: Non-asymptotic, Instance-optimal Closed-loop Learning
职业:非渐近、实例最优闭环学习
基本信息
- 批准号:2141511
- 负责人:
- 金额:$ 50.69万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-06-15 至 2027-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Machine Learning and Artificial Intelligence can recognize and exploit hidden patterns in data in order to predict future outcomes in applications ranging from content recommendation to personalized medicine. However, there are many problem areas where collecting the data is time-consuming (e.g., cells need to grow in lab environments) or expensive (e.g., special materials or expert opinions are required). Ideally, in order to reduce the amount of data needed to reach conclusions, already-collected data can be leveraged to guide the selection of future measurements in a closed-loop manner. While the behavior and benefits of some closed-loop data collection strategies are well understood in simple settings, this family of strategies is not commonly employed in real-world scientific laboratories or in medical trials due to a lack of predictability and accuracy of the outcomes. This project seeks to make foundational contributions to the understanding of closed-loop learning strategies with a view towards designing new data-collection strategies that are both effective and reliable. In practice, this may lead to requiring fewer patients in a clinical trial or to halving the time to identify a disease-curing drug. The investigator also plans to engage with high-school students and machine-learning enthusiasts alike to increase their level of awareness around data collection -- for instance, how even a simple survey, if not carefully designed, can result in privacy violations, demographic under-representation and bias of many forms, all of which may lead to inaccurate conclusions.For many problems of interest in closed-loop learning, prior art has focused only on minimax optimality, where the sample complexity of the worst-case problem instance is minimized. This approach leads to algorithms that are significantly inferior on "easy" or benign instances that may occur in nature but which are far from adversarial. In contrast this project will study the fundamental limits of instance-optimal sample complexity for problems of interactive learning and reinforcement learning in the Probably Approximately Correct (PAC) setting. The insights to be gained will be applied towards the design of algorithms that automatically adapt to the intrinsic difficulty of the particular problem instance being faced, be it benign or not. The proposed approach is motivated by the observation that the instance-optimal sample complexity decomposes into an asymptotic term, which is by now well characterized, and a moderate-confidence term, which is known to dominate the asymptotic term for all practical purposes. As the properties of the latter term are still poorly understood, lower bounds for it will be constructed together with algorithms that nearly achieve them. Such results will lead to algorithms that greatly reduce the overall instance-optimal sample complexity and vastly improve upon state-of-the-art algorithms that tend to cater to worst-case scenarios. The efforts will initially focus on structured linear bandits and reinforcement learning in the tabular and linear-function approximation settings. While these paradigms are of wide applicability to practice, they also have enough complexity to allow insights to be extrapolated to more generic closed-loop learning paradigms.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习和人工智能可以识别和利用数据中隐藏的模式,以预测从内容推荐到个性化医疗等应用的未来结果。然而,存在许多问题区域,其中收集数据是耗时的(例如,细胞需要在实验室环境中生长)或昂贵(例如,需要特殊材料或专家意见)。理想情况下,为了减少得出结论所需的数据量,可以利用已经收集的数据以闭环方式指导未来测量的选择。虽然一些闭环数据收集策略的行为和好处在简单的环境中得到了很好的理解,但由于缺乏结果的可预测性和准确性,这一系列策略并不常用于现实世界的科学实验室或医学试验。该项目旨在为理解闭环学习策略做出基础性贡献,以期设计既有效又可靠的新数据收集策略。在实践中,这可能会导致临床试验中需要更少的患者,或者将识别疾病治疗药物的时间减半。研究人员还计划与高中生和机器学习爱好者进行接触,以提高他们对数据收集的认识水平-例如,即使是一个简单的调查,如果没有精心设计,也会导致侵犯隐私,人口统计学代表性不足和许多形式的偏见,所有这些都可能导致不准确的结论。现有技术仅集中在极大极小最优性上,其中最坏情况问题实例的样本复杂性被最小化。这种方法导致算法在自然界中可能发生的“简单”或良性实例上显着逊色,但远非对抗性。相比之下,本项目将研究在可能近似正确(PAC)设置中交互学习和强化学习问题的实例最优样本复杂性的基本限制。所获得的见解将被应用于算法的设计,这些算法自动适应所面临的特定问题实例的内在困难,无论它是否是良性的。所提出的方法的动机是观察到的实例最优样本的复杂性分解成一个渐近项,这是现在很好的特点,和一个中等的信心,这是众所周知的主导渐近项的所有实际目的。由于后一项的性质仍然知之甚少,它的下限将与几乎实现它们的算法一起构建。这样的结果将导致算法,大大降低了整体的实例最佳样本的复杂性,并大大提高了国家的最先进的算法,往往迎合最坏的情况。这些努力最初将集中在表格和线性函数近似设置中的结构化线性强盗和强化学习。虽然这些范例广泛适用于实践,但它们也有足够的复杂性,可以将见解外推到更通用的闭环学习范例。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
- DOI:
- 发表时间:2021-08
- 期刊:
- 影响因子:0
- 作者:Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson
- 通讯作者:Andrew J. Wagenmaker;Max Simchowitz;Kevin G. Jamieson
Instance-optimal PAC Algorithms for Contextual Bandits
针对上下文强盗的实例最优 PAC 算法
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Li, Zhaoqi;Ratliff, Lillian;Nassif, Houssam;Jamieson, Kevin;Jain, Lalit
- 通讯作者:Jain, Lalit
Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design
- DOI:10.48550/arxiv.2207.02575
- 发表时间:2022-07
- 期刊:
- 影响因子:0
- 作者:Andrew J. Wagenmaker;Kevin G. Jamieson
- 通讯作者:Andrew J. Wagenmaker;Kevin G. Jamieson
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kevin Jamieson其他文献
Fair Active Learning in Low-Data Regimes
低数据制度下的公平主动学习
- DOI:
10.48550/arxiv.2312.08559 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Romain Camilleri;Andrew J. Wagenmaker;Jamie Morgenstern;Lalit Jain;Kevin Jamieson - 通讯作者:
Kevin Jamieson
Query-Efficient Algorithms to Find the Unique Nash Equilibrium in a Two-Player Zero-Sum Matrix Game
在两人零和矩阵博弈中寻找唯一纳什均衡的高效查询算法
- DOI:
10.48550/arxiv.2310.16236 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Arnab Maiti;Ross Boczar;Kevin Jamieson;Lillian J. Ratliff - 通讯作者:
Lillian J. Ratliff
Unbiased Identification of Broadly Appealing Content Using a Pure Exploration Infinitely-Armed Bandit Strategy
使用纯粹探索无限武装强盗策略公正地识别具有广泛吸引力的内容
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Maryam Aziz;J. Anderton;Kevin Jamieson;Alice Wang;Hugues Bouchard;J. Aslam - 通讯作者:
J. Aslam
Optimal Exploration is no harder than Thompson Sampling
最优探索并不比汤普森采样难
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Zhaoqi Li;Kevin Jamieson;Lalit Jain - 通讯作者:
Lalit Jain
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning
利用策略和主动学习构建具有成本效益的代理奖励模型
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yifang Chen;Shuohang Wang;Ziyi Yang;Hiteshi Sharma;Nikos Karampatziakis;Donghan Yu;Kevin Jamieson;Simon Shaolei Du;Yelong Shen - 通讯作者:
Yelong Shen
Kevin Jamieson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kevin Jamieson', 18)}}的其他基金
CIF: Small: Online Learning and Optimal Experiment Design with a Budget
CIF:小型:在线学习和预算内的最佳实验设计
- 批准号:
2007036 - 财政年份:2020
- 资助金额:
$ 50.69万 - 项目类别:
Standard Grant
相似国自然基金
Non-CG DNA甲基化平衡大豆产量和SMV抗性的分子机制
- 批准号:32301796
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
long non-coding RNA(lncRNA)-activatedby TGF-β(lncRNA-ATB)通过成纤维细胞影响糖尿病创面愈合的机制研究
- 批准号:LQ23H150003
- 批准年份:2023
- 资助金额:0.0 万元
- 项目类别:省市级项目
染色体不稳定性调控肺癌non-shedding状态及其生物学意义探索研究
- 批准号:82303936
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
变分法在双临界Hénon方程和障碍系统中的应用
- 批准号:12301258
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
BTK抑制剂下调IL-17分泌增强CD20mb对Non-GCB型弥漫大B细胞淋巴瘤敏感性
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Non-TAL效应子NUDX4通过Nudix水解酶活性调控水稻白叶枯病菌致病性的分子机制
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
一种新non-Gal抗原CYP3A29的鉴定及其在猪-猕猴异种肾移植体液排斥反应中的作用
- 批准号:
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
非经典BAF(non-canonical BAF,ncBAF)复合物在小鼠胚胎干细胞中功能及其分子机理的研究
- 批准号:32170797
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:面上项目
Non-Oberbeck-Boussinesq效应下两相自然对流问题的建模及高效算法研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
植物胚乳发育过程中non-CG甲基化调控的分子机制探究
- 批准号:LQ21C060001
- 批准年份:2020
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
CAREER: Non-Asymptotic Random Matrix Theory and Connections
职业:非渐近随机矩阵理论和联系
- 批准号:
2237646 - 财政年份:2023
- 资助金额:
$ 50.69万 - 项目类别:
Continuing Grant
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2022
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
Non-Asymptotic Random Matrix Theory and Random Graphs
非渐近随机矩阵理论和随机图
- 批准号:
2054408 - 财政年份:2021
- 资助金额:
$ 50.69万 - 项目类别:
Standard Grant
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2021
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
A Non-Asymptotic Analysis of Stochastic Mirror Descent for Non-Convex Learning
非凸学习的随机镜像下降的非渐近分析
- 批准号:
2444063 - 财政年份:2020
- 资助金额:
$ 50.69万 - 项目类别:
Studentship
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2020
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2019
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
Non-Asymptotic Statistical Similarity Measures
非渐近统计相似性度量
- 批准号:
424522268 - 财政年份:2019
- 资助金额:
$ 50.69万 - 项目类别:
Research Fellowships
Non-asymptotic inference for high and infinite dimensional data
高维和无限维数据的非渐近推理
- 批准号:
RGPIN-2018-05678 - 财政年份:2018
- 资助金额:
$ 50.69万 - 项目类别:
Discovery Grants Program - Individual
A Non-Asymptotic Theory of Robustness
鲁棒性的非渐近理论
- 批准号:
1811376 - 财政年份:2018
- 资助金额:
$ 50.69万 - 项目类别:
Continuing Grant