Collaborative Research: CIF: Small: Sequential Decision Making Under Uncertainty With Submodular Rewards
合作研究:CIF:小:不确定性下的顺序决策与子模奖励
基本信息
- 批准号:2149588
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-03-01 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Many companies, government agencies, and individuals make sequences of challenging decisions over time, for which they must choose from among many possible options, may have limited knowledge about the outcomes of their decisions, and will receive limited feedback. For example, search engines and content providers make decisions for what sets of websites, products, or media to recommend each time a user logs on to their system or submits a query, in some cases having limited knowledge of the users’ underlying preferences. If users' privacy is protected, then only users' past actions, such as which links or media were selected by earlier users, will be available as feedback to inform the search engine or content provider on what to recommend next. This project aims to develop provably good strategies that decision makers can use in such settings, aiding their decision making under uncertainty and with limited feedback. This project will also develop strategies for the more challenging setting where multiple decision makers must coordinate with each other on such problems, but have limited communication available to do so. Furthermore, this project will support undergraduate and graduate research training, as well as graduate-level course development, in machine learning and artificial intelligence, preparing students for careers in advanced technical fields.The goal of this project is to develop novel, provably good strategies for solving sequential decision problems (multi-armed bandit problems) when the actions available have a combinatorial structure (such as choosing subsets of products to recommend), the rewards have a diminishing returns property (submodularity), and there is no side-information available -- the only feedback comes from the reward itself. The proposed work builds on the rich literature of multi-armed bandits and of submodular optimization. The technical aims of the project are divided into two thrusts. The first thrust focuses on developing algorithms and identifying their regret bounds for combinatorial multi-armed bandit problems with submodular rewards and no additional feedback. The second thrust extends those strategies and regret analyses to a decentralized setting, where multiple agents coordinate to solve combinatorial multi-armed bandit problems, despite limited resources for communication.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着时间的推移,许多公司、政府机构和个人都会做出一系列具有挑战性的决策,他们必须从许多可能的选项中做出选择,他们对决策的结果可能知之甚少,收到的反馈也有限。例如,每当用户登录到搜索引擎或提交查询时,搜索引擎和内容提供商都会决定推荐哪些网站、产品或媒体,在某些情况下,它们对用户的潜在偏好了解有限。如果用户的隐私受到保护,那么只有用户过去的行为,比如哪些链接或媒体是之前的用户选择的,才能作为反馈,告知搜索引擎或内容提供商下一步应该推荐什么。该项目旨在制定决策者可以在这种情况下使用的可证明的良好策略,帮助他们在不确定和有限反馈的情况下做出决策。该项目还将为更具挑战性的环境制定战略,在这些环境中,多个决策者必须在这些问题上相互协调,但可用的沟通有限。此外,该项目将支持机器学习和人工智能领域的本科生和研究生研究培训,以及研究生水平的课程开发,为学生在先进技术领域的职业生涯做好准备。这个项目的目标是开发新颖的,可证明的好的策略来解决顺序决策问题(多臂强盗问题),当可用的动作具有组合结构(例如选择推荐的产品子集),奖励具有收益递减属性(子模块化),并且没有可用的附加信息-唯一的反馈来自奖励本身。提出的工作建立在丰富的文献多武装土匪和亚模块优化。该项目的技术目标分为两个重点。第一个重点是开发算法,并确定它们的遗憾界限,以组合多臂强盗问题与子模块奖励和没有额外的反馈。第二个重点是将这些策略和遗憾分析扩展到分散的环境中,在这种环境中,尽管通信资源有限,但多个代理协调解决组合的多武装强盗问题。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Combinatorial Stochastic-Greedy Bandit
组合随机贪婪老虎机
- DOI:
- 发表时间:2024
- 期刊:
- 影响因子:0
- 作者:Fourati, Fares and
- 通讯作者:Fourati, Fares and
Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization
用于对抗性 DR 子模优化的统一无投影算法
- DOI:
- 发表时间:2024
- 期刊:
- 影响因子:0
- 作者:Pedramfar, Mohammad and
- 通讯作者:Pedramfar, Mohammad and
Multi-Agent Multi-Armed Bandits with Limited Communication
通信受限的多代理多臂强盗
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:6
- 作者:Mridul Agarwal, Vaneet Aggarwal
- 通讯作者:Mridul Agarwal, Vaneet Aggarwal
Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback
全老虎机反馈下非单调随机子模最大化的随机贪婪学习
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Fourati, Fares;Aggarwal, Vaneet;Quinn, Christopher John;Alouini, Mohamed-Slim
- 通讯作者:Alouini, Mohamed-Slim
A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback
- DOI:10.48550/arxiv.2301.13326
- 发表时间:2023-01
- 期刊:
- 影响因子:0
- 作者:G. Nie;Yididiya Y. Nadew;Yanhui Zhu;V. Aggarwal;Christopher J. Quinn
- 通讯作者:G. Nie;Yididiya Y. Nadew;Yanhui Zhu;V. Aggarwal;Christopher J. Quinn
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vaneet Aggarwal其他文献
An Intelligent Learning Approach to Achieve Near-Second Low-Latency Live Video Streaming under Highly Fluctuating Networks
网络高波动下实现近秒低延时视频直播的智能学习方法
- DOI:
10.1145/3581783.3612154 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Guanghui Zhang;Ke Liu;Mengbai Xiao;Bingshu Wang;Vaneet Aggarwal - 通讯作者:
Vaneet Aggarwal
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
通过原始对偶策略梯度算法学习无限视野平均奖励约束 MDP 的通用参数化策略
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Qinbo Bai;Washim Uddin Mondal;Vaneet Aggarwal - 通讯作者:
Vaneet Aggarwal
Boundary representation compatible feature recognition for manufacturing CAD models
制造 CAD 模型的边界表示兼容特征识别
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:3.9
- 作者:
Xingyu Fu;Dheeraj Peddireddy;Fengfeng Zhou;Yuting Xi;Vaneet Aggarwal;Xingyu Li;Martin Byung - 通讯作者:
Martin Byung
Preemptive scheduling on unrelated machines with fractional precedence constraints
- DOI:
10.1016/j.jpdc.2021.07.010 - 发表时间:
2021-11-01 - 期刊:
- 影响因子:
- 作者:
Vaneet Aggarwal;Tian Lan;Dheeraj Peddireddy - 通讯作者:
Dheeraj Peddireddy
Integrating reinforcement-learning-based vehicle dispatch algorithm into agent-based modeling of autonomous taxis
将基于强化学习的车辆调度算法集成到基于代理的自动驾驶出租车建模中
- DOI:
10.1007/s11116-023-10433-w - 发表时间:
2023 - 期刊:
- 影响因子:4.3
- 作者:
Zequn Li;M. Lokhandwala;Abubakr O. Al;Vaneet Aggarwal;Hua Cai - 通讯作者:
Hua Cai
Vaneet Aggarwal的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Vaneet Aggarwal', 18)}}的其他基金
Conference: NSF WORKSHOP ON POST-QUANTUM AI
会议:美国国家科学基金会后量子人工智能研讨会
- 批准号:
2326996 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
NeTS: Small: Collaborative Research: Rethinking Erasure Codes for Cloud Storage: A Quantitative Framework for Latency, Reliability, and Cost Optimization
NeTS:小型:协作研究:重新思考云存储纠删码:延迟、可靠性和成本优化的定量框架
- 批准号:
1618335 - 财政年份:2016
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
CIF: Small: Collaborative Research: Communications with Energy Harvesting Nodes
CIF:小型:协作研究:与能量收集节点的通信
- 批准号:
1527486 - 财政年份:2015
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant