CIF: Small: Compression Schemes for Communication Constrained Bandit and Reinforcement Learning

CIF:小:通信受限强盗和强化学习的压缩方案

基本信息

  • 批准号:
    2221871
  • 负责人:
  • 金额:
    $ 60万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-10-01 至 2025-09-30
  • 项目状态:
    未结题

项目摘要

Active learning and online learning are machine-learning paradigms in which computers learn to make complex decisions while receiving feedback from an environment. For instance, a drone may learn to fly by itself, or a car may learn to drive by trial and error. Recently, these learning paradigms have been widely applied and have achieved phenomenal successes with human-level performance in tasks like gameplay or robot control. As computing devices become smaller and less power-consuming, new distributed learning frameworks start to emerge. These frameworks contain low-capability learning agents (such as cell phones, unmanned vehicles, or drones) that are far apart but perform learning collectively by communicating with each other through (wireless) networks. However, existing communication approaches would become bottlenecks for learning since they were designed for high-power computers and consume too much power and network bandwidth. This project aims to address this issue by providing novel techniques that efficiently compress data to be communicated while preserving the learning ability. The techniques developed in this project will advance the state-of-the-art in distributed online/active learning by improving communication efficiencies. The overarching goal of this project is to establish efficient compression schemes that support effective active/online learning, such as bandit and reinforcement learning over communication-constrained networks. In these learning environments, a learner aims to make a good decision for the next steps based on experience; this project will explore fundamental bounds and efficient algorithms that support this goal while minimizing the number of bits communicated - by compressing in a way that only retains the necessary information for decision making. In other words, this project aims to explore the fundamental trade-off between compression and learnability in active/online environments. Building on promising preliminary work, the investigators will study problems ranging from the most basic multi-arm bandit setting to more complex reinforcement learning settings and consider both centralized and decentralized network topologies. More specifically, the investigators propose compression schemes and fundamental theoretical bounds for (1) rewards in multi-armed bandit problems, (2) context vectors for contextual bandit problems, and (3) state-action features and models for Markov decision problems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
积极学习和在线学习是机器学习范式,其中计算机学会在从环境中收到反馈的同时做出复杂的决策。例如,无人机可以学会自行飞行,或者汽车可能会通过反复试验学习开车。最近,这些学习范式已被广泛应用,并在游戏玩法或机器人控制等任务中获得了人级的表现,取得了惊人的成功。随着计算设备变得较小且功能减少,新的分布式学习框架开始出现。这些框架包含较低的能力学习代理(例如手机,无人驾驶汽车或无人机),但通过(无线)网络相互交流,可以集体进行学习。但是,现有的通信方法将成为学习的瓶颈,因为它们是为高功率计算机设计的,并且消耗了太多的功率和网络带宽。该项目旨在通过提供新技术来解决此问题,这些新技术在保留学习能力时有效地压缩要传达的数据。该项目中开发的技术将通过提高沟通效率来推进分布式在线/主动学习的最新技术。该项目的总体目标是建立有效的有效的压缩方案,以支持有效的主动/在线学习,例如强盗和对沟通受限的网络的强化学习。在这些学习环境中,学习者旨在根据经验为下一步做出一个很好的决定。该项目将探索基本的界限和有效的算法,这些算法支持该目标,同时最大程度地减少传达的位数 - 通过仅保留必要信息进行决策的方式来压缩。换句话说,该项目旨在探讨在活动/在线环境中的压缩与可学习性之间的基本权衡。在有希望的初步工作的基础上,研究人员将研究从最基本的多臂匪徒设置到更复杂的强化学习环境的问题,并考虑集中和分散的网络拓扑。更具体地说,研究人员提出了(1)多军匪徒问题的奖励的压缩方案和基本理论界限,((2)上下文的上下文匪徒问题的上下文向量,以及(3)马尔可夫决策问题的国家行动特征和模型。该奖项反映了NSF的法定任务,并通过评估范围的范围来反映了范围的范围,并通过评估了范围的范围。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Near-Optimal Sample Complexity Bounds for Constrained MDPs
受限 MDP 的近乎最优样本复杂度界限
Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning
  • DOI:
    10.48550/arxiv.2304.08944
  • 发表时间:
    2023-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Dingwen Kong;Lin F. Yang
  • 通讯作者:
    Dingwen Kong;Lin F. Yang
PROVABLY EFFICIENT LIFELONG REINFORCEMENT LEARNING WITH LINEAR REPRESENTATION
具有线性表示的可证明有效的终身强化学习
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Amani, Sanae;Yang, Lin;Cheng, Ching-An
  • 通讯作者:
    Cheng, Ching-An
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
  • DOI:
    10.48550/arxiv.2306.09554
  • 发表时间:
    2023-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yunfan Li-;Yiran Wang-;Y. Cheng;Lin F. Yang
  • 通讯作者:
    Yunfan Li-;Yiran Wang-;Y. Cheng;Lin F. Yang
Horizon-Free Learning for Markov Decision Processes and Games: Stochastically Bounded Rewards and Improved Bounds
马尔可夫决策过程和博弈的无地平线学习:随机有界奖励和改进界限
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lin Yang其他文献

Discharge Behavior and Morphological Characteristics of Suspended Water-Drop on Shed Edge during Rain Flashover of Polluted Large-Diameter Post Insulator
污秽大直径支柱绝缘子雨闪时伞边悬浮水滴放电行为及形态特征
  • DOI:
    10.3390/en14061652
  • 发表时间:
    2021-03
  • 期刊:
  • 影响因子:
    3.2
  • 作者:
    Yifan Liao;Qiao Wang;Lin Yang;Zhiqiang Kuang;Yanpeng Hao;Chuyan Zhang
  • 通讯作者:
    Chuyan Zhang
DAWE: A Double Attention-Based Word Embedding Model with Sememe Structure Information
DAWE:具有义原结构信息的基于双重注意力的词嵌入模型
  • DOI:
    10.3390/app10175804
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shengwen Li;Renyao Chen;Bo Wan;Junfang Gong;Lin Yang;Hong Yao
  • 通讯作者:
    Hong Yao
Microglial AIM2 alleviates antiviral‐related neuro‐inflammation in mouse models of Parkinson's disease
小胶质细胞 AIM2 减轻帕金森病小鼠模型中抗病毒相关的神经炎症
  • DOI:
    10.1002/glia.24260
  • 发表时间:
    2022-08
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wen‐Juan Rui;Sheng Li;Lin Yang;Ying Liu;Yi Fan;Ying‐Chao Hu;Chun‐Mei Ma;Bing‐Wei Wang;Jing‐Ping Shi
  • 通讯作者:
    Jing‐Ping Shi
Synthesis, structures and anticancer potentials of five platinum(II) complexes with benzothiazole-benzopyran targeting mitochondria
五种铂(II)铂(II)配合物与苯并噻唑-苯并吡喃靶向线粒体的合成、结构和抗癌潜力
  • DOI:
    10.1016/j.poly.2020.115004
  • 发表时间:
    2021-03
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    Qing-Min Wei;Zu-Zhuang Wei;Jia-Jing Zeng;Lin Yang;Qi-Pin Qin;Ming-Xiong Tan;Hong Liang
  • 通讯作者:
    Hong Liang
Erosion effects on soil properties of the unique red soil hilly region of the economic development zone in southern China
南方经济开发区特有的红壤丘陵区侵蚀对土壤性质的影响
  • DOI:
    10.1007/s12665-012-1616-0
  • 发表时间:
    2012-03
  • 期刊:
  • 影响因子:
    2.8
  • 作者:
    Xue Zhang;Zhongwu Li;Guangming Zeng;Xiaolan Xia;Lin Yang;Jiajie Wu
  • 通讯作者:
    Jiajie Wu

Lin Yang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

LHC能区小碰撞系统中奇异和重夸克介子的压缩关联
  • 批准号:
    11905085
  • 批准年份:
    2019
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
多重分形符号相位转移熵理论及其在海面漂浮小目标检测中的应用
  • 批准号:
    61901195
  • 批准年份:
    2019
  • 资助金额:
    23.5 万元
  • 项目类别:
    青年科学基金项目
激波与湍流小尺度结构相互作用的数值研究
  • 批准号:
    91852109
  • 批准年份:
    2018
  • 资助金额:
    100.0 万元
  • 项目类别:
    重大研究计划
小波框架的构造及其在压缩感知领域中的应用
  • 批准号:
    11531013
  • 批准年份:
    2015
  • 资助金额:
    230.0 万元
  • 项目类别:
    重点项目
基于小波稀疏表示的压缩感知数字全息层析技术研究
  • 批准号:
    61307011
  • 批准年份:
    2013
  • 资助金额:
    27.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

CIF:Small:Toward a Modern Theory of Compression: Manifold Sources and Learned Compressors
CIF:小:迈向现代压缩理论:流形源和学习压缩机
  • 批准号:
    2306278
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CIF: Small: Compression for Learning over networks
CIF:小型:网络学习压缩
  • 批准号:
    2007714
  • 财政年份:
    2020
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CIF: Small: Reconstructing Multiple Sources by Spatial Sampling and Compression
CIF:小:通过空间采样和压缩重建多个源
  • 批准号:
    1910497
  • 财政年份:
    2019
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CIF: Small: Harnessing Network Compression Gains: Fundamental Limits and Practical Implementations
CIF:小型:利用网络压缩增益:基本限制和实际实施
  • 批准号:
    1617673
  • 财政年份:
    2016
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CIF: Small: Collaborative Research: Ordinal Data Compression
CIF:小型:协作研究:有序数据压缩
  • 批准号:
    1642550
  • 财政年份:
    2016
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了