CAREER: Lyapunov Drift Methods for Stochastic Recursions: Applications in Cloud Computing and Reinforcement Learning

职业:随机递归的李亚普诺夫漂移方法:云计算和强化学习中的应用

基本信息

  • 批准号:
    2144316
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-05-01 至 2027-04-30
  • 项目状态:
    未结题

项目摘要

Part I:The ongoing Artificial Intelligence revolution is possible due to progresses in two distinct areas. The first is the development of novel algorithms in machine learning paradigms such as Reinforcement Learning, that overcome long-standing challenges; the second is the breakthroughs in cloud computing infrastructure based on large data centers that enables one to collect, store and process large amounts of data very easily and at a short notice. In spite of tremendous success stories in both these areas, fundamental trade-offs and optimal performance is not understand and theory lags far behind practice. In spite of seeming to be very distinct problems, both Reinforcement Learning and Cloud computing can be studied using stochastic recursions. The goal of this CAREER project is to take a unified theoretical viewpoint of both these seemingly distinct areas first developing a general theory of stochastic recursions, and then to use it to study both Reinforcement Learning and Cloud computing. In particular, we will use the theory to develop novel learning algorithms with provably optimal sample complexity across various paradigms such as off-policy learning and actor-critic framework. The theory of stochastic recursions as well as the novel learning algorithms will also be used to develop optimal scheduling algorithms for cloud computing data centers that minimize the tail of delay experienced by the users. The novel algorithms developed during the course of this project will be implemented through collaborations with partners in industry as well as at Georgia Tech’s internal cloud. A Jupyter based open source RL simulation platform will be developed, and the novel algorithms developed during the course of this project will be included in this platform. The platform is used not only in dissemination of the outcome of this project, but also for undergraduate research projects, course projects for a new course on Reinforcement learning, and for STEM outreach activities to K-12 education. In addition to dissemination of research results through conferences and journal publications, we will develop a novel special topics course, and bring out a monograph on the unified Lyapunov framework for stochastic recursions. In addition, training of graduate and undergraduate students forms a core part of the project with special emphasis on mentoring future faculty. Part 2: Intellectual Merit:The proposed work is organized into three interdependent thrusts. Thrust I builds a Lyapunov theory of stochastic recursions, where we obtain finite-time mean square error and exponential tail bounds, as well as characterize the steady-state limiting distribution for a broad class of stochastic recursions. This thrust forms the foundation for the next two thrusts.Thrust II studies the finite-time mean-square bounds, tail probability bounds (aka PAC bounds), sample complexity, and steady-state behavior of RL algorithms under three paradigms, viz., off-policy RL, two time-scale policy space algorithms (such as actor-critic) and average reward RL, and develops novel, fast, RL algorithms with near optimal sample efficiency. Thrust-III studies scheduling problems in data center networks, with the goal of minimizing mean delay and delay tails. Using the Lyapunov theory from Thrust I, we develop novel low complexity algorithms with provable guarantees on steady-state delay in the heavy-traffic asymptotic regime. With these as initial policies, we will deploy RL algorithms from Thrust II to learn new scheduling policies that are optimal even in the preasymptotic regime, which is of practical interest. All the proposed algorithms will be evaluated using real world traffic traces through our collaborations with industry partners. Broader Impacts:The proposed work, and the PI’s ongoing industry collaborations have potential for significant societal impact by making RL and cloud computing more efficient. The proposed Lyapunov theory for Stochastic Recursions is applicable in many other disciplines. And so, the PI will disseminate it widely through a special topics course, a monograph, and tutorials, in addition to conference and journal publications. The project integrates research with educational activities at every level. A Jupyter based RL simulation platform and a library of notebooks that we will build, will serve as an extensive pedagogical resource for these activities. The PI will continue his ongoing involvement in undergraduate research through the REU program and the VIP program at Georgia Tech. In order to fulfill a growing demand, the PI will develop a new interdisciplinary undergraduate level RL course and extensively use the RL simulation platform. To promote STEM activities, the PI will take part in outreach activities to local high schools working with an academic professional in ISyE and will mentor high school teachers through the GIFT program. To support Ph.D. students interested in academic career, the PI runs a future faculty mentorship program. The PI is committed to broadening participation, and currently advises a female Hispanic student, and has advised several URM undergraduate students.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
第一部分:由于两个不同领域的进展,正在进行的人工智能革命是可能的。首先是机器学习范式(如强化学习)中新算法的发展,克服了长期存在的挑战;二是基于大型数据中心的云计算基础设施的突破,使人们能够非常容易地在短时间内收集、存储和处理大量数据。尽管在这两个领域都有巨大的成功案例,但人们并不了解基本的权衡和最佳性能,理论远远落后于实践。尽管看起来是非常不同的问题,但强化学习和云计算都可以使用随机递归进行研究。这个CAREER项目的目标是对这两个看似不同的领域采取统一的理论观点,首先发展随机递归的一般理论,然后将其用于研究强化学习和云计算。特别是,我们将使用该理论开发新的学习算法,这些算法在各种范式(如off-policy学习和行动者-评论家框架)中具有可证明的最佳样本复杂性。随机递归理论以及新的学习算法也将用于开发云计算数据中心的最佳调度算法,以最大限度地减少用户所经历的延迟尾部。在这个项目过程中开发的新算法将通过与行业合作伙伴以及佐治亚理工学院的内部云的合作来实施。将开发一个基于Jupyter的开源RL仿真平台,在此项目过程中开发的新算法将包含在该平台中。该平台不仅用于本项目成果的传播,还用于本科研究项目、强化学习新课程的课程项目,以及面向K-12教育的STEM外展活动。除了通过会议和期刊出版物传播研究成果外,我们还将开发一门新的专题课程,并推出一本关于随机递归统一Lyapunov框架的专著。此外,研究生和本科生的培训是该项目的核心部分,特别强调指导未来的教师。第2部分:知识价值:建议的工作被组织成三个相互依赖的重点。Thrust I建立了随机递归的Lyapunov理论,在该理论中,我们获得了有限时间均方误差和指数尾界,并表征了一类广泛的随机递归的稳态极限分布。这个逆冲形成了接下来两个逆冲的基础。Thrust II研究了非策略RL、双时间尺度策略空间算法(如actor-critic)和平均奖励RL三种范式下RL算法的有限时间均方边界、尾部概率边界(即PAC边界)、样本复杂度和稳态行为,并开发了新颖、快速、样本效率接近最优的RL算法。推力- iii研究数据中心网络中的调度问题,目标是最小化平均延迟和延迟尾。利用Thrust I的Lyapunov理论,我们开发了一种新的低复杂度算法,该算法在大流量渐近状态下具有可证明的稳态延迟保证。将这些作为初始策略,我们将部署来自Thrust II的RL算法来学习即使在预渐近状态下也是最优的新调度策略,这是具有实际意义的。我们将通过与行业合作伙伴的合作,利用真实世界的交通轨迹对所有提出的算法进行评估。更广泛的影响:通过提高RL和云计算的效率,提议的工作和PI正在进行的行业合作有可能产生重大的社会影响。随机递归的李雅普诺夫理论也适用于许多其他学科。因此,PI将通过专题课程、专著和教程,以及会议和期刊出版物,广泛传播它。该项目将研究与各级教育活动相结合。我们将建立一个基于Jupyter的强化学习模拟平台和一个笔记本库,作为这些活动的广泛教学资源。PI将通过佐治亚理工学院的REU项目和VIP项目继续参与本科研究。为了满足不断增长的需求,PI将开发一个新的跨学科本科水平的RL课程,并广泛使用RL仿真平台。为了促进STEM活动,PI将与ISyE的学术专家一起参加当地高中的外展活动,并将通过GIFT计划指导高中教师。为了支持对学术生涯感兴趣的博士生,PI运行了一个未来教师指导计划。PI致力于扩大参与,目前为一名西班牙裔女学生提供咨询,并为几名URM本科生提供咨询。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Logarithmic heavy traffic error bounds in generalized switch and load balancing systems
广义交换机和负载平衡系统中的对数大流量错误界限
  • DOI:
    10.1017/jpr.2021.82
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    1
  • 作者:
    Hurtado-Lange, Daniela;Varma, Sushil Mahavir;Maguluri, Siva Theja
  • 通讯作者:
    Maguluri, Siva Theja
Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling
联合强化学习:马尔可夫采样下的线性加速
Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
  • DOI:
    10.1016/j.automatica.2022.110623
  • 发表时间:
    2019-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zaiwei Chen;Sheng Zhang;Thinh T. Doan;J. Clarke;S. T. Maguluri
  • 通讯作者:
    Zaiwei Chen;Sheng Zhang;Thinh T. Doan;J. Clarke;S. T. Maguluri
Power-of-? Choices Load Balancing in the Sub-Halfin Whitt Regime
的力量-?
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Varma, Sushil M.;Castro, Francisco;Maguluri, Siva Theja
  • 通讯作者:
    Maguluri, Siva Theja
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Siva Theja Maguluri其他文献

Siva Theja Maguluri的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Siva Theja Maguluri', 18)}}的其他基金

Two-sided Queues and Networked Matching Platforms
双边队列和网络化撮合平台
  • 批准号:
    2140534
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CRII: CIF: Resource Allocation in Data Center Networks: Algorithms, Fundamental Limits and Performance Bounds
CRII:CIF:数据中心网络中的资源分配:算法、基本限制和性能界限
  • 批准号:
    1850439
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CIF: Small: Collaborative Research: Analytics on Edge-labeled Hypergraphs: Limits to De-anonymization
CIF:小型:协作研究:边缘标记超图分析:去匿名化的限制
  • 批准号:
    1944993
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似国自然基金

向量Lyapunov函数架构下分数阶随机神经网络稳定性策略研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
大规模广义Lyapunov方程的快速算法及相关预处理技术研究
  • 批准号:
    12361080
  • 批准年份:
    2023
  • 资助金额:
    27 万元
  • 项目类别:
    地区科学基金项目
基于随机共振和Lyapunov指数结合的复杂环境下管道导波微小缺陷定量研究
  • 批准号:
    n/a
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
基于非线性局部Lyapunov指数方法的台湾海峡台风观测布局研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于触发机制相关Lyapunov函数的混合驱动间歇控制系统分析与综合
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于Lyapunov-like函数的不确定切换系统的鲁棒吸引域计算
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于触发机制相关Lyapunov函数的混合驱动间歇控制系统分析与综合
  • 批准号:
    62203412
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
腔QED系统中基于Lyapunov控制的量子态操控
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于非严格Lyapunov泛函技术的时标时滞系统的稳定性研究
  • 批准号:
    62003195
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
切换系统输入输出稳定性分析的不定Lyapunov函数方法研究
  • 批准号:
    2020JJ5990
  • 批准年份:
    2020
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目

相似海外基金

Safe Lyapunov-Based Deep Neural Network Adaptive Control of a Rehabilitative Upper Extremity Hybrid Exoskeleton
基于安全李亚普诺夫的深度神经网络自适应控制康复上肢混合外骨骼
  • 批准号:
    2230971
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Parametric deformation of control Lyapunov function for nonlinear systems and its applications
非线性系统控制Lyapunov函数的参数变形及其应用
  • 批准号:
    23H01430
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
CAREER: Efficient Learning of Equilibria in Dynamic Bayesian Games with Nash, Bellman and Lyapunov
职业生涯:与纳什、贝尔曼和李亚普诺夫一起有效学习动态贝叶斯博弈中的均衡
  • 批准号:
    2238838
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Collaborative Research: Data-driven Power Systems Control with Stability Guarantee: A Lyapunov Approach
合作研究:具有稳定性保证的数据驱动电力系统控制:李亚普诺夫方法
  • 批准号:
    2200692
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
The Complex Dynamics of Large Systems with Long-Range Interactions: New Insights from Covariant Lyapunov Vectors
具有长程相互作用的大型系统的复杂动力学:来自协变 Lyapunov 向量的新见解
  • 批准号:
    2138055
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Dynamical systems with observable Lyapunov irregular sets
具有可观测李亚普诺夫不规则集的动力系统
  • 批准号:
    22K03342
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Lyapunov exponent
李亚普诺夫指数
  • 批准号:
    2602125
  • 财政年份:
    2021
  • 资助金额:
    $ 50万
  • 项目类别:
    Studentship
Lyapunov Exponents and Spectral Properties of Aperiodic Structures
非周期结构的李亚普诺夫指数和谱性质
  • 批准号:
    EP/S010335/1
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    Research Grant
Lyapunov Vector Field-Based Guidance law for Spacecraft Motion Synchronization
航天器运动同步的李亚普诺夫矢量场制导律
  • 批准号:
    539314-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    University Undergraduate Student Research Awards
Lyapunov exponents and invariant measures of deterministic maps and random maps
确定性映射和随机映射的李雅普诺夫指数和不变测度
  • 批准号:
    526905-2018
  • 财政年份:
    2018
  • 资助金额:
    $ 50万
  • 项目类别:
    University Undergraduate Student Research Awards
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了