权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Towards a Principled Framework for Resilient, Data Efficient and Scalable Reinforcement Learning for Control

职业：建立一个有弹性、数据高效且可扩展的强化学习控制原则框架

基本信息

批准号：
2045783
负责人：
Dileep Kalathil
金额：
$ 50万
依托单位：
Texas A&M Engineering Experiment Station
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-02-01 至 2026-01-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2045783&HistoricalAwards=false
关键词：
CAREER Towards Principled Framework Resilient

项目摘要

The success of the traditional control system design depends crucially on the availability of tractable system models with known parameters, well-understood sources of uncertainty and clearly specified objectives. These assumptions are no longer true in the emerging paradigm of intelligent and autonomous large-scale engineering systems, such as next generation electricity systems. A data-driven machine learning approach that takes advantage of large amounts of data coming from such systems can provide a promising path forward. However, unlike the remarkable successes of machine learning in classiﬁcation problems such as image recognition, reinforcement learning (RL) that addresses the problem of “learning to control” has seen achievements limited to more structured or simulated environments, and its successes in real-world engineering systems are not as prominent. There are three critical issues that significantly impede the success of RL in real-world engineering systems: lack of resiliency, data efficiency, and scalability. This CAREER proposal develops a principled approach for the RL-based design of control algorithms for large-scale real-world engineering systems, by overcoming the fundamental challenges of resiliency, data efficiency, and scalability. The main application domain of interest is electricity systems, which guides the problem formulation and solution approaches, and lends credence to the algorithms using real-world examples. The project has an innovative education plan that includes the ‘Aggie DeepRacer Project’ that follows an ‘experiential learning’ approach for integrating the research in RL into the educational curriculum. Mentoring students from underrepresented minorities through collaboration with the Louis Stokes Alliances for Minority Participation (LSAMP) program, and hosting teachers from low socioeconomic status schools with a high percentage of minority students strengthens the project. Project outcomes include development of activity-based learning modules for high school students and presenting them at the Aggie STEM summer camp and Physics and Engineering Festival. By developing a data-driven and learning-based approach for efficient control of power systems, this project also contributes to reducing the cost of fuel and operations, and hence signiﬁcantly increasing the reliability of the overall energy system.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

传统控制系统设计的成功关键取决于是否有易于处理的系统模型，这些模型具有已知的参数、易于理解的不确定性来源和明确规定的目标。这些假设在智能和自主的大型工程系统(如下一代电力系统)的新兴范例中不再适用。利用来自这类系统的大量数据的数据驱动的机器学习方法可以提供一条有希望的前进道路。然而，与机器学习在图像识别等经典ﬁ阳离子问题上取得的显著成功不同，强化学习(RL)解决“学习控制”问题的成果仅限于更结构化或更模拟的环境，它在现实世界工程系统中的成功并不显著。在实际工程系统中，有三个关键问题严重阻碍了RL的成功：缺乏弹性、数据效率和可伸缩性。这项职业计划通过克服弹性、数据效率和可扩展性方面的根本挑战，为大型真实世界工程系统的基于RL的控制算法设计开发了一种原则性方法。主要的应用领域是电力系统，它指导问题的表达和解决方法，并使用真实世界的例子来证明算法的可靠性。该项目有一个创新的教育计划，其中包括‘Aggie Deepracer Project’，该项目遵循‘体验式学习’的方法，将RL的研究整合到教育课程中。通过与路易斯·斯托克斯少数族裔参与联盟(LSAMP)合作，指导少数族裔代表不足的学生，并接待来自社会经济地位较低、少数族裔学生比例较高的学校的教师，加强了该项目。项目成果包括为高中生开发以活动为基础的学习模块，并在Aggie STEM夏令营和物理工程节上展示这些模块。通过开发一种数据驱动和基于学习的方法来有效控制电力系统，该项目还有助于降低燃料和运营成本，从而显著提高整个能源系统的可靠性。该奖项反映了ﬁ的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（16）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Distributionally Robust Behavioral Cloning for Robust Imitation Learning

DOI：
10.1109/cdc49753.2023.10383976
发表时间：
2023-12
期刊：
2023 62nd IEEE Conference on Decision and Control (CDC)
影响因子：
0
作者：
Kishan Panaganti;Zaiyan Xu;D. Kalathil;Mohammad Ghavamzadeh
通讯作者：
Kishan Panaganti;Zaiyan Xu;D. Kalathil;Mohammad Ghavamzadeh

Meta-Learning Online Control for Linear Dynamical Systems

线性动力系统的元学习在线控制

DOI：
10.1109/cdc51059.2022.9993222
发表时间：
2022
期刊：
IEEE 61st Conference on Decision and Control (CDC
影响因子：
0
作者：
Muthirayan, Deepan;Kalathil, Dileep;Khargonekar, Pramod P.
通讯作者：
Khargonekar, Pramod P.

DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning

DOI：
发表时间：
2021-12
期刊：
影响因子：
0
作者：
Archana Bura;Aria HasanzadeZonuzy;D. Kalathil;S. Shakkottai;J. Chamberland
通讯作者：
Archana Bura;Aria HasanzadeZonuzy;D. Kalathil;S. Shakkottai;J. Chamberland

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

DOI：
发表时间：
2020-06
期刊：
影响因子：
0
作者：
K. Badrinath;D. Kalathil
通讯作者：
K. Badrinath;D. Kalathil

Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments

通过稀疏奖励环境中的演示增强元强化学习

DOI：
发表时间：
2022
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Rengarajan, Desik;Chaudhary, Sapana;Kim, Jaewon;Kalathil, Dileep;Shakkottai, Srinivas
通讯作者：
Shakkottai, Srinivas