权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

ERI: Improving the Learning Efficiency of Adaptive Optimal Control Systems in Information-Limited Environments

ERI：提高信息有限环境中自适应最优控制系统的学习效率

基本信息

批准号：
2138206
负责人：
Kim-Doang Nguyen
金额：
$ 20万
依托单位：
Florida Institute of Technology
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-01-01 至 2024-12-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2138206&HistoricalAwards=false
关键词：
ERI Improving Learning Efficiency Adaptive

项目摘要

This Engineering Research Initiation (ERI) grant will fund research that enables efficient, on-the-fly learning of optimal control strategies for complex engineering systems operating in uncertain environments, with application to connected and autonomous vehicles, thereby promoting the progress of science and advancing the national prosperity and welfare. Many emerging control systems in the artificial intelligence, automotive, robotic, and energy fields require that optimal actions be identified and executed across a network of components in the absence of detailed system knowledge and based on limited input data. Learning-based approaches have been developed to meet this requirement, but are challenged by very slow rates of learning, restrictive requirements on the control policy used to initiate the learning process, and complications due to sensor limitations and sparse data sharing between individual components. This project will overcome these challenges by building a new learning-efficient control framework that integrates advantages of existing methods and demonstrates new solutions for handling missing data streams and optimizing the communication structure between system components. When applied to networks of autonomous vehicles in complex traffic scenarios, the framework may enable improvements in roadway safety and reduction in road fatalities. The broader impacts of this project include outreach efforts to the public intended to show how artificial intelligence and automatic control can be safely leveraged, as well as training and preparation of undergraduate and graduate students to pursue further education and advanced STEM careers.This research aims to make fundamental contributions to the development of a learning-efficient, adaptive optimal control framework for nonlinear dynamical systems with completely unknown system models and under conditions of partial observability, and to enable the application of this framework to networked control systems with nontrivial communication topologies. It will achieve this outcome by developing a new hybrid iterative form of reinforcement learning that achieves a quadratic rate of convergence even if a system model and an initial admissible control policy are unavailable. Inspired by ideas from hierarchical reinforcement learning, a two-layer learning-efficient method will be created to enable simultaneous learning of robust distributed control strategies for individual network agents and an optimal network communication topology, including in the presence of communication delays between agents. Micro-traffic simulations and physical experiments will be used to test the theoretical framework in the context of collision avoidance in several formation control scenarios.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该工程研究启动（ERI）拨款将资助研究，使在不确定环境中运行的复杂工程系统能够有效地实时学习最佳控制策略，并应用于联网和自动驾驶汽车，从而促进科学进步，促进国家繁荣和福利。人工智能、汽车、机器人和能源领域的许多新兴控制系统要求在缺乏详细系统知识的情况下，基于有限的输入数据，在组件网络上识别和执行最佳动作。基于学习的方法已经开发出来，以满足这一要求，但学习速度非常慢，用于启动学习过程的控制策略的限制性要求，以及由于传感器的限制和单个组件之间的稀疏数据共享的并发症的挑战。该项目将通过构建一个新的学习高效的控制框架来克服这些挑战，该框架集成了现有方法的优点，并展示了用于处理丢失数据流和优化系统组件之间的通信结构的新解决方案。当应用于复杂交通场景中的自动驾驶车辆网络时，该框架可以改善道路安全并减少道路死亡人数。该项目的更广泛影响包括向公众展示如何安全地利用人工智能和自动控制，以及培训和准备本科生和研究生继续教育和高级STEM职业。这项研究旨在为发展学习效率，本文研究了具有完全未知系统模型和部分可观测条件下的非线性动态系统的自适应最优控制框架，并将该框架应用于具有非平凡通信拓扑的网络控制系统。它将通过开发一种新的混合迭代形式的强化学习来实现这一结果，即使在系统模型和初始可接受的控制策略不可用的情况下，该强化学习也可以实现二次收敛率。受分层强化学习思想的启发，将创建一种两层学习高效方法，以同时学习单个网络代理的鲁棒分布式控制策略和最佳网络通信拓扑，包括代理之间存在通信延迟的情况。微交通模拟和物理实验将被用来测试的理论框架，在避碰的背景下，在几个编队控制scenaries.This奖项反映了NSF的法定使命，并已被认为是值得的支持，通过评估使用该基金会的智力价值和更广泛的影响审查标准。