权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Biologically-Inspired Robust Adaptive Dynamic Programming for Continuous-Time Stochastic Systems

连续时间随机系统的受生物学启发的鲁棒自适应动态规划

基本信息

批准号：
1501044
负责人：
Zhong-Ping Jiang
金额：
$ 28.46万
依托单位：
New York University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2015
资助国家：
美国
起止时间：
2015-08-01 至 2019-07-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1501044&HistoricalAwards=false
关键词：
Biologically Inspired Robust Adaptive Dynamic

项目摘要

The project aims to develop tools and methods, inspired by neurobiology, for addressing the need in building brain-like reinforcement learning systems and, ultimately, contributing to the understanding of brain functions. The project seeks to address fundamentally challenging issues arising from the robust optimal management of large complex systems subject to stochastic effects, nonlinearity, and dynamic uncertainties. Research findings from this project will contribute new solutions to emerging engineering applications such as the smart electricity grid, robotics, and intelligent transportation systems. The proposed research will have a substantial direct impact upon education at the PI's institution. The interdisciplinary nature of the project should appeal to students from several departments.The project team will work on stochastic variants of adaptive dynamic programming (ADP) for continuous-time systems subject to stochastic and dynamic disturbances. ADP is a practically sound data-driven, non-model based approach for optimal control design in complex systems. ADP has been extensively studied for Markov decision processes, focusing mostly on discrete and finite state-space, and for deterministic (discrete- and continuous-time) dynamic systems. Stability and robustness issues in the presence of dynamic uncertainties are seldom addressed systematically. For problems involving complex modern engineering systems or biological systems, for which stability is an important concern, straightforward application of the existing ADP results does not seem productive or even likely to be successful. Hence, it is necessary to develop novel tools and methods for ADP design of general stochastic systems in continuous-time and continuous state-space, with rigorous stability and convergence analysis. The novelty of the proposed research consists of application and extension of techniques from reinforcement learning, stochastic systems theory, and nonlinear control theory. The specific goals of the proposal are the development of tools and methods for stochastic adaptive dynamic programming for linear and nonlinear stochastic systems, stochastic adaptive optimal control with robustness to dynamic uncertainties, and application to human motor systems. Rigorous stability proofs, convergence analysis of learning algorithms, and robustness analysis will be pursued. Important classes of continuous-time linear and nonlinear models with multiplicative and additive noise will be studied, along with non-model based, stochastic optimal controller designs. Beyond engineering applications, it is believed that bringing together ADP and research in computational neuroscience may yield new methodologies for the diagnosis and treatment of neurodegenerative genetic disorders that affect muscle coordination. One such medical condition is Parkinson's disease, which affects approximately seven million people globally, and one million in the United States. Generalizing the PI's recent work in linear stochastic variants of robust adaptive dynamic programming can lead to a potentially new computational mechanism for human motor control.

该项目旨在开发受神经生物学启发的工具和方法，以解决构建类脑强化学习系统的需求，并最终有助于理解大脑功能。该项目旨在解决从根本上具有挑战性的问题，这些问题来自于随机效应、非线性和动态不确定性的大型复杂系统的鲁棒优化管理。该项目的研究成果将为智能电网、机器人和智能交通系统等新兴工程应用提供新的解决方案。拟议的研究将对PI所在机构的教育产生实质性的直接影响。该项目的跨学科性质应该吸引来自几个系的学生。项目团队将研究受随机和动态干扰的连续系统的自适应动态规划（ADP）的随机变体。ADP是一种实际可靠的数据驱动、非基于模型的复杂系统最优控制设计方法。ADP在马尔可夫决策过程中得到了广泛的研究，主要集中在离散和有限状态空间，以及确定性（离散和连续时间）动态系统。在存在动态不确定性的情况下，稳定性和鲁棒性问题很少得到系统的解决。对于涉及复杂的现代工程系统或生物系统的问题，稳定性是一个重要的问题，直接应用现有的ADP结果似乎并不有效，甚至不太可能成功。因此，有必要开发具有严格稳定性和收敛性分析的连续时间和连续状态空间的一般随机系统的ADP设计的新工具和方法。该研究的新颖之处在于对强化学习、随机系统理论和非线性控制理论等技术的应用和扩展。该提案的具体目标是开发线性和非线性随机系统的随机自适应动态规划的工具和方法，具有动态不确定性鲁棒性的随机自适应最优控制，以及在人体运动系统中的应用。严谨的稳定性证明，学习算法的收敛性分析，以及鲁棒性分析将被追求。将研究具有乘性和加性噪声的重要连续时间线性和非线性模型，以及非基于模型的随机最优控制器设计。除了工程应用之外，人们相信，将ADP和计算神经科学的研究结合起来，可能会产生新的方法，用于诊断和治疗影响肌肉协调的神经退行性遗传疾病。帕金森氏症就是这样一种疾病，它影响着全球大约700万人，在美国有100万人。将PI最近的工作推广到鲁棒自适应动态规划的线性随机变量中，可以为人类运动控制提供一种潜在的新的计算机制。