权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Theory and Applications of Approximate Dynamic Programming

近似动态规划理论与应用

基本信息

批准号：
9625489
负责人：
John Tsitsiklis
金额：
$ 32.48万
依托单位：
Massachusetts Institute of Technology
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
1996
资助国家：
美国
起止时间：
1996-07-01 至 1999-12-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=9625489&HistoricalAwards=false
关键词：
Theory Applications Approximate Dynamic Programming

项目摘要

9625489 Tsitsiklis This project deals with problems of sequential decision making under uncertainty that are too large for the classical methods of dynamic programming. The focus is on approximate dynamic programming which is a rich field that combines classical dynamic programming, reinforcement learning and other methods from artificial intelligence, approximation theory and neural networks, and simulation. The objectives of the research are as follows: (1). Resolve several outstanding question related to simulation-based methods involving lookup table representations. (2). Provide a theoretical understanding of approximate dynamic programming methods, when a compact representation of the cost-to-go function is employed. In particular, identifying classes of algorithm/architecture combinations for which convergence is guaranteed, together with error bounds. (3). Develop the theory and algorithms for average cost approximate dynamic programming. (4). Develop the theory and algorithms for approximate dynamic programming for the case of Markov games. (5). Carry out a number of case studies that will provide a better understanding of the comparative performance of different methods, and to demonstrate the usefulness of the methods on realistic problems of real-world importance. Three case studies involve: (a) Dynamic scheduling of a fleet of trucks (b) Pricing of complex derivative financial instruments. (c) The game of football. The overall objectives are to provide a comprehensive mathematical foundation for the field of approximate dynamic programming and at the same time succeed in solving some difficult and important problems.

小行星9625489 这个项目处理的问题，顺序决策的不确定性下，太大的经典方法的动态规划。重点是近似动态规划，这是一个丰富的领域，结合了经典动态规划，强化学习和人工智能，近似理论和神经网络以及模拟的其他方法。本研究的目的如下：（1）. 解决了几个悬而未决的问题，涉及到基于模拟的方法，查找表表示。（二）、提供近似动态规划方法的理论理解，当使用成本函数的紧凑表示时。特别是，识别类的算法/架构组合的收敛性得到保证，连同误差界。（三）、发展平均费用近似动态规划的理论和算法。（四）、发展马尔可夫博弈情形下近似动态规划的理论和算法。（五）、开展一些案例研究，以更好地了解不同方法的比较性能，并证明这些方法对现实世界重要的现实问题的有用性。三个案例研究涉及：（a）卡车车队的动态调度（B）复杂衍生金融工具的定价。(c)足球比赛总体目标是为近似动态规划领域提供全面的数学基础，同时成功地解决一些困难和重要的问题。