Research on Adaptive Estimation and Control of Dynamical Systems
动力系统自适应估计与控制研究
基本信息
- 批准号:9703812
- 负责人:
- 金额:$ 10万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:1997
- 资助国家:美国
- 起止时间:1997-08-01 至 2000-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
DMS 9703812 Research on Adaptive Estimation and Control of Dynamical Systems. Michael N. Katehakis and Herbert Robbins Rutgers University Abstract This research involves work on adaptive control of dynamic systems. The basic dynamic model is known as the "Markov decision process with incomplete information" (MDP) problem, where the transition law and/or the expected one-period rewards may depend on unknown parameters. The most notable results in this area are based on ideas utilizing either a separation principle and the related certainty-equivalence rule, or uniformly efficient rules for the model of sequential allocation known as the multi-armed bandit (MAB) problem. Limitations of the certainty-equivalence rule are: i) there is no claim on the rate of convergence, and ii) there are cases for which, with positive probability, this rule can prematurely converge to a wrong parameter value so that it eventually uses only a non-optimal policy. The typical approach in the latter studies has been to fit the larger MDP model into the smaller MAB one by considering each deterministic policy as a reward-generating population (bandit). A consequence of this is that the resulting statistically efficient procedures involve sampling from all deterministic policies and do not otherwise utilize the optimization aspect of the problem. Thus, they become limited in scope by data collection complexity. The reason is that in practice the state spaces of MDP models tend to be very large and the set of deterministic policies is immense. In recent work the investigators have obtained adaptive procedures with data collection requirements that are proportional to the number of state - action pairs of the MDP, under a minimal irreducibility condition. A major direction of the proposed research involves the development of solutions for important more general problems such as i) multi-chain MDPs, ii) the case in which there a re side constraints, and iii) discounted streams of rewards. A second important goal is the development of new adaptive statistical methods that possess practically useful implementation and optimality properties for the related problems of detection of total error and change points. The main idea of adaptive control is to compute strategies (policies, or control rules) for the operation of a system that estimate the unknown parameters of the system, and in doing so converge to a strategy that is optimal for the true values of the unknown parameters. Applications arise in many areas of modern engineering, finance, and operations research, such as reliability, maintenance, quality control, scheduling, inventory, and production planning. Consequently, this type of problem has been widely studied in the literature. However, effective procedures that take into account and optimize the speed of convergence have been obtained only recently for specific models, often, with prohibitive data collection complexity. A primary objective of the proposed research is the development of relatively simple adaptive control procedures with reasonable computational and memory requirements for on-line implementation, for a wide class of problems, utilizing ideas from recent work of the investigators. Another important goal is the development of new methods for specific models useful in such areas as software reliability (error detection) and quality control (change points). This research relates to the following strategic areas of national concern: high performance computing, communications, and manufacturing.
9703812动态系统自适应估计与控制的研究。Michael N.Katehakis和Herbert Robbins Rutgers大学摘要本研究涉及动态系统的自适应控制工作。基本的动态模型被称为“不完全信息马尔可夫决策过程”(MDP)问题,其中的转移规律和/或期望的单周期报酬可能取决于未知参数。这一领域最显著的结果是基于这样的思想:要么利用分离原则和相关的确定性等价规则,要么利用顺序分配模型的一致有效规则,称为多臂强盗(MAB)问题。确定性等价规则的局限性是:i)没有关于收敛速度的要求,ii)在某些情况下,该规则可能以正的概率过早收敛到错误的参数值,从而最终只使用非最优策略。在后一种研究中,典型的方法是将每个确定性政策视为一个产生报酬的群体(强盗),从而将较大的MDP模型与较小的MAB模型相匹配。这样做的结果是,所产生的统计上有效的程序涉及从所有确定性策略中抽样,并且不以其他方式利用问题的最优化方面。因此,它们的范围受到数据收集复杂性的限制。这是因为在实际应用中,MDP模型的状态空间往往非常大,确定性策略集也是巨大的。在最近的工作中,研究人员获得了自适应程序,在最小不可约条件下,数据收集要求与MDP的状态-动作对的数量成正比。拟议研究的一个主要方向涉及为更重要的更一般的问题开发解决方案,例如i)多链MDP,ii)存在Re边约束的情况,以及iii)奖励折扣流。第二个重要目标是开发新的自适应统计方法,这些方法对于检测总误差和变化点的相关问题具有实用的实施性和最优性。自适应控制的主要思想是为系统的运行计算策略(策略或控制规则),估计系统的未知参数,并在这样做的过程中收敛到对未知参数的真值最优的策略。应用出现在现代工程、金融和运筹学的许多领域,如可靠性、维护、质量控制、调度、库存和生产计划。因此,这类问题在文献中得到了广泛的研究。然而,考虑和优化收敛速度的有效程序直到最近才针对特定的模型获得,通常具有令人望而却步的数据收集复杂性。拟议研究的一个主要目标是利用调查人员最近工作中的想法,开发相对简单的自适应控制程序,为在线实施提供合理的计算和内存需求,用于广泛类别的问题。另一个重要目标是为在软件可靠性(错误检测)和质量控制(变化点)等领域有用的特定模型开发新方法。这项研究涉及国家关注的以下战略领域:高性能计算、通信和制造。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Katehakis其他文献
Michael Katehakis的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Katehakis', 18)}}的其他基金
Collaborative Research: Theoretical and Algorithmic Advances in Sequential Adaptive Decisions
协作研究:序贯自适应决策的理论和算法进展
- 批准号:
1662629 - 财政年份:2017
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
EAGER: Event-Driven, Goal-Oriented Dynamic Resource Deployment
EAGER:事件驱动、目标导向的动态资源部署
- 批准号:
1450743 - 财政年份:2014
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Research on Adaptive Sampling and Stochastic Scheduling
自适应采样与随机调度研究
- 批准号:
8507671 - 财政年份:1985
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
相似海外基金
Efficient and unbiased estimation in adaptive platform trials
自适应平台试验中的高效且公正的估计
- 批准号:
MR/X030261/1 - 财政年份:2024
- 资助金额:
$ 10万 - 项目类别:
Research Grant
Structured adaptive estimation: reliable "grey box" adaptation
结构化自适应估计:可靠的“灰盒”自适应
- 批准号:
EP/W014734/1 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Research Grant
Structured adaptive estimation: reliable "grey box" adaptation
结构化自适应估计:可靠的“灰盒”自适应
- 批准号:
EP/W014661/1 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Research Grant
Adaptive controllers based on patient state estimation for acute robotic rehabilitation
基于患者状态估计的自适应控制器用于急性机器人康复
- 批准号:
570170-2022 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Postgraduate Scholarships - Doctoral
Nationally scalable and adaptive methods for traffic estimation with open data
使用开放数据进行全国可扩展和自适应的流量估计方法
- 批准号:
2747587 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Studentship
Communication Environment Estimation by Deep Learning for Improving Frequency Utilization Efficiency and its Application to Adaptive Modulation Coding
提高频率利用效率的深度学习通信环境估计及其在自适应调制编码中的应用
- 批准号:
22K14253 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Nationally scalable and adaptive methods for traffic estimation with open data.
使用开放数据进行全国可扩展和自适应的流量估计方法。
- 批准号:
2752741 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Studentship
Inducing and Exploiting Grid Structures for Fast, Adaptive, and Accurate Estimation
引入和利用网格结构进行快速、自适应和准确的估计
- 批准号:
1953111 - 财政年份:2020
- 资助金额:
$ 10万 - 项目类别:
Standard Grant