权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Stochastic Optimal Control based on Gaussian Processes Regression

基于高斯过程回归的随机最优控制

基本信息

批准号：
349395379
负责人：
Professor Dr.-Ing. Uwe D. Hanebeck
金额：
--
依托单位：
Lehrstuhl für Intelligente Sensor-Aktor-Systeme (ISAS)
依托单位国家：
德国
项目类别：
Research Grants
财政年份：
2017
资助国家：
德国
起止时间：
2016-12-31 至 2020-12-31
项目状态：
已结题

来源：
https://gepris.dfg.de/gepris/projekt/349395379?language=en
关键词：
Stochastic Optimal Control based Gaussian

项目摘要

In stochastic control, optimal decision making in continuous domains under statistically modeled uncertainty is usually addressed via Dynamic Programming (DP). The goal consists in finding policies that map the information available to the controller to a control input in such a way that a performance criterion, often defined in terms of costs, is optimized. Usually, using nonlinear filtering methods, this information is condensed into a probability distribution that represents the state estimate of the system to be controlled, and the policies map these distributions to control inputs.Unfortunately, DP is intractable except in a few very special cases. Therefore, approximate but tractable approaches are of interest. One such approach is the point-based value iteration algorithm, where each point is a probability distribution. In this approach, the controller maintains the optimal costs for a set of representative state estimates instead of trying the impossible task of maintaining the costs for all state estimates as it would be required in classical DP. Then, it uses this information in order to obtain an approximation of the optimal costs at a state estimate that is needed for decision making. As we see, point-based value iteration requires approximation methods for functions defined over general probability distributions. However, state-of-the-art approaches either restrict the class of possible state estimates or assume finite sets of control inputs and measurements. Although workarounds for continuous control inputs and measurements exist, they usually require additional approximations. For this reason, we propose a novel approach to stochastic control of nonlinear dynamical systems with continuous states, control inputs, and measurements that is based on Gaussian Process (GP) regression. Classical GP regression only allows for deterministic vector-valued inputs. For this reason, we propose a novel extension of the GP framework to inputs given in form of probability distributions. By doing so, we extend the GP framework to infinite-dimensional inputs. Our approach is based on the idea to define the covariance functions that determine the GP in terms of the distance between the probability distributions provided as inputs to the GP.In the course of the project, we plan to develop a solid framework for GPs defined over general probability distributions and to derive stochastic control algorithms that use such GPs to compute the policy. We believe that the proposed project will substantially contribute to research on stochastic control. Furthermore, the presented idea for defining GPs with inputs given in terms of probability distributions can also be used in machine learning research in order to derive other non-parametric Bayesian regression and classification methods over probability distributions.

在随机控制中，统计建模不确定性下连续域中的最优决策通常通过动态规划 (DP) 来解决。目标在于找到将控制器可用的信息映射到控制输入的策略，从而优化通常根据成本定义的性能标准。通常，使用非线性过滤方法，这些信息被压缩为表示要控制的系统的状态估计的概率分布，并且策略将这些分布映射到控制输入。不幸的是，除了少数非常特殊的情况外，DP 很难处理。因此，近似但易于处理的方法很有意义。其中一种方法是基于点的值迭代算法，其中每个点都是一个概率分布。在这种方法中，控制器维护一组代表性状态估计的最优成本，而不是像经典动态规划中所要求的那样尝试维护所有状态估计的成本这一不可能的任务。然后，它使用此信息来获得决策所需的状态估计的最佳成本的近似值。正如我们所看到的，基于点的值迭代需要针对一般概率分布定义的函数的近似方法。然而，最先进的方法要么限制可能的状态估计的类别，要么假设有限的控制输入和测量集。尽管存在连续控制输入和测量的解决方法，但它们通常需要额外的近似值。为此，我们提出了一种基于高斯过程（GP）回归的具有连续状态、控制输入和测量的非线性动力系统的随机控制的新方法。经典 GP 回归仅允许确定性向量值输入。出于这个原因，我们提出了 GP 框架的一种新的扩展，以概率分布形式给出的输入。通过这样做，我们将 GP 框架扩展到无限维输入。我们的方法基于定义协方差函数的思想，协方差函数根据作为 GP 输入提供的概率分布之间的距离来确定 GP。在项目过程中，我们计划为在一般概率分布上定义的 GP 开发一个坚实的框架，并导出使用此类 GP 来计算策略的随机控制算法。我们相信所提出的项目将对随机控制的研究做出重大贡献。此外，所提出的用概率分布给出的输入来定义 GP 的想法也可以用于机器学习研究，以便导出概率分布上的其他非参数贝叶斯回归和分类方法。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Stochastic Optimal Control Using Gaussian Process Regression over Probability Distributions

DOI：
10.23919/acc.2019.8814658
发表时间：
2019-07
期刊：
2019 American Control Conference (ACC)
影响因子：
0
作者：
Jana Mayer;Maxim Dolgov;Tobias Stickling;Selim Özgen;Florian Rosenthal;U. Hanebeck
通讯作者：
Jana Mayer;Maxim Dolgov;Tobias Stickling;Selim Özgen;Florian Rosenthal;U. Hanebeck

Position and Speed Estimation of PMSMs Using Gaussian Processes

使用高斯过程估计 PMSM 的位置和速度