权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

A probabilistic toolkit to study regularity of free boundaries in stochastic optimal control

研究随机最优控制中自由边界规律性的概率工具包

基本信息

批准号：
EP/R021201/1
负责人：
Tiziano De Angelis
金额：
$ 12.79万
依托单位：
University of Leeds
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2018
资助国家：
英国
起止时间：
2018 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FR021201%2F1
关键词：
probabilistic toolkit study regularity free

项目摘要

Imagine a spaceship that travels towards a planet and must reach it by a given date. After the launch, and in absence of further intervention, the change in relative position of the spaceship and the planet cannot be predicted with 100% accuracy. The trajectory must be constantly monitored and unforeseen variations must be accounted for, in order to reach the target. However, several constraints must be considered: for example availability of fuel and effectiveness of intervention (if the spaceship is too far off the target, late interventions will not bring it back to the desired trajectory). The question is therefore how to strike the right balance between costs and benefits in order to control the trajectory of the spaceship in the optimal way.This problem was formulated as one of "stochastic control" in the 60's by J. Bather and H. Chernoff. A quick search in the NASA Technical Reports Server shows that "stochastic control theory" is at the core of aerospace engineering (6,843 matching records). Interestingly this exciting branch of mathematics finds applications in many other real-world problems including physics, biology, energy and economics.To give an oversimplified idea of what a solution may look like in the problem above, we could say that it is optimal to make a "small" adjustment to the spaceship's trajectory each time that the offset between spaceship and planet exceeds a value that depends on the available fuel and on the time elapsed from launch. This critical value is called the "free boundary", in mathematics, and it is the key unknown quantity in most stochastic control problems. In practice, the shape and smoothness of the free boundary (as a function of time and fuel, in the example), are needed to enable efficient (numerical) evaluation of the spaceship's optimal trajectory (e.g. leading to minimal use of fuel). Associated to each control problem there is indeed a "cost function" which measures the performance of the control strategy. In the example this may be taken as proportional to the distance of the spaceship from the target, plus the cost of using fuel. The aim of the controller is to minimise the expected value of such cost. When the optimal control is implemented the resulting expected cost is called "value function". The value function is the other main unknown object in this context and, along with the free boundary, their study goes under the name of "free boundary problem" (FBP).FBPs are addressed in Analysis and in Probability theory. However it is often impossible to find a full analogy between results in the two fields. On the one hand, Probability can only explain very limited smoothness of the free boundary and of the value function, but is flexible in modelling randomness in the system. On the other hand, Analysis obtains fine regularity results but mostly under rather inflexible assumptions on the model. In our example above, PDE theory gives very accurate optimal controls if the spaceship's trajectory is described by a simple model. However, in practice engineers must deal with a wide class of random perturbations and need a versatile probabilistic approach. The latter must be supported by a refined probabilistic understanding of FBPs, which is the objective of this proposal. In this project I will develop a new framework for the study of free boundary problems that will hinge on properties of random noises drawn from the class of diffusion processes. My work will provide advanced tools that not only will unify the probabilistic and analytic approach to stochastic control but, more importantly, will remove some of the long-standing assumptions in both areas and allow for tractable solutions to a whole new class of applied problems. Moreover, I will obtain ODEs to accurately compute optimal strategies in multi-dimensional settings (see 2.5 in Case for Support). Non-linear integral equations, currently used, cannot be computed efficiently in dimension higher than two.

想象一下，一艘宇宙飞船要飞向一颗行星，必须在给定的日期到达。发射后，在没有进一步干预的情况下，飞船和行星相对位置的变化无法100%准确地预测。为了达到目标，必须不断地监测轨迹，并且必须考虑到不可预见的变化。然而，必须考虑几个限制因素：例如燃料的可用性和干预的有效性（如果宇宙飞船离目标太远，后期干预将无法使其回到预期的轨道上）。因此，问题是如何在成本和收益之间取得适当的平衡，以便以最佳方式控制宇宙飞船的轨道。这个问题是在60年代由J. Bather和H. Chernoff提出的“随机控制”问题之一。在NASA技术报告服务器上快速搜索一下就会发现，“随机控制理论”是航空航天工程的核心（6843条匹配记录）。有趣的是，这个令人兴奋的数学分支在许多其他现实世界的问题中都有应用，包括物理学、生物学、能源和经济学。为了简化上述问题的解决方案，我们可以说，每次飞船和行星之间的偏移量超过一个取决于可用燃料和发射时间的值时，对飞船的轨迹进行“小”调整是最优的。这个临界值在数学上称为“自由边界”，它是大多数随机控制问题中的关键未知量。在实践中，自由边界的形状和平滑度（在本例中作为时间和燃料的函数）需要能够对宇宙飞船的最佳轨迹进行有效的（数值）评估（例如，导致最少使用燃料）。与每个控制问题相关联的确实有一个衡量控制策略性能的“成本函数”。在这个例子中，这可以被看作是与飞船到目标的距离成正比，加上使用燃料的成本。控制器的目标是使这种成本的期望值最小化。当实现最优控制时，产生的期望成本称为“价值函数”。在这种情况下，价值函数是另一个主要的未知对象，与自由边界一起，他们的研究被称为“自由边界问题”（FBP）。fbp是在分析和概率论中讨论的。然而，在这两个领域的结果之间往往不可能找到完全相似的地方。一方面，概率只能解释非常有限的自由边界和值函数的平滑性，但在模拟系统中的随机性方面是灵活的。另一方面，分析得到了良好的规律性结果，但大多是在模型上相当不灵活的假设下。在我们上面的例子中，如果宇宙飞船的轨迹用一个简单的模型来描述，PDE理论给出了非常精确的最优控制。然而，在实践中，工程师必须处理各种各样的随机扰动，需要一种通用的概率方法。后者必须得到对fbp的精确概率理解的支持，这也是本建议的目标。在这个项目中，我将开发一个新的框架来研究自由边界问题，这将取决于从扩散过程中提取的随机噪声的性质。我的工作将提供先进的工具，不仅将统一随机控制的概率和分析方法，而且更重要的是，将消除这两个领域中一些长期存在的假设，并允许对全新一类应用问题的易于处理的解决方案。此外，我将获得在多维设置中精确计算最优策略的ode（参见2.5的支持情况）。目前使用的非线性积分方程在二维以上的情况下不能有效地计算。