权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Nonlinear dynamic optimization theory on stochastic model and its application to mathematical finance

随机模型的非线性动态优化理论及其在数理金融中的应用

基本信息

批准号：
17540121
负责人：
OHTSUBO Yoshio
金额：
$ 2.37万
依托单位：
Kochi University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
2005
资助国家：
日本
起止时间：
2005 至 2007
项目状态：
已结题

项目摘要

The summary of research results is as follows.1. We consider multistage decision processes where a criterion function is an expectation of minimum function and formulate it as Markov decision processes with imbedded parameters. The policy depends upon a history including past imbedded parameters and the rewards at each stage is random and depends upon a current state, a current action and a next state. We give an optimality equation by using operators and show that there exist a right continuous deterministic Markov policy which depend upon a current state and an imbedded parameter.2. We consider Markov decisions processes with a target set, where criterion function is an expectation of minimum function. We formulate the problem as an infinite horizon case with a recurrent class. We show under some conditions that an optimal value function is a unique solution to an optimality equation and there exists an stationary optimal policy. Also we give a policy improvement method.3. We conside … More r a stochastic shortest path problem with associative criteria in which for each node of a graph we choose a probability distribution over the set of successor nodes so as to reach a given target node optimally. We formulate such a problem as an associative Markov decision processes. We show that an optimal value function is a unique solution to an optimality equation and find an optimal stationary policy. Also we give a value iteration method and a policy improvement method.4. We consider utility-constrained Markov decision processes. The expected utility of the total discounted reward is maximized subject to multiple expected utility constraints. By introducing a corresponding Lagrange function, saddle-point theorem of the utility constrained optimization is derived. The existence of a constrained optimal policy is characterized by optimal action sets specified with a parametric utility.5. We consider an inequality condition where one side is greater than or equal to a multiple of the other side and an equality holds if and only if one value is a multiple of the other variable. We show a cross-duality between four pairs of Golden inequalities for one-variable functions. Less

研究结果总结如下：1。考虑准则函数为最小函数期望的多阶段决策过程，并将其表述为带嵌入参数的马尔可夫决策过程。策略取决于历史，包括过去嵌入的参数，每个阶段的奖励是随机的，取决于当前状态、当前行动和下一个状态。利用算子给出了一个最优性方程，并证明了存在一个依赖于当前状态和嵌入参数的右连续确定性马尔可夫策略。我们考虑具有目标集的马尔可夫决策过程，其中准则函数是最小函数的期望。我们将问题表述为具有循环类的无限视界情况。在一定条件下，证明了最优值函数是最优性方程的唯一解，且存在平稳最优策略。并提出了政策改进的方法。我们考虑一个具有关联准则的随机最短路径问题，其中对于图的每个节点，我们在后继节点集合上选择一个概率分布，以便最优到达给定的目标节点。我们将这个问题表述为一个关联马尔可夫决策过程。我们证明了最优值函数是最优性方程的唯一解，并找到了最优平稳策略。给出了一种值迭代法和一种策略改进法。我们考虑效用约束的马尔可夫决策过程。在多重期望效用约束下，总贴现奖励的期望效用最大化。通过引入相应的拉格朗日函数，导出了效用约束优化的鞍点定理。约束最优策略的存在性以参数效用指定的最优行为集为特征。我们考虑一个不等式条件，其中一边大于或等于另一边的倍数，等式成立当且仅当一个值是另一个变量的倍数。我们证明了单变量函数的四对金不等式之间的交叉对偶性。少