Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
基本信息
- 批准号:RGPIN-2017-04877
- 负责人:
- 金额:$ 1.46万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Many complex problems arising in business, health care, and transportation can be modelled as sequential decision making problems under uncertainty, meaning that a decision maker has to make decisions periodically while some random events unfold over time. For instance, an airline dynamically changes the fare for different flights over a network of cities without knowing the actual future demand, trying to maximize its revenue while managing the risk of unsold seats. These problems can be conveniently modelled in the form of dynamic programs, a method that finds the best decision by maximizing the sum of immediate reward and the expected future reward. Unfortunately, for many practical problems, the number of future scenarios that one should consider in order to calculate the expected future reward function is exponentially large, making exact calculation of this function intractable. In order to overcome this issue, approximate dynamic programming (ADP) methods have been developed to find an approximate optimal solution.
A cornerstone of many ADP algorithms is defining a set of basis functions (or an approximation architecture) for approximating the future consequence of present decisions (the expected future reward function). Currently, the choice of basis functions requires prior expert knowledge about the problem, and is usually considered as more of an art than a science. My research program aims to develop, study, and apply novel algorithms that automate generation of basis functions by efficiently selecting a subset of functions from a large pool of potential basis functions, and updating this set as more information becomes known about the problem. The benefit of such algorithms is twofold: first, it reduces the burden to come up with a well-informed set of basis functions that requires significant prior knowledge about the problem; second, since many potential candidates are considered for basis functions, it is expected that the quality of the approximation is improved. My short-term objective includes evaluating the performance of the proposed algorithms in a variety of application areas, such as perishable inventory management, patient scheduling, and revenue management.
ADP is a general method that is commonly used for solving many different problems in a variety of applications. As the quality of the policies generated by these algorithms is dependent on the quality of the basis functions chosen, it would be of great interest, both theoretically and practically, if the process of generating and selecting basis functions can be automated. Therefore, even a small improvement achieved by the findings of my research would have significant practical implications in multiple application areas.
Many complex problems arising in business, health care, and transportation can be modelled as sequential decision making problems under uncertainty, meaning that a decision maker has to make decisions periodically while some random events unfold over time. For instance, an airline dynamically changes the fare for different flights over a network of cities without knowing the actual future demand, trying to maximize its revenue while managing the risk of unsold seats. These problems can be conveniently modelled in the form of dynamic programs, a method that finds the best decision by maximizing the sum of immediate reward and the expected future reward. Unfortunately, for many practical problems, the number of future scenarios that one should consider in order to calculate the expected future reward function is exponentially large, making exact calculation of this function intractable. In order to overcome this issue, approximate dynamic programming (ADP) methods have been developed to find an approximate optimal solution.
A cornerstone of many ADP algorithms is defining a set of basis functions (or an approximation architecture) for approximating the future consequence of present decisions (the expected future reward function). Currently, the choice of basis functions requires prior expert knowledge about the problem, and is usually considered as more of an art than a science. My research program aims to develop, study, and apply novel algorithms that automate generation of basis functions by efficiently selecting a subset of functions from a large pool of potential basis functions, and updating this set as more information becomes known about the problem. The benefit of such algorithms is twofold: first, it reduces the burden to come up with a well-informed set of basis functions that requires significant prior knowledge about the problem; second, since many potential candidates are considered for basis functions, it is expected that the quality of the approximation is improved. My short-term objective includes evaluating the performance of the proposed algorithms in a variety of application areas, such as perishable inventory management, patient scheduling, and revenue management.
ADP is a general method that is commonly used for solving many different problems in a variety of applications. As the quality of the policies generated by these algorithms is dependent on the quality of the basis functions chosen, it would be of great interest, both theoretically and practically, if the process of generating and selecting basis functions can be automated. Therefore, even a small improvement achieved by the findings of my research would have significant practical implications in multiple application areas.
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
SabouriBaghAbbas, Alireza其他文献
SabouriBaghAbbas, Alireza的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('SabouriBaghAbbas, Alireza', 18)}}的其他基金
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
A study on practical algorithms for combinatorial optimization based on approximate submodularity
基于近似子模性的组合优化实用算法研究
- 批准号:
22K17857 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Approximate Scheduling Algorithms via Mathematical Relaxations
职业:通过数学松弛的近似调度算法
- 批准号:
1844890 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Continuing Grant
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
CRII: CIF: Approximate Message Passing Algorithms for High-Dimensional Estimation
CRII:CIF:高维估计的近似消息传递算法
- 批准号:
1849883 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Standard Grant
Approximate Optimization Algorithms with Theoretical Rationales in Markov Decision Processes
马尔可夫决策过程中具有理论依据的近似优化算法
- 批准号:
19K04904 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Fast approximate inference methods: new algorithms, applications and theory
快速近似推理方法:新算法、应用和理论
- 批准号:
DP180100597 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Projects
Novel Algorithms to Approximate the Future Consequence of Sequential Decisions
近似连续决策的未来后果的新算法
- 批准号:
RGPIN-2017-04877 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
III: Small: Algorithms and Theoretical Foundations for Approximate Bayesian Inference in Machine Learning
III:小:机器学习中近似贝叶斯推理的算法和理论基础
- 批准号:
1906694 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Continuing Grant