ITR: Risk, Reward, and Reinforcement
ITR:风险、回报和强化
基本信息
- 批准号:0342634
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2003
- 资助国家:美国
- 起止时间:2003-08-01 至 2008-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The objectives of this project are to develop efficient and reliable algorithms for direct reinforcement, to learn risk-averse behaviors for problems with high degrees of uncertainty, and to apply the methods developed to an economically important problem: global asset allocation. Reinforcement learning (RL) enables a goal-directed agent to discover strategies through trial and error exploration with only limited feedback. Direct Reinforcement (DR, or "policy gradient") methods enable an agent to discover a strategy without the need to learn a value function.Dynamic programming and related value function RL methods are often found to be inefficient, to produce unstable solutions, and to have difficulty scaling up to large problems. Hence, there have been relatively few real-world applications of the value function type RL. This project seeks to make several advancements in Direct Reinforcement that will enable the development of efficient and effective practical applications.By controlling the "exploration vs. exploitation" trade-off during on-line learning, DR agents will be able to discover better policies and do so more efficiently. Stochastic optimization methods, such as stochastic "search then converge" or annealing of a Boltzmann temperature are candidate approaches. By developing risk-averse reinforcement methods, DR agents will be able to learn robust policies for uncertain or risky environments. Using risk-sensitive intertemporal utilities, DR agents will learn to avoid risky states or actions while they pursue long-term reward. Dynamic programming is widely used in economics and finance, but few attempts have been made to solve important financial problems with reinforcement learning. As a demonstration of risk-averse DR, this project will build a prototype global asset allocation system.Risk-averse direct reinforcement may find application in a variety of engineering domains, from robotics to industrial control to autonomous agents. Many industries, such as energy and the airlines, need to manage operational and financial risks together, in order to avoid supply shortfalls or bankruptcy. Individual investors must manage risk while building their investment portfolios to meet future needs, such as children's college expenses or retirement. Risk-averse Direct Reinforcement may find application in many such contexts.
该项目的目标是开发有效和可靠的算法,用于直接强化,学习风险规避行为的高度不确定性的问题,并将开发的方法应用到一个经济上重要的问题:全球资产配置。 强化学习(RL)使目标导向的智能体能够通过只有有限反馈的试错探索来发现策略。 直接强化(DR,或“策略梯度”)方法使智能体能够发现策略而不需要学习值函数。动态规划和相关的值函数RL方法通常被发现效率低下,产生不稳定的解决方案,并且难以扩展到大型问题。 因此,RL类型的值函数在现实世界中的应用相对较少。 该项目旨在在直接强化方面取得一些进展,从而能够开发高效和有效的实际应用程序。通过控制在线学习期间的“探索与开发”权衡,DR代理将能够发现更好的策略并更有效地执行。 随机优化方法,如随机“搜索然后收敛”或玻尔兹曼温度退火是候选方法。 通过开发风险规避强化方法,DR代理将能够学习不确定或风险环境的鲁棒策略。 利用风险敏感的跨期效用,DR智能体将学习避免风险状态或行为,同时追求长期回报。 动态规划在经济学和金融学中有着广泛的应用,但很少有人尝试用强化学习来解决重要的金融问题。 作为风险规避DR的一个示范,该项目将建立一个原型全球资产配置系统。风险规避直接加固可能会在各种工程领域中找到应用,从机器人到工业控制到自治代理。 许多行业,如能源和航空公司,需要共同管理运营和财务风险,以避免供应短缺或破产。 个人投资者必须在建立投资组合的同时管理风险,以满足未来的需求,例如子女的大学费用或退休金。 风险厌恶直接强化法可以在许多这样的情况下找到应用。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
John Moody其他文献
Illuminating wildfire erosion and deposition patterns with repeat terrestrial lidar
利用重复地面激光雷达照亮野火侵蚀和沉积模式
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
F. Rengers;Gregory E. Tucker;John Moody;B. Ebel - 通讯作者:
B. Ebel
Gamer 2.0: Software Toolkit for Adaptive Mesh Generation from Structural Biological Datasets
- DOI:
10.1016/j.bpj.2017.11.1921 - 发表时间:
2018-02-02 - 期刊:
- 影响因子:
- 作者:
Christopher T. Lee;John Moody;Michael J. Holst;J. Andrew McCammon;Rommie E. Amaro - 通讯作者:
Rommie E. Amaro
The mitochondrial permeability transition pore components
线粒体通透性转换孔成分
- DOI:
- 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
John Moody;Drake Circus - 通讯作者:
Drake Circus
Effects of temperature on rate constants of inhibition of organophosphorus and carbamate compounds
- DOI:
10.1016/j.tox.2011.09.009 - 发表时间:
2011-12-18 - 期刊:
- 影响因子:
- 作者:
Kasim Abass Askar;Caleb Kudi;John Moody - 通讯作者:
John Moody
Histochemical localization of acetylcholinesterase
- DOI:
10.1016/j.tox.2011.09.010 - 发表时间:
2011-12-18 - 期刊:
- 影响因子:
- 作者:
Kasim Abass Askar;Caleb Kudi;John Moody - 通讯作者:
John Moody
John Moody的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('John Moody', 18)}}的其他基金
Variance Reduction Techniques for the Identification of Noisy Systems
用于识别噪声系统的方差减少技术
- 批准号:
9626406 - 财政年份:1997
- 资助金额:
-- - 项目类别:
Continuing grant
CISE Postdoctoral Program: Robust Forecasting with Neural Networks
CISE博士后项目:神经网络稳健预测
- 批准号:
9503968 - 财政年份:1995
- 资助金额:
-- - 项目类别:
Standard Grant
Neural Networks for Time Series Prediction
用于时间序列预测的神经网络
- 批准号:
9309728 - 财政年份:1993
- 资助金额:
-- - 项目类别:
Standard Grant
Strategies for Better System Identification
更好的系统识别策略
- 批准号:
9396074 - 财政年份:1992
- 资助金额:
-- - 项目类别:
Standard Grant
Strategies for Better System Identification
更好的系统识别策略
- 批准号:
9114333 - 财政年份:1991
- 资助金额:
-- - 项目类别:
Standard Grant
Mathematical Sciences: The Induction Exponent e of an Infinite Discrete Group
数学科学:无限离散群的归纳指数 e
- 批准号:
8704085 - 财政年份:1987
- 资助金额:
-- - 项目类别:
Standard Grant
相似国自然基金
The Heterogenous Impact of Monetary Policy on Firms' Risk and Fundamentals
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
基于移动健康技术干预动脉粥样硬化性心血管疾病高危人群的随机对照现场试验:The ASCVD Risk Intervention Trial
- 批准号:81973152
- 批准年份:2019
- 资助金额:54.0 万元
- 项目类别:面上项目
基于时间序列间分位相依性(quantile dependence)的风险值(Value-at-Risk)预测模型研究
- 批准号:71903144
- 批准年份:2019
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
RISK通路在胃泌素介导的心脏缺血再灌注损伤保护中的作用研究
- 批准号:81800239
- 批准年份:2018
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
异氟烷基于TLR4/RISK/NF-κB调控糖尿病缺血性脑卒中后NLRP3炎症小体形成的机制研究
- 批准号:81771232
- 批准年份:2017
- 资助金额:54.0 万元
- 项目类别:面上项目
Notch1与RISK/SAFE/HIF-1α信号通路整合在I-postC保护中的作用及其机制
- 批准号:81260024
- 批准年份:2012
- 资助金额:50.0 万元
- 项目类别:地区科学基金项目
相似海外基金
Elucidating the contribution of risk-perception and reward-value during optimal decision-making
阐明风险感知和奖励价值在最佳决策过程中的贡献
- 批准号:
24K18605 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Early-Career Scientists
Reward Valuation and Suicidal Behavior in High-Risk Adolescents
高危青少年的奖励评估和自杀行为
- 批准号:
10655103 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Reward Responsiveness as a Prevention Target in Youth At Risk for Anhedonia
将奖励反应作为快感缺失风险青少年的预防目标
- 批准号:
10722481 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Age-Related Patterns in Intuition about Risk and Reward
关于风险和回报的直觉中与年龄相关的模式
- 批准号:
RGPIN-2017-03906 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Prenatal Environmental Mixtures, Cognitive Control and Reward Processes, And Risk for Psychiatric Problems in Adolescence.
产前环境混合物、认知控制和奖励过程以及青春期精神问题的风险。
- 批准号:
10303872 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Locomotor Activation and Mania Spectrum Risk: Circadian and Reward Mechanisms
运动激活和躁狂谱系风险:昼夜节律和奖励机制
- 批准号:
10642785 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Behavioral and neural mechanisms of reward responsivity across normative and at-risk adolescent development
规范和高危青少年发展中奖励反应的行为和神经机制
- 批准号:
10705724 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Social Reward Processing Across the Lifespan: Identifying Risk Factors for Financial Exploitation
整个生命周期的社会奖励处理:识别金融剥削的风险因素
- 批准号:
10213369 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Age-Related Patterns in Intuition about Risk and Reward
关于风险和回报的直觉中与年龄相关的模式
- 批准号:
RGPIN-2017-03906 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Behavioral and neural mechanisms of reward responsivity across normative and at-risk adolescent development
规范和高危青少年发展中奖励反应的行为和神经机制
- 批准号:
10387432 - 财政年份:2021
- 资助金额:
-- - 项目类别: