Collaborative Research: SLES: Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms, and Experiments

协作研究:SLES:安全的分布式强化学习系统:理论、算法和实验

基本信息

项目摘要

Reinforcement learning (RL), with its success in automation and robotics, has been widely viewed as one of the most important technologies for next-generation, learning-enabled systems. For example, 6G networking systems, autonomous driving, digital healthcare, and smart cities are all enabled by RL. However, despite the significant advances over the last few decades, a major obstacle in applying RL in practice is the lack of “safety'' guarantees such as robustness, resilience to tail-risks, operational constraints, etc. This is because the traditional RL only aims at maximizing cumulative reward. While it is possible to add penalties to rewards in a traditional RL algorithm to discourage unsafe actions, many safety constraints, such as chance constraints, cannot be simply treated as penalties. This project develops foundational technologies for safe RL-enabled systems based on Distributional Reinforcement Learning (DRL), which learns the optimal policy. While developing the foundation of DRL for safe learning-enabled systems, research and education are integrated by including new theories and algorithms developed in this project into their graduate-level courses. All team members have been regularly supervising undergraduate students and students from underrepresented groups. The team continues to leverage Women's Place at Ohio State University and the Women in Science and Engineering Program at Arizona State University to enhance the broader participation of women students and researchers. This project focuses on a comprehensive approach for the end-to-end safety of DRL-enabled systems. The end-to-end safety includes (i) policy safety: learn a safe policy to avoid the occurrence of catastrophic outcomes (corresponds to risk-sensitive RL); (ii) exploration safety -- learn a safe policy safely by avoiding dangerous actions during exploration/learning (corresponds to online RL); and (iii) environmental safety -- learn a policy that is robust to parametric uncertainty (environment change). This project includes four thrusts. Thrust 1 (Foundation of constrained DRL) aims to establish theoretical foundations of risk sensitive constrained DRL and focuses on policy and environmental safety. Thrust 2 (Online constrained DRL) considers safe online learning and decision-making and focuses on exploration safety and environmental safety when learning a safe DRL policy. Thrust 3 (Physics-Enhanced constrained DRL) exploits physics to enhance end-to-end safety. These three thrusts on foundational research are interdependent, but each focuses on a unique aspect of safe RL-enabled systems and addresses multiple safety notions. The fourth thrust will provide comprehensive validation with both high-fidelity simulations and real-world experiments using unmanned aerial vehicles.This research is supported by a partnership between the National Science Foundation and Open Philanthropy.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
强化学习(RL)凭借其在自动化和机器人领域的成功,已被广泛视为下一代学习系统最重要的技术之一。例如,6 G网络系统、自动驾驶、数字医疗和智慧城市都是由RL实现的。然而,尽管在过去的几十年里取得了重大进展,在实践中应用强化学习的一个主要障碍是缺乏“安全”保证,如鲁棒性,对尾部风险的弹性,操作约束等,这是因为传统的强化学习只着眼于最大化累积奖励。虽然在传统的RL算法中可以将惩罚添加到奖励中以阻止不安全的行为,但许多安全约束(例如机会约束)不能简单地被视为惩罚。 该项目开发基于分布式强化学习(DRL)的安全RL启用系统的基础技术,学习最佳策略。 在为安全的学习系统开发DRL基础的同时,研究和教育通过将该项目中开发的新理论和算法纳入其研究生课程来整合。所有团队成员都定期监督本科生和代表性不足群体的学生。该小组继续利用俄亥俄州州立大学的妇女地位和亚利桑那州州立大学的妇女科学和工程方案,以加强女学生和研究人员的更广泛参与。该项目的重点是为支持DRL的系统提供端到端安全性的综合方法。端到端安全包括(i)策略安全:学习安全策略以避免灾难性结果的发生(对应于风险敏感RL);(ii)探索安全-通过在探索/学习期间避免危险行为来安全地学习安全策略(对应于在线RL);以及(iii)环境安全-学习对参数不确定性(环境变化)鲁棒的策略。该项目包括四个重点。推力1(约束日间行车线的基础)旨在建立风险敏感的约束日间行车线的理论基础,并侧重于政策和环境安全。 推力2(在线约束DRL)考虑安全的在线学习和决策,并在学习安全DRL政策时重点关注勘探安全和环境安全。Thrust 3(Physics-Enhanced constrained DRL)利用物理学来增强端到端安全性。基础研究的这三个重点是相互依赖的,但每个重点都集中在安全RL启用系统的一个独特方面,并解决了多个安全概念。第四个项目将通过高保真模拟和使用无人机的真实实验进行全面验证。该研究得到了美国国家科学基金会和开放慈善机构之间的合作支持。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lei Ying其他文献

Data fusion based EKF-UI for real-time simultaneous identification of structural systems and unknown external inputs
基于数据融合的 EKF-UI,用于结构系统和未知外部输入的实时同步识别
  • DOI:
    10.1016/j.measurement.2016.02.002
  • 发表时间:
    2016-06
  • 期刊:
  • 影响因子:
    5.6
  • 作者:
    Liu Lijun;Su Ying;Zhu Jiajia;Lei Ying
  • 通讯作者:
    Lei Ying
Sodium arsenite augments sensitivity of Echinococcus granulosus protoscoleces to albendazole.
亚砷酸钠增强细粒棘球绦虫原头节对阿苯达唑的敏感性。
  • DOI:
    10.1016/j.exppara.2019.02.008
  • 发表时间:
    2019-05
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    Xing Guoqiang;Zhang Hui;Liu Chunli;Guo Zhengyi;Yang Xiaoli;Wang Zhuo;Wang Bo;Lei Ying;Yang Rentan;Jian Yufeng;Lv Hailong
  • 通讯作者:
    Lv Hailong
Erythromycin relaxes BALB/c mouse airway smooth muscle
红霉素松弛 BALB/c 小鼠气道平滑肌
  • DOI:
    10.1016/j.lfs.2019.02.009
  • 发表时间:
    2019-03
  • 期刊:
  • 影响因子:
    6.1
  • 作者:
    Cai Yan;Lei Ying;Chen Jingguo;Cao Lei;Yang Xudong;Zhang Kanghuai;Cao Yongxiao
  • 通讯作者:
    Cao Yongxiao
Hybrid density functional studies of C-anion-doped anatase TiO2
C-阴离子掺杂锐钛矿型 TiO2 的杂化密度泛函研究
  • DOI:
    10.1016/j.cplett.2016.02.047
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    2.8
  • 作者:
    Shi Jianhao;Li Xuechao;Wan Rundong;Leng Chongyan;Lei Ying
  • 通讯作者:
    Lei Ying
Approaching Throughput Optimality With Limited Feedback in Multichannel Wireless Downlink Networks
在多通道无线下行链路网络中通过有限反馈实现吞吐量最优

Lei Ying的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lei Ying', 18)}}的其他基金

Collaborative Research: III: Small: Reconstruction of Diffusion History in Cyber and Human Networks with Applications in Epidemiology and Cybersecurity
合作研究:III:小:重建网络和人类网络中的扩散历史及其在流行病学和网络安全中的应用
  • 批准号:
    2324769
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Nonasymptotic Analysis for Stochastic Networks and Systems: Foundations and Applications
合作研究:CIF:小型:随机网络和系统的非渐近分析:基础和应用
  • 批准号:
    2207548
  • 财政年份:
    2022
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: Towards a Theoretic Foundation for Optimal Deep Graph Learning
协作研究:为最优深度图学习奠定理论基础
  • 批准号:
    2134081
  • 财政年份:
    2022
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Continuing Grant
NeTS: Small: Collaborative Research: Towards Adaptive and Efficient Wireless Computing Networks
NeTS:小型:协作研究:迈向自适应且高效的无线计算网络
  • 批准号:
    2002608
  • 财政年份:
    2019
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
III: Small: Towards a Theoretical Foundation for Diffusion Source Localization
III:小:迈向扩散源定位的理论基础
  • 批准号:
    2003924
  • 财政年份:
    2019
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
SpecEES: Collaborative Research: Leveraging Randomization and Human Behavior for Efficient Large-Scale Distributed Spectrum Access
SpecEES:协作研究:利用随机化和人类行为实现高效的大规模分布式频谱访问
  • 批准号:
    2001687
  • 财政年份:
    2019
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
SpecEES: Collaborative Research: Leveraging Randomization and Human Behavior for Efficient Large-Scale Distributed Spectrum Access
SpecEES:协作研究:利用随机化和人类行为实现高效的大规模分布式频谱访问
  • 批准号:
    1824393
  • 财政年份:
    2018
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
NeTS: Small: Collaborative Research: Towards Adaptive and Efficient Wireless Computing Networks
NeTS:小型:协作研究:迈向自适应且高效的无线计算网络
  • 批准号:
    1813392
  • 财政年份:
    2018
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
III: Small: Towards a Theoretical Foundation for Diffusion Source Localization
III:小:迈向扩散源定位的理论基础
  • 批准号:
    1715385
  • 财政年份:
    2017
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: Resource Allocation for Time-Critical Communications in Wireless Networks
合作研究:无线网络中时间关键型通信的资源分配
  • 批准号:
    1609202
  • 财政年份:
    2016
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
  • 批准号:
    2331878
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
  • 批准号:
    2331879
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms, and Experiments
协作研究:SLES:安全的分布式强化学习系统:理论、算法和实验
  • 批准号:
    2331781
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Foundations of Qualitative and Quantitative Safety Assessment of Learning-enabled Systems
合作研究:SLES:学习型系统定性和定量安全评估的基础
  • 批准号:
    2331938
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Bridging offline design and online adaptation in safe learning-enabled systems
协作研究:SLES:在安全的学习系统中桥接离线设计和在线适应
  • 批准号:
    2331880
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Foundations of Qualitative and Quantitative Safety Assessment of Learning-enabled Systems
合作研究:SLES:学习型系统定性和定量安全评估的基础
  • 批准号:
    2331937
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Safety under Distributional Shift in Learning-Enabled Power Systems
合作研究:SLES:学习型电力系统分配转变下的安全性
  • 批准号:
    2331776
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Verifying and Enforcing Safety Constraints in AI-based Sequential Generation
合作研究:SLES:验证和执行基于人工智能的顺序生成中的安全约束
  • 批准号:
    2331967
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Bridging offline design and online adaptation in safe learning-enabled systems
协作研究:SLES:在安全的学习系统中桥接离线设计和在线适应
  • 批准号:
    2331881
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Verifying and Enforcing Safety Constraints in AI-based Sequential Generation
合作研究:SLES:验证和执行基于人工智能的顺序生成中的安全约束
  • 批准号:
    2331966
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了