Application of Reinforcement Learning to the Flight Control of Unmanned Aerial Vehicles
强化学习在无人机飞行控制中的应用
基本信息
- 批准号:2104294
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2018
- 资助国家:英国
- 起止时间:2018 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Project description:Complex urban environments pose a significant challenge for the operation of Unmanned Autonomous Systems (UAS). To operate in such areas, vehicles require the ability to rapidly change direction, avoid obstacles, and land in confined areas. This is especially challenging for a fixed-wing platform, due to the minimum airspeed needed to prevent aircraft stall. Fixed-wing platforms offer a number of advantages of rotary-wing vehicles, such as increased flight endurance and range, and greater payload capacity. As such, there is significant research in improving the agility of fixed-wing platforms, to improve their ability to operate in complex environments. This research proposal aims to build upon previous research projects conducted by the University of Bristol Flight Lab [1] [2]. These projects used a variable-sweep, fixed-wing platform to perform a bio-inspired perched landing manoeuvre. This agile landing manoeuvre, taking advantage of dynamic stall, enabled to UAS to land safely on small landing site, with minimal aircraft velocity, without the need for a long landing strip or arresting equipment. In particular, such a manoeuvre is applicable to challenging operational environments, such as in a complex urban setting, or operating from the deck of a ship. Non-linear control strategies were evaluated to generate the necessary perching manoeuvres. The reinforcement learning process, using a Deep Q-Network (DQN), generated trajectories with the lowest cost function, and showed the ability to generate trajectories from a range of starting conditions.Research in the first year aimed to modernise the perching UAV learning process, integrating and evaluating state-of-the-art reinforcement learning algorithms and frameworks. Compared to the DQN algorithm used previously, modern algorithms, such as Proximal Policy Optimisation (PPO), demonstrate the ability to attain higher rewards, as well as improved stability and convergence during the learning process. [3] This research has also explored the use of continuous control outputs, to increase the granularity of actuator control available to the learning agent. The next stage of this research is transitioning to real-world flight testing of the perching manoeuvre using these improvements to the process. This project has also transitioned to using state-of-the-art frameworks, such as OpenAI's Gym toolkit, to modernise and modularise the learning architecture. This lays the foundation for simpler, faster implementation of alternative algorithms and scenarios moving forward. This research project will aim to build on previous projects of the research group, and incorporate state-of-the-art algorithms and techniques, to develop reinforcement learning-based flight controllers which can perform a number of agile flight manoeuvres. The current flight dynamics model of a model UAV will be improved and expanded, to improve accuracy when performing agile manoeuvres, and by incorporating the lateral degrees of freedom into the current longitudinal-only model. Methods to improve the accuracy of the trained model will be evaluated and implemented, such as incorporating flight data into the offline, simulated learning process, and conducting online learning on the real-world vehicle. A number of agile flight manoeuvres, applicable to the operating in complex environments, will be selected, tested and evaluated. Examples of candidate algorithms include rapid changes of direction, and minimum distance 180 turns, such that the vehicle can avoid obstacles and navigate cluttered environments. A key focus of this research will be generating trained controllers and the necessary software frameworks such that they can be tested and used on real-world platforms.
项目描述:复杂的城市环境给无人自主系统(UAS)的运行带来了巨大的挑战。要在这样的区域运行,车辆需要能够迅速改变方向,避开障碍物,并在受限区域着陆。这对于固定翼平台来说尤其具有挑战性,因为防止飞机失速所需的最低空速。固定翼平台提供了旋转翼飞行器的许多优势,如更高的飞行耐力和航程,以及更大的有效载荷能力。因此,提高固定翼平台的敏捷性,提高其在复杂环境中的作业能力具有重要的研究意义。这项研究提案旨在建立在布里斯托尔大学飞行实验室[1][2]以前进行的研究项目的基础上。这些项目使用了一个可变后掠翼的固定翼平台来执行生物灵感栖息着陆动作。这种灵活的着陆操作,利用动态失速,使无人机能够以最小的飞机速度安全降落在小型着陆点,而不需要长长的着陆跑道或拦阻设备。特别是,这种机动适用于具有挑战性的作战环境,例如在复杂的城市环境中,或在舰船甲板上操作。对非线性控制策略进行了评估,以产生必要的栖息操纵。强化学习过程使用深度Q网络(DQN),以最低的代价函数生成轨迹,并显示出从一系列起始条件生成轨迹的能力。第一年的研究旨在使栖息式无人机学习过程现代化,集成和评估最先进的强化学习算法和框架。与以前使用的DQN算法相比,现代算法,如最近策略优化(PPO),显示了在学习过程中获得更高回报的能力,以及更好的稳定性和收敛性能。[3]本研究还探索了连续控制输出的使用,以增加可供学习代理使用的执行器控制的粒度。这项研究的下一阶段是过渡到使用这些过程改进的栖息动作的真实世界飞行测试。该项目还过渡到使用最先进的框架,如OpenAI的Gym工具包,以实现学习架构的现代化和模块化。这为更简单、更快地实施替代算法和方案奠定了基础。这项研究项目旨在以课题组以前的项目为基础,结合最先进的算法和技术,开发基于强化学习的飞行控制器,可以执行一些灵活的飞行动作。目前无人机模型的飞行动力学模型将进行改进和扩展,以提高进行灵活机动时的精度,并将横向自由度纳入当前仅限纵向的模型中。将评估和实施提高训练模型准确性的方法,例如将飞行数据纳入离线、模拟学习过程,以及在真实世界的飞行器上进行在线学习。将选择、测试和评估一些适用于复杂环境下操作的敏捷飞行动作。候选算法的例子包括快速改变方向和最小距离180个转弯,这样车辆就可以避开障碍物并在混乱的环境中导航。这项研究的一个关键重点将是生成训练有素的控制器和必要的软件框架,以便它们可以在真实世界的平台上进行测试和使用。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似国自然基金
海桑属杂种区强化(Reinforcement)的检验与遗传基础研究
- 批准号:30800060
- 批准年份:2008
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Application of Deep Reinforcement Learning to Predict Ablation Therapy for Atrial Fibrillation from Imaging Data
应用深度强化学习根据影像数据预测心房颤动的消融治疗
- 批准号:
2740519 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Studentship
application of scalable safe reinforcement learning to high-risk robotics
可扩展安全强化学习在高风险机器人技术中的应用
- 批准号:
21J15633 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Grant-in-Aid for JSPS Fellows
Discrete and Continuous Reinforcement Learning with a Library of Skills and its Application to Robotic Food Manipulation
具有技能库的离散和连续强化学习及其在机器人食品操作中的应用
- 批准号:
21K12070 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Application of reinforcement learning to hexapod robots with free-swinging joint failures trained on rough/uneven terrain
强化学习在粗糙/不平坦地形上训练的自由摆动关节故障六足机器人中的应用
- 批准号:
566969-2021 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Research on the innovative evolution of deep reinforcement learning based on the profit sharing principle and its application to real problems
基于利润分享原则的深度强化学习创新演化及其在实际问题中的应用研究
- 批准号:
21K12024 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Application of Reinforcement Learning Algorithms to the Generation of Novel Chemical Entities
强化学习算法在新型化学实体生成中的应用
- 批准号:
542755-2019 - 财政年份:2019
- 资助金额:
-- - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Developing Smart Sample-efficient exploration strategies for Reinforcement learning agents with a focus on its application for autonomous robot contro
为强化学习代理开发智能样本高效探索策略,重点关注其在自主机器人控制中的应用
- 批准号:
2109484 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Studentship
Theory and Application of Statistical Reinforcement Learning
统计强化学习理论与应用
- 批准号:
17H00757 - 财政年份:2017
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (A)
Deep Reinforcement Learning and its Application to Dialogue Systems
深度强化学习及其在对话系统中的应用
- 批准号:
495424-2016 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Theoretical research of the policy gradient reinforcement learning without Markov properties and its application to games
无马尔可夫性质的策略梯度强化学习理论研究及其在游戏中的应用
- 批准号:
26330419 - 财政年份:2014
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)