AI and robotics - applying the maximum entropy framework to real-world robotics tasks
人工智能和机器人——将最大熵框架应用于现实世界的机器人任务
基本信息
- 批准号:2745856
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Context and impactIn recent years, robotics has shown increasing promise in moving outside of laboratories and into real-world tasks. Areas such as car manufacturing that require simple and repetitive motions have felt the impact of robotics for years, but the current challenge is to extend this reach into large, dynamic environments that involve interaction between humans and robots. The economic impact of intelligent autonomous systems, once deployed at scale, will be vast, allowing the work of one person to be leveraged many times over, and creating orders of magnitude of efficiency gains. One of the key components of these systems is control, which is where my research sits.Aims and objectivesThe control tasks relevant to real-world robotics broadly fall into two categories: locomotion, my focus, and manipulation. While locomotion over flat ground is fairly straightforward, things become much more difficult over rough terrain, requiring the use of extra sensory modes like vision to anticipate obstacles and act accordingly. One of the principal aims of my research is to increase the effective range of operation of state-of-the-art locomotion controllers. More complex forms of movement like climbing and jumping, performed robustly, are currently out of the reach of modern robotics systems. Extending these capabilities would greatly enhance the domain of autonomy of these systems, furthering real-world deployment. Novelty of the research methodologyOne of the key technologies powering state-of-the-art robotics is Deep Reinforcement learning (RL). This will be the technique I will primarily focus on for my research and will aim to extend its capabilities by addressing the following questions.The first of these is how we can make training reinforcement learning systems more stable and effective. RL algorithms commonly fall into locally optimal solutions that either partially solve a problem or 'games' the reward function in an unhelpful way - like an agent that doesn't move, to avoid penalties for collapsing but sidestepping the task we want it to accomplish. Additionally, the search for useful actions often involves highly unstable behaviour which when deployed on real-world systems can easily result in damage to the hardware, or more importantly, to nearby people. If we wish to one day have intelligent autonomous systems that can adapt to real-time changes in their environment, these problems must be solved. From my research, a promising solution to both of these issues is the maximum entropy framework (MaxEnt).Maximum entropy algorithms jointly optimise for the agent's reward function and the 'entropy' - which can be thought of as a measure of randomness - of the distribution of actions it takes. The benefits of this approach are as follows. Firstly it is a simple and robust solution to the exploration-exploitation trade-off. This is one of the key dilemmas in designing an RL system and can be thought of as balancing solutions we know are currently effective, and exploring other options for solutions that might be even more effective. Secondly, MaxEnt has a unique advantage over other methods in that it allows for multimodal solutions, meaning that if there are multiple equally valid ways of solving a problem, the agent can retain all of them, giving us greater flexibility.Alignment to EPSRC's strategies and research areasI believe this project falls squarely under the UKRI's AI and robotics theme, as robust control policies are one of the core challenges that need to be solved for our systems, and thus the UK, to be resilient and effective. Additionally, my research coincides with many of the strategic priorities like artificial intelligence, frontiers in engineering and technology, and possibly even more distant areas like transforming health and healthcare with the addition of robotic manipulators in surgical procedures.
背景和影响近年来,机器人技术在走出实验室进入现实世界的任务方面表现出越来越大的希望。汽车制造等需要简单重复动作的领域多年来一直感受到机器人技术的影响,但目前的挑战是将这种影响扩展到涉及人类与机器人互动的大型动态环境中。一旦大规模部署,智能自主系统的经济影响将是巨大的,允许一个人的工作被多次利用,并创造数量级的效率增益。这些系统的关键组成部分之一是控制,这是我的研究所在。目的和目标与现实世界的机器人相关的控制任务大致分为两类:运动,我的重点,和操纵。虽然在平坦的地面上移动是相当简单的,但在崎岖的地形上,事情变得更加困难,需要使用视觉等额外的感官模式来预测障碍物并采取相应的行动。我的研究的主要目的之一是增加最先进的运动控制器的有效操作范围。更复杂的运动形式,如攀爬和跳跃,表现强劲,目前是现代机器人系统所无法企及的。扩展这些功能将大大增强这些系统的自主性,促进实际部署。 研究方法的新奇为最先进的机器人技术提供动力的关键技术之一是深度强化学习(RL)。这将是我在研究中主要关注的技术,并将通过解决以下问题来扩展其功能:第一个问题是如何使训练强化学习系统更加稳定和有效。RL算法通常会陷入局部最优解,要么部分解决问题,要么以一种无益的方式“游戏”奖励函数-就像一个不移动的代理,以避免崩溃的惩罚,但回避我们希望它完成的任务。此外,搜索有用的操作通常涉及高度不稳定的行为,当部署在现实世界的系统上时,很容易导致硬件损坏,更重要的是,对附近的人造成损害。如果我们希望有一天拥有能够适应环境实时变化的智能自治系统,这些问题必须得到解决。根据我的研究,这两个问题的一个有希望的解决方案是最大熵框架(MaxEnt)。最大熵算法联合优化代理的奖励函数和“熵”-可以被认为是随机性的度量-它所采取的行动的分布。这种方法的好处如下。首先,它是一个简单而强大的解决方案,探索开发权衡。这是设计强化学习系统的关键难题之一,可以认为是平衡我们已知的当前有效的解决方案,并探索其他可能更有效的解决方案。其次,MaxEnt与其他方法相比具有独特的优势,因为它允许多模式解决方案,这意味着如果有多个同样有效的解决问题的方法,智能体可以保留所有这些方法,为我们提供更大的灵活性。与EPSRC的战略和研究领域保持一致我相信这个项目福尔斯完全属于UKRI的人工智能和机器人主题,因为强有力的控制政策是我们的系统需要解决的核心挑战之一,因此英国要有弹性和有效性。此外,我的研究与许多战略重点相吻合,如人工智能,工程和技术前沿,甚至可能更遥远的领域,如通过在外科手术中添加机器人操纵器来改变健康和医疗保健。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似海外基金
CAREER: Democratizing Robot Learning for Assistive Robotics in MCI
职业:MCI 辅助机器人的机器人学习民主化
- 批准号:
2340177 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Continuing Grant
Reverse Design of Tuneable 4D Printed Materials for Soft Robotics
用于软体机器人的可调谐 4D 打印材料的逆向设计
- 批准号:
DE240100960 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Discovery Early Career Researcher Award
Advanced AI and RobotIcS for autonomous task pErformance
先进的人工智能和机器人控制系统可实现自主任务执行
- 批准号:
10110390 - 财政年份:2024
- 资助金额:
-- - 项目类别:
EU-Funded
Human-centric Digital Twin Approaches to Trustworthy AI and Robotics for Improved Working Conditions
以人为本的数字孪生方法,实现值得信赖的人工智能和机器人技术,以改善工作条件
- 批准号:
10109582 - 财政年份:2024
- 资助金额:
-- - 项目类别:
EU-Funded
CAREER: Manufacturing of Solid Particle-Liquid Metal Mixtures for Soft Robotics and Stretchable Electronics
职业:制造用于软机器人和可拉伸电子产品的固体颗粒-液体金属混合物
- 批准号:
2339780 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
REU Site: Undergraduate Robotics Research for Rural Appalachia
REU 网站:阿巴拉契亚农村地区本科生机器人研究
- 批准号:
2348288 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
CAREER: Integrating Robotics and Socio-emotional Learning for Incarcerated Middle School Students
职业:将机器人技术与被监禁中学生的社会情感学习相结合
- 批准号:
2404954 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Continuing Grant
MyTurn: An Afterschool Social Robotics Program to Promote Interest in Computing Among Middle School Students
MyTurn:一个课后社交机器人项目,旨在提高中学生对计算的兴趣
- 批准号:
2342099 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
Agricultural Robotics and Automation Technologies
农业机器人及自动化技术
- 批准号:
2348815 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
Hybrid Robotics for Future Reconfigurable Manufacturing
用于未来可重构制造的混合机器人
- 批准号:
2905321 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Studentship