权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SHAIC1: Towards Scalable Human-AI Coordination from First Principles

SHAIC1：从第一原则迈向可扩展的人类与人工智能协调

基本信息

批准号：
EP/Y028481/1
负责人：
Jakob Foerster
金额：
$ 242.77万
依托单位：
University of Oxford
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2024
资助国家：
英国
起止时间：
2024 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FY028481%2F1
关键词：
SHAIC1 Towards Scalable Human AI

项目摘要

The goal of this proposal is to develop artificial intelligence (AI) agents, that can support and collaborate with humans in complex, real-world settings. These include, for example, industrial or service robots that can work in teams with humans and self-driving cars that interact smoothly with other traffic participants in mixed-autonomy settings. A fundamental issue is that, unlike the scalable solutions for competitive settings, current approaches for cooperative ones rely on human data and are thus limited in their scalability. Unfortunately, scaling compute to remove the need for human data is challenging in these settings. Without the well-defined objective present in zero-sum settings, it requires finding one of the few solutions that is human-compatible in a pool that also contains combinatorially many human-incompatible ones.My hypothesis is that humans have a well-defined concept of a 'good coordination solution' and to a great extent rely on this concept to solve coordination problems, i.e. when they have to work with others but cannot pre-agree on a strategy. Generally speaking, a good solution in such scenarios is one that is simple, symmetric, and therefore easy to adapt to. To move towards a formalisation and implementation of this intuitive idea, I will show how general purpose coordination policies can be efficiently discovered in complex settings using iteratively learned state-abstractions which implement simplicity and symmetry constraints.I will then robustify these policies to human sub-optimality using novel algorithms that gradually relax the constraints via online adaptation or small amounts of real-world human data.This project will result in new methods that can scale to complex human-AI coordination problems beyond the reach of the current state of the art. It will also develop a new theory that sets the scene for fundamental progress on human-AI coordination and unlocks crucial application areas, such autonomous industrial robots.

该提案的目标是开发人工智能（AI）代理，可以在复杂的现实世界环境中支持人类并与之合作。这些包括，例如，工业或服务机器人，可以与人类和自动驾驶汽车在混合自动设置中与其他交通参与者顺利互动的团队工作。一个根本问题是，与竞争环境的可扩展解决方案不同，目前的合作方法依赖于人类数据，因此其可扩展性有限。不幸的是，在这些环境中，扩展计算以消除对人类数据的需求是具有挑战性的。在零和博弈中，如果没有明确的目标，就需要在一个包含许多与人类不相容的组合的集合中找到少数几个与人类相容的解决方案之一。我的假设是，人类有一个明确的“良好协调解决方案”的概念，并且在很大程度上依赖于这个概念来解决协调问题，也就是说，当他们必须与其他人合作，但不能预先商定战略时。一般来说，在此类场景中，好的解决方案是简单、对称的，因此易于适应。为了使这个直观的想法正式化和实现，我将展示如何使用迭代学习的状态抽象来实现简单性和对称性约束，从而在复杂环境中有效地发现通用协调策略。然后，我将使用通过在线自适应或少量真实的-世界人类数据。该项目将产生新的方法，可以扩展到复杂的人类-人工智能协调问题，超出了当前最先进的水平。它还将开发一种新的理论，为人类-人工智能协调的根本性进展奠定基础，并解锁关键的应用领域，如自主工业机器人。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Jakob Foerster其他文献

Computing Low-Entropy Couplings for Large-Support Distributions

计算大支撑分布的低熵耦合

DOI：
发表时间：
2024
期刊：
arXiv.org
影响因子：
0
作者：
Samuel Sokota;Dylan Sam;C. S. D. Witt;Spencer Compton;Jakob Foerster;J. Z. Kolter
通讯作者：
J. Z. Kolter

Reinforcement Learning Controllers for Soft Robots Using Learned Environments

使用学习环境的软机器人强化学习控制器

DOI：
发表时间：
2024
期刊：
International Conference on Soft Robotics
影响因子：
0
作者：
Uljad Berdica;Matthew Jackson;Niccolò Enrico Veronese;Jakob Foerster;P. Maiolino
通讯作者：
P. Maiolino