权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Effective and Semantic Communication in Multi-Agent Reinforcement Learning

多智能体强化学习中的有效语义通信

基本信息

批准号：
2619796
负责人：
金额：
--
依托单位：
Imperial College London
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2021
资助国家：
英国
起止时间：
2021 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2619796
关键词：
Effective Semantic Communication Multi Agent

项目摘要

My project focuses on designing goal-oriented communication frameworks and systems using machine learning optimisation techniques.The communication problem can be divided into 3 levels:1. Technical problem: How accurately can the symbols of communication be transmitted?2. Semantic problem: How precisely do the transmitted symbols convey the desired meaning?3. Effectiveness problem: How effectively does the received meaning affect conduct in the desired way?The leading prevailing paradigm is the technical problem perspective. Consider streaming visual media or guiding a remote-controlled rover. Current approaches to communications consider a layered strategy. First, the message (for instance video frame or a rover command) is mapped to a bit pattern to remove redundancy. Next, the compressed bit pattern is protected against channel distortion with an error-correcting code. Finally, the protected bits are mapped to channel symbols for transmission. After the transmission, the process is reversed at the receiver. Each step of this process has been extensively researched in the last 80 years. This method has a significant advantage - it allows for much simpler analysis at each stage. The problem is broken into subproblems that are more manageable to tackle.This approach is optimal for specific sources and long enough block codes - it cannot be beaten asymptotically. However, in general, this approach is not flawless. Communication is rarely the goal in itself. Instead, it is used to achieve some other end. Thus, the success of communication should be measured in the context of the overall objective. That is the main focus of my project - semantic and effective communication problems. The modular system fails to exploit the interactions, dependencies, and correlations between the steps. However, each of these stages is complex by itself. Thus, doing away with the modular approach is not feasible in a straightforward analytic manner. That is why the current advancements in statistical methods such as machine learning are especially promising. The ability to learn inductively from examples allows for joining the modules of communication into a single system. However, what constitutes the desired behaviour is not always apparent. Thus, another framework is introduced. Reinforcement learning is the set of algorithms that allows for learning complex behaviours through interactions with an environment. By introducing realistic communication channels, we can extend those methods to allow for learning of the communication schemes themselves jointly with the desired conduct. This framework can be extended to include multiple agents interacting and learning. This study is relevant to remote control problems, drone swarm navigation, coordinated autonomous vehicle driving, distributed learning, or industrial internet of things where the number of independent actors need to coordinate to achieve a common goal.Relevant EPSRC research areas: Artificial intelligence technologies, ICT networks and distributed systems, Digital signal processing, Statistics and applied probability.

我的项目重点是使用机器学习优化技术设计面向目标的沟通框架和系统。沟通问题可以分为三个层次：1.技术问题：如何准确地传递交流的符号？2.语义问题：所传递的符号如何准确地传达期望的意义？3.有效性问题：接收的意义如何有效地影响期望方式的行为？主导的主流范式是技术问题视角。考虑流媒体视频或引导远程控制的漫游车。目前的通信方法考虑的是分层战略。首先，消息(例如视频帧或漫游者命令)被映射到比特模式以去除冗余。接下来，利用纠错码来保护压缩的比特图案免受信道失真。最后，将受保护的比特映射到用于传输的信道码元。在传输之后，在接收器处该过程被颠倒。在过去的80年里，人们对这一过程的每一步都进行了广泛的研究。这种方法有一个显著的优势--它允许在每个阶段进行简单得多的分析。这个问题被分解成更容易处理的子问题。这种方法对于特定的来源和足够长的分组码是最优的-它不能被渐进地击败。然而，总的来说，这种方法并非完美无缺。沟通本身很少是目标。相反，它被用来实现其他目的。因此，交流的成功与否应从总体目标的角度来衡量。这就是我的项目的主要焦点--语义和有效的沟通问题。模块化系统无法利用步骤之间的交互、依赖和相关性。然而，这些阶段本身都很复杂。因此，以直截了当的分析方式废除模块化方法是不可行的。这就是为什么目前机器学习等统计方法的进步特别有希望。从例子中归纳学习的能力允许将交流模块连接到一个单独的系统中。然而，理想行为的构成并不总是显而易见的。因此，引入了另一个框架。强化学习是一组算法，允许通过与环境的交互来学习复杂的行为。通过引入现实的沟通渠道，我们可以扩展这些方法，以便与期望的行为一起学习沟通方案本身。该框架可以扩展到包括多个交互和学习的代理。这项研究与远程控制问题、无人机群导航、协调自动驾驶车辆、分布式学习或工业物联网相关，这些问题需要多个独立行为者协调才能实现共同目标。相关EPSRC研究领域：人工智能技术、ICT网络和分布式系统、数字信号处理、统计学和应用概率。