权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Robust Multimodal Fusion For Low-Level Tasks

适用于低级任务的鲁棒多模态融合

基本信息

批准号：
EP/T026111/1
负责人：
Joao De Castro Mota
金额：
$ 32.44万
依托单位：
Heriot-Watt University
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2021
资助国家：
英国
起止时间：
2021 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FT026111%2F1
关键词：
Robust Multimodal Fusion Low Level

项目摘要

There is a silent but steady revolution happening in all sectors of the economy, from agriculture through manufacturing to services. In virtually all activities in these sectors, processes are being constantly monitored and improved via data collection and analysis. While there has been tremendous progress in data collection through a panoply of new sensor technologies, data analysis has revealed to be a much more challenging task. Indeed, in many situations, the data generated by sensors often comes in quantities so large that most of it ends up being discarded. Also, many times, sensors collect different types of data about the same phenomenon, the so-called multimodal data. However, it is hard to determine how the different types of data relate to each other or, in particular, what one sensing modality tells about another sensing modality. In this project, we address the challenge of making sensing of multimodal data, that is, data that refers to the same phenomenon, but reveals different aspects from it and is usually presented in different formats. For example, several modalities can be used to diagnose cancer, including blood tests, imaging technologies like magnetic resonance (MR) and computed tomography (CT), genetic data, and family history information. Each of these modalities is typically insufficient to perform an accurate diagnosis but, when considered together, they usually lead to an undeniable conclusion.Our departing point is the realization that different sensing modalities have different costs, where "cost" can be financial, refer to safety or societal issues, or both. For instance, in the above example of cancer diagnosis, CT imaging involves exposing patients to X-ray radiation which, ironically, can provoke cancer. MR imaging, on the other hand, exposes patients to strong magnetics fields, a procedure that is generally safe. A pertinent question is then whether we can perform both MR and CT imaging, but use a lower dose of radiation in CT (obtaining a poor-resolution CT) and, afterward, improve the resolution of CT by leveraging information from MR. This, of course, requires learning what type of information can be transferred between different modalities. Another example scenario is autonomous driving, in which sensors like radar, LiDAR, or infrared cameras, although much more expensive than conventional cameras, collect information that is critical to driving in safe conditions. In this case, is it possible to use cheaper, lower-resolution sensors and enhance them with information from conventional cameras? These examples also demonstrate that many of the scenarios in which we collect multimodal data also have robustness requirements, namely, precision of diagnosis in cancer detection and safety in autonomous driving. Our goal is then to develop data processing algorithms that effectively capture common information across multimodal data, leverage these structures to improve reconstruction, prediction, or classification of the costlier (or all) modalities, and are verifiable and robust. We do this by combining learning-based approaches with model-based approaches. Over the last years, learning-based approaches, namely deep learning methods, have reached unprecedented performance, and work by extracting information from large datasets. Unfortunately, they are vulnerable to so-called generalization errors, which occur when the data to which they are applied differs significantly from the data used in the learning process. On the other hand, model-based methods tend to be more robust, but have poorer performance in general. The approaches we propose to explore use learning-based techniques to determine correspondences across modalities, extracting relevant common information, and integrate that common information into model-based schemes. Their ultimate goal is to compensate cost and quality imbalances across the modalities while, at the same time, providing robustness and verifiability.

从农业到制造业，再到服务业，所有经济部门都在发生一场无声但稳步的革命。在这些部门的几乎所有活动中，都通过数据收集和分析不断监测和改进程序。虽然通过一系列新的传感器技术在数据收集方面取得了巨大进展，但数据分析已被证明是一项更具挑战性的任务。事实上，在许多情况下，传感器生成的数据往往数量庞大，以至于其中大部分最终被丢弃。此外，很多时候，传感器收集关于同一现象的不同类型的数据，即所谓的多模态数据。然而，很难确定不同类型的数据如何彼此相关，或者特别是一种感测模态告诉另一种感测模态什么。在这个项目中，我们解决了多模态数据的感知的挑战，即指同一现象的数据，但揭示了不同的方面，通常以不同的格式呈现。例如，可以使用几种方式来诊断癌症，包括血液检查，磁共振（MR）和计算机断层扫描（CT）等成像技术，遗传数据和家族史信息。这些模式中的每一种通常都不足以进行准确的诊断，但是，当一起考虑时，它们通常会导致不可否认的结论。我们的出发点是认识到不同的传感模式具有不同的成本，其中“成本”可以是财务，指安全或社会问题，或两者兼而有之。例如，在上述癌症诊断的例子中，CT成像涉及将患者暴露于X射线辐射，具有讽刺意味的是，这可能引发癌症。另一方面，MR成像将患者暴露在强磁场中，这是一种通常安全的程序。一个相关的问题是，我们是否可以同时进行MR和CT成像，但在CT中使用较低的辐射剂量（获得低分辨率CT），然后通过利用MR的信息来提高CT的分辨率。另一个示例场景是自动驾驶，其中雷达、激光雷达或红外摄像头等传感器虽然比传统摄像头贵得多，但可以收集对安全驾驶至关重要的信息。在这种情况下，是否有可能使用更便宜、分辨率更低的传感器，并利用传统相机的信息来增强它们？这些例子还表明，我们收集多模态数据的许多场景也有鲁棒性要求，即癌症检测的诊断精度和自动驾驶的安全性。然后，我们的目标是开发数据处理算法，有效地捕获多模态数据的共同信息，利用这些结构来改善重建，预测或分类的昂贵（或所有）的模态，是可验证的和强大的。我们通过将基于学习的方法与基于模型的方法相结合来做到这一点。在过去的几年里，基于学习的方法，即深度学习方法，已经达到了前所未有的性能，并通过从大型数据集中提取信息来工作。不幸的是，它们很容易受到所谓的泛化错误的影响，当它们所应用的数据与学习过程中使用的数据显著不同时，就会发生泛化错误。另一方面，基于模型的方法往往更健壮，但总体上性能较差。我们建议探索的方法使用基于学习的技术来确定跨模态的对应关系，提取相关的公共信息，并将该公共信息集成到基于模型的方案中。其最终目标是补偿各种模式之间的成本和质量不平衡，同时提供稳健性和可核查性。

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Classification-Driven Discrete Neural Representation Learning for Semantic Communications

DOI：
10.1109/jiot.2024.3354312
发表时间：
2024-05
期刊：
IEEE Internet of Things Journal
影响因子：
10.6
作者：
Wenhui Hua;Longhui Xiong;Sicong Liu;Lingyu Chen;Xuemin Hong;João F. C. Mota;Xiang Cheng
通讯作者：
Wenhui Hua;Longhui Xiong;Sicong Liu;Lingyu Chen;Xuemin Hong;João F. C. Mota;Xiang Cheng

Sharper Bounds for Proximal Gradient Algorithms with Errors

有错误的近端梯度算法的更清晰界限