Efficient and reliable coded distributed computing

高效可靠的编码分布式计算

基本信息

  • 批准号:
    570977-2021
  • 负责人:
  • 金额:
    $ 3.64万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Alliance Grants
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Many modern ICT applications work with data at scale and demand massive computations that cannot be performed in a single computer. This has led to the wide use of distributed computing, where a massive computational task is distributed among a large number of computing nodes in a communication network. In real-life, some of these computing nodes fail to deliver their task due to software/hardware failures, handling other tasks for the network, leaving the network, etc. These straggling nodes (typically around 5% of the processing nodes) result in unpredictable network performance and can significantly prolong job completion. Currently, redundancy in the form of repeating the tasks is implemented to combat the stragglers.Error-correcting codes offer an opportunity to combat stragglers at a much lower cost, reduced communication load, higher success rate, and with added security/privacy benefits. They also create the opportunity of using a large number of very low-cost hardware by the network to reliably finish a massive job in a short time. In this project(i) we will design various low-complexity error-correction coding algorithms that are feasible for large-scale distributed computing, hence enabling the network to handle data at scale reliably;(ii) we will design task scheduling algorithms that optimally distribute and schedule the tasks in the network in order to minimize the completion time/cost, with guaranteed success.We anticipate this project to significantly improve cloud services by developing coded distributed computation and task allocation/scheduling algorithms that (i) reduce the completion time and communication costs, (ii) very efficiently use the available resources, (iii) have low implementation complexity, and (iv) provide added privacy/security. In addition, our algorithms can be used in real-life applications such as telepresence, telehealth, augmented reality, distributed database management systems, real-time process control and more.
许多现代信息和通信技术应用都需要大规模的数据,需要大量的计算,而这些计算无法在一台计算机上完成。这导致了分布式计算的广泛使用,其中大量计算任务分布在通信网络中的大量计算节点之间。在现实生活中,这些计算节点中的一些由于软件/硬件故障而无法交付其任务,处理网络的其他任务,离开网络等。这些分散的节点(通常约占处理节点的5%)导致不可预测的网络性能,并可能显著延长作业完成时间。目前,以重复任务的形式实现的冗余来对抗掉队者。纠错码提供了以低得多的成本、减少的通信负载、更高的成功率和附加的安全/隐私益处来对抗掉队者的机会。它们还创造了通过网络使用大量非常低成本的硬件在短时间内可靠地完成大量工作的机会。在这个项目中,(i)我们将设计各种低复杂度的纠错编码算法,这些算法适用于大规模分布式计算,从而使网络能够可靠地处理大规模数据;(ii)我们将设计任务调度算法,该算法最优地分配和调度网络中的任务,以便最小化完成时间/成本,我们预计该项目将通过开发编码的分布式计算和任务分配/调度算法来显着改善云服务,这些算法(i)减少完成时间和通信成本,(ii)非常有效地使用可用资源,(iii)具有低的实现复杂性,以及(iv)提供附加的隐私/安全性。此外,我们的算法可用于现实生活中的应用,如远程呈现,远程医疗,增强现实,分布式数据库管理系统,实时过程控制等。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ardakani, MasoudM其他文献

Ardakani, MasoudM的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CRII: RI: Deep neural network pruning for fast and reliable visual detection in self-driving vehicles
CRII:RI:深度神经网络修剪,用于自动驾驶车辆中快速可靠的视觉检测
  • 批准号:
    2412285
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Standard Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Standard Grant
Enabling Reliable Testing Of SMLM Datasets
实现 SMLM 数据集的可靠测试
  • 批准号:
    BB/X01858X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Research Grant
RITA: Reliable and Efficient Task Management in Edge Computing for AIoT Systems
RITA:AIoT 系统边缘计算中可靠、高效的任务管理
  • 批准号:
    EP/Y015886/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Fellowship
A Novel Contour-based Machine Learning Tool for Reliable Brain Tumour Resection (ContourBrain)
一种基于轮廓的新型机器学习工具,用于可靠的脑肿瘤切除(ContourBrain)
  • 批准号:
    EP/Y021614/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Research Grant
Economic & Reliable DC Microgrids
经济的
  • 批准号:
    EP/Y034619/1
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Fellowship
CAREER: Graded and Reliable Aerosol Deposition for Electronics (GRADE): Understanding Multi-Material Aerosol Jet Printing with In-Line Mixing
职业:电子产品的分级且可靠的气溶胶沉积 (GRADE):了解通过在线混合进行多材料气溶胶喷射打印
  • 批准号:
    2336356
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Standard Grant
STTR Phase I: A Reliable and Efficient New Method for Satellite Attitude Control
STTR第一阶段:可靠、高效的卫星姿态控制新方法
  • 批准号:
    2310323
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Standard Grant
Towards an Explainable, Efficient, and Reliable Federated Learning Framework: A Solution for Data Heterogeneity
迈向可解释、高效、可靠的联邦学习框架:数据异构性的解决方案
  • 批准号:
    24K20848
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
SPARQ(s) - Scalable, Precise, And Reliable positioning of color centers for Quantum computing and simulation
SPARQ(s) - 用于量子计算和模拟的可扩展、精确且可靠的色心定位
  • 批准号:
    10078083
  • 财政年份:
    2024
  • 资助金额:
    $ 3.64万
  • 项目类别:
    Collaborative R&D
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了