Training Unstructured Sparse Neural Networks

训练非结构化稀疏神经网络

基本信息

  • 批准号:
    RGPIN-2022-03120
  • 负责人:
  • 金额:
    $ 1.82万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

The Increasing Cost of Training Deep Neural Networks Deep Neural Networks (DNNs) are behind the intelligence found in contemporary technology, enabling us to search our photos, ask questions of smart assistants or use machine translation to understand a foreign language. DNNs have a fundamental problem however: they are very expensive - in cost, energy usage, and time, both during training (learning a task) and inference (application). The state-of-the-art model for Natural Language Processing (NLP), GPT-3, has 175 billion parameters and is estimated to cost more than $4.6M USD to train. The current trend is that training DNNs will only get ever more expensive as the growth in the cost of training state-of-the-art DNN models has surpassed even the exponential growth in transistor technology that governs our computational capacity, i.e. Moore's Law. Proposed Program of Research Enable Sparse DNN Training DNNs are primarily expensive due to the empirically established, but poorly understood, requirement to over-parameterize DNNs models during training for good generalization (performance on unseen data). We know these models are over-parameterized because, after training, 80-95% of the learned weights (parameters) of DNN models can be removed (pruned) without significant loss of generalization [16]. Unstructured pruning, that is removing unnecessary individual weights from a DNN, can be highly effective at reducing the size and efficiency of pre-trained DNNs used in applications. However, attempting to train unstructured sparse DNN from random initialization - just as dense (standard) DNNs are trained - rarely achieves similar generalization as dense training empirically. For this reason, unstructured sparsity has not played a significant role in decreasing the cost of training DNNs as of yet. Developing effective approaches for efficient training would make DNN training cheaper, faster, more repeatable, and importantly more accessible to both researchers and new applications. Efficient Deep Learning and Society: Adversarial Robustness and Bias of Efficient DNNs Already DNNs used in industrial application are drastically different from those proposed in most academic papers, in using efficient DNNs methods (e.g. quantization, pruning, and distillation) and efficient DNN architectures - and yet, there is little work exploring the differences between the well-studied academic models, and industry-applied efficient DNNs beyond simply generalization performance and efficiency. Understanding the differences between efficient DNNs and the typical DNNs seen in academic research is becoming imperative given the increasing reliance of our society on this technology. With an ever larger potential to affect our society, identifying and resolving any issues with robustness and bias specific to efficient DNNs is an increasingly important area of research, and one relevant to a variety of potential industrial partners using DNNs in real-world applications.
深度神经网络(DNN)是当代技术中发现的智能的背后,使我们能够搜索我们的照片,向智能助手提问或使用机器翻译来理解外语。然而,DNN有一个根本问题:它们非常昂贵-在训练(学习任务)和推理(应用)期间的成本,能源使用和时间。最先进的自然语言处理(NLP)模型GPT-3有1750亿个参数,估计训练成本超过460万美元。目前的趋势是,训练DNN只会变得越来越昂贵,因为训练最先进的DNN模型的成本增长甚至超过了控制我们计算能力的晶体管技术的指数增长,即摩尔定律。建议的研究计划启用稀疏DNN训练DNN主要是昂贵的,因为经验上建立的,但了解甚少,在训练过程中需要过度参数化DNN模型,以实现良好的泛化(在看不见的数据上的性能)。我们知道这些模型是过度参数化的,因为在训练之后,DNN模型的80-95%的学习权重(参数)可以被删除(修剪),而不会显著损失泛化能力[16]。非结构化修剪,即从DNN中删除不必要的个体权重,可以非常有效地减少应用程序中使用的预训练DNN的大小和效率。然而,尝试从随机初始化训练非结构化稀疏DNN-就像训练密集(标准)DNN一样-很少能实现与密集训练相似的泛化经验。因此,非结构化稀疏性在降低DNN训练成本方面尚未发挥重要作用。开发有效的训练方法将使DNN训练更便宜,更快,更可重复,更重要的是研究人员和新应用程序更容易获得。高效的深度学习和社会:高效DNN的对抗鲁棒性和偏差已经在工业应用中使用的DNN与大多数学术论文中提出的DNN在使用高效DNN方法方面有很大不同(例如量化,修剪和蒸馏)和有效的DNN架构-然而,很少有工作探索研究良好的学术模型之间的差异,以及行业应用的高效DNN,而不仅仅是泛化性能和效率。了解高效DNN和学术研究中常见的典型DNN之间的差异变得越来越重要,因为我们的社会越来越依赖这项技术。随着影响我们社会的潜力越来越大,识别和解决任何针对高效DNN的鲁棒性和偏见问题是一个越来越重要的研究领域,也是一个与在现实世界中使用DNN的各种潜在工业合作伙伴相关的领域。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ioannou, Yani其他文献

Scientific Domain Knowledge Improves Exoplanet Transit Classification with Deep Learning
  • DOI:
    10.3847/2041-8213/aaf23b
  • 发表时间:
    2018-12-10
  • 期刊:
  • 影响因子:
    7.9
  • 作者:
    Ansdell, Megan;Ioannou, Yani;Angerhausen, Daniel
  • 通讯作者:
    Angerhausen, Daniel

Ioannou, Yani的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ioannou, Yani', 18)}}的其他基金

Training Unstructured Sparse Neural Networks
训练非结构化稀疏神经网络
  • 批准号:
    DGECR-2022-00358
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Launch Supplement
Training Unstructured Sparse Neural Networks
训练非结构化稀疏神经网络
  • 批准号:
    DGDND-2022-03120
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
    DND/NSERC Discovery Grant Supplement

相似海外基金

CRII: OAC: Dynamically Adaptive Unstructured Mesh Technologies for High-Order Multiscale Fluid Dynamics Simulations
CRII:OAC:用于高阶多尺度流体动力学仿真的动态自适应非结构​​化网格技术
  • 批准号:
    2348394
  • 财政年份:
    2024
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Standard Grant
Archer: Next-generation unstructured data access for hospitals and clinical trial sponsors, delivering efficiency, reducing costs and improving care
Archer:为医院和临床试验申办者提供下一代非结构化数据访问,提高效率、降低成本并改善护理
  • 批准号:
    10096804
  • 财政年份:
    2024
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Collaborative R&D
SHF: Small: Domain-Specific FPGAs to Accelerate Unrolled DNNs with Fine-Grained Unstructured Sparsity and Mixed Precision
SHF:小型:特定领域 FPGA 加速具有细粒度非结构化稀疏性和混合精度的展开 DNN
  • 批准号:
    2303626
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Standard Grant
Generating Reproducible Real-World Evidence with Multi-Source Data to Capture Unstructured Clinical Endpoints for Chronic Diseases
利用多源数据生成可重复的真实世界证据,以捕获慢性病的非结构化临床终点
  • 批准号:
    10797849
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
Harnessing Business Insights from Unstructured Customer Data
利用非结构化客户数据的业务洞察
  • 批准号:
    DP230101490
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Projects
Experiments on unstructured bargaining
非结构化讨价还价实验
  • 批准号:
    23K01318
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Developing a "state of the art" reasoning engine to extract forward looking and actionable insights for financial institutions from unstructured data sets
开发“最先进”的推理引擎,从非结构化数据集中为金融机构提取前瞻性且可行的见解
  • 批准号:
    10054206
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Collaborative R&D
Frames as dictionaries in inverse problems: Recovery guarantees for structured sparsity, unstructured environments, and symmetry-group identification
逆问题中的框架作为字典:结构化稀疏性、非结构化环境和对称群识别的恢复保证
  • 批准号:
    2308152
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Standard Grant
SHINE: Faster Boundary-Conforming Simulations of Solar Convection on Unstructured Grids
SHINE:非结构化电网上太阳对流的更快边界一致模拟
  • 批准号:
    2310372
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Standard Grant
Regional Oncology Research Center (LLMs for Unstructured Data Extraction)
区域肿瘤学研究中心(非结构化数据提取法学硕士)
  • 批准号:
    10891024
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了