权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Efficient Large Language Model Inference Through Codesign: Adaptable Software Partitioning and FPGA-based Distributed Hardware

职业：通过协同设计进行高效的大型语言模型推理：适应性软件分区和基于 FPGA 的分布式硬件

基本信息

批准号：
2339084
负责人：
Mohamed Abdelfattah
金额：
$ 88.31万
依托单位：
Cornell University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-05-01 至 2029-04-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2339084&HistoricalAwards=false
关键词：
CAREER Efficient Large Language Model

项目摘要

Artificial intelligence (AI) has entered the "age of scale". Huge amounts of training data are being used to train enormous deep neural networks (DNNs) on large-scale computers as epitomized by the rise of large language models (LLMs). The extremely high demand for this technology is clearly evident, as recently exemplified by ChatGPT: an LLM chatbot that garnered 100 million active users merely two months post-release, setting a new world record. However, deploying LLMs can be quite costly, given that their memory footprint can extend to terabytes of data while also demanding high computational resources. Consequently, large-scale distributed computers have become essential, particularly to meet the performance required for interactive applications. To improve efficiency, this project tackles new challenges that are specific to LLMs, including their large memory footprint, varying computational demands, and distributed computing. This is critical to make LLMs more accessible and sustainable for widespread use. Concurrently, this award seeks to develop a diverse AI workforce proficient in algorithms, hardware, and software, achieved through a large-scale AI course for diverse student population at public universities, comprehensive curriculum integration, and student mentorship at both graduate and undergraduate levels.This project will enable the codesign of LLMs and distributed computing platforms, divided into three major thrusts that correspond to three levels of the computing stack: software, hardware, and algorithms. Initially, the project will focus on automated partitioning and mapping algorithms, as these form the foundations by which LLMs can be deployed and optimized on both existing and new distributed computing platforms. Key to this research thrust is the development of an extensible hardware performance estimator that can model current GPU-based systems alongside new distributed computing approaches. In particular, the second thrust investigates the use of in-network and near-storage FPGAs within distributed systems to speed up LLM inference. The final thrust investigates platform-aware compression for LLMs, including mixed-precision quantization and low-rank approximation. In addition to improving LLM efficiency across the computing stack, this project will develop a research framework to synergistically co-optimize LLMs and distributed hardware platforms, resulting in new optimized LLM computing systems and implementation methodologies.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

人工智能（AI）已经进入“规模化时代”。大量的训练数据正在被用于在大型计算机上训练巨大的深度神经网络（DNN），大型语言模型（LLM）的兴起就是一个缩影。对这项技术的极高需求是显而易见的，正如最近ChatGPT所证明的那样：一个LLM聊天机器人在发布后仅两个月就获得了1亿活跃用户，创造了新的世界纪录。然而，部署LLM可能非常昂贵，因为它们的内存占用可能扩展到TB级的数据，同时还需要高计算资源。因此，大规模分布式计算机已成为必不可少的，特别是为了满足交互式应用程序所需的性能。为了提高效率，该项目解决了特定于LLM的新挑战，包括它们的大内存占用，不同的计算需求和分布式计算。这对于使LLMs更容易获得和可持续地广泛使用至关重要。同时，该奖项旨在通过为公立大学的不同学生群体提供大规模的AI课程，全面的课程整合以及研究生和本科生的学生导师制，培养精通算法，硬件和软件的多元化AI人才。该项目将实现LLM和分布式计算平台的共同设计，分为三个主要的推动力，对应于三个层次的计算堆栈：软件，硬件和算法。最初，该项目将专注于自动分区和映射算法，因为这些算法构成了LLM可以在现有和新的分布式计算平台上部署和优化的基础。这项研究的关键是开发一种可扩展的硬件性能估计器，可以模拟当前基于GPU的系统以及新的分布式计算方法。特别是，第二个推力研究在分布式系统中使用网络和近存储FPGA来加速LLM推理。最后的推力研究平台感知压缩LLM，包括混合精度量化和低秩近似。除了提高整个计算堆栈的LLM效率外，该项目还将开发一个研究框架，以协同优化LLM和分布式硬件平台，从而产生新的优化LLM计算系统和实施方法。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Mohamed Abdelfattah其他文献

Computed tomography vs. cinefluoroscopy for the assessment of mechanical prosthetic valve leaflet motion

DOI：
10.1007/s00380-022-02193-x
发表时间：
2022-10-27
期刊：
HEART AND VESSELS
影响因子：
1.500
作者：
Mohammad Abdelghani;Mohamed Abdelfattah;Ahmed Mohamed Diab;Hamada Elsheikh;Mohy E. Mansour Elabbady
通讯作者：
Mohy E. Mansour Elabbady

Investigation and monitoring of rotational landslides in El Mokkattam plateau Egypt, using integrated geological and geophysical techniques

DOI：
10.1016/j.heliyon.2024.e36545
发表时间：
2024-09-15
期刊：
Research article
影响因子：
作者：
Mohamed A. Gamal;Mohamed Abdelfattah;George Maher
通讯作者：
George Maher

Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel

探索每像素微比特语义图像压缩的极限

DOI：
10.48550/arxiv.2402.13536
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Jordan Dotzel;Bahaa Kotb;James Dotzel;Mohamed Abdelfattah;Zhiru Zhang
通讯作者：
Zhiru Zhang

Monitoring coastal changes in Port Said, Egypt using multi-temporal satellite imagery and GIS-DSAS

DOI：
10.1007/s40808-024-02266-y
发表时间：
2025-01-04
期刊：
Modeling Earth Systems and Environment
影响因子：
2.900
作者：
Hany F. Abd-Elhamid;Mohamed Abdelfattah;Martina Zeleňáková;Abd Elnaby Kabeel;Jacek Barańczuk;Salem S. Gharbia;Mohamed Mahdy
通讯作者：
Mohamed Mahdy

Comparative Study between Erector Spinae Plane Block versus Intravenous Morphine as Postoperative Analgesia after Spine Surgeries

竖脊肌平面阻滞与静脉吗啡用于脊柱术后镇痛的比较研究

DOI：
10.21608/ejhm.2024.348925
发表时间：
2024
期刊：
The Egyptian Journal of Hospital Medicine
影响因子：
0
作者：
Khaled Mohamed;Hamza Hassan;Abo Alam;Mahmoud Mohamed Abo;Elhamd Abd;Elrahman;Khaled Abdelfattah;Mohamed Abdelfattah;Mohamed Abo Elhamd;Abd Elrahman
通讯作者：
Abd Elrahman