MRI: Acquisition of the LanguageLens for Large-Scale Language Modeling
MRI:获取用于大规模语言建模的 LanguageLens
基本信息
- 批准号:2214708
- 负责人:
- 金额:$ 101.48万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-08-01 至 2025-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Machine learning is revolutionizing many parts of society, but training the very best models requires tremendous computing resources that are often out of reach for academic groups. This project therefore acquires a special-purpose instrument, named the LanguageLens, that is designed to process vast amounts of natural language text. The LanguageLens will support research in natural language processing, deep learning, computational linguistics, crisis informatics, conversational AI, neural machine translation, and legal corpus linguistics, and will enable academic research to advance both the machine learning needed to train large models, as well as societially relevant applications of those models.The LanguageLens is a high-performance GPU cluster that balances compute, storage and internode communication to support a variety of demanding NLP-based workloads. The LanguageLens will be focused on solving research projects that have the potential for transformational, interdisciplinary impact across a wide variety of fields. A key area of focus for the instrument is the ability to train new large-scale language models and to examine their inner workings in real-time. Language models will be trained with specific downstream applications in mind, on novel corpora as well as with novel neuro-symbolic architectures, to help derive insight from the resulting weights. The LanguageLens will prioritize support for research that addresses pressing societal problems. It will also provide authentic workforce training and educational experiences for students: as the resource gap between industry and academia grows, it is increasingly difficult to give them opportunities to pursue high-impact research that involves huge models and datasets. Finally, as many companies refuse to release the pretrained weights of their models, a central goal is to make trained weights freely available to everyone, subject to ethical considerations, to drive national impact for both industry and academia. Project resources such as code, publications, datasets and pretrained models will be available through the LanguageLens website at https://ll.cs.byu.edu/.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习正在彻底改变社会的许多方面,但训练最好的模型需要大量的计算资源,而这些资源往往是学术团体无法企及的。 因此,该项目获得了一种名为“人工智能透镜”的专用仪器,旨在处理大量的自然语言文本。 该平台将支持自然语言处理、深度学习、计算语言学、危机信息学、会话人工智能、神经机器翻译和法律的语料库语言学等领域的研究,并将使学术研究能够推进训练大型模型所需的机器学习以及这些模型的社会相关应用。该平台是一个高性能的GPU集群,可以平衡计算、存储和节点间通信,以支持各种要求苛刻的基于NLP的工作负载。该项目将专注于解决那些有潜力在各个领域产生变革性、跨学科影响的研究项目。 该工具的一个重点领域是能够训练新的大规模语言模型并实时检查其内部工作。 语言模型将在新的语料库以及新的神经符号架构上进行特定下游应用的训练,以帮助从所得权重中获得洞察力。该基金将优先支持解决紧迫社会问题的研究。 它还将为学生提供真实的劳动力培训和教育体验:随着工业界和学术界之间的资源差距越来越大,越来越难以为他们提供机会进行涉及庞大模型和数据集的高影响力研究。 最后,由于许多公司拒绝发布其模型的预训练权重,因此一个核心目标是在道德考虑的前提下,向所有人免费提供训练权重,以推动行业和学术界的全国影响。项目资源,如代码,出版物,数据集和预先训练的模型将通过www.example.com网站提供https://ll.cs.byu.edu/.This奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
David Wingate其他文献
Quantitative effect of oral feeding on gastrointestinal myoelectric activity in the conscious dog
- DOI:
10.1007/bf01299823 - 发表时间:
1979-06-01 - 期刊:
- 影响因子:2.500
- 作者:
David Wingate;Elizabeth Pearce;Anne Ling;Barbara Boucher;Hilary Thompson;Michael Hutton - 通讯作者:
Michael Hutton
Automated high-speed analysis of gastrointestinal myoelectric activity
- DOI:
10.1007/bf01072284 - 发表时间:
1977-03-01 - 期刊:
- 影响因子:2.500
- 作者:
David Wingate;Thomas Barnett;Roger Green;Michael Armstrong-James - 通讯作者:
Michael Armstrong-James
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
当基准成为目标时:揭示大型语言模型排行榜的敏感性
- DOI:
10.48550/arxiv.2402.01781 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Norah Alzahrani;Hisham Abdullah Alyahya;Sultan Yazeed Alnumay;Shaykhah Alsubaie;Yusef Almushaykeh;Faisal Mirza;Nouf Alotaibi;Nora Altwairesh;Areeb Alowisheq;Saiful Bari;Haidar Khan;A. Jeddah;B. Makkah;C. Paris;D. Riyadh;B. Riyadh;D. Makkah;Peter Clark;Isaac Cowhey;O. Etzioni;Tushar Khot;Mostafa Dehghani;Yi Tay;A. Gritsenko;Zhe Zhao;N. Houlsby;Fernando Diaz;Donald Metzler;Leo Gao;J. Tow;Baber Abbasi;Stella Biderman;Sid Black;Anthony DiPofi;Charles Foster;Laurence Golding;Jeffrey Hsu;Alain Le Noac’h;Haonan Li;Dan Hendrycks;Collin Burns;Steven Basart;Andy Zou;Muhtasim Tahmid;Rahman Laskar;Md. Mizanur Rahman;A. Bhuiyan;Percy Liang;Rishi Bommasani;Tony Lee;Dimitris Tsipras;Dilara Soylu;Michihiro Yasunaga;Yian Zhang;Deepak Narayanan;Yuhuai Wu;Ananya Kumar;Benjamin Newman;Binhang Yuan;Bobby Yan;Ce Zhang;Christian Cosgrove;Christopher D. Man;Christopher Ré;Diana Acosta;Drew A. Hudson;E. Zelikman;Esin Durmus;Faisal Lad;Frieda Rong;Hongyu Ren;Huaxiu Yao;Jue Wang;Keshav Santhanam;Laurel J. Orr;Lucia Zheng;Mert Yuksekgonul;Mirac Suzgun;Nathan Kim;Neel Guha;Niladri S. Chatterji;O. Khattab;Peter Henderson;Qian Huang;Ryan Chi;Sang Michael;Shibani Xie;Surya Santurkar;Tatsunori Ganguli;Thomas Hashimoto;Tianyi Icard;Vishrav Zhang;William Chaudhary;Xuechen Wang;Yifan Li;Mai Yuhui;Zhang Yuta;Koreeda. 2023;Holistic evaluation;Colin Raffel;Noam M. Shazeer;A. Roberts;K. Lee;Sharan Narang;Michael Matena;Yanqi Zhou;Wei Li;Peter J. Liu;Joshua Robinson;Christopher Rytting;David Wingate;Leveraging;Amrita Saha;Vardaan Pahuja;Mitesh M. Khapra;Karthik Sankaranarayanan;Victor Sanh;Albert Webson;Stephen H. Bach;Lintang Sutawika;Zaid Alyafeai;Antoine Chaffin;Arnaud Stiegler;Saleh Soltan;Shankar Ananthakrishnan;Jack FitzGer;Rahul Gupta;Wael Hamza;Charith Peris;Stephen Rawls;Andrew Rosenbaum;Nathan Scales;Nathanael Schärli;Sebastian Gehrmann;Won Chung;Aakanksha Chowdhery;V. Quoc;Ed H Le;Chi;Denny;Hugo Touvron;Louis Martin;Kevin Stone;Peter Al;Amjad Almahairi;Yasmine Babaei;Nikolay Bashlykov;Soumya Batra;Prajjwal Bhargava;Shruti Bhosale;Daniel M. Bikel;Lukas Blecher;Cristian Cantón Ferrer;Moya Chen;Guillem Cucurull;David Esiobu;Jude Fernandes;Jeremy Fu;Wenyin Fu;Brian Fuller;Cynthia Gao;Vedanuj Goswami;Naman Goyal;Anthony Hartshorn;Saghar Hosseini;Rui Hou;Hakan Inan;Marcin Kardas;Viktor Kerkez;Madian Khabsa;Isabel M. Kloumann;Punit Artem Korenev;Singh Koura;Marie;Thibaut Lavril;Jenya Lee;Diana Liskovich;Yinghai Lu;Yuning Mao;Xavier Mar;Todor Mihaylov;Pushkar Mishra;Igor Moly;Yixin Nie;Andrew Poulton;Jeremy Reizen;Rashi Rungta;Kalyan Saladi;Alex Wang;Yada Pruksachatkun;Nikita Nangia;Amanpreet Singh;Julian Michael;Felix Hill;Omer Levy;Samuel R. Bowman;Superglue;Amanpreet Singh;Felix;Wanjun Zhong;Ruixiang Cui;Yiduo Guo;Yaobo Liang;Shuai Lu;Yanlin Wang;Amin Saied;Weizhu Chen;Nan Duan. 2023 - 通讯作者:
Nan Duan. 2023
Human-robot planar co-manipulation of extended objects: data-driven models and control from human-human dyads
扩展对象的人机平面协同操作:数据驱动模型和人机二元组控制
- DOI:
10.3389/fnbot.2024.1291694 - 发表时间:
2024 - 期刊:
- 影响因子:3.1
- 作者:
Erich A. Mielke;Eric C. Townsend;David Wingate;John L. Salmon;Marc D. Killpack - 通讯作者:
Marc D. Killpack
Abstracts of the Second International Symposium on Brain-Gut Interactions
- DOI:
10.1007/bf01300401 - 发表时间:
1992-06-01 - 期刊:
- 影响因子:2.500
- 作者:
Thomas F. Bucks;Yvette Tache;David Wingate - 通讯作者:
David Wingate
David Wingate的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('David Wingate', 18)}}的其他基金
EAGER: Harnessing Accurate Bias in Large-Scale Language Models
EAGER:利用大规模语言模型中的准确偏差
- 批准号:
2141680 - 财政年份:2021
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
CAREER: Blending Deep Reinforcement Learning and Probabilistic Programming
职业:融合深度强化学习和概率编程
- 批准号:
1652950 - 财政年份:2017
- 资助金额:
$ 101.48万 - 项目类别:
Continuing Grant
相似海外基金
Doctoral Dissertation Research: Aspect and Event Cognition in the Acquisition and Processing of a Second Language
博士论文研究:第二语言习得和处理中的方面和事件认知
- 批准号:
2337763 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
Collaborative Research: LTREB: The importance of resource availability, acquisition, and mobilization to the evolution of life history trade-offs in a variable environment.
合作研究:LTREB:资源可用性、获取和动员对于可变环境中生命史权衡演变的重要性。
- 批准号:
2338394 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Continuing Grant
EA: Acquisition of analytical equipment for environmental biogeochemistry and mineralogy
EA:购置环境生物地球化学和矿物学分析设备
- 批准号:
2323242 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Effects of age of acquisition in emerging sign languages
博士论文研究:新兴手语习得年龄的影响
- 批准号:
2335955 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
EA/Ed: Acquisition of a carbon dioxide and methane Cavity Ringdown Spectrometer for education and research
EA/Ed:购买二氧化碳和甲烷腔衰荡光谱仪用于教育和研究
- 批准号:
2329285 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
The effect of AI-assisted summary writing on second language acquisition
人工智能辅助摘要写作对第二语言习得的影响
- 批准号:
24K04154 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Conference: Child Language Acquisition Symposium for Indigenous Communities
会议:土著社区儿童语言习得研讨会
- 批准号:
2410232 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Effects of non-verbal working memory and spoken first language proficiency on sign language acquisition by deaf second language learners
博士论文研究:非语言工作记忆和第一语言口语能力对聋哑第二语言学习者手语习得的影响
- 批准号:
2336589 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant
Collaborative Research: LTREB: The importance of resource availability, acquisition, and mobilization to the evolution of life history trade-offs in a variable environment.
合作研究:LTREB:资源可用性、获取和动员对于可变环境中生命史权衡演变的重要性。
- 批准号:
2338395 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Continuing Grant
EA: Acquisition of an X-ray Fluorescence Spectrometer for Research, Undergraduate Education, and STEM Outreach
EA:购买 X 射线荧光光谱仪用于研究、本科教育和 STEM 推广
- 批准号:
2327202 - 财政年份:2024
- 资助金额:
$ 101.48万 - 项目类别:
Standard Grant