权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

EAGER: Harnessing Accurate Bias in Large-Scale Language Models

EAGER：利用大规模语言模型中的准确偏差

基本信息

批准号：
2141680
负责人：
David Wingate
金额：
$ 27.89万
依托单位：
Brigham Young University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2024-02-29
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2141680&HistoricalAwards=false
关键词：
EAGER Harnessing Accurate Bias Large

项目摘要

Machine learning models reflect patterns in the data they are trained on, and can, unfortunately, exhibit negative social biases such as prejudice, sexism, or racism. Most research seeks to mitigate this bias, but this work flips the paradigm and explores an alternative by asking: can the bias in machine learning models be harnessed for good? There is strong evidence that some language models exhibit a property called "accurate bias": the patterns captured by the models correlate strongly with human values, judgements, and opinions in ways that are accurately intertwined with time, geography, personal identity, and cultural milieu. In fact, the correlations are so strong and fine-grained that models exhibiting accurate bias can be studied as a surrogate for human subjects, implying researchers can derive actionable insight by experimenting on models in ways that are not possible with humans. By developing a robust methodology and best practices for extracting and analyzing the accurate bias in language models, it is possible to develop new tools for the social sciences, and could revolutionize any field that studies humans, such as psychology, cognitive science, or political science.To accomplish these goals, this EArly Grant for Exploratory Research (EAGER) will systematically study language models to determine the possibilities and limitations of accurate bias. As an EAGER, these research activities will be highly exploratory, designed to amass preliminary results and develop technical proofs of concept to support future research. The work will blend methods from machine learning and social sciences to develop a preliminary theory of accurate bias, and a suite of accompanying methodological and technical best practices. By studying the feasibility of leveraging accurate bias in large-scale language models, this work could deliver fundamental insights into the values, opinions and thought processes of humans. This work could also deliver insights into how to improve language models, including improving their ability to reason symbolically, and a deeper understanding of the relationship between prompt engineering, data curation, fine-tuning, and the informativity of the final model. Technical elements of our proposal, such as work on prompt engineering and controllable text generation, could have significant applicability outside the context of social science research, and stand on their own right as advances of interest to the machine learning community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习模型反映了它们所训练的数据中的模式，不幸的是，它可能会表现出负面的社会偏见，如偏见、性别歧视或种族主义。大多数研究都试图减轻这种偏见，但这项工作颠覆了这一范式，并通过提问探索了另一种选择：机器学习模型中的偏见能否被永久利用？有强有力的证据表明，一些语言模型表现出一种被称为“准确偏差”的特性：模型捕捉到的模式与人类的价值观、判断和观点密切相关，这些价值观、判断和观点与时间、地理、个人身份和文化环境密切相关。事实上，这种相关性是如此强烈和细致，以至于表现出准确偏差的模型可以作为人类受试者的替代品进行研究，这意味着研究人员可以通过以人类不可能的方式在模型上进行实验来获得可操作的见解。通过开发一种强大的方法和最佳实践来提取和分析语言模型中的准确偏见，有可能为社会科学开发新的工具，并可能彻底改变任何研究人类的领域，如心理学、认知科学或政治学。为了实现这些目标，探索性研究早期拨款（EAGER）将系统地研究语言模型，以确定准确偏差的可能性和局限性。作为一个迫切需要，这些研究活动将是高度探索性的，旨在积累初步结果和发展概念的技术证明，以支持未来的研究。这项工作将融合机器学习和社会科学的方法，以发展准确偏见的初步理论，以及一套配套的方法和技术最佳实践。通过研究在大规模语言模型中利用精确偏见的可行性，这项工作可以为人类的价值观、观点和思维过程提供基本的见解。这项工作还可以提供关于如何改进语言模型的见解，包括提高它们的符号化推理能力，以及对提示工程、数据管理、微调和最终模型的信息性之间关系的更深入理解。我们提案中的技术元素，例如关于快速工程和可控文本生成的工作，可能在社会科学研究的背景之外具有重要的适用性，并且作为机器学习社区感兴趣的进步，它们本身就是正确的。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

一种无需地面真实标签即可促进工程的信息论方法

DOI：
10.18653/v1/2022.acl-long.60
发表时间：
2022
期刊：
ACL 2022
影响因子：
0
作者：
Sorensen, Taylor;Robinson, Joshua;Rytting, Christopher;Shaw, Alexander;Rogers, Kyle;Delorey, Alexia;Khalil, Mahmoud;Fulda, Nancy;Wingate, David
通讯作者：
Wingate, David

Leveraging Large Language Models for Multiple Choice Question Answering

DOI：
10.48550/arxiv.2210.12353
发表时间：
2022-10
期刊：
ArXiv
影响因子：
0
作者：
Joshua Robinson;Christopher Rytting;D. Wingate
通讯作者：
Joshua Robinson;Christopher Rytting;D. Wingate

Out of One, Many: Using Language Models to Simulate Human Samples

DOI：
10.1017/pan.2023.2
发表时间：
2023-02-21
期刊：
POLITICAL ANALYSIS
影响因子：
5.4
作者：
Argyle, Lisa P. P.;Busby, Ethan C. C.;Wingate, David
通讯作者：
Wingate, David

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

David Wingate其他文献

Quantitative effect of oral feeding on gastrointestinal myoelectric activity in the conscious dog

DOI：
10.1007/bf01299823
发表时间：
1979-06-01
期刊：
DIGESTIVE DISEASES AND SCIENCES
影响因子：
2.500
作者：
David Wingate;Elizabeth Pearce;Anne Ling;Barbara Boucher;Hilary Thompson;Michael Hutton
通讯作者：
Michael Hutton

Automated high-speed analysis of gastrointestinal myoelectric activity

DOI：
10.1007/bf01072284
发表时间：
1977-03-01
期刊：
DIGESTIVE DISEASES AND SCIENCES
影响因子：
2.500
作者：
David Wingate;Thomas Barnett;Roger Green;Michael Armstrong-James
通讯作者：
Michael Armstrong-James

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

当基准成为目标时：揭示大型语言模型排行榜的敏感性

DOI：
10.48550/arxiv.2402.01781
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Norah Alzahrani;Hisham Abdullah Alyahya;Sultan Yazeed Alnumay;Shaykhah Alsubaie;Yusef Almushaykeh;Faisal Mirza;Nouf Alotaibi;Nora Altwairesh;Areeb Alowisheq;Saiful Bari;Haidar Khan;A. Jeddah;B. Makkah;C. Paris;D. Riyadh;B. Riyadh;D. Makkah;Peter Clark;Isaac Cowhey;O. Etzioni;Tushar Khot;Mostafa Dehghani;Yi Tay;A. Gritsenko;Zhe Zhao;N. Houlsby;Fernando Diaz;Donald Metzler;Leo Gao;J. Tow;Baber Abbasi;Stella Biderman;Sid Black;Anthony DiPofi;Charles Foster;Laurence Golding;Jeffrey Hsu;Alain Le Noac’h;Haonan Li;Dan Hendrycks;Collin Burns;Steven Basart;Andy Zou;Muhtasim Tahmid;Rahman Laskar;Md. Mizanur Rahman;A. Bhuiyan;Percy Liang;Rishi Bommasani;Tony Lee;Dimitris Tsipras;Dilara Soylu;Michihiro Yasunaga;Yian Zhang;Deepak Narayanan;Yuhuai Wu;Ananya Kumar;Benjamin Newman;Binhang Yuan;Bobby Yan;Ce Zhang;Christian Cosgrove;Christopher D. Man;Christopher Ré;Diana Acosta;Drew A. Hudson;E. Zelikman;Esin Durmus;Faisal Lad;Frieda Rong;Hongyu Ren;Huaxiu Yao;Jue Wang;Keshav Santhanam;Laurel J. Orr;Lucia Zheng;Mert Yuksekgonul;Mirac Suzgun;Nathan Kim;Neel Guha;Niladri S. Chatterji;O. Khattab;Peter Henderson;Qian Huang;Ryan Chi;Sang Michael;Shibani Xie;Surya Santurkar;Tatsunori Ganguli;Thomas Hashimoto;Tianyi Icard;Vishrav Zhang;William Chaudhary;Xuechen Wang;Yifan Li;Mai Yuhui;Zhang Yuta;Koreeda. 2023;Holistic evaluation;Colin Raffel;Noam M. Shazeer;A. Roberts;K. Lee;Sharan Narang;Michael Matena;Yanqi Zhou;Wei Li;Peter J. Liu;Joshua Robinson;Christopher Rytting;David Wingate;Leveraging;Amrita Saha;Vardaan Pahuja;Mitesh M. Khapra;Karthik Sankaranarayanan;Victor Sanh;Albert Webson;Stephen H. Bach;Lintang Sutawika;Zaid Alyafeai;Antoine Chaffin;Arnaud Stiegler;Saleh Soltan;Shankar Ananthakrishnan;Jack FitzGer;Rahul Gupta;Wael Hamza;Charith Peris;Stephen Rawls;Andrew Rosenbaum;Nathan Scales;Nathanael Schärli;Sebastian Gehrmann;Won Chung;Aakanksha Chowdhery;V. Quoc;Ed H Le;Chi;Denny;Hugo Touvron;Louis Martin;Kevin Stone;Peter Al;Amjad Almahairi;Yasmine Babaei;Nikolay Bashlykov;Soumya Batra;Prajjwal Bhargava;Shruti Bhosale;Daniel M. Bikel;Lukas Blecher;Cristian Cantón Ferrer;Moya Chen;Guillem Cucurull;David Esiobu;Jude Fernandes;Jeremy Fu;Wenyin Fu;Brian Fuller;Cynthia Gao;Vedanuj Goswami;Naman Goyal;Anthony Hartshorn;Saghar Hosseini;Rui Hou;Hakan Inan;Marcin Kardas;Viktor Kerkez;Madian Khabsa;Isabel M. Kloumann;Punit Artem Korenev;Singh Koura;Marie;Thibaut Lavril;Jenya Lee;Diana Liskovich;Yinghai Lu;Yuning Mao;Xavier Mar;Todor Mihaylov;Pushkar Mishra;Igor Moly;Yixin Nie;Andrew Poulton;Jeremy Reizen;Rashi Rungta;Kalyan Saladi;Alex Wang;Yada Pruksachatkun;Nikita Nangia;Amanpreet Singh;Julian Michael;Felix Hill;Omer Levy;Samuel R. Bowman;Superglue;Amanpreet Singh;Felix;Wanjun Zhong;Ruixiang Cui;Yiduo Guo;Yaobo Liang;Shuai Lu;Yanlin Wang;Amin Saied;Weizhu Chen;Nan Duan. 2023
通讯作者：
Nan Duan. 2023

Human-robot planar co-manipulation of extended objects: data-driven models and control from human-human dyads

扩展对象的人机平面协同操作：数据驱动模型和人机二元组控制

DOI：
10.3389/fnbot.2024.1291694
发表时间：
2024
期刊：
Frontiers in Neurorobotics
影响因子：
3.1
作者：
Erich A. Mielke;Eric C. Townsend;David Wingate;John L. Salmon;Marc D. Killpack
通讯作者：
Marc D. Killpack