CRII: RI: Can Low-Bias Machine Learners Acquire English Grammar? Deep Learning and Linguistic Acceptability
CRII:RI:低偏差机器学习者能否获得英语语法?
基本信息
- 批准号:1850208
- 负责人:
- 金额:$ 17.49万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-03-15 至 2021-02-28
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Widely-deployed applications of language technology such as translation systems and smart assistants rely heavily on machine learning models for sentence understanding. These models learn to understand language from data, which can often be as simple as a collection of published books or a download of Wikipedia, rather than through any kind of manual engineering or hands-on guidance by linguistic expert. While modern machine learning methods are quite effective, they are not perfect. When they fail understand some text, it can be difficult to discover why, and even more difficult to craft interventions to address those failures. This CISE Research Initiation Initiative (CRII) project develops tools to help use methods and insights from research in linguistic science to analyze and refine machine learning systems for sentence understanding. The project should have a practical impact in making it easier to develop effective language technologies, a scientific impact in helping linguists use machine learning as a proxy to study human language learning, and a training impact in supporting several PhD students---through both research seminars and direct research collaborations---as they develop into experts in the interaction between linguistic science and language technology.The methods used in this project relies on the human ability to judge the grammatical acceptability of a sentence; i.e., to decide whether someone could ever use a given sequence of words to say something. The project has three parts: (1) to build a large acceptability-based dataset for English which evaluates machine learning systems on their linguistic knowledge; (2) to use this data to evaluate widely-used standard approaches to machine learning for language, with a focus on promising recent approaches that use artificial neural networks learn from plain text; and (3) to develop methods for using small custom datasets to directly repair any gaps in the knowledge that these machine learning models acquire. Analyzing and improving artificial neural networks is difficult, since their internal representations of language are continuous and at least superficially, their internal representations of language do not at all resemble the kinds of representations that linguists use to analyze language. The investigators' methods are designed to minimize this difficulty, which rely on converging evidence from multiple ways of using the same data in its experiments.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
广泛部署的语言技术应用,如翻译系统和智能助手,在很大程度上依赖于机器学习模型来理解句子。这些模型从数据中学习理解语言,这些数据通常可以像出版书籍的集合或维基百科的下载一样简单,而不是通过语言专家的任何手动工程或实践指导。虽然现代机器学习方法非常有效,但它们并不完美。当他们无法理解某些文本时,很难发现原因,甚至更难制定干预措施来解决这些失败。这个CISE研究启动计划(CRII)项目开发工具,以帮助使用语言科学研究的方法和见解来分析和改进机器学习系统以进行句子理解。该项目应该在使开发有效的语言技术变得更容易方面产生实际影响,在帮助语言学家使用机器学习作为研究人类语言学习的代理方面产生科学影响,以及通过研究研讨会和直接研究合作支持几名博士生的培训影响-当他们发展成为语言科学和语言技术之间相互作用的专家时。这个项目中使用的方法依赖于人类判断句子的语法可接受性的能力;也就是说,来决定一个人是否可以用一个给定的单词序列来表达某件事。该项目包括三个部分:(1)建立一个基于英语可接受性的大型数据集,用于评估机器学习系统的语言知识;(2)使用这些数据来评估广泛使用的语言机器学习标准方法,重点关注最近使用人工神经网络从纯文本学习的有前途的方法;以及(3)开发使用小型自定义数据集的方法,以直接修复这些机器学习模型获取的知识中的任何差距。分析和改进人工神经网络是困难的,因为它们的语言内部表示是连续的,至少在表面上,它们的语言内部表示与语言学家用来分析语言的表示完全不同。研究人员的方法旨在最大限度地减少这一困难,该方法依赖于在实验中使用相同数据的多种方式中收集的证据。该奖项反映了NSF的法定使命,并且通过使用基金会的知识价值进行评估,被认为值得支持和更广泛的影响审查标准。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
CAN NEURAL NETWORKS ACQUIRE A STRUCTURAL BIAS FROM RAW LINGUISTIC DATA?
神经网络可以从原始语言数据中获取结构偏差吗?
- DOI:
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Warstadt, Alex;Bowman, Samuel R.
- 通讯作者:Bowman, Samuel R.
Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)
了解哪些特征很重要:RoBERTa(最终)获得了对语言概括的偏好
- DOI:10.18653/v1/2020.emnlp-main.16
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Warstadt, Alex;Zhang, Yian;Li, Xiaocheng;Liu, Haokun;Bowman, Samuel R.
- 通讯作者:Bowman, Samuel R.
Neural Network Acceptability Judgments
- DOI:10.1162/tacl_a_00290
- 发表时间:2019-01-01
- 期刊:
- 影响因子:10.9
- 作者:Warstadt, Alex;Singh, Amanpreet;Bowman, Samuel R.
- 通讯作者:Bowman, Samuel R.
Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs
调查 BERT 的语言知识:使用 NPI 的五种分析方法
- DOI:10.18653/v1/d19-1286
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Warstadt, Alex;Cao, Yu;Grosu, Ioana;Peng, Wei;Blix, Hagen;Nie, Yining;Alsop, Anna;Bordia, Shikha;Liu, Haokun;Parrish, Alicia
- 通讯作者:Parrish, Alicia
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?
- DOI:10.18653/v1/2021.acl-long.98
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Nikita Nangia;Saku Sugawara;H. Trivedi;Alex Warstadt;Clara Vania;Sam Bowman
- 通讯作者:Nikita Nangia;Saku Sugawara;H. Trivedi;Alex Warstadt;Clara Vania;Sam Bowman
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Samuel Bowman其他文献
FarFetched: An Entity-centric Approach for Reasoning on Textually Represented Environments
FarFetched:一种以实体为中心的文本表示环境推理方法
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Colin Raffel;Noam M. Shazeer;A. Roberts;K. Lee;Sharan Narang;Michael Matena;Yanqi;Wei Zhou;J. LiPeter;Liu. 2020;Exploring;Jim Webber;A. programmatic;Adina Williams;Nikita Nangia;Samuel Bowman - 通讯作者:
Samuel Bowman
Reactive Transport and Péclet Number Analysis of Hydrogen Flux Pathways in Uniform Clay Matrix: Implications for Underground Storage
- DOI:
10.1007/s11242-025-02200-5 - 发表时间:
2025-07-03 - 期刊:
- 影响因子:2.600
- 作者:
Samuel Bowman;Arkajyoti Pathak;Shikha Sharma - 通讯作者:
Shikha Sharma
Effect of Ionic Strength on H2O and Si-Species Stability Field Geometry in pH-Eh Space
pH-Eh 空间中离子强度对 H2O 和 Si 物种稳定场几何形状的影响
- DOI:
10.1007/s10498-023-09417-0 - 发表时间:
2023 - 期刊:
- 影响因子:1.6
- 作者:
Samuel Bowman;Arkajyoti Pathak;V. Agrawal;Shikha Sharma - 通讯作者:
Shikha Sharma
What Makes Machine Reading Comprehension Questions Difficult? Investigating Variation in Passage Sources and Question Types
是什么让机器阅读理解问题难以调查文章来源和问题类型的变化?
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Susan Bartlett;Grzegorz Kondrak;Max Bartolo;Alastair Roberts;Johannes Welbl;Steven Bird;Ewan Klein;E. Loper;Samuel Bowman;George Dahl. 2021;What;Chao Pang;Junyuan Shang;Jiaxiang Liu;Xuyi Chen;Yanbin Zhao;Yuxiang Lu;Weixin Liu;Z. Wu;Weibao Gong;Jianzhong Liang;Zhizhou Shang;Peng Sun;Ouyang Xuan;Dianhai;Hao Tian;Hua Wu;Haifeng Wang;Adam Trischler;Tong Wang;Xingdi Yuan;Justin Har;Alessandro Sordoni;Philip Bachman;Adina Williams;Nikita Nangia;Zhilin Yang;Peng Qi;Saizheng Zhang;Y. Bengio;ing. In - 通讯作者:
ing. In
Samuel Bowman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Samuel Bowman', 18)}}的其他基金
CAREER: High-Agreement Crowdsourcing for Difficult Language-Understanding Tasks
职业:针对困难的语言理解任务的高度一致的众包
- 批准号:
2046556 - 财政年份:2021
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
The 2018 NAACL Student Research Workshop
2018年NAACL学生研究研讨会
- 批准号:
1803423 - 财政年份:2018
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
相似国自然基金
醒脑静多靶点调控PI3K/Akt通路抑制CI/RI氧化应激—基于网络药理学及体内、外实验研究
- 批准号:2025JJ90117
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
IgA-FcαRI介导的Syk/NLRP3/caspase-1通路在线状IgA大疱性皮病
中的机制研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于双修饰ANG-RNH1系统阻抑RI复合物生成机制建立口腔黏膜等效物血管化稳态
- 批准号:82401112
- 批准年份:2024
- 资助金额:30 万元
- 项目类别:青年科学基金项目
跨膜蛋白LRP5胞外域调控膜受体TβRI促钛表面BMSCs归巢、分化的研究
- 批准号:82301120
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于“免疫-神经”网络探讨眼针活化CI/RI大鼠MC靶向H3R调节“免疫监视”的抗炎机制
- 批准号:82374375
- 批准年份:2023
- 资助金额:51 万元
- 项目类别:面上项目
Dectin-2通过促进FcεRI聚集和肥大细胞活化加剧哮喘发作的机制研究
- 批准号:82300022
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
藏药甘肃蚤缀β-咔啉生物碱类TβRI抑制剂的发现及其抗肺纤维化作用机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
nCs通过TβRI结合并磷酸化Axin促进颌骨成骨反应的作用及机制研究
- 批准号:2022J011347
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
TβRI的UFM化修饰调控TGF-β信号通路和乳腺癌转移的作用及机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于 FcεRI 信号通路介导的肥大细胞脱颗粒过程研究支气管哮喘的机制及中药干预
- 批准号:2022JJ70115
- 批准年份:2022
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
Evaluation of the effect of low-phosphorus stress response by plant on "rhizosphere" using RI imaging technology
利用RI成像技术评价植物低磷胁迫反应对“根际”的影响
- 批准号:
22K05373 - 财政年份:2022
- 资助金额:
$ 17.49万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2218773 - 财政年份:2022
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2100158 - 财政年份:2021
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
CRII: RI: Learning with Low-Quality Visual Data: Handling Both Passive and Active Degradations
CRII:RI:使用低质量视觉数据学习:处理被动和主动退化
- 批准号:
2053269 - 财政年份:2020
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Low-Latency and High-Quality Simultaneous Translation
RI:小:低延迟、高质量同声翻译
- 批准号:
2009071 - 财政年份:2020
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
CRII: RI: Learning with Low-Quality Visual Data: Handling Both Passive and Active Degradations
CRII:RI:使用低质量视觉数据学习:处理被动和主动退化
- 批准号:
1755701 - 财政年份:2018
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Creating Text-to-Speech Synthesis for Low Resource Languages
RI:小型:为低资源语言创建文本到语音合成
- 批准号:
1717680 - 财政年份:2017
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Low Cost Technologies to Improve the Quality of 3D Scanning
RI:小型:提高 3D 扫描质量的低成本技术
- 批准号:
1717355 - 财政年份:2017
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Collaborative Research: Structured Inference for Low-Level Vision
RI:小型:协作研究:低级视觉的结构化推理
- 批准号:
1820693 - 财政年份:2017
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant
RI: Small: Collaborative Research: Structured Inference for Low-Level Vision
RI:小型:协作研究:低级视觉的结构化推理
- 批准号:
1618227 - 财政年份:2016
- 资助金额:
$ 17.49万 - 项目类别:
Standard Grant