权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Learning Disentangled Representations for Text to Aid Interpretability and Transfer

RI：媒介：学习文本的解缠表示以帮助可解释性和迁移

基本信息

批准号：
1901117
负责人：
Byron Wallace
金额：
$ 100万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-07-01 至 2024-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1901117&HistoricalAwards=false
关键词：
RI Medium Learning Disentangled Representations

项目摘要

Machine learning methods for natural language processing power many technologies that we use on a day-to-day basis, such as spam filters and translation software. The models underlying these techniques have become increasingly sophisticated, yielding improved performance but also increasing complexity. In particular, "neural network" based approaches have re-emerged as the dominant class of machine learning models for language processing. These approaches often perform better than their non-neural counterparts, but also have key downsides. First, training these models requires human effort and time to generate a sufficiently large set of training data in the form of manually annotated text. Second, it is often not obvious whether a model trained on one dataset will generalize to another. Finally, it is hard to discern why such models make the specific predictions that they do, largely because predictions are made on the basis of learned representations of texts which do not naturally afford transparency. This project proposes technical innovations to address these interrelated issues using "disentanglement". The idea is to design models such that the learned representations used to make predictions have known meaning. This approach has the potential to enable re-use of models (increasing efficiency and reducing human costs), and aid interpretability, so that one can have a better idea of why a model made a given prediction.To realize the above goals of improved interpretability and transferability of models, this work will develop and evaluate new models that learn representations in which certain dimensions are imbued with explicit semantics. This is a departure from current approaches, which indiscriminately code all attributes into a single (entangled) representation. To achieve disentanglement, this project will explore deep generative models and sparse, gated neural encoders. These will use inductive biases and light supervision strategies that guide models toward disentangled representations. For example, models will be penalized if distances in learned embedding spaces do not reflect human judgments concerning the relative similarities of instances with respect to specific aspects of interest. In other cases, "weak" supervision (e.g., rules) may provide adequate guidance for disentanglement. Finally, "probing" tasks constitute a third supervision strategy to be explored: This will involve the use of auxiliary tasks to provide "supervision" that guides individual aspect-wise embeddings of input. The project will develop and evaluate such models for representative problems in natural language processing, specifically: classification, sequence tagging, and summarization. Models will be evaluated both for predictive performance (including their generalizability to new domains and the efficiency with which they do so), and the degree to which learned representations are disentangled and capture the intended aspects.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

用于自然语言处理的机器学习方法为我们日常使用的许多技术提供了动力，例如垃圾邮件过滤器和翻译软件。这些技术的基础模型已经变得越来越复杂，从而提高了性能，但也增加了复杂性。特别是，基于“神经网络”的方法已经重新成为语言处理的机器学习模型的主导类别。这些方法通常比非神经方法表现得更好，但也有关键的缺点。首先，训练这些模型需要人工努力和时间来生成足够大的手动注释文本形式的训练数据集。其次，在一个数据集上训练的模型是否会推广到另一个数据集通常并不明显。最后，很难理解为什么这些模型会做出特定的预测，这主要是因为预测是基于对文本的学习表征，而这些文本本身并不具有透明度。该项目提出了技术创新，以利用“解开”来解决这些相互关联的问题。这个想法是设计模型，使得用于进行预测的学习表示具有已知的含义。这种方法有可能使模型的重用（提高效率和减少人力成本），并有助于解释性，以便人们可以更好地了解为什么模型做出了给定的预测。为了实现上述目标，提高模型的解释性和可移植性，这项工作将开发和评估新的模型，学习表示中的某些维度充满了显式语义。这与当前的方法不同，当前的方法不加区别地将所有属性编码为单个（纠缠）表示。为了实现解纠缠，该项目将探索深度生成模型和稀疏门控神经编码器。这些将使用归纳偏见和轻监督策略，引导模型走向分离的表示。例如，如果学习的嵌入空间中的距离不反映人类对实例相对于感兴趣的特定方面的相对相似性的判断，则模型将受到惩罚。在其他情况下，“弱”监督（例如，规则）可以为解缠提供足够的指导。最后，“探测”任务构成了第三个要探索的监督策略：这将涉及使用辅助任务来提供“监督”，指导个人方面的输入嵌入。该项目将为自然语言处理中的代表性问题开发和评估此类模型，特别是：分类，序列标记和摘要。模型将被评估预测性能（包括其推广到新领域的能力和效率），并在何种程度上学习表示被解开，并捕捉到预期的方面。该奖项反映了NSF的法定使命，并已被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve

DOI：
10.1162/coli_a_00397
发表时间：
2021-03-01
期刊：
COMPUTATIONAL LINGUISTICS
影响因子：
9.3
作者：
Agarwal, Oshin;Yang, Yinfei;Nenkova, Ani
通讯作者：
Nenkova, Ani

Rate-Regularization and Generalization in Variational Autoencoders

变分自编码器中的速率正则化和泛化

DOI：
发表时间：
2021
期刊：
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Bozkurt, A;Esmaeili, B.;Tristan, J.-B.;Brooks, D.;Dy, J.;van de Meent, J.-W.
通讯作者：
van de Meent, J.-W.

That’s the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data

DOI：
10.48550/arxiv.2210.06565
发表时间：
2022-10
期刊：
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
影响因子：
0
作者：
Denis Jered McInerney;Geoffrey S. Young;Jan-Willem van de Meent;Byron Wallace
通讯作者：
Denis Jered McInerney;Geoffrey S. Young;Jan-Willem van de Meent;Byron Wallace

Biomedical Interpretable Entity Representations

生物医学可解释的实体表示

DOI：
发表时间：
2021
期刊：
Proceedings of the Association for Computational Linguistics (ACL
影响因子：
0
作者：
Garcia-Olano, Diego;Onoe, Yasumasa;Baldini, Ioana;Ghosh, Joydeep;Wallace, Byron C.;Varshney, Kush
通讯作者：
Varshney, Kush

Automatically Summarizing Evidence from Clinical Trials: A Prototype Highlighting Current Challenges

DOI：
10.48550/arxiv.2303.05392
发表时间：
2023-03
期刊：
Proceedings of the conference. Association for Computational Linguistics. Meeting
影响因子：
0
作者：
S. Ramprasad;Denis Jered McInerney;Iain J. Marshal;Byron Wallace
通讯作者：
S. Ramprasad;Denis Jered McInerney;Iain J. Marshal;Byron Wallace

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Byron Wallace其他文献

Edinburgh Research Explorer Living systematic reviews

爱丁堡研究探索者生活系统评论

DOI：
发表时间：
期刊：
影响因子：
0
作者：
James Thomas;Anna Noel;Iain J Marshall;Byron Wallace;Steven McDonald;Chris Mavergames;Paul Glasziou;I. Shemilt;Anneliese J Synnot;Tari Turner;Julian H. Elliott
通讯作者：
Julian H. Elliott