Collaborative Research: FMitF: Track I: Differentiable Probabilistic Programming with Recursive Structured Models

合作研究:FMitF:第一轨:使用递归结构化模型的可微概率规划

基本信息

  • 批准号:
    2019291
  • 负责人:
  • 金额:
    $ 37.53万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-07-01 至 2025-06-30
  • 项目状态:
    未结题

项目摘要

Symbols (like the letters of the alphabet) and structures (like words formed out of letters) are natural for humans to work with: they are ubiquitous in daily life, they are easy for us to understand, and it is easy to write programs that work with them. But current artificial intelligence (AI) systems learn by making many small changes to see which ones improve the performance of the system; they are therefore good at working with representations that allow small changes, like numbers, and not so good with symbols and structures, like letters and words. This can be an obstacle both to building AI systems and to understanding why they work. A typical way for an AI system to learn to work with symbols and structures is to consider all choices and make small changes to their probabilities. But what if there are not 26 choices, but 26 trillion? For example, the grammatical structure of a sentence can be represented by a tree, one out of a large or even infinite number of possible trees. In such cases -- which are the rule rather than the exception -- one can resort to approximations, like randomly selecting a few thousand possibilities, or one can use carefully constructed algorithms to consider all of them. But it is not easy to do the latter or even to know when it is possible. This project's novelty is to develop a new programming framework to make it easy to code such algorithms, so that writing a program that learns to use trees can be as easy as writing a program that uses trees. If successful, the project's impact is to help make machine learning an everyday part of computer programming, not only for researchers but even for beginners.This project draws on and contributes to the fields of machine learning, programming languages, and formal language theory. In machine learning, there is growing interest in neural networks that make probabilistic decisions about discrete structures such as trees that represent the possible grammatical structures of a sentence. In programming language research, there has been much work on probabilistic programs and operations on them that preserve meaning exactly. However, in existing frameworks for both neural networks and probabilistic programs, it is still difficult to represent distributions over recursive structures exactly and to efficiently perform operations on them like differentiation. This project uses ideas from formal language theory to bridge this gap, making it easy to work on these distributions exactly and efficiently. The project has three stages: First, it is extending and vectorizing exact transformations on probabilistic programs so that they work on programs parameterized by differentiable tensors. Second, the project is using hyperedge replacement graph grammars (HRGs) to represent distributions over recursive structures. HRGs generalize both graphical models and string/tree automata, providing a single highly expressive formalism for structured models. Methods for efficient inference on HRGs are also being developed. Third, the team is automating the translation of probabilistic code that uses recursive data structures into HRGs. The techniques developed are being implemented in an open-source deep-learning framework.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
符号(如字母表中的字母)和结构(如由字母组成的单词)对于人类来说是很自然的:它们在日常生活中无处不在,我们很容易理解,并且很容易编写使用它们的程序。但当前的人工智能(AI)系统通过进行许多小的改变来学习,看看哪些改变可以提高系统的性能;因此,他们擅长处理允许微小变化的表示,例如数字,而不擅长处理符号和结构,例如字母和单词。这可能会成为构建人工智能系统和理解其工作原理的障碍。人工智能系统学习使用符号和结构的典型方法是考虑所有选择并对它们的概率进行微小的改变。但如果不是 26 个选择,而是 26 万亿个呢?例如,句子的语法结构可以用一棵树来表示,它是大量甚至无限数量的可能树中的一棵。在这种情况下(这是规则而不是例外),人们可以诉诸近似,例如随机选择几千种可能性,或者可以使用精心构建的算法来考虑所有可能性。但要做到后者,甚至知道何时可以实现,都不容易。该项目的新颖之处在于开发一个新的编程框架,以便轻松编写此类算法,以便编写学习使用树的程序可以像编写使用树的程序一样简单。如果成功,该项目的影响将有助于使机器学习成为计算机编程的日常组成部分,不仅对研究人员而言,甚至对初学者而言也是如此。该项目借鉴并为机器学习、编程语言和形式语言理论领域做出了贡献。在机器学习中,人们对神经网络越来越感兴趣,神经网络可以对离散结构(例如代表句子可能的语法结构的树)做出概率决策。在编程语言研究中,已经有很多关于概率程序和精确保留含义的操作的工作。然而,在神经网络和概率程序的现有框架中,仍然很难准确地表示递归结构上的分布并有效地对其执行微分等操作。该项目使用形式语言理论的思想来弥补这一差距,使准确有效地处理这些发行版变得容易。该项目分为三个阶段:首先,它扩展和矢量化概率程序的精确变换,以便它们可以处理由可微分张量参数化的程序。其次,该项目使用超边替换图语法(HRG)来表示递归结构上的分布。 HRG 概括了图形模型和字符串/树自动机,为结构化模型提供了单一的高度表达形式。有效推断 HRG 的方法也正在开发中。第三,该团队正在将使用递归数据结构的概率代码自动转换为 HRG。所开发的技术正在开源深度学习框架中实施。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Exact Recursive Probabilistic Programming
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David Chiang其他文献

Learning Context-free Languages with Nondeterministic Stack RNNs
使用非确定性堆栈 RNN 学习上下文无关语言
Efficiency through Auto-Sizing: Notre Dame NLP’s Submission to the WNGT 2019 Efficiency Task
通过自动调整大小提高效率:Notre Dame NLP 提交给 WNGT 2019 效率任务
  • DOI:
    10.18653/v1/d19-5634
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kenton Murray;Brian DuSell;David Chiang
  • 通讯作者:
    David Chiang
Mildly Context-Sensitive Grammars for Estimating Maximum Entropy Parsing Models
用于估计最大熵解析模型的轻度上下文相关语法
  • DOI:
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Chiang
  • 通讯作者:
    David Chiang
Syntax-Based Attention Masking for Neural Machine Translation
用于神经机器翻译的基于语法的注意力掩蔽
  • DOI:
    10.18653/v1/2021.naacl-srw.7
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Colin McDonald;David Chiang
  • 通讯作者:
    David Chiang
We're Calling an Intervention: Taking a Closer Look at Language Model Adaptation to Different Types of Linguistic Variation
我们呼吁干预:仔细研究语言模型对不同类型语言变异的适应
  • DOI:
    10.48550/arxiv.2404.07304
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Aarohi Srivastava;David Chiang
  • 通讯作者:
    David Chiang

David Chiang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David Chiang', 18)}}的其他基金

RI: Small: Learning to Retrieve Structured Information for Summarization and Translation of Unstructured Text
RI:小:学习检索结构化信息以摘要和翻译非结构化文本
  • 批准号:
    2137396
  • 财政年份:
    2022
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Small: NL(V)P:Natural Language (Variety) Processing
合作研究:RI:小型:NL(V)P:自然语言(品种)处理
  • 批准号:
    2125948
  • 财政年份:
    2021
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: Language Documentation with an Artificial Intelligence (AI) Helper
协作研究:使用人工智能 (AI) 助手进行语言文档记录
  • 批准号:
    2109709
  • 财政年份:
    2021
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
RI: Small: Language Induction meets Language Documentation: Leveraging bilingual aligned audio for learning and preserving languages
RI:小:语言归纳遇见语言文档:利用双语对齐音频来学习和保存语言
  • 批准号:
    1423406
  • 财政年份:
    2014
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Continuing Grant
RI: Small: Language Induction meets Language Documentation: Leveraging bilingual aligned audio for learning and preserving languages
RI:小:语言归纳遇见语言文档:利用双语对齐音频来学习和保存语言
  • 批准号:
    1464553
  • 财政年份:
    2014
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Continuing Grant
EAGER: Machine Translation for Language Preservation
EAGER:用于语言保护的机器翻译
  • 批准号:
    1144167
  • 财政年份:
    2011
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
EAGER: Phylo: Phylogenetic Reconstruction of Textual Histories
EAGER:Phylo:文本历史的系统发育重建
  • 批准号:
    1011778
  • 财政年份:
    2010
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

FMitF: Collaborative Research: RedLeaf: Verified Operating Systems in Rust
FMITF:协作研究:RedLeaf:经过验证的 Rust 操作系统
  • 批准号:
    2313411
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Game Theoretic Updates for Network and Cloud Functions
合作研究:FMitF:第一轨:网络和云功能的博弈论更新
  • 批准号:
    2318970
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Knitting Semantics
合作研究:FMitF:第一轨:针织语义
  • 批准号:
    2319182
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Towards Verified Robustness and Safety in Power System-Informed Neural Networks
合作研究:FMitF:第一轨:实现电力系统通知神经网络的鲁棒性和安全性验证
  • 批准号:
    2319242
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: DeepSmith: Scheduling with Quality Guarantees for Efficient DNN Model Execution
合作研究:FMitF:第一轨:DeepSmith:为高效 DNN 模型执行提供质量保证的调度
  • 批准号:
    2349461
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Towards Verified Robustness and Safety in Power System-Informed Neural Networks
合作研究:FMitF:第一轨:实现电力系统通知神经网络的鲁棒性和安全性验证
  • 批准号:
    2319243
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Synthesis and Verification of In-Memory Computing Systems using Formal Methods
合作研究:FMitF:第一轨:使用形式方法合成和验证内存计算系统
  • 批准号:
    2319400
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Synthesis and Verification of In-Memory Computing Systems using Formal Methods
合作研究:FMitF:第一轨:使用形式方法合成和验证内存计算系统
  • 批准号:
    2319399
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: Simplifying End-to-End Verification of High-Performance Distributed Systems
合作研究:FMitF:第一轨:简化高性能分布式系统的端到端验证
  • 批准号:
    2318954
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
Collaborative Research: FMitF: Track I: The Phlox framework for verifying a high-performance distributed database
合作研究:FMitF:第一轨:用于验证高性能分布式数据库的 Phlox 框架
  • 批准号:
    2319167
  • 财政年份:
    2023
  • 资助金额:
    $ 37.53万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了