权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: FMitF: Track I: Differentiable Probabilistic Programming with Recursive Structured Models

合作研究：FMitF：第一轨：使用递归结构化模型的可微概率规划

基本信息

批准号：
2019266
负责人：
Chung-chieh Shan
金额：
$ 37.4万
依托单位：
Indiana University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-07-01 至 2025-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2019266&HistoricalAwards=false
关键词：
Collaborative Research FMitF Track Differentiable

项目摘要

Symbols (like the letters of the alphabet) and structures (like words formed out of letters) are natural for humans to work with: they are ubiquitous in daily life, they are easy for us to understand, and it is easy to write programs that work with them. But current artificial intelligence (AI) systems learn by making many small changes to see which ones improve the performance of the system; they are therefore good at working with representations that allow small changes, like numbers, and not so good with symbols and structures, like letters and words. This can be an obstacle both to building AI systems and to understanding why they work. A typical way for an AI system to learn to work with symbols and structures is to consider all choices and make small changes to their probabilities. But what if there are not 26 choices, but 26 trillion? For example, the grammatical structure of a sentence can be represented by a tree, one out of a large or even infinite number of possible trees. In such cases -- which are the rule rather than the exception -- one can resort to approximations, like randomly selecting a few thousand possibilities, or one can use carefully constructed algorithms to consider all of them. But it is not easy to do the latter or even to know when it is possible. This project's novelty is to develop a new programming framework to make it easy to code such algorithms, so that writing a program that learns to use trees can be as easy as writing a program that uses trees. If successful, the project's impact is to help make machine learning an everyday part of computer programming, not only for researchers but even for beginners.This project draws on and contributes to the fields of machine learning, programming languages, and formal language theory. In machine learning, there is growing interest in neural networks that make probabilistic decisions about discrete structures such as trees that represent the possible grammatical structures of a sentence. In programming language research, there has been much work on probabilistic programs and operations on them that preserve meaning exactly. However, in existing frameworks for both neural networks and probabilistic programs, it is still difficult to represent distributions over recursive structures exactly and to efficiently perform operations on them like differentiation. This project uses ideas from formal language theory to bridge this gap, making it easy to work on these distributions exactly and efficiently. The project has three stages: First, it is extending and vectorizing exact transformations on probabilistic programs so that they work on programs parameterized by differentiable tensors. Second, the project is using hyperedge replacement graph grammars (HRGs) to represent distributions over recursive structures. HRGs generalize both graphical models and string/tree automata, providing a single highly expressive formalism for structured models. Methods for efficient inference on HRGs are also being developed. Third, the team is automating the translation of probabilistic code that uses recursive data structures into HRGs. The techniques developed are being implemented in an open-source deep-learning framework.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

符号（如字母表中的字母）和结构（如由字母组成的单词）对人类来说是很自然的：它们在日常生活中无处不在，我们很容易理解，编写与它们一起工作的程序也很容易。但目前的人工智能（AI）系统通过做出许多小的改变来学习，看看哪些改变可以提高系统的性能;因此，它们擅长处理允许小变化的表示，如数字，而不擅长处理符号和结构，如字母和单词。这可能是构建人工智能系统和理解它们为什么工作的障碍。人工智能系统学习使用符号和结构的一种典型方法是考虑所有选择并对其概率进行小的更改。但如果不是26个选择，而是26万亿呢？例如，一个句子的语法结构可以用一棵树来表示，这棵树是大量甚至无限多棵可能的树中的一棵。在这种情况下--这是规则而不是例外--人们可以采取近似方法，比如随机选择几千种可能性，或者可以使用精心构建的算法来考虑所有这些可能性。但要做到后一点并不容易，甚至不容易知道什么时候有可能。这个项目的新奇在于开发了一个新的编程框架，使编写这样的算法变得容易，这样编写一个学习使用树的程序就可以像编写一个使用树的程序一样容易。如果成功的话，该项目的影响将有助于使机器学习成为计算机编程的日常部分，不仅对研究人员，甚至对初学者都是如此。该项目借鉴并有助于机器学习，编程语言和形式语言理论领域。在机器学习中，人们对神经网络越来越感兴趣，神经网络对离散结构（如表示句子可能语法结构的树）做出概率决策。在编程语言的研究中，有很多关于概率程序的工作，以及对它们的操作，这些操作可以精确地保持意义。然而，在神经网络和概率程序的现有框架中，仍然很难准确地表示递归结构上的分布，并有效地对它们进行微分等操作。这个项目使用形式语言理论的思想来弥合这一差距，使其更容易准确有效地处理这些发行版。该项目分为三个阶段：首先，它是扩展和向量化的概率程序的精确变换，使他们的程序参数化的可微张量。其次，该项目使用超边替换图文法（HRGs）来表示递归结构上的分布。HRG概括了图形模型和字符串/树自动机，为结构化模型提供了一个高度表达的形式主义。对HRG进行有效推理的方法也正在开发中。第三，该团队正在将使用递归数据结构的概率代码自动转换为HRG。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（1）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Exact Recursive Probabilistic Programming

DOI：
10.1145/3586050
发表时间：
2022-10
期刊：
Proceedings of the ACM on Programming Languages
影响因子：
0
作者：
David Chiang;Colin McDonald;Chung-chieh Shan
通讯作者：
David Chiang;Colin McDonald;Chung-chieh Shan

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Chung-chieh Shan其他文献

マルチトラック文字列に対するパターン発見について

关于查找多轨字符串的模式

DOI：
发表时间：
2011
期刊：
影响因子：
0
作者：
Sebastian Fischer;Michael Hanus;Yukiyoshi Kameyama;Chung-chieh Shan;Naoki Takashima;赤松和土;桂敬史,成澤和志,篠原歩
通讯作者：
桂敬史,成澤和志,篠原歩

Closing the Stage: From Staged Code to Typed Closures

关闭阶段：从暂存代码到类型化闭包

DOI：
发表时间：
2008
期刊：
Proc. ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation(PEPM08)
影响因子：
0
作者：
Yukiyoshi Kameyama;Oleg Kise lyov;Chung-chieh Shan
通讯作者：
Chung-chieh Shan