CRII: SHF: Towards the Construction of a Model for Natural Language and Source Code
CRII:SHF:构建自然语言和源代码模型
基本信息
- 批准号:1850412
- 负责人:
- 金额:$ 17.45万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-05-01 至 2022-04-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Source code is written using a combination of human languages, such as English, and programming languages. Developers use a combination of the rules for human languages and programming languages to understand code. The act of trying to understand code is referred to as program comprehension; it is an activity that precedes all other programming-related activities a developer might undertake when coding. For example, before fixing a bug, a developer needs to understand the code where the bug is present; to add a new software feature, a developer must understand the code which will support the new feature. If a piece of code is highly comprehensible, then developers will have an easier time maintaining, debugging, and adding to it. To support comprehension, research must attempt to formally model how human language describes program behavior. With such a model, source code could be optimized to be maximally understandable by automatically improving, or generating, human language to best describe it. This project aims to build such a model by combining information from natural language part of speech with a model of program behavior to assist, improve and measure comprehension. This project aims to formally model how human language describes source code behavior. This will be achieved by combining a static-analysis-based taxonomy of identifier type categorizations with natural language techniques and identifier definition-use chains. The combination of these three activities allow the model to measure 1) how the type constrains the behavior of an identifier, 2) what role, in English, the words in an identifier correlate to, and 3) what function calls the identifier is used in. These will allow the model to understand how the English of an identifier relates to the usage (function calls) and behavior constraints (type constraints). The goal of this model is to formally measure the way human languages are used to describe source code behavior such that it could be used to train a machine to do the same. The completed model will increase the current understanding of how developers express program behavior through human languages and allow for this expression to be measurably optimized for increased comprehensibility. Additionally, the model will improve modern program comprehension techniques by allowing them to be more aware of how the underlying source code structure and rules influence the way human languages are used to describe program behavior.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
源代码是使用人类语言(如英语)和编程语言的组合编写的。开发人员使用人类语言和编程语言的规则组合来理解代码。 试图理解代码的行为被称为程序理解;它是一种先于开发人员在编码时可能进行的所有其他编程相关活动的活动。例如,在修复bug之前,开发人员需要了解bug所在的代码;要添加新的软件功能,开发人员必须了解将支持新功能的代码。 如果一段代码是高度可理解的,那么开发人员将更容易维护,调试和添加它。为了支持理解,研究必须尝试正式建模人类语言如何描述程序行为。通过这种模型,可以通过自动改进或生成人类语言来最好地描述源代码,从而优化源代码,使其最大限度地易于理解。本项目旨在通过将自然语言词性的信息与程序行为模型相结合来构建这种模型,以帮助,改善和测量理解。该项目旨在正式建模人类语言如何描述源代码行为。这将通过将基于静态分析的标识符类型分类与自然语言技术和标识符定义使用链相结合来实现。这三个活动的组合允许模型测量1)类型如何约束标识符的行为,2)在英语中,标识符中的单词与什么角色相关,以及3)标识符用于什么函数调用。这些将允许模型理解标识符的英语如何与用法(函数调用)和行为约束(类型约束)相关。该模型的目标是正式衡量人类语言用于描述源代码行为的方式,以便它可以用来训练机器做同样的事情。完成的模型将增加当前对开发人员如何通过人类语言表达程序行为的理解,并允许对该表达进行可测量的优化,以提高可理解性。此外,该模型将通过允许他们更清楚地了解底层源代码结构和规则如何影响人类语言用于描述程序行为的方式来改进现代程序理解技术。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
An Open Dataset of Abbreviations and Expansions
缩写和扩展的开放数据集
- DOI:10.1109/icsme.2019.00041
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Newman, Christian;Decker, Michael John;AlSuhaibani, Reem S;Peruma, Anthony;Kaushik, Dishant;Hill, Emily
- 通讯作者:Hill, Emily
IDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL:开源标识符名称评估工具
- DOI:10.1109/icsme52107.2021.00064
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Peruma, Anthony;Arnaoudova, Venera;Newman, Christian D.
- 通讯作者:Newman, Christian D.
On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers
- DOI:10.1016/j.jss.2020.110740
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Christian D. Newman;Reem S. Alsuhaibani;M. J. Decker;Anthony S Peruma;D. Kaushik;Mohamed Wiem Mkaouer
- 通讯作者:Christian D. Newman;Reem S. Alsuhaibani;M. J. Decker;Anthony S Peruma;D. Kaushik;Mohamed Wiem Mkaouer
Modeling the Relationship Between Identifier Name and Behavior
对标识符名称和行为之间的关系进行建模
- DOI:10.1109/icsme.2019.00062
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Newman, Christian D.;Preuma, Anthony;AlSuhaibani, Reem
- 通讯作者:AlSuhaibani, Reem
Contextualizing rename decisions using refactorings, commit messages, and data types
- DOI:10.1016/j.jss.2020.110704
- 发表时间:2020-11
- 期刊:
- 影响因子:0
- 作者:Anthony S Peruma;Mohamed Wiem Mkaouer;M. J. Decker;Christian D. Newman
- 通讯作者:Anthony S Peruma;Mohamed Wiem Mkaouer;M. J. Decker;Christian D. Newman
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Christian Newman其他文献
Christian Newman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
天然超短抗菌肽Temporin-SHf衍生多肽的构效分析与抗菌机制研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
衔接蛋白SHF负向调控胶质母细胞瘤中EGFR/EGFRvIII再循环和稳定性的功能及机制研究
- 批准号:82302939
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
EGFR/GRβ/Shf调控环路在胶质瘤中的作用机制研究
- 批准号:81572468
- 批准年份:2015
- 资助金额:60.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: SHF: Medium: Towards Harmonious Federated Intelligence in Heterogeneous Edge Computing via Data Migration
协作研究:SHF:中:通过数据迁移实现异构边缘计算中的和谐联邦智能
- 批准号:
2312617 - 财政年份:2023
- 资助金额:
$ 17.45万 - 项目类别:
Continuing Grant
Collaborative Research: SHF: Medium: Towards Harmonious Federated Intelligence in Heterogeneous Edge Computing via Data Migration
协作研究:SHF:中:通过数据迁移实现异构边缘计算中的和谐联邦智能
- 批准号:
2312616 - 财政年份:2023
- 资助金额:
$ 17.45万 - 项目类别:
Continuing Grant
SHF: Medium: Cross-Stack Algorithm-Hardware-Systems Optimization Towards Ubiquitous On-Device 3D Intelligence
SHF:中:跨堆栈算法-硬件-系统优化,实现无处不在的设备上 3D 智能
- 批准号:
2312758 - 财政年份:2023
- 资助金额:
$ 17.45万 - 项目类别:
Continuing Grant
CCF: SHF: CORE: Small: Towards Systematic Quality Control of Physically Unclonable Functions (PUFs)
CCF:SHF:CORE:小型:迈向物理不可克隆功能(PUF)的系统质量控制
- 批准号:
2244479 - 财政年份:2023
- 资助金额:
$ 17.45万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Towards Robust Deep Learning Computing on GPUs
合作研究:SHF:小型:在 GPU 上实现稳健的深度学习计算
- 批准号:
2301940 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Towards Variability-Aware Software Analysis and Testing
协作研究:SHF:小型:迈向可变性感知软件分析和测试
- 批准号:
2211589 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Towards More Human-like AI Models of Source Code
合作研究:SHF:Medium:迈向更人性化的 AI 源代码模型
- 批准号:
2211429 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Continuing Grant
Collaborative Research: SHF: Medium: Towards More Human-like AI Models of Source Code
合作研究:SHF:Medium:迈向更人性化的 AI 源代码模型
- 批准号:
2211428 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Continuing Grant
Collaborative Research: SHF: Small: Towards Variability-Aware Software Analysis and Testing
协作研究:SHF:小型:迈向可变性感知软件分析和测试
- 批准号:
2211588 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Standard Grant
SHF: Small: Towards High Performance Serverless Edge Computing for Data-intensive Applications
SHF:小型:面向数据密集型应用程序的高性能无服务器边缘计算
- 批准号:
2230620 - 财政年份:2022
- 资助金额:
$ 17.45万 - 项目类别:
Standard Grant














{{item.name}}会员




