EAGER: Statistical Modeling of Linguistic Change in Open Source Software
EAGER:开源软件语言变化的统计建模
基本信息
- 批准号:1821525
- 负责人:
- 金额:$ 6.31万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-05-01 至 2021-04-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The project explores a theory of open source software (OSS) evolution based on statistical natural language processing techniques. Based on the emerging recognition that software code is, in many ways, as "natural" as natural language (e.g., English), there is a trend to apply statistical models for software development tasks such as code analysis, comprehension, and programmer support. This grant extends the "naturalness of code" theory by studying how the code lexicon evolves in open source software as different developers work on a software project and features are added, modified, deleted. The goal is to learn the extent to which the evolution of a developer's lexicon follows the laws of natural language evolution.To create the needed demonstration, large datasets of code lexicons are being collected from a large number of OSS projects and their revisions (on GitHub and SourceForge). The main constructs of the frequency model of natural language evolution will be applied to track and identify the main patterns of language changes (e.g., birth, propagation, death of terms in the lexicon) throughout OSS projects life cycle. Part of the challenge is to better understand how events that instigate code evolution, such as maintenance activities and team formation, are fundamentally different from the events that instigate change in natural language, such as war and migration. The research should lead to new ways to predict software project outcomes and to improve software productivity and quality. The project will make available the data, tools, and algorithms that will be produced by the project, which will support future work to understand the dynamics of code evolution in open source software ecosystems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目探索了基于统计自然语言处理技术的开源软件(OSS)进化理论。基于对软件代码在许多方面与自然语言(例如,英语)一样“自然”的认识,有一种趋势是将统计模型应用于软件开发任务,例如代码分析、理解和程序员支持。这项资助扩展了“代码的自然性”理论,通过研究代码词典在开源软件中如何随着不同的开发人员在一个软件项目中工作以及特性的添加、修改和删除而演变。目标是了解开发人员的词汇的演变在多大程度上遵循自然语言演变的规律。为了创建所需的演示,从大量的OSS项目及其修订(在GitHub和SourceForge上)收集了大量的代码词典数据集。自然语言进化频率模型的主要构造将用于跟踪和识别整个OSS项目生命周期中语言变化的主要模式(例如,词汇中术语的诞生、传播和消亡)。挑战的一部分是更好地理解引发代码进化的事件(例如维护活动和团队组建)与引发自然语言变化的事件(例如战争和迁移)是如何从根本上不同的。这项研究应该会导致新的方法来预测软件项目的结果,并提高软件的生产力和质量。该项目将提供由该项目产生的数据、工具和算法,这将支持未来的工作,以了解开源软件生态系统中代码演变的动态。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Linguistic Documentation of Software History
软件历史的语言文档
- DOI:10.1145/3387904.3389288
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Tushev, M;Mahmoud, A.
- 通讯作者:Mahmoud, A.
On Combining IR Methods to Improve Bug localization
- DOI:10.1145/3387904.3389280
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Saket Khatiwada;Miroslav Tushev;Anas Mahmoud
- 通讯作者:Saket Khatiwada;Miroslav Tushev;Anas Mahmoud
Using GitHub in large software engineering classes. An exploratory case study
在大型软件工程课程中使用 GitHub。
- DOI:10.1080/08993408.2019.1696168
- 发表时间:2019
- 期刊:
- 影响因子:2.7
- 作者:Tushev, Miroslav;Williams, Grant;Mahmoud, Anas
- 通讯作者:Mahmoud, Anas
Linguistic Change in Open Source Software
开源软件的语言变化
- DOI:10.1109/icsme.2019.00045
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Tushev, Miroslav;Khatiwada, Saket;Mahmoud, Anas
- 通讯作者:Mahmoud, Anas
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Anas Mahmoud其他文献
VANETs Positioning in Urban Environments: A Novel Cooperative Approach
VANET 在城市环境中的定位:一种新颖的合作方法
- DOI:
10.1109/vtcfall.2015.7391188 - 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Anas Mahmoud;A. Noureldin;H. Hassanein - 通讯作者:
H. Hassanein
An information theoretic approach for extracting and tracing non-functional requirements
- DOI:
10.1109/re.2015.7320406 - 发表时间:
2015-11 - 期刊:
- 影响因子:0
- 作者:
Anas Mahmoud - 通讯作者:
Anas Mahmoud
Rhabdomyosarcoma in Adults: De Novo or Conversion From Non-seminomas?
成人横纹肌肉瘤:新发还是非精原细胞瘤转化?
- DOI:
10.7759/cureus.55449 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
M. Ghrewati;Anas Mahmoud;Tala Beliani;Mehandar Kumar - 通讯作者:
Mehandar Kumar
Green thin-layer chromatography‒densitometric method for the determination of elexacaftor, tezacaftor, and ivacaftor simultaneously in pharmaceutical preparation and spiked human plasma
- DOI:
10.1007/s00764-025-00343-1 - 发表时间:
2025-05-19 - 期刊:
- 影响因子:1.100
- 作者:
Hesham Salem;Dina Z. Mazen;Belal M. Abdelghany;Abdelrahman Medhat;Anas Mahmoud;Ragaa Laban;Shahd Hissham;Saad Rabiea;Amany Abdelaziz - 通讯作者:
Amany Abdelaziz
Video Game Development in a Rush: A Survey of the Global Game Jam Participants
视频游戏开发热潮:对全球 Game Jam 参与者的调查
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:2.3
- 作者:
Markus Borg;V. Garousi;Anas Mahmoud;Thomas Olsson;O. Stålberg - 通讯作者:
O. Stålberg
Anas Mahmoud的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Anas Mahmoud', 18)}}的其他基金
SCC-PG: Utilizing Sharing Economy to Foster Social Capital and Economic Growth in Baton Rouge
SCC-PG:利用共享经济促进巴吞鲁日的社会资本和经济增长
- 批准号:
1951411 - 财政年份:2020
- 资助金额:
$ 6.31万 - 项目类别:
Standard Grant
相似海外基金
Comparison of Machine Learning and Conventional Statistical Modeling for Predicting Readmission Following Acute Heart Failure Hospitalization
机器学习与传统统计模型预测急性心力衰竭住院后再入院的比较
- 批准号:
495410 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Collaborative Research: Enabling Hybrid Methods in the NIMBLE Hierarchical Statistical Modeling Platform
协作研究:在 NIMBLE 分层统计建模平台中启用混合方法
- 批准号:
2332442 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Standard Grant
Study of Human Statistical Biases on Unsupervised Parsing and Language Modeling
无监督句法分析和语言建模的人类统计偏差研究
- 批准号:
23KJ0565 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Statistical Modeling and Inference for Network Data in Modern Applications
现代应用中网络数据的统计建模和推理
- 批准号:
2326893 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Continuing Grant
ATD: Statistical Modeling of Spatial Temporal Human Mobility Flows from Aggregated Mobile Phone Data
ATD:根据聚合的移动电话数据对时空人类移动流进行统计建模
- 批准号:
2220231 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Standard Grant
Statistical modeling via functional data analysis and its application to various fields
通过功能数据分析进行统计建模及其在各个领域的应用
- 批准号:
23K11005 - 财政年份:2023
- 资助金额:
$ 6.31万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Statistical Methods and Theory for Predictive Biomarker Study in Clinical Trials via Modeling and Analysis of Covariate Interactions
通过协变量相互作用建模和分析进行临床试验中预测生物标志物研究的统计方法和理论
- 批准号:
RGPIN-2018-04462 - 财政年份:2022
- 资助金额:
$ 6.31万 - 项目类别:
Discovery Grants Program - Individual
SCH: Statistical Foundation and Predictive Modeling for Personalized Diabetes Management: Continuous Glucose Monitoring (CGM), Electronic Health Records (EHR), and Biobanks
SCH:个性化糖尿病管理的统计基础和预测模型:连续血糖监测 (CGM)、电子健康记录 (EHR) 和生物样本库
- 批准号:
2205441 - 财政年份:2022
- 资助金额:
$ 6.31万 - 项目类别:
Standard Grant
Statistical methods for longitudinal integrated mechanistic modeling of multiview data
多视图数据纵向综合机制建模的统计方法
- 批准号:
10445698 - 财政年份:2022
- 资助金额:
$ 6.31万 - 项目类别:
Statistical methods for longitudinal integrated mechanistic modeling of multiview data
多视图数据纵向综合机制建模的统计方法
- 批准号:
10685565 - 财政年份:2022
- 资助金额:
$ 6.31万 - 项目类别:














{{item.name}}会员




