Elements: Development of Assumption-Free Parallel Data Curing Service for Robust Machine Learning and Statistical Predictions
要素:开发用于鲁棒机器学习和统计预测的无假设并行数据固化服务
基本信息
- 批准号:1931380
- 负责人:
- 金额:$ 59.24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-09-01 至 2023-11-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large, incomplete datasets create major challenges for statistical prediction in research. This project will develop a data curing service that is able to manage large, incomplete, and diverse datasets, and would provide uncertainty measures for the cured data. The project identifies and collaborates with several communities where this data service is central to scientific research, including civil engineering, building science, urban energy, and social science. The effort creates a parallel data curing service, provides uncertainty measures for the cured data, and develops supplementary imputing algorithms. The team develops a data curing platform with imputation for incomplete, heterogeneous data; robust machine learning (ML) and statistical predictions would be established by developing an easy-to-use, general-purpose, large data-friendly imputation program. The focus is on a novel combination of three established imputation methods: two-level finite mixture model-based imputation (FMMI), fractional hot deck imputation (FHDI), and Gaussian mixture model-based imputation (GMMI), for which parallel implementations in R would also be provided. This award by the NSF Office of Advanced Cyberinfrastructure is jointly funded by the Established Program to Stimulate Competitive Research (EPSCoR).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
庞大的、不完整的数据集给研究中的统计预测带来了重大挑战。该项目将开发一种数据修复服务,能够管理大型、不完整和多样化的数据集,并将为修复的数据提供不确定性测量。该项目确定并与几个社区合作,在这些社区中,这种数据服务是科学研究的核心,包括土木工程、建筑科学、城市能源和社会科学。这项工作创建了一个并行的数据修复服务,为修复的数据提供了不确定性度量,并开发了补充的输入算法。该团队开发了一个数据固化平台,可以对不完整的、不同种类的数据进行计算;通过开发一个易于使用、通用的大型数据友好计算程序,将建立稳健的机器学习(ML)和统计预测。重点研究了基于两级有限混合模型的推算(FMMI)、基于分数热甲板推算(FHDI)和基于高斯混合模型的推算(GMMI)这三种推算方法的新组合,并提供了R语言中的并行实现。这一奖项由NSF高级网络基础设施办公室联合资助,由既定的激励竞争研究计划(EPSCoR)资助。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(17)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Statistical inference using regularized M-estimation in the reproducing kernel Hilbert space for handling missing data
- DOI:10.1007/s10463-023-00872-8
- 发表时间:2021-07
- 期刊:
- 影响因子:1
- 作者:Hengfang Wang;Jae Kwang Kim
- 通讯作者:Hengfang Wang;Jae Kwang Kim
Statistical inference with semiparametric nonignorable nonresponse models
- DOI:10.1111/sjos.12652
- 发表时间:2023-04
- 期刊:
- 影响因子:1
- 作者:Masatoshi Uehara;Danhyang Lee;Jae Kwang Kim
- 通讯作者:Masatoshi Uehara;Danhyang Lee;Jae Kwang Kim
A framework for glass-box physics rule learner and its application to nano-scale phenomena
- DOI:10.1038/s42005-020-0339-x
- 发表时间:2020-05-08
- 期刊:
- 影响因子:5.5
- 作者:Cho, In Ho;Li, Qiang;Kim, Jaeyoun
- 通讯作者:Kim, Jaeyoun
Survey data integration for regression analysis using model calibration.
- DOI:
- 发表时间:2021-07
- 期刊:
- 影响因子:0
- 作者:Zhonglei Wang;Hang J Kim;Jae Kwang Kim
- 通讯作者:Zhonglei Wang;Hang J Kim;Jae Kwang Kim
Flexible and interpretable generalization of self-evolving computational materials framework
- DOI:10.1016/j.compstruc.2021.106706
- 发表时间:2021-11-17
- 期刊:
- 影响因子:4.7
- 作者:Bazroun, Mohammed;Yang, Yicheng;Cho, In Ho
- 通讯作者:Cho, In Ho
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
In Ho Cho其他文献
An artificial-intelligence based approach for predicting structural damages of paved-road systems under superloads
- DOI:
10.1016/j.conbuildmat.2023.134257 - 发表时间:
2024-01-12 - 期刊:
- 影响因子:
- 作者:
Yongsung Koh;Halil Ceylan;Sunghwan Kim;In Ho Cho - 通讯作者:
In Ho Cho
Evaluation of Coronary Artery Disease with Tc-99m Tetrofosmin SPECT in Conjuction with Intravenous Adenosine
Tc-99m替曲膦SPECT联合静脉注射腺苷评价冠状动脉疾病
- DOI:
- 发表时间:
1997 - 期刊:
- 影响因子:0
- 作者:
J. Hwang;Jaetae Lee;J. Choi;B. Ahn;Yongkeun Cho;S. Chae;J. Jun;W. Park;K. Lee;Y. Kim;In Ho Cho - 通讯作者:
In Ho Cho
Safety Profile of Adenosine Myocardial Perfusion Imaging
腺苷心肌灌注成像的安全性
- DOI:
- 发表时间:
1997 - 期刊:
- 影响因子:0
- 作者:
Jeongai Kim;B. Ahn;K. Chun;D. Hyun;Young;S. Bae;Dongmin Kwak;J. Hwang;Y. Cho;S. Chae;J. Jun;W. Park;In Ho Cho;Jaetae Lee;K. Lee - 通讯作者:
K. Lee
Design for controlling thermal and mechanical properties of graphene oxide/silk fibroin nanocomposites: Numerical analysis and experimental study
氧化石墨烯/丝素蛋白纳米复合材料热性能和机械性能控制的设计:数值分析与实验研究
- DOI:
10.1016/j.icheatmasstransfer.2025.108945 - 发表时间:
2025-05-01 - 期刊:
- 影响因子:6.400
- 作者:
Taehee Kim;Truong Nhut Huynh;Hyeonho Cho;In Ho Cho;Sangmin Lee;Jin-Gyun Kim;Sunghan Kim - 通讯作者:
Sunghan Kim
In Ho Cho的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
水稻边界发育缺陷突变体abnormal boundary development(abd)的基因克隆与功能分析
- 批准号:32070202
- 批准年份:2020
- 资助金额:58 万元
- 项目类别:面上项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
相似海外基金
Development of a new solid tritium breeder blanket
新型固体氚增殖毯的研制
- 批准号:
2908923 - 财政年份:2027
- 资助金额:
$ 59.24万 - 项目类别:
Studentship
Optimal utility-based design of oncology clinical development programmes
基于效用的肿瘤学临床开发项目的优化设计
- 批准号:
2734768 - 财政年份:2026
- 资助金额:
$ 59.24万 - 项目类别:
Studentship
REU Site: Microbial Biofilm Development, Resistance, & Community Structure
REU 网站:微生物生物膜的发展、耐药性、
- 批准号:
2349311 - 财政年份:2025
- 资助金额:
$ 59.24万 - 项目类别:
Continuing Grant
SoundDecisions - Musical Listening, Decision Making, And Equitable Development In The Mekong Delta
SoundDecisions - 湄公河三角洲的音乐聆听、决策和公平发展
- 批准号:
EP/Z000424/1 - 财政年份:2025
- 资助金额:
$ 59.24万 - 项目类别:
Research Grant
Bio-MATSUPER: Development of high-performance supercapacitors based on bio-based carbon materials
Bio-MATSUPER:开发基于生物基碳材料的高性能超级电容器
- 批准号:
EP/Z001013/1 - 财政年份:2025
- 资助金额:
$ 59.24万 - 项目类别:
Fellowship
Development of a Cell-Based Assay for Tetanus Vaccine Quality Control
破伤风疫苗质量控制细胞检测方法的开发
- 批准号:
10101986 - 财政年份:2024
- 资助金额:
$ 59.24万 - 项目类别:
Collaborative R&D
HURR — Platform Development
HURR – 平台开发
- 批准号:
10103254 - 财政年份:2024
- 资助金额:
$ 59.24万 - 项目类别:
Investment Accelerator
Automatic battery swapping cabinet development for scalability of e-mobility in Uganda
自动电池交换柜开发,以提高乌干达电动汽车的可扩展性
- 批准号:
10080435 - 财政年份:2024
- 资助金额:
$ 59.24万 - 项目类别:
Collaborative R&D
Development of digital diagnostics services for Parkinson’s disease
开发帕金森病数字诊断服务
- 批准号:
10086932 - 财政年份:2024
- 资助金额:
$ 59.24万 - 项目类别:
Collaborative R&D
RestoreDNA: Development of scalable eDNA-based solutions for biodiversity regulators and nature-related disclosure
RestoreDNA:为生物多样性监管机构和自然相关披露开发可扩展的基于 eDNA 的解决方案
- 批准号:
10086990 - 财政年份:2024
- 资助金额:
$ 59.24万 - 项目类别:
Collaborative R&D