Dependable Data Driven Discovery
可靠的数据驱动发现
基本信息
- 批准号:2152117
- 负责人:
- 金额:$ 299.9万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-01 至 2027-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data-driven decisions are becoming increasingly critical for the well-being of individuals and society; and the nation and the world’s reliance on such decisions is likely to increase tremendously in the next decade. However, data science lifecycles are assembled and operated by a wide variety of individuals and institutions with varying levels of expertise. While to err is human, the consequences of errors in critical data science lifecycles can be catastrophic. Unreliable decisions can potentially have enormous negative impacts such as loss of life, widespread contagions, and economic depression. To respond to this critical challenge, this NSF Research Traineeship project is establishing a graduate program of study, "Dependable Data Driven Discovery (D4)" that engages faculty across multiple disciplines to train students with diverse scientific backgrounds to recognize risks to dependable data driven discovery and to develop corresponding mitigation strategies. One hundred students are expected to participate from disciplines such as computer science, mathematics, statistics, bioengineering, and computational biology, including 32 funded graduate (MS & PhD) trainees and 16 undergraduate students from minority groups who are underrepresented in their participation in these fields. Trainees will gain a holistic perspective of the entire data science lifecycle through coursework as well as collaborative, transdisciplinary research. Three focal areas comprise the D4 research and training agenda. First is a focus on formal foundations, methodology, and tools for a dependable data driven discovery framework. Second is an examination of risk mitigation methods to handle noise in data, limited training data, and uncertainty prediction and interpretability of machine learning models. The third focal area addresses quality assurance issues for protein function prediction and cellular engineering processes to direct undifferentiated cells into mature, functional cells with dependable data science lifecycles. The project will develop a new graduate certificate in dependable data science to train students in dependability issues within data science lifecycles. Through coursework, trainees will experience the entire data science lifecycle several times, each with an increasingly deeper understanding of the risks, measures, and risks mitigation mechanisms. Trainees will also engage with industry partners to examine their data science lifecycles and discuss risk mitigation methods. The capstone project course will reinforce trainees’ technical skills to address the above research problems in data science, biological science, and engineering as well as written and oral communication skills. Trainees will interact with outside collaborators through the D4 seminars, gain experiential learning with industry partners, conduct original research through capstone projects, and work in internships, resulting in awareness of dependable data science lifecycles and risk mitigation mechanisms across Iowa State University, local industry, NGOs, and government. The project will engage in outreach activities through two existing Iowa State University infrastructures: ISU Science Bound and ISU Extension and Outreach with Iowa 4-H.The NSF Research Traineeship (NRT) Program is designed to encourage the development and implementation of bold, new potentially transformative models for STEM graduate education training. The program is dedicated to effective training of STEM graduate students in high priority interdisciplinary or convergent research areas through comprehensive traineeship models that are innovative, evidence-based, and aligned with changing workforce and research needs.This project is jointly funded by NRT and the Established Program to Stimulate Competitive Research (EPSCoR).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的决策对个人和社会的福祉越来越重要;在未来十年,国家和世界对此类决策的依赖可能会大大增加。然而,数据科学生命周期是由具有不同专业知识水平的各种个人和机构组装和操作的。虽然犯错是人的本性,但关键数据科学生命周期中的错误后果可能是灾难性的。不可靠的决策可能会产生巨大的负面影响,如生命损失、广泛的传染和经济萧条。为了应对这一关键挑战,NSF研究培训项目正在建立一个研究生课程,“依赖数据驱动的发现(D4)”,该课程涉及多个学科的教师,以培训具有不同科学背景的学生认识到可靠数据驱动发现的风险,并制定相应的缓解策略。预计将有100名来自计算机科学,数学,统计学,生物工程和计算生物学等学科的学生参加,其中包括32名受资助的研究生(MS PhD)学员和16名来自少数群体的本科生,他们在这些领域的参与人数不足。学员将通过课程作业以及跨学科的协作研究获得整个数据科学生命周期的整体视角。D4研究和培训议程包括三个重点领域。首先是关注可靠的数据驱动发现框架的正式基础、方法和工具。第二是检查风险缓解方法,以处理数据中的噪声,有限的训练数据以及机器学习模型的不确定性预测和可解释性。第三个重点领域涉及蛋白质功能预测和细胞工程过程的质量保证问题,以指导未分化细胞进入成熟的功能细胞,并具有可靠的数据科学生命周期。该项目将开发一个新的可靠数据科学研究生证书,以培训学生在数据科学生命周期内的可靠性问题。通过课程,学员将多次体验整个数据科学生命周期,每次都对风险,措施和风险缓解机制有更深入的了解。学员还将与行业合作伙伴合作,检查他们的数据科学生命周期,并讨论风险缓解方法。顶点项目课程将加强学员的技术技能,以解决上述数据科学,生物科学和工程方面的研究问题,以及书面和口头沟通技巧。学员将通过D4研讨会与外部合作者互动,与行业合作伙伴进行体验式学习,通过顶点项目进行原创研究,并在实习中工作,从而提高对爱荷华州州立大学,当地行业,非政府组织和政府的可靠数据科学生命周期和风险缓解机制的认识。该项目将通过两个现有的爱荷华州州立大学的基础设施参与外展活动:ISU科学约束和ISU扩展和外展与爱荷华州4-H. NSF研究培训(NRT)计划旨在鼓励开发和实施大胆的,新的潜在变革模型STEM研究生教育培训。该计划致力于通过创新,循证,并与不断变化的劳动力和研究需求保持一致。该项目由NRT和刺激竞争性研究既定计划(EPSCoR)联合资助该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Mutation-based Fault Localization of Deep Neural Networks
- DOI:10.1109/ase56229.2023.00171
- 发表时间:2023-09
- 期刊:
- 影响因子:0
- 作者:Ali Ghanbari;Deepak-George Thomas;Muhammad Arbab Arshad;Hridesh Rajan
- 通讯作者:Ali Ghanbari;Deepak-George Thomas;Muhammad Arbab Arshad;Hridesh Rajan
What kinds of contracts do ML APIs need?
- DOI:10.1007/s10664-023-10320-z
- 发表时间:2023-07
- 期刊:
- 影响因子:4.1
- 作者:S. K. Samantha;Shibbir Ahmed;S. Imtiaz;Hridesh Rajan;G. Leavens
- 通讯作者:S. K. Samantha;Shibbir Ahmed;S. Imtiaz;Hridesh Rajan;G. Leavens
Fix Fairness, Don’t Ruin Accuracy: Performance Aware Fairness Repair using AutoML
- DOI:10.1145/3611643.3616257
- 发表时间:2023-06
- 期刊:
- 影响因子:0
- 作者:Giang Nguyen-;Sumon Biswas;Hridesh Rajan
- 通讯作者:Giang Nguyen-;Sumon Biswas;Hridesh Rajan
Towards Understanding Fairness and its Composition in Ensemble Machine Learning
- DOI:10.1109/icse48619.2023.00133
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:Usman Gohar;Sumon Biswas;Hridesh Rajan
- 通讯作者:Usman Gohar;Sumon Biswas;Hridesh Rajan
Design by Contract for Deep Learning APIs
- DOI:10.1145/3611643.3616247
- 发表时间:2023-11
- 期刊:
- 影响因子:0
- 作者:Shibbir Ahmed;S. Imtiaz;S. K. Samantha;Breno Dantas Cruz;Hridesh Rajan
- 通讯作者:Shibbir Ahmed;S. Imtiaz;S. K. Samantha;Breno Dantas Cruz;Hridesh Rajan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Wallapak Tavanapong其他文献
Automatic polyp region segmentation for colonoscopy images using watershed algorithm and ellipse segmentation
使用分水岭算法和椭圆分割对结肠镜检查图像进行自动息肉区域分割
- DOI:
- 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
Sae Hwang;Jung;Wallapak Tavanapong;J. Wong;P. C. Groen - 通讯作者:
P. C. Groen
Real-Time Feedback During Colonoscopy to Improve Quality: How Often to Improve Inspection?
结肠镜检查期间的实时反馈以提高质量:多久改进一次检查?
- DOI:
- 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
P. C. Groen;Michael J. Szewczynski;F. Enders;Wallapak Tavanapong;Jung;J. Wong - 通讯作者:
J. Wong
Fast Object Detection Using Color Features for Colonoscopy Quality Measurements
使用颜色特征进行快速物体检测以进行结肠镜检查质量测量
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Jayantha Muthukudage;Jung;Ruwan Dharshana Nawarathna;Wallapak Tavanapong;J. Wong;P. C. Groen - 通讯作者:
P. C. Groen
Automatic real-time capture and segmentation of endoscopy video
内窥镜视频自动实时采集和分割
- DOI:
10.1117/12.770930 - 发表时间:
2008 - 期刊:
- 影响因子:4.1
- 作者:
Sean Stanek;Wallapak Tavanapong;J. Wong;Jung;P. de Groen - 通讯作者:
P. de Groen
Real-time phase boundary detection in colonoscopy videos
结肠镜检查视频中的实时相界检测
- DOI:
- 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
Jung;Malik Avnish Rajbal;Jayantha Muthukudage;Wallapak Tavanapong;J. Wong;P. C. Groen - 通讯作者:
P. C. Groen
Wallapak Tavanapong的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Wallapak Tavanapong', 18)}}的其他基金
STTR Phase II: Real-time Analysis and Feedback during Colonoscopy to improve Quality
STTR 第二阶段:结肠镜检查期间的实时分析和反馈以提高质量
- 批准号:
0956847 - 财政年份:2010
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
STTR Phase I:Video Analysis Techniques for Computer-Aided Quality Control for Colonoscopy
STTR 第一阶段:结肠镜检查计算机辅助质量控制的视频分析技术
- 批准号:
0740596 - 财政年份:2008
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
SEI: Collaborative Research: Endoscopic Multimedia Information System (EMIS)
SEI:合作研究:内窥镜多媒体信息系统(EMIS)
- 批准号:
0513809 - 财政年份:2005
- 资助金额:
$ 299.9万 - 项目类别:
Continuing Grant
Strategies for Caching Information on Distributed Systems
分布式系统上的信息缓存策略
- 批准号:
0092914 - 财政年份:2001
- 资助金额:
$ 299.9万 - 项目类别:
Continuing Grant
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
基于Linked Open Data的Web服务语义互操作关键技术
- 批准号:61373035
- 批准年份:2013
- 资助金额:77.0 万元
- 项目类别:面上项目
Molecular Interaction Reconstruction of Rheumatoid Arthritis Therapies Using Clinical Data
- 批准号:31070748
- 批准年份:2010
- 资助金额:34.0 万元
- 项目类别:面上项目
高维数据的函数型数据(functional data)分析方法
- 批准号:11001084
- 批准年份:2010
- 资助金额:16.0 万元
- 项目类别:青年科学基金项目
染色体复制负调控因子datA在细胞周期中的作用
- 批准号:31060015
- 批准年份:2010
- 资助金额:25.0 万元
- 项目类别:地区科学基金项目
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
CC* Networking Infrastructure: YinzerNet: A Multi-Site Data and AI Driven Research Network
CC* 网络基础设施:YinzerNet:多站点数据和人工智能驱动的研究网络
- 批准号:
2346707 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
Collaborative Research: Data-Driven Elastic Shape Analysis with Topological Inconsistencies and Partial Matching Constraints
协作研究:具有拓扑不一致和部分匹配约束的数据驱动的弹性形状分析
- 批准号:
2402555 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
CAREER: Data-Driven Hardware and Software Techniques to Enable Sustainable Data Center Services
职业:数据驱动的硬件和软件技术,以实现可持续的数据中心服务
- 批准号:
2340042 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Continuing Grant
CAREER: A Universal Framework for Safety-Aware Data-Driven Control and Estimation
职业:安全意识数据驱动控制和估计的通用框架
- 批准号:
2340089 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
Data Driven Discovery of New Catalysts for Asymmetric Synthesis
数据驱动的不对称合成新催化剂的发现
- 批准号:
DP240100102 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Discovery Projects
PIDD-MSK: Physics-Informed Data-Driven Musculoskeletal Modelling
PIDD-MSK:物理信息数据驱动的肌肉骨骼建模
- 批准号:
EP/Y027930/1 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Fellowship
N2Vision+: A robot-enabled, data-driven machine vision tool for nitrogen diagnosis of arable soils
N2Vision:一种由机器人驱动、数据驱动的机器视觉工具,用于耕地土壤的氮诊断
- 批准号:
10091423 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Collaborative R&D
Facilitating circular construction practices in the UK: A data driven online marketplace for waste building materials
促进英国的循环建筑实践:数据驱动的废弃建筑材料在线市场
- 批准号:
10113920 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
SME Support
Collaborative Research: Data-driven engineering of the yeast Kluyveromyces marxianus for enhanced protein secretion
合作研究:马克斯克鲁维酵母的数据驱动工程,以增强蛋白质分泌
- 批准号:
2323984 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Standard Grant
Data-driven prediction of fatigue crack nucleation in directionally-solidified Ni-based superalloys
定向凝固镍基高温合金疲劳裂纹形核的数据驱动预测
- 批准号:
24K07230 - 财政年份:2024
- 资助金额:
$ 299.9万 - 项目类别:
Grant-in-Aid for Scientific Research (C)