权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

EAGER: Towards a Computational Infrastructure for Analysis of Sensitive Data

EAGER：建立用于分析敏感数据的计算基础设施

基本信息

批准号：
1551843
负责人：
Vasant Honavar
金额：
$ 23.16万
依托单位：
Pennsylvania State Univ University Park
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2015
资助国家：
美国
起止时间：
2015-09-01 至 2019-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1551843&HistoricalAwards=false
关键词：
EAGER Towards Computational Infrastructure Analysis

项目摘要

In many applications, e.g., health, education, our ability to realize the full potential of big data to improve decisions and outcomes is currently limited, by the lack of practical frameworks for analysis of sensitive data in a manner that does not violate applicable data access and use policies. This constitutes a significant barrier to the engagement of researchers with expertise in analytics in developing and evaluating advanced methods for analysis of such data, assessing the performance of alternative approaches, or ensuring the reproducibility of results. Against this background, this high-risk and potentially high-impact research project aims to explore a framework and a software infrastructure for data access and use policy (DAUP) compliant analysis of sensitive data. The project aims to develop a novel framework for data access and use policy (DAUP) compliant analysis of sensitive data. The framework will support (i) Querying and retrieval of information from the data store that are permitted by the user or project specific DAUP. Such information could include the schema of the data store, metadata that specify the variables, and their domains and ranges, etc.; (ii) Execution of system or user-supplied implementations of algorithms for construct predictive or causal models or visualizations from the data in the data store; (iii) Evaluation of the predictive performance of the resulting models on benchmark data or user-provided data; (iv) Deployment of the validated models in the form of web servers that provide predictions or visualizations over user-submitted data or over results of user-defined queries against the data store; and (v) Publication of reusable analytics workflows. This exploratory project seeks to test the feasibility of the framework using predictive and causal modeling of data from an online health community as a test case. A major outcome of this project is the open source software infrastructure for facilitating analysis and visualization of sensitive data. This research will: (i) fill a major gap in infrastructure for predictive modeling from sensitive data; (ii) significantly lower the barrier to the entry of researchers with deep expertise in analytics to domains (e.g., health, education) that involve sensitive data; (iii) improve the accuracy of assessment of the state-of-the-art in predictive and causal modeling in such domains by facilitating rigorous comparison of algorithms; and (iv) facilitate, reproducible analysis of sensitive data. This research will (i) yield a prototype open source software infrastructure to support analysis and visualization of sensitive data; (ii) Accelerate data-driven advances in domains that involve sensitive data e.g., health, education through broad engagement of talent in developing better algorithms; and (iii) Support incorporation of hands-on experience with such applications into Data Sciences education through hackathons and competitions organized around specific sensitive data sets.

在卫生、教育等许多应用中，由于缺乏以不违反适用的数据访问和使用政策的方式分析敏感数据的实用框架，我们实现大数据改善决策和结果的全部潜力的能力目前受到限制。这对具有分析专业知识的研究人员参与开发和评估分析这类数据的先进方法、评估替代方法的性能或确保结果的可重复性构成了重大障碍。在此背景下，这一高风险和潜在高影响的研究项目旨在探索一个框架和软件基础设施，用于对敏感数据进行符合数据访问和使用政策(DAUP)的分析。该项目旨在为符合数据访问和使用政策(DAUP)的敏感数据分析开发一个新的框架。该框架将支持(I)在用户或项目特定DAUP允许的情况下从数据存储中查询和检索信息。这样的信息可以包括数据存储的模式、指定变量的元数据以及它们的域和范围等；(Ii)用于从数据存储中的数据构建预测或因果模型或可视化的算法的系统或用户提供的实现的执行；(Iii)对所得模型对基准数据或用户提供的数据的预测性能的评估；(Iv)以网络服务器的形式部署经验证的模型，所述网络服务器在用户提交的数据上或在针对数据存储的用户定义的查询的结果上提供预测或可视化；以及(V)可重复使用的分析工作流程的发布。这一探索性项目试图使用来自在线健康社区的数据的预测性和因果建模作为测试案例来测试框架的可行性。该项目的一个主要成果是开放源码软件基础设施，以促进敏感数据的分析和可视化。这项研究将：(I)填补从敏感数据进行预测建模的基础设施方面的重大空白；(Ii)显著降低具有深厚分析专业知识的研究人员进入涉及敏感数据的领域(如卫生、教育)的门槛；(Iii)通过促进算法的严格比较，提高对这些领域预测和因果建模最先进水平的评估的准确性；以及(Iv)促进对敏感数据的可重复分析。这项研究将(I)产生一个原型开源软件基础设施，以支持敏感数据的分析和可视化；(Ii)通过人才广泛参与开发更好的算法，在涉及敏感数据的领域(如卫生、教育)加速数据驱动的进步；以及(Iii)通过围绕特定敏感数据集组织的黑客马拉松和比赛，支持将此类应用程序的实践经验纳入数据科学教育。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Vasant Honavar其他文献

Neural network design and the complexity of learning, by J. Stephen Judd. Cambridge, MA: MIT Press, 1990

DOI：
10.1007/bf00993255
发表时间：
1992-06-01
期刊：
MACHINE LEARNING
影响因子：
2.900
作者：
Vasant Honavar
通讯作者：
Vasant Honavar

Machine-learning guided biophysical model development: application to ribosome catalysis

DOI：
10.1016/j.bpj.2021.11.2053
发表时间：
2022-02-11
期刊：
Conference abstract
影响因子：
作者：
Yang Jiang;Justin Petucci;Nishant Soni;Vasant Honavar;Edward O'Brien
通讯作者：
Edward O'Brien

Book Review:Neural Network Design and the Complexity of Learning, by J. Stephen Judd. Cambridge, MA: MIT Press, 1990

DOI：
10.1023/a:1022680813848
发表时间：
1992-06-01
期刊：
MACHINE LEARNING
影响因子：
2.900
作者：
Vasant Honavar
通讯作者：
Vasant Honavar

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

DOI：
10.1186/1471-2105-8-284
发表时间：
2007-08-03
期刊：
BMC BIOINFORMATICS
影响因子：
3.300
作者：
Carson Andorf;Drena Dobbs;Vasant Honavar
通讯作者：
Vasant Honavar

A practical guide to machine learning interatomic potentials – Status and future

机器学习原子间势的实用指南——现状与未来

DOI：
10.1016/j.cossms.2025.101214
发表时间：
2025-03-01
期刊：
CURRENT OPINION IN SOLID STATE & MATERIALS SCIENCE
影响因子：
13.400
作者：
Ryan Jacobs;Dane Morgan;Siamak Attarian;Jun Meng;Chen Shen;Zhenghao Wu;Clare Yijia Xie;Julia H. Yang;Nongnuch Artrith;Ben Blaiszik;Gerbrand Ceder;Kamal Choudhary;Gabor Csanyi;Ekin Dogus Cubuk;Bowen Deng;Ralf Drautz;Xiang Fu;Jonathan Godwin;Vasant Honavar;Olexandr Isayev;Brandon M. Wood
通讯作者：
Brandon M. Wood