NSF Convergence Accelerator Track D: The Data Hypervisor: Orchestrating Data and Models

NSF 融合加速器轨道 D:数据管理程序:编排数据和模型

基本信息

  • 批准号:
    2040718
  • 负责人:
  • 金额:
    $ 95.46万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-15 至 2023-05-31
  • 项目状态:
    已结题

项目摘要

The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. This project, NSF Convergence Accelerator–Track D: The Data Hypervisor: Orchestrating Data and Models, will design and implement the Data Station—a new architecture where both data and derived data products are sealed and cannot be directly seen or downloaded by anyone. In the Data Station architecture, computation is brought to the data, rather than data being brought to users, as is common in traditional data lakes and warehouses. Sharing data and models has had a transformative impact on scientific problems from medical imaging to natural language understanding. Despite the potential upside, many researchers in both academia and industry are reluctant to centralize and share data to both internal and external researchers. Organizations today have to navigate complex regulatory considerations and protect intellectual property while incurring a significant technical investment in documenting and maintaining data. The Data Station will ease access to sensitive data, assist with data discovery and integration, and facilitate enforcement of arbitrary data access and governance policies. The project will work with partners in biomedicine, materials science, and enterprise data management to establish the capabilities and prove the concepts of the Data Station architecture.While building upon prior research in data systems, Data Station will introduce novel data-unaware task capsules that enable users to specify data-driven tasks such as traditional data queries and machine learning model training without the user requiring direct access to the data itself. The programming interfaces convey sufficient information for the Data Station to trigger the discovery of potentially relevant datasets; integrate and prune those datasets for computation; and compute the results by executing the task. In effect, Data Station inverts the traditional data querying modeling by bringing computations to the data. Task capsules also include a user-defined metric for determining what results are useful from the user’s perspective as well as which trust constraints need to be met to validate the provenance of input datasets. The Data Station captures metadata every time a derived data product is created and provides a set of primitives to implement various data governance and data access policies necessary to address data contributor use cases. Only authorized users are able to access the data based on a novel access-token model implemented by Data Station that permits fine-grained yet scalable access control. Users must explicitly be authorized to access results via tokens obtained from data contributors. The Data Station project will engage a diverse set of partners in materials science, biomedicine, and enterprise scenarios to help design and apply the Data Station to various use cases. An education program will engage high school, undergraduate, and graduate students in researching, developing, and evaluating the Data Station.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
NSF融合加速器支持以使用为灵感,以团队为基础,多学科的努力,以应对国家重要性的挑战,并将在不久的将来为社会提供有价值的成果。这个项目,NSF融合加速器-轨道D:数据管理程序:演示数据和模型,将设计和实现数据站-一个新的架构,其中数据和衍生数据产品都是密封的,任何人都不能直接看到或下载。在数据站架构中,计算被带到数据中,而不是像传统数据湖和数据仓库中常见的那样将数据带给用户。共享数据和模型对从医学成像到自然语言理解的科学问题产生了变革性的影响。尽管有潜在的好处,但学术界和工业界的许多研究人员都不愿意将数据集中和共享给内部和外部研究人员。如今的组织必须应对复杂的监管考虑并保护知识产权,同时在记录和维护数据方面进行大量技术投资。数据站将简化敏感数据的访问,协助数据发现和整合,并促进强制执行任意数据访问和治理政策。该项目将与生物医学、材料科学和企业数据管理领域的合作伙伴合作,建立数据站架构的能力并验证其概念。数据站将引入新的数据无感知任务胶囊,使用户能够指定数据-传统数据查询和机器学习模型训练等驱动任务,而无需用户直接访问数据本身。编程接口为数据站传递足够的信息,以触发潜在相关数据集的发现;整合和修剪这些数据集以进行计算;并通过执行任务计算结果。实际上,Data Station通过将计算引入数据来反转传统的数据查询建模。任务胶囊还包括用户定义的度量,用于确定从用户的角度来看哪些结果是有用的,以及需要满足哪些信任约束来验证输入数据集的出处。每次创建派生数据产品时,Data Station都会捕获元数据,并提供一组原语来实现解决数据贡献者用例所需的各种数据治理和数据访问策略。只有经过授权的用户才能访问数据,数据站实现了一种新的访问令牌模型,允许细粒度但可扩展的访问控制。必须明确授权用户通过从数据贡献者获得的令牌访问结果。数据站项目将吸引材料科学、生物医学和企业场景方面的各种合作伙伴,以帮助设计数据站并将其应用于各种用例。一个教育项目将吸引高中生、本科生和研究生参与数据站的研究、开发和评估。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Enabling AI innovation via data and model sharing: An overview of the NSF Convergence Accelerator Track D
  • DOI:
    10.1002/aaai.12042
  • 发表时间:
    2022-03-01
  • 期刊:
  • 影响因子:
    0.9
  • 作者:
    Baru, Chaitanya;Pozmantier, Michael;Zhang, Peng
  • 通讯作者:
    Zhang, Peng
Ver: View Discovery in the Wild
Data-Sharing Markets: Model, Protocol, and Algorithms to Incentivize the Formation of Data-Sharing Consortia
数据共享市场:激励数据共享联盟形成的模型、协议和算法
Data Station: Delegated, Trustworthy, and Auditable Computation to Enable Data-Sharing Consortia with a Data Escrow
  • DOI:
    10.14778/3551793.3551861
  • 发表时间:
    2022-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Siyuan Xia;Zhiru Zhu;Chris Zhu;Jinjin Zhao;K. Chard;Aaron J. Elmore;Ian D. Foster;Michael
  • 通讯作者:
    Siyuan Xia;Zhiru Zhu;Chris Zhu;Jinjin Zhao;K. Chard;Aaron J. Elmore;Ian D. Foster;Michael
Protecting Data Markets from Strategic Buyers
保护数据市场免受战略买家的侵害
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ian Foster其他文献

GreenFaaS: Maximizing Energy Efficiency of HPC Workloads with FaaS
GreenFaaS:利用 FaaS 最大限度提高 HPC 工作负载的能源效率
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Alok V. Kamatar;Valerie Hayot;Y. Babuji;André Bauer;Gourav Rattihalli;Ninad Hogade;D. Milojicic;Kyle Chard;Ian Foster
  • 通讯作者:
    Ian Foster
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
DeepSpeed4Science 计划:通过复杂的人工智能系统技术实现大规模科学发现
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Song;Bonnie Kruft;Minjia Zhang;Conglong Li;Shiyang Chen;Chengming Zhang;Masahiro Tanaka;Xiaoxia Wu;Jeff Rasley;A. A. Awan;Connor Holmes;Martin Cai;Adam Ghanem;Zhongzhu Zhou;Yuxiong He;Christopher Bishop;Max Welling;Tie;Christian Bodnar;Johannes Brandsetter;W. Bruinsma;Chan Cao;Yuan Chen;Peggy Dai;P. Garvan;Liang He;E. Heider;Pipi Hu;Peiran Jin;Fusong Ju;Yatao Li;Chang Liu;Renqian Luo;Qilong Meng;Frank Noé;Tao Qin;Janwei Zhu;Bin Shao;Yu Shi;Wen;Gregor Simm;Megan Stanley;Lixin Sun;Yue Wang;Tong Wang;Zun Wang;Lijun Wu;Yingce Xia;Leo Xia;Shufang Xie;Shuxin Zheng;Jianwei Zhu;Pete Luferenko;Divya Kumar;Jonathan Weyn;Ruixiong Zhang;Sylwester Klocek;V. Vragov;Mohammed Alquraishi;Gustaf Ahdritz;C. Floristean;Cristina Negri;R. Kotamarthi;V. Vishwanath;Arvind Ramanathan;Sam Foreman;Kyle Hippe;T. Arcomano;R. Maulik;Max Zvyagin;Alexander Brace;Bin Zhang;Cindy Orozco Bohorquez;Austin R. Clyde;B. Kale;Danilo Perez;Heng Ma;Carla M. Mann;Michael Irvin;J. G. Pauloski;Logan Ward;Valerie Hayot;M. Emani;Zhen Xie;Diangen Lin;Maulik Shukla;Thomas Gibbs;Ian Foster;James J. Davis;M. Papka;Thomas Brettin;Prasanna Balaprakash;Gina Tourassi;John P. Gounley;Heidi Hanson;T. Potok;Massimiliano Lupo Pasini;Kate Evans;Dan Lu;D. Lunga;Junqi Yin;Sajal Dash;Feiyi Wang;M. Shankar;Isaac Lyngaas;Xiao Wang;Guojing Cong;Peifeng Zhang;Ming Fan;Siyan Liu;A. Hoisie;Shinjae Yoo;Yihui Ren;William Tang;K. Felker;Alexey Svyatkovskiy;Hang Liu;Ashwin Aji;Angela Dalton;Michael Schulte;Karl Schulz;Yuntian Deng;Weili Nie;Josh Romero;Christian Dallago;Arash Vahdat;Chaowei Xiao;Anima Anandkumar;R. Stevens
  • 通讯作者:
    R. Stevens
An optical microscopy system for 3 D dynamic imagingRandy
用于 3D 动态成像的光学显微镜系统Randy
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    R. Hudson;John N. Aarsvold;Chin;Jie Chen;Peter Davies;T. Disz;Ian Foster;Melvin Griem;Man K Kwong;B. Lin
  • 通讯作者:
    B. Lin
Review of low-cost self-driving laboratories in chemistry and materials science: the “frugal twin” concept
化学与材料科学低成本自动驾驶实验室综述:“节俭双胞胎”概念
  • DOI:
    10.1039/d3dd00223c
  • 发表时间:
    2024-05-15
  • 期刊:
  • 影响因子:
    5.600
  • 作者:
    Stanley Lo;Sterling G. Baird;Joshua Schrier;Ben Blaiszik;Nessa Carson;Ian Foster;Andrés Aguilar-Granda;Sergei V. Kalinin;Benji Maruyama;Maria Politi;Helen Tran;Taylor D. Sparks;Alán Aspuru-Guzik
  • 通讯作者:
    Alán Aspuru-Guzik
Exploring Benchmarks for Self-Driving Labs using Color Matching
使用颜色匹配探索自动驾驶实验室的基准
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tobias Ginsburg;Kyle Hippe;Ryan Lewis;Aileen Cleary;D. Ozgulbas;Rory Butler;Casey Stone;Abraham Stroka;Rafael Vescovi;Ian Foster
  • 通讯作者:
    Ian Foster

Ian Foster的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ian Foster', 18)}}的其他基金

Collaborative Research: NSF Workshop on Automated, Programmable and Self Driving Labs
合作研究:NSF 自动化、可编程和自动驾驶实验室研讨会
  • 批准号:
    2335910
  • 财政年份:
    2023
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Frameworks: Garden: A FAIR Framework for Publishing and Applying AI Models for Translational Research in Science, Engineering, Education, and Industry
框架:Garden:用于发布和应用人工智能模型进行科学、工程、教育和工业转化研究的公平框架
  • 批准号:
    2209892
  • 财政年份:
    2022
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Collaborative Research: OAC Core: ScaDL: New Approaches to Scaling Deep Learning for Science Applications on Supercomputers
协作研究:OAC 核心:ScaDL:在超级计算机上扩展深度学习科学应用的新方法
  • 批准号:
    2107511
  • 财政年份:
    2021
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: funcX: A Function Execution Service for Portability and Performance
协作研究:框架:funcX:可移植性和性能的函数执行服务
  • 批准号:
    2004894
  • 财政年份:
    2020
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Virtual Data Set Services Enabling New Science at NSF Facilities
虚拟数据集服务在 NSF 设施中实现新科学
  • 批准号:
    1841531
  • 财政年份:
    2018
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Framework: Software: HDR Globus Automate: A Distributed Research Automation Platform
框架:软件:HDR Globus Automate:分布式研究自动化平台
  • 批准号:
    1835890
  • 财政年份:
    2018
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
EAGER: Designing the OSN Software Platform
EAGER:设计 OSN 软件平台
  • 批准号:
    1836357
  • 财政年份:
    2018
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Integrative Materials Design (IMaD): Leverage, Innovate, and Disseminate
BD 辐条:辐条:中西部:协作:集成材料设计 (IMaD):利用、创新和传播
  • 批准号:
    1636950
  • 财政年份:
    2017
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Collaborative Research: CyberSEES:Type 2: Framework to Advance Climate, Economics, and Impact Investigations with Information Technology (FACE-IT)
合作研究:Cyber​​SEES:类型 2:利用信息技术推进气候、经济和影响调查的框架 (FACE-IT)
  • 批准号:
    1331922
  • 财政年份:
    2013
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
Collaborative Research: Managing Cloud Usage Allocation and Accounting for the NSF Community
协作研究:管理 NSF 社区的云使用分配和核算
  • 批准号:
    1250555
  • 财政年份:
    2012
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant

相似海外基金

NSF Convergence Accelerator Track L: HEADLINE - HEAlth Diagnostic eLectronIc NosE
NSF 融合加速器轨道 L:标题 - 健康诊断电子 NosE
  • 批准号:
    2343806
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator track L: Translating insect olfaction principles into practical and robust chemical sensing platforms
NSF 融合加速器轨道 L:将昆虫嗅觉原理转化为实用且强大的化学传感平台
  • 批准号:
    2344284
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track K: Unraveling the Benefits, Costs, and Equity of Tree Coverage in Desert Cities
NSF 融合加速器轨道 K:揭示沙漠城市树木覆盖的效益、成本和公平性
  • 批准号:
    2344472
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track L: Smartphone Time-Resolved Luminescence Imaging and Detection (STRIDE) for Point-of-Care Diagnostics
NSF 融合加速器轨道 L:用于即时诊断的智能手机时间分辨发光成像和检测 (STRIDE)
  • 批准号:
    2344476
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track L: Intelligent Nature-inspired Olfactory Sensors Engineered to Sniff (iNOSES)
NSF 融合加速器轨道 L:受自然启发的智能嗅觉传感器,专为嗅探而设计 (iNOSES)
  • 批准号:
    2344256
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track K: COMPASS: Comprehensive Prediction, Assessment, and Equitable Solutions for Storm-Induced Contamination of Freshwater Systems
NSF 融合加速器轨道 K:COMPASS:风暴引起的淡水系统污染的综合预测、评估和公平解决方案
  • 批准号:
    2344357
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track M: Water-responsive Materials for Evaporation Energy Harvesting
NSF 收敛加速器轨道 M:用于蒸发能量收集的水响应材料
  • 批准号:
    2344305
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator (L): Innovative approach to monitor methane emissions from livestock using an advanced gravimetric microsensor.
NSF Convergence Accelerator (L):使用先进的重力微传感器监测牲畜甲烷排放的创新方法。
  • 批准号:
    2344426
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator, Track K: Mapping the nation's wetlands for equitable water quality, monitoring, conservation, and policy development
NSF 融合加速器,K 轨道:绘制全国湿地地图,以实现公平的水质、监测、保护和政策制定
  • 批准号:
    2344174
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
NSF Convergence Accelerator Track M: A new biomanufacturing process for making precipitated calcium carbonate and plant-based compounds that support human health
NSF Convergence Accelerator Track M:一种新的生物制造工艺,用于制造支持人类健康的沉淀碳酸钙和植物基化合物
  • 批准号:
    2344228
  • 财政年份:
    2024
  • 资助金额:
    $ 95.46万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了