权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

BD Spokes: SPOKE: NORTHEAST: Collaborative: A Licensing Model and Ecosystem for Data Sharing

BD Spokes：SPOKE：NORTHEAST：协作：数据共享的许可模型和生态系统

基本信息

批准号：
1947440
负责人：
Tim Kraska
金额：
$ 27.09万
依托单位：
Massachusetts Institute of Technology
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-01 至 2021-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1947440&HistoricalAwards=false
关键词：
BD Spokes SPOKE NORTHEAST Collaborative

项目摘要

Sharing of data sets can provide tremendous mutual benefits for industry, researchers and nonprofit organizations. For example, companies can profit from the fact that university researchers explore their data sets and make discoveries, which help the company to improve their business. At the same time, researchers are always on the search for real world data sets to show that their newly developed techniques work in practice. Unfortunately, many attempts to share relevant data sets between different stakeholders in industry and academia fail or require a large investment to make data sharing possible. A major obstacle is that data often comes with prohibitive restrictions on how it can be used (requiring e.g., the enforcement of legal terms or other policies, handling data privacy issues, etc.). In order to enforce these requirements today, lawyers are usually involved in negotiation the terms of each contract. It is not atypical that this process of creating an individual contract for data sharing ends up in protracted negotiations, which are both disconnected from what the actual stakeholders aim to do and fraught as both sides struggle with the implications and possibilities of modern security, privacy, and data sharing techniques. Worse, fear of missing a loophole in how the data might be (mis)used often prevents many data sharing efforts from even getting off the ground. To address these challenges, our new data sharing spoke will enable data providers to easily share data while enforcing constraints on the use of the data. This effort has two key components:(1) Creating a licensing model for data that facilitates sharing data that is not necessarily open or free between different organizations and (2) Developing a prototype data sharing software platform, ShareDB, which enforces the terms and restrictions of the developed licenses. We believe these efforts will have a transformative impact on how data sharing takes place. By moving data out of the silos of individuals and single organizations and into the hands of broader society, we can tackle many societally significant problems.This new data sharing spoke will enable data providers to easily share data while enforcing constraints on the use of the data. Many services and platforms that provide access to data sets exist already today. However, these platforms generally promote completely open access and do not address the aforementioned issues that arise when dealing with proprietary data. Thus, the effort has three key components: (1) Creating a licensing model for data that facilitates sharing data that is not necessarily open or free between different organizations and (2) developing a prototype data sharing software platform, ShareDB, which enforces the terms and restrictions of the developed licenses, and (3) developing and integrating relevant metadata that will accompany the datasets shared under the different licenses, making them easily searchable and interpretable. To ensure that the developed tools and licenses are useful, the project will form the Northeast Data Sharing Group, comprising of many different stakeholders to make the licensing model widely accepted and usable in many application domains (e.g., health and finance). The intellectual merit of this proposal is to design a licensing model and a data sharing platform that is widely accepted and usable as a template in many different domains. While there exist other efforts to enable data sharing (e.g., Creative Commons), they focus on the case where the data owner is willing to openly share the data on the Internet. This licensing model and the ecosystem is different since it allows data owners to enforce certain requirements stated in a data sharing agreement (e.g., on who is allowed to access the data) and also provides tools to make data sharing of sensitive information safe. The licenses and software we propose to investigate will make it easier for organizations to open up their data to the appropriate organizations, while maintaining the ability to ensure it is protected, that access is revocable, and that access controls and audit logs are maintained.

共享数据集可以为行业、研究人员和非营利组织带来巨大的互惠互利。例如，公司可以从大学研究人员探索他们的数据集和发现这一事实中获利，这有助于公司改善他们的业务。与此同时，研究人员一直在寻找现实世界的数据集，以表明他们新开发的技术在实践中有效。不幸的是，在工业界和学术界的不同利益相关者之间共享相关数据集的许多尝试都失败了，或者需要大量投资才能实现数据共享。一个主要障碍是，数据往往伴随着对如何使用数据的令人望而却步的限制(例如，要求执行法律术语或其他政策、处理数据隐私问题等)。为了今天执行这些要求，律师通常参与谈判每一份合同的条款。这种为数据共享创建个人合同的过程最终以旷日持久的谈判告终的情况并不少见，谈判既与实际利益相关者的目标脱节，又令人担忧，因为双方都在努力应对现代安全、隐私和数据共享技术的影响和可能性。更糟糕的是，担心错过数据可能被(错误)使用的漏洞，往往会阻止许多数据共享努力甚至无法开始。为了应对这些挑战，我们新的数据共享发言人将使数据提供商能够轻松地共享数据，同时对数据的使用实施限制。这项工作有两个关键组成部分：(1)创建一个数据许可模式，以促进不同组织之间不一定开放或免费的数据共享；(2)开发一个原型数据共享软件平台，该平台执行已开发许可证的条款和限制。我们相信，这些努力将对数据共享的方式产生革命性的影响。通过将数据从个人和单个组织的孤岛转移到更广泛的社会手中，我们可以解决许多重大的社会问题。这种新的数据共享话语将使数据提供商能够轻松地共享数据，同时对数据的使用实施限制。今天已经存在许多提供访问数据集的服务和平台。然而，这些平台通常提倡完全开放获取，并没有解决在处理专有数据时出现的上述问题。因此，这项工作有三个关键组成部分：(1)创建一个数据许可模式，促进不同组织之间不一定开放或免费的数据共享；(2)开发一个原型数据共享软件平台-开发一个原型数据共享软件平台为了确保开发的工具和许可证有用，该项目将成立由许多不同利益相关者组成的东北数据共享小组，使许可模式在许多应用领域(如医疗和金融)得到广泛接受和使用。这一提议的智力价值在于设计了一种许可模式和一个数据共享平台，该平台在许多不同的领域作为模板被广泛接受和使用。虽然还有其他努力来实现数据共享(例如，知识共享)，但它们侧重于数据所有者愿意在互联网上公开共享数据的情况。这种许可模式和生态系统是不同的，因为它允许数据所有者执行数据共享协议中规定的某些要求(例如，允许谁访问数据)，并提供工具来确保敏感信息的数据共享的安全。我们建议调查的许可证和软件将使组织更容易向适当的组织开放其数据，同时保持确保数据受到保护、访问可撤销以及访问控制和审核日志得到维护的能力。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Towards instance-optimized data systems

DOI：
10.14778/3476311.3476392
发表时间：
2021-07
期刊：
Proc. VLDB Endow.
影响因子：
0
作者：
Tim Kraska
通讯作者：
Tim Kraska

Poly'19 Workshop Summary: GDPR

Poly19 研讨会摘要：GDPR

DOI：
10.1145/3444831.3444842
发表时间：
2020
期刊：
ACM SIGMOD Record
影响因子：
0
作者：
Stonebraker, Michael;Mattson, Timothy;Kraska, Tim;Gadepally, Vijay
通讯作者：
Gadepally, Vijay

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Tim Kraska其他文献

Building Database Applications in the Cloud

DOI：
10.3929/ethz-a-006007449
发表时间：
2010
期刊：
影响因子：
0
作者：
Tim Kraska
通讯作者：
Tim Kraska

Towards a Benchmark for the Cloud

迈向云基准

DOI：
发表时间：
2018
期刊：
影响因子：
0
作者：
Carsten Binnig;Donald Kossmann;Tim Kraska;Simon Losing
通讯作者：
Simon Losing

Self-Organizing Data Containers

自组织数据容器

DOI：
发表时间：
2022
期刊：
Conference on Innovative Data Systems Research
影响因子：
0
作者：
S. Madden;Jialin Ding;Tim Kraska;Sivaprasad Sudhir;David Cohen;T. Mattson;Nesime Tatbul
通讯作者：
Nesime Tatbul