Fine-grained Data Provenance for Very Expressive Queries
细粒度的数据来源,用于非常富有表现力的查询
基本信息
- 批准号:398800066
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:德国
- 项目类别:Research Grants
- 财政年份:2018
- 资助国家:德国
- 起止时间:2017-12-31 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Data provenance uncovers how database queries transform, filter, merge, and aggregate input data to arrive at the final output. With today's characteristic steep growth in data volume as well as query complexity, the inner workings of a query quickly become hard to assess and validate: where in the input did this piece of output originate? Why did the query emit this item but omit another? How did the query produce this result value and exactly which query constructs participated in the evaluation? Data provenance has answers to these and further questions and the responses explain query internals (and bugs), aid in data quality assessments, and help to build trust in query results—a critical service to data-dependent science and society.With provenance, we shift a query's focus from values and their transformation to the dependencies between output and input data. This research proposal is built on the central hypothesis that abstract interpretation provides an ideal framework to think and reason about as well as to implement this shift of focus. In abstract interpretation, a program analysis discipline first established in the 1970s, all but one (or few) selected aspect(s) of a program's evaluation are ignored. This project will adapt these ideas to develop a view of queries and programs in which input/output dependencies—not: values—assume the primary role.The benefits of data provenance grow with the complexity of the query logic it is able to explain. We set out to derive provenance for advanced query language constructs and idioms like deep nesting, sliding windows, user-defined and built-in functions, or recursion. It is a core goal to embrace practically relevant and complex languages, like modern variants of SQL, where prior work exhibited significant restrictions. We will capitalize on the flexibility of abstract interpretation and design abstract domains that explain provenance at various levels of data granularity, down to individual atomic values (table cells, say). Further adaptations of the abstract domain and query interpretation rules will allow the exploration of new and notoriously difficult types of data provenance (e.g., those of values absent in the output). Abstract interpretation is both, a powerful theoretical but also a practical tool. Building on the latter, we will study parallel provenance derivation for queries over large data volumes and the seamless integration of data provenance into query compilers of existing modern database systems.
Data Provenance揭示了数据库查询如何转换、筛选、合并和聚合输入数据以获得最终输出。随着当今数据量和查询复杂性的急剧增长,查询的内部工作很快就变得很难评估和验证:这段输出来自输入中的哪里?为什么查询会发出这一项,而忽略另一项?查询是如何生成该结果值的,以及哪些查询构造参与了评估?Data Provenance为这些问题和其他问题提供了答案,响应解释了查询的内部结构(和错误),帮助进行数据质量评估,并帮助建立对查询结果的信任--这是对依赖数据的科学和社会的关键服务。这一研究建议建立在一个中心假设之上,即抽象解释提供了一个理想的框架来思考和推理,并实现了这种焦点的转移。在抽象解释中,最早建立于20世纪70年代的程序分析学科,除了一个(或几个)选定的方面(S)之外,程序评估的所有方面都被忽略。这个项目将采用这些想法来开发一种查询和程序的视图,其中输入/输出依赖关系-而不是:值-承担主要角色。数据来源的好处随着它能够解释的查询逻辑的复杂性而增长。我们开始为高级查询语言构造和习惯用法(如深度嵌套、滑动窗口、用户定义和内置函数或递归)派生来源。它的核心目标是包含实际相关和复杂的语言,如SQL的现代变体,以前的工作显示出很大的限制。我们将利用抽象解释的灵活性,并设计抽象领域,在不同级别的数据粒度上解释来源,直到单个原子值(比方说表格单元格)。对抽象域和查询解释规则的进一步调整将允许探索新的和出了名的困难类型的数据来源(例如,那些在输出中缺失的值)。抽象阐释既是一种强大的理论工具,也是一种实践工具。在后者的基础上,我们将研究针对大数据量的查询的并行来源推导,以及将数据来源无缝地整合到现有现代数据库系统的查询编译器中。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Professor Dr. Torsten Grust其他文献
Professor Dr. Torsten Grust的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Professor Dr. Torsten Grust', 18)}}的其他基金
ALIEN: Abstractions, Languages, and Implementation Techniques That Cross the Program/Query Divide
ALIEN:跨越程序/查询鸿沟的抽象、语言和实现技术
- 批准号:
282458149 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Research Grants
Relationale Datenbanksysteme als hocheffiziente XQuery-Prozessoren: Compilationstechniken und Laufzeitsysteme
作为高效 XQuery 处理器的关系数据库系统:编译技术和运行时系统
- 批准号:
27645166 - 财政年份:2006
- 资助金额:
-- - 项目类别:
Research Grants
Recursive Computation Over Relational Data (RECORD)
关系数据的递归计算 (RECORD)
- 批准号:
511062611 - 财政年份:
- 资助金额:
-- - 项目类别:
Research Grants
相似海外基金
Using Fine-grained Programming Trace Data to Inform Disciplinary Models of Self-Regulated Learning in Computing Education
使用细粒度编程跟踪数据为计算机教育中的自我调节学习的学科模型提供信息
- 批准号:
2300612 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Continuing Grant
Using Fine-grained Programming Trace Data to Inform Disciplinary Models of Self-Regulated Learning in Computing Education
使用细粒度编程跟踪数据为计算机教育中的自我调节学习的学科模型提供信息
- 批准号:
2300613 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Continuing Grant
Study on an intelligent sensing system for fine-grained data of urban garbage discharge
城市垃圾排放细粒度数据智能感知系统研究
- 批准号:
21K17735 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Early-Career Scientists
SHF: Small: Beyond Accelerators - Using FPGAs to Achieve Fine-grained Control of Data-flows in Embedded SoCs
SHF:小型:超越加速器 - 使用 FPGA 实现嵌入式 SoC 中数据流的细粒度控制
- 批准号:
2008799 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Standard Grant
CC* Integration-Large: mGuard: A Secure Real-time Data Distribution System with Fine-Grained Access Control for mHealth Research
CC* 大型集成:mGuard:一种安全的实时数据分发系统,具有用于移动医疗研究的细粒度访问控制
- 批准号:
2019085 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Standard Grant
Using Fine-Grained Quantitative and Qualitative Data to Enhance Curricula and Broaden Participation in Computer Science
使用细粒度的定量和定性数据来增强课程并扩大计算机科学的参与
- 批准号:
2030070 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Standard Grant
EAGER: A Fine-Grained Data-Driven Approach to Studying Sequential Decision-Making in Engineering Systems Design
EAGER:一种研究工程系统设计中顺序决策的细粒度数据驱动方法
- 批准号:
1842588 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
Resource-intensive and data-intensive methods for robust fine-grained sentiment-analysis
用于稳健的细粒度情感分析的资源密集型和数据密集型方法
- 批准号:
253706877 - 财政年份:2014
- 资助金额:
-- - 项目类别:
Research Grants
Research on the system to investigate fine-grained software process data analysis
细粒度软件过程数据分析系统研究
- 批准号:
22500027 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Fine-Grained Semantic Markup of Descriptive Data for Knowledge Applications in Biodiversity Domains
用于生物多样性领域知识应用的描述性数据的细粒度语义标记
- 批准号:
0849982 - 财政年份:2009
- 资助金额:
-- - 项目类别:
Standard Grant