III: Small: An end-to-end pipeline for interactive visual analysis of big data
III:小型:用于大数据交互式可视化分析的端到端管道
基本信息
- 批准号:1815238
- 负责人:
- 金额:$ 48.58万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-09-01 至 2021-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Computer scientists, statisticians, and data scientists use sophisticated analysis techniques to extract insights from their massive sources of data, from astronomic data gathered from telescopes to climate simulations run on supercomputers to user activity in online social networks. At the same time, they would like to make use of interactive visualization, so they can understand and explore their data by means of graphics and visual interfaces. These visual analytics systems are more intuitive and more powerful, and allow analysts to make better decisions more confidently. Currently, these visualization systems are not fast enough for broad applicability in large-scale settings. In this project, novel techniques are developed to speed up the methods used in data analyses in order for stakeholders to combine the sophisticated analyses they need with the interactive visualization systems they prefer to use. This project has the potential to transform how current infrastructure and systems for interactive and exploratory data analysis are designed. Open-source software that integrates directly with the libraries and programming languages used by scientists and other data analysts will be broadly disseminated. In addition, the concepts and technologies developed here will be used in classrooms to train future generations of researchers and computer scientists.There currently is a major obstacle for the application of interactive visualization systems in large-scale data analysis: many techniques require repeated loops (or scans) over the dataset in order to collect the appropriate aggregation information. The recently developed hierarchical, spatiotemporal data cube data structures replace many of scans, but are only suitable for basic bar charts, histograms, and heatmaps, since they accelerate only a small number of queries available to database management systems. In contrast, this project aims at developing novel data structures that support a broader swath of the exploratory data analysis and visualization pipeline, such as k-means, logistic regression, least-squares optimization, dimensionality reduction, etc., and connect these data structures directly to the APIs and calls made by visualization libraries that use these methods. The performance of proposed infrastructure for interactive and exploratory data analysis will be evaluated on specifically designed benchmarks to compare existing and novel interactive data cube systems. The benchmarks will enable synthesis of knowledge that is spread across a somewhat fractured research area. The benchmarks will, in turn, guide the evaluation of the development of improvement for these data structures, aiming at a decrease between 30% to 80% in storage costs, and likely comparable gains in preprocessing time, that translate directly into better interactive visualization capabilities. APIs for integrating these data structures in modern data science environments such as R and Python will be developed and widely disseminated in order to increase the impact of this project.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
计算机科学家、统计学家和数据科学家使用复杂的分析技术从他们的海量数据来源中提取见解,从望远镜收集的天文数据到在超级计算机上运行的气候模拟,再到在线社交网络中的用户活动。同时,他们希望利用交互式可视化,这样他们就可以通过图形和可视化界面来理解和探索他们的数据。这些可视化分析系统更直观、更强大,使分析师能够更自信地做出更好的决策。目前,这些可视化系统的速度还不够快,无法在大规模环境中广泛适用。在这个项目中,开发了新的技术来加快数据分析中使用的方法,以便利益攸关方将他们需要的复杂分析与他们更喜欢使用的交互式可视化系统结合起来。该项目有可能改变目前交互式和探索性数据分析的基础设施和系统的设计方式。将广泛传播直接与科学家和其他数据分析人员使用的库和编程语言集成的开放源码软件。此外,这里开发的概念和技术将用于课堂培训未来一代的研究人员和计算机科学家。目前,交互式可视化系统在大规模数据分析中的应用存在一个主要障碍:许多技术需要对数据集进行重复循环(或扫描),以收集适当的聚合信息。最近开发的分层时空数据立方体数据结构取代了许多扫描,但只适用于基本的条形图、直方图和热图,因为它们只加速了数据库管理系统可用的少量查询。相比之下,该项目旨在开发支持更广泛的探索性数据分析和可视化管道的新型数据结构,如k-Means、Logistic回归、最小二乘优化、降维等,并将这些数据结构直接连接到使用这些方法的可视化库所进行的API和调用。将根据专门设计的基准评估拟议的交互式和探索性数据分析基础设施的性能,以比较现有的和新的交互式数据立方体系统。这些基准将使分布在一个有点支离破碎的研究领域的知识能够得到综合。这些基准将反过来指导对这些数据结构改进发展的评估,目标是将存储成本降低30%至80%,并可能在预处理时间方面获得类似的收益,从而直接转化为更好的交互可视化能力。将开发和广泛传播将这些数据结构集成到现代数据科学环境中的API,以增加该项目的影响。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Disentangling Influence: Using disentangled representations to audit model predictions
解缠结影响:使用解缠结表示来审核模型预测
- DOI:
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Marx, Charles;Phillips, Richard;Friedler, Sorelle A.;Scheidegger, Carlos;Venkatasubramanian, Suresh
- 通讯作者:Venkatasubramanian, Suresh
A Structured Review of Data Management Technology for Interactive Visualization and Analysis
- DOI:10.1109/tvcg.2020.3028891
- 发表时间:2020-10
- 期刊:
- 影响因子:5.2
- 作者:L. Battle;C. Scheidegger
- 通讯作者:L. Battle;C. Scheidegger
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carlos Scheidegger其他文献
Aardvark: Comparative Visualization of Data Analysis Scripts
Aardvark:数据分析脚本的比较可视化
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Rebecca Faust;Carlos Scheidegger;Chris North - 通讯作者:
Chris North
Set Visualization and Uncertainty
设置可视化和不确定性
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
∗. SusanneBleisch;∗. StevenChaplick;∗. Jan;∗. EvaMayr;∗. MarcvanKreveld;†. AnnikaBonerath;Dagstuhl Reports;Markus Wallinger;MosaicSets Sara;Irina Fabrikant;Alexander Wolff;StoryLines;Marc van;Peter Rodgers;D. Archambault;Bei Wang;Nathan van Beusekom;Amy Griffin;Martin Krzywinski;Paolo Simonetto;Carlos Scheidegger;David Auber Main;S. Miksch;TU Wien;T. Gschwandtner;M. Bögl;P. Federico;Silvia Miksch Main;NL TU Eindhoven;Wouter Meulemans;Bettina Speckmann License;C. Tominski;Michael Behrisch;S. Fabrikant;Helen C. Purchase License;Hsiang - 通讯作者:
Hsiang
Carlos Scheidegger的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carlos Scheidegger', 18)}}的其他基金
III: Medium: Collaborative Research: Evaluating and Maximizing Fairness in Information Flow on Networks
III:媒介:协作研究:评估和最大化网络信息流的公平性
- 批准号:
1955162 - 财政年份:2020
- 资助金额:
$ 48.58万 - 项目类别:
Continuing Grant
III: Medium: Collaborative Research: Topological Data Analysis for Large Network Visualization
III:媒介:协作研究:大型网络可视化的拓扑数据分析
- 批准号:
1513651 - 财政年份:2015
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
Collaborative Research: HCC: Small: End-User Guided Search and Optimization for Accessible Product Customization and Design
协作研究:HCC:小型:最终用户引导的搜索和优化,以实现无障碍产品定制和设计
- 批准号:
2327136 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Small: Anomaly Detection and Performance Optimization for End-to-End Data Transfers at Scale
协作研究:OAC 核心:小型:大规模端到端数据传输的异常检测和性能优化
- 批准号:
2412329 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
Small molecules targeting RuvBL complex for triple negative breast cancer
靶向 RuvBL 复合物的小分子治疗三阴性乳腺癌
- 批准号:
10751401 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232055 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
Comprehensive, Real Time Monitoring of the Accumulation and Clearance of Small Molecules in Kidney Disease
全面、实时监测肾脏疾病中小分子的积累和清除
- 批准号:
10863011 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Perivascular Inflammation and Vascular Remodeling: A Common Cause of Hemorrhage in Cerebral Small Vessel Diseases?
血管周围炎症和血管重塑:脑小血管疾病出血的常见原因?
- 批准号:
10861499 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232054 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
CyberTraining: Implementation: Small: COMPrehensive Learning for end-users to Effectively utilize CyberinfraStructure (COMPLECS)
网络培训:实施:小型:最终用户全面学习以有效利用网络基础设施 (COMPLECS)
- 批准号:
2320934 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
Collaborative Research: HCC: Small: End-User Guided Search and Optimization for Accessible Product Customization and Design
协作研究:HCC:小型:最终用户引导的搜索和优化,以实现无障碍产品定制和设计
- 批准号:
2327137 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别:
Standard Grant
Discovery of GPR171 small molecule ligands for the treatment of chronic pain
发现GPR171小分子配体用于治疗慢性疼痛
- 批准号:
10604177 - 财政年份:2023
- 资助金额:
$ 48.58万 - 项目类别: