BIGDATA: F: Collaborative Research: Design and Computation of Scalable Graph Distances in Metric Spaces: A Unified Multiscale Interpretable Perspective
BIGDATA:F:协作研究:度量空间中可扩展图距离的设计和计算:统一的多尺度可解释视角
基本信息
- 批准号:1741129
- 负责人:
- 金额:$ 59.92万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2023-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Representations of real-world phenomena as graphs (a.k.a. networks) are ubiquitous, ranging from social and information networks, to technological, biological, chemical, and brain networks. Many graph mining tasks -- including clustering, anomaly detection, nearest neighbor, similarity search, pattern recognition, and transfer learning -- require a distance measure between graphs to be computed efficiently. The existing distance measures between graphs leave a lot to be desired. They are overwhelmingly based on heuristics. Many do not scale to graphs with millions of nodes; others do not satisfy the metric properties of non-negativity, positive definiteness, symmetry, and triangle inequality. This project studies a formal mathematical foundation covering a family of graph distances that overcome these limitations, focusing on real-world applications in biology and social network analysis. It also provides a universal methodology for parallelizing the computation of graph distance metrics within this family over massive graphs with millions of nodes, and scaling it over cloud computing resources.This project studies, designs, and evaluates graph distances that satisfy the following six properties: (1) They are scalable -- i.e., they are strictly subquadratic in runtime and achieve a speedup when computed in parallel. (2) They are metrics -- i.e., they satisfynon-negativity, positive definiteness, symmetry, and triangle inequality. (3) They are discriminative, as measured by comparisons to the "chemical distance", which finds the optimal mapping between two graphs that minimizes edge discrepancies. (4) They are statisticallyrobust -- i.e., they have confidence intervals. (5) They can incorporate auxiliary information available on nodes and links. (6) They are interpretable to subject matter experts. Rather than providing a single metric, this project explores a family of such graph distance metrics. It also provides a universal methodology, using the Alternating Directions Method of Multipliers (ADMM), to parallelizing the computation of graph distance metrics within this family over massive graphs with millions of nodes. The proposed metrics are evaluated over massive real-world graphs using Apache Spark on a cloud computing infrastructure.
将真实世界的现象表示为图形(也称为网络)是无处不在的,范围从社会和信息网络到技术、生物、化学和大脑网络。许多图挖掘任务--包括聚类、异常检测、最近邻、相似性搜索、模式识别和迁移学习--都需要有效地计算图之间的距离度量。现有的图之间的距离测量留下了很多需要改进的地方。它们绝大多数都是基于几何学的,许多不能扩展到具有数百万个节点的图;其他的不满足非负性、正定性、对称性和三角不等式等度量性质。该项目研究了一个正式的数学基础,涵盖了一系列克服这些限制的图距离,专注于生物学和社会网络分析中的实际应用。它还提供了一种通用的方法,用于在具有数百万节点的海量图上并行计算该家族中的图距离度量,并将其扩展到云计算资源上。本项目研究,设计和评估满足以下六个属性的图距离:(1)它们是可扩展的-即,它们在运行时是严格次二次的,并且在并行计算时实现加速。(2)它们是度量--即,它们满足非负性、正定性、对称性和三角不等式。(3)它们是有区别的,通过与“化学距离”的比较来测量,化学距离找到两个图之间的最佳映射,使边缘差异最小化。(4)他们是非常健壮的--即,它们有置信区间。(5)它们可以包含节点和链路上可用的辅助信息。(6)它们是主题专家可以解释的。而不是提供一个单一的度量,这个项目探讨了一个家庭这样的图形距离度量。它还提供了一个通用的方法,使用交替方向乘法(ADMM),并行计算的图距离度量在这个家庭超过数百万个节点的大规模图形。在云计算基础设施上使用Apache Spark对大量真实世界的图形进行评估。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
An Explicit Convergence Rate for Nesterov's Method from SDP
基于SDP的Nesterov方法的显式收敛率
- DOI:10.1109/isit.2018.8437794
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:S. Safavi;Bikash Joshi;G. França;José Bento
- 通讯作者:José Bento
Efficient Projection onto the Perfect Phylogeny Model
- DOI:
- 发表时间:2018-11
- 期刊:
- 影响因子:0
- 作者:Bei Jia;Surjyendu Ray;S. Safavi;José Bento
- 通讯作者:Bei Jia;Surjyendu Ray;S. Safavi;José Bento
A Family of Tractable Graph Distances
一系列易于处理的图距离
- DOI:
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Bento, José;Ioannidis, Stratis
- 通讯作者:Ioannidis, Stratis
Estimating Cellular Goals from High-Dimensional Biological Data
- DOI:10.1145/3292500.3330775
- 发表时间:2018-07
- 期刊:
- 影响因子:0
- 作者:Laurence Yang;José Bento;Jean-Christophe Lachance;B. Palsson
- 通讯作者:Laurence Yang;José Bento;Jean-Christophe Lachance;B. Palsson
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jose Bento其他文献
Mystery-driven Institutionalism: The Jesuit Spiritual Exercises as a Book of Practices Leading Nowhere
神秘驱动的制度主义:耶稣会精神练习作为一本无处可去的实践书
- DOI:
10.1108/s0733-558x20200000071006 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Jose Bento;P. Quattrone - 通讯作者:
P. Quattrone
Jose Bento的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
2348159 - 财政年份:2023
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
2308649 - 财政年份:2022
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
- 批准号:
2027516 - 财政年份:2020
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1934319 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Protecting Yourself from Wildfire Smoke: Big Data-Driven Adaptive Air Quality Prediction Methodologies
大数据:IA:协作研究:保护自己免受野火烟雾的侵害:大数据驱动的自适应空气质量预测方法
- 批准号:
1838022 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
- 批准号:
1926250 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
1947584 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
1837964 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838222 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838248 - 财政年份:2019
- 资助金额:
$ 59.92万 - 项目类别:
Standard Grant