Structural Comparison of Labelled Graph Data

标记图数据的结构比较

基本信息

  • 批准号:
    EP/G012407/1
  • 负责人:
  • 金额:
    $ 10.33万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2009
  • 资助国家:
    英国
  • 起止时间:
    2009 至 无数据
  • 项目状态:
    已结题

项目摘要

Semistructured data is an important data format, essentially embodied by the XML standard. It is increasingly used in many critical applications, especially in business-to-business communications and peer-to-peer traffic on the Internet.The main advantage of the semistructured format is that it increases the flexibility of the way the data may be structured: as new situations arise, the structure of the data may evolve as well as the values.For example, given an established stream of business messages about car insurance between multiple brokers and underwriters, one underwriter may decide that the colour of a car is a significant factor not currently included in the data being supplied. They can advertise this fact, and brokers may choose to start asking their clients for the colour of their cars. From this point onwards, a field may start to appear in the messages between brokers and underwriters, the field being optionally included by brokers and optionally acted upon by underwriters where it is present.When much of this kind of activity takes place, it becomes important to consider the structural attributes of the whole, potentially large, pool of data, as well as the individual items. For example, given a set of data items: do they have anything much in common with each other?; do they all have at least something in common, and if so what?; how different is one given item from another, and are there any others exactly the same as this one?; can one or more clusters of similarly or identically-structured items be identified within the pool?; is the pool of data itself evolving or becoming quiescent, that is, over time, are individual items becoming, on the whole, more different or more similar?Recent work we have done on the inherent complexity, and thus regularity, of semistructured data items has led us to an observation that we believe will give great insights into how to answer these and other similar questions. Using some long-established results from Information Theory, we have applied the concept of mechanical entropy to the domain of semistructured data to give a metric for the complexity of individual data items. We have also discovered an efficient way of calculating this, by use of a data structure, the structural fingerprint, which represents the essential structure of the item. We now believe that the reapplication of this work into the above context will give a great leverage in terms of producing useful, quantified answers to the above questions and others, while the use of the structural fingerprint will make it computationally feasible to perform these calculations upon large pools of semistructured data in the global domain of the Internet.
半结构化数据是一种重要的数据格式,其本质体现为XML标准。它越来越多地用于许多关键应用程序,特别是在企业对企业的通信和互联网上的对等通信中。半结构化格式的主要优点是它增加了数据结构化方式的灵活性:随着新情况的出现,数据的结构以及值可能会演变,例如,给定在多个经纪人和承保人之间建立的关于汽车保险的商业消息流,一个承保人可以决定汽车的颜色是当前未包括在所提供的数据中的重要因素。他们可以宣传这一事实,经纪人可能会选择开始向客户询问他们汽车的颜色。从这一点开始,一个字段可能开始出现在经纪人和承销商之间的消息中,经纪人可以选择包含该字段,承销商也可以选择对其采取行动。当发生大量此类活动时,考虑整个数据池(可能很大)以及单个项目的结构属性变得非常重要。例如,给定一组数据项:它们之间有什么共同点吗?他们是否都至少有一些共同点,如果有的话,是什么?一个给定的项目与另一个有多大的不同,还有其他的项目与这个完全相同吗?是否可以在池中识别一个或多个相似或相同结构的项目集群?数据库本身是在演变还是在静止,也就是说,随着时间的推移,单个项目总体上是变得越来越不同还是越来越相似?最近的工作,我们所做的固有的复杂性,从而规律性,半结构化的数据项,使我们的观察,我们相信将提供很大的见解如何回答这些和其他类似的问题。使用信息论的一些长期建立的结果,我们已经将机械熵的概念应用于半结构化数据域,以给出单个数据项的复杂性的度量。我们还发现了一种有效的计算方法,通过使用数据结构,结构指纹,它代表了项目的基本结构。我们现在相信,将这项工作重新应用到上述背景下,将在产生有用的,量化的答案,上述问题和其他人方面提供很大的杠杆作用,而使用的结构指纹将使其计算上可行的执行这些计算上的大型池的半结构化数据在全球域的互联网。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Similarity Search and Applications - 9th International Conference, SISAP 2016, Tokyo, Japan, October 24-26, 2016, Proceedings
相似性搜索和应用 - 第九届国际会议,SISAP 2016,日本东京,2016 年 10 月 24-26 日,会议记录
  • DOI:
    10.1007/978-3-319-46759-7_4
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Connor R
  • 通讯作者:
    Connor R
A New Probabilistic Ranking Model
一种新的概率排名模型
  • DOI:
    10.1145/2499178.2499185
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Connor R
  • 通讯作者:
    Connor R
Towards a universal information distance for structured data
迈向结构化数据的通用信息距离
  • DOI:
    10.1145/1995412.1995426
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Connor R
  • 通讯作者:
    Connor R
Hilbert Exclusion Improved Metric Search through Finite Isometric Embeddings
希尔伯特排除通过有限等距嵌入改进了度量搜索
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Richard Connor其他文献

Supermetric search
  • DOI:
    10.1016/j.is.2018.01.002
  • 发表时间:
    2019-02-01
  • 期刊:
  • 影响因子:
  • 作者:
    Richard Connor;Lucia Vadicamo;Franco Alberto Cardillo;Fausto Rabitti
  • 通讯作者:
    Fausto Rabitti
Query filtering using two-dimensional local embeddings
  • DOI:
    10.1016/j.is.2021.101808
  • 发表时间:
    2021-11-01
  • 期刊:
  • 影响因子:
  • 作者:
    Lucia Vadicamo;Richard Connor;Edgar Chávez
  • 通讯作者:
    Edgar Chávez
Bitpart: Exact metric search in high(er) dimensions
  • DOI:
    10.1016/j.is.2020.101493
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Alan Dearle;Richard Connor
  • 通讯作者:
    Richard Connor
A bounded distance metric for comparing tree structure
  • DOI:
    10.1016/j.is.2010.12.003
  • 发表时间:
    2011-06-01
  • 期刊:
  • 影响因子:
  • 作者:
    Richard Connor;Fabio Simeoni;Michael Iakovos;Robert Moss
  • 通讯作者:
    Robert Moss
Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
  • DOI:
    10.1016/j.is.2020.101506
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Lucia Vadicamo;Claudio Gennaro;Fabrizio Falchi;Edgar Chávez;Richard Connor;Giuseppe Amato
  • 通讯作者:
    Giuseppe Amato

Richard Connor的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Investigating the dynamic nature of listening comprehension in EMI lectures: A comparison of Japan, Hong Kong, and Sweden
调查 EMI 讲座中听力理解的动态性质:日本、香港和瑞典的比较
  • 批准号:
    23K25340
  • 财政年份:
    2024
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Understanding Dike Propagation Through Comparison of High-fidelity Coupled Fracture and Fluid Flow Models and Field Observations
通过比较高保真耦合裂缝和流体流动模型以及现场观测来了解堤坝的扩展
  • 批准号:
    2333837
  • 财政年份:
    2024
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Continuing Grant
MCA Pilot PUI: Proxy-model comparison using carbon isotopes from annually banded marine calcifiers and ocean circulation inverse models to evaluate coastal carbon cycle processes
MCA Pilot PUI:使用年度带状海洋钙化物的碳同位素和海洋环流反演模型进行代理模型比较,以评估沿海碳循环过程
  • 批准号:
    2322042
  • 财政年份:
    2024
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Standard Grant
Comparison Framework - Legal Services
比较框架 - 法律服务
  • 批准号:
    10091101
  • 财政年份:
    2024
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Collaborative R&D
Comparison of Machine Learning and Conventional Statistical Modeling for Predicting Readmission Following Acute Heart Failure Hospitalization
机器学习与传统统计模型预测急性心力衰竭住院后再入院的比较
  • 批准号:
    495410
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
An International Comparison of the Art Production Strategies among Fringe Theatre Producers in Diverse Social Conditions
不同社会条件下边缘戏剧制作人艺术制作策略的国际比较
  • 批准号:
    22KJ3016
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Contemporary Long-Term Homicide Trends in England and Wales in the Period 1977-2019 and a Comparison with Non-Lethal Violence Trends
1977-2019 年英格兰和威尔士当代长期凶杀趋势以及与非致命暴力趋势的比较
  • 批准号:
    ES/X000575/1
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Research Grant
Examining the long-term social legacy of mega urban projects - A comparison of Shanghai and Dujiangyan City
审视巨型城市项目的长期社会遗产——上海与都江堰市的比较
  • 批准号:
    ES/W003104/2
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Research Grant
Collaborative Research: Ultra-High Resolution Paleostreamflow in Southeast Asia--Proxy/Model Comparison
合作研究:东南亚超高分辨率古水流——代理/模型比较
  • 批准号:
    2302669
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Standard Grant
Comparison of clinical and health economic outcomes of COVID-19 inpatients: an ecological study
COVID-19 住院患者的临床和健康经济结果比较:一项生态研究
  • 批准号:
    23K16292
  • 财政年份:
    2023
  • 资助金额:
    $ 10.33万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了