ESPRESSO: Efficient Search over Personal Repositories - Secure and Sovereign

ESPRESSO:对个人存储库进行高效搜索 - 安全且主权

基本信息

  • 批准号:
    EP/W025868/1
  • 负责人:
  • 金额:
    $ 100.55万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

Recent controversies over access and processing of personal data have highlighted the significance of the sovereignty of individuals over their personal data and are leading to new paradigms for application development based on personal online datastores (pods), where individuals have complete control over which applications can gain access to their personal data and for what purpose. Emerging frameworks and ecosystems, such as SOLID and the one by Dataswift, support the development of such decentralised applications which, when granted access by the individuals concerned, can access the data stored in pods to provide services to users in areas such as health and well- being, social networking, and collaborative authoring. However, this decentralisation presents significant performance challenges when searching or querying data stored in pods on a large scale, which is critical to fulfil the potential of such applications. The current state of the art in searching for data by supplying keywords or phrases would require separate indexes to be created and maintained for each user (or group of users who share identical access rights to all available pods), leading to significant increases in storage, network and computation costs. Similarly, the current state of the art in searching for data by supplying a database query (e.g. using the SPARQL query language) would require separate metadata to be created and maintained for each user, and additional checks to control access to, and caching of, data during the query evaluation process. The current state of the art does not provide techniques for the efficient generation and maintenance of the necessary indexes and meta-information data structures, nor algorithms for evaluating search queries and aggregating query results on a large scale in such decentralised settings. The ESPRESSO project will research, develop and evaluate appropriate algorithms, indexes and meta-information data structures to enable large-scale data search across distributed pods. Our techniques will handle varying access rights and data caching requirements, as set by each individual pod owner. We will address both keyword-based search where the most important (top-k) or all search results may be required, and distributed querying using SPARQL. We will evaluate our techniques over pods that are implemented using the SOLID framework and SOLID-compatible data ecosystems such as the one by Dataswift based on HAT Microserver technology. The numbers, distribution and content of these pods will be determined by existing Information Retrieval benchmarks, extended with ESRESSO-specific data about owners' access and caching restrictions; and by the requirements of real-world scenarios elicited from the Health domain, which provides a wealth of settings to investigate and demonstrate the efficiency gains in pod data search achieved by our new techniques. We will not use real personal data, but will employ synthetic data generated using statistical patterns obtained in the aggregate from publicly available anonymised datasets of human subjects. The project will collaborate with NExT++ centre in Singapore and with the SOLID and HAT project teams. We will actively engage with the academic community and industrial stakeholders through dedicated events. The project's findings will inform current research in distributed systems, databases, the digital economy and cybersecurity, as well as the design of innovative decentralised applications, and will help to address the policy challenges relating to data sovereignty and privacy.
最近关于访问和处理个人数据的争议突出了个人对其个人数据的主权的重要性,并导致基于个人在线数据存储(pod)的应用程序开发的新模式,其中个人完全控制哪些应用程序可以访问他们的个人数据以及用于什么目的。新兴的框架和生态系统,如SOLID和Dataswift的框架和生态系统,支持这种分散式应用程序的开发,当相关个人授予访问权限时,可以访问存储在pod中的数据,为用户提供健康和福祉,社交网络和协作创作等领域的服务。然而,这种去中心化在大规模搜索或查询存储在pod中的数据时会带来重大的性能挑战,这对于发挥此类应用的潜力至关重要。通过提供关键字或短语来搜索数据的现有技术将需要为每个用户(或对所有可用pod共享相同访问权限的用户组)创建和维护单独的索引,从而导致存储、网络和计算成本的显著增加。类似地,通过提供数据库查询(例如,使用SPARQL查询语言)来搜索数据的现有技术将需要为每个用户创建和维护单独的元数据,以及在查询评估过程期间控制对数据的访问和数据的缓存的附加检查。现有技术没有提供用于有效地生成和维护必要的索引和元信息数据结构的技术,也没有提供用于在这种分散设置中大规模地评估搜索查询和聚集查询结果的算法。ESDIO项目将研究、开发和评估适当的算法、索引和元信息数据结构,以实现跨分布式Pod的大规模数据搜索。我们的技术将处理不同的访问权限和数据缓存要求,由每个pod所有者设置。我们将解决基于关键字的搜索,其中可能需要最重要的(top-k)或所有搜索结果,以及使用SPARQL的分布式查询。我们将在使用SOLID框架和SOLID兼容的数据生态系统(例如基于HAT微服务器技术的Dataswift)实现的Pod上评估我们的技术。这些pod的数量,分布和内容将由现有的信息检索基准确定,扩展了ESRESSO特定的关于所有者访问和缓存限制的数据;以及从健康领域引出的真实场景的要求,健康领域提供了丰富的设置来调查和展示我们的新技术在pod数据搜索中实现的效率提升。我们不会使用真实的个人数据,但将使用使用从公开可用的人类受试者匿名数据集汇总获得的统计模式生成的合成数据。该项目将与新加坡的NExT++中心以及SOLID和HAT项目团队合作。我们将通过专门的活动积极与学术界和工业利益相关者接触。该项目的研究结果将为分布式系统、数据库、数字经济和网络安全以及创新分散应用程序的设计提供信息,并将有助于解决与数据主权和隐私相关的政策挑战。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Web Information Systems Engineering - WISE 2023 - 24th International Conference, Melbourne, VIC, Australia, October 25-27, 2023, Proceedings
网络信息系统工程 - WISE 2023 - 第 24 届国际会议,澳大利亚维多利亚州墨尔本,2023 年 10 月 25-27 日,会议记录
  • DOI:
    10.1007/978-981-99-7254-8_28
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ragab M
  • 通讯作者:
    Ragab M
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Thanassis Tiropanis其他文献

ESPRESSO: A Framework to Empower Search on the Decentralized Web
  • DOI:
    10.1007/s41019-024-00263-w
  • 发表时间:
    2024-11-26
  • 期刊:
  • 影响因子:
    4.600
  • 作者:
    Mohamed Ragab;Yury Savateev;Helen Oliver;Thanassis Tiropanis;Alexandra Poulovassilis;Adriane Chapman;George Roussos
  • 通讯作者:
    George Roussos

Thanassis Tiropanis的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CIF: Small: Theory and Algorithms for Efficient and Large-Scale Monte Carlo Tree Search
CIF:小型:高效大规模蒙特卡罗树搜索的理论和算法
  • 批准号:
    2327013
  • 财政年份:
    2023
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Standard Grant
Development of deep learning for efficient search of continuous gravitational waves toward exploring new physics and new particles
发展深度学习以有效搜索连续引力波以探索新物理和新粒子
  • 批准号:
    23K13099
  • 财政年份:
    2023
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Privacy-Preserving and Action-Event Sequence Data Mining and Advanced Data Structures for Efficient Heuristic Search
隐私保护和动作事件序列数据挖掘以及用于高效启发式搜索的高级数据结构
  • 批准号:
    RGPIN-2019-07301
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Discovery Grants Program - Individual
Development of a rational screening method for highly efficient search for neurotrophic compounds
开发合理的筛选方法以高效搜索神经营养化合物
  • 批准号:
    22K06689
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Efficient and Effective Search over Graph-like Databases
对类图数据库进行高效且有效的搜索
  • 批准号:
    RGPIN-2017-04993
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Discovery Grants Program - Individual
Efficient Nearest Neighbour Search for Wasserstein Distances
Wasserstein 距离的高效最近邻搜索
  • 批准号:
    557558-2021
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
ESPRESSO: Efficient Search over Personal Repositories - Secure and Sovereign
ESPRESSO:对个人存储库进行高效搜索 - 安全且主权
  • 批准号:
    EP/W024659/1
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Research Grant
Efficient Search for Bioactive Substances from Difficult-to-Cultivate Microorganisms
从难以培养的微生物中高效寻找生物活性物质
  • 批准号:
    22K06662
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CPS: Medium: Collaborative Research: Srch3D: Efficient 3D Model Search via Online Manufacturing-specific Object Recognition and Automated Deep Learning-Based Design Classification
CPS:中:协作研究:Srch3D:通过在线制造特定对象识别和基于自动化深度学习的设计分类进行高效 3D 模型搜索
  • 批准号:
    2240733
  • 财政年份:
    2022
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Standard Grant
Privacy-Preserving and Action-Event Sequence Data Mining and Advanced Data Structures for Efficient Heuristic Search
隐私保护和动作事件序列数据挖掘以及用于高效启发式搜索的高级数据结构
  • 批准号:
    RGPIN-2019-07301
  • 财政年份:
    2021
  • 资助金额:
    $ 100.55万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了