Foundations of the Finite Model Theory of Concatenation

级联有限模型理论的基础

基本信息

  • 批准号:
    EP/T033762/1
  • 负责人:
  • 金额:
    $ 50.67万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2021
  • 资助国家:
    英国
  • 起止时间:
    2021 至 无数据
  • 项目状态:
    未结题

项目摘要

The topic of this proposal is the finite model theory of concatenation (FC), a new logic that combines approaches from two different subfields of theoretical computer science and that has direct applications in information extraction and database theory, and eventual applications in all areas that deal with searching in or filtering of textual data.Textual data is everywhere in modern life: No matter whether we are dealing with books, report, emails, social media, spreadsheets, or even HTML documents or log files, a huge amount of information is represented as a sequence of letters.Information extraction (IE) deals with the problem of extracting relevant information from such data. One model for this are document spanners, a relational framework for IE that can be understood as querying text like one would query a database. Due to their similarity to relational queries, document spanners have recently become a trending topic in database theory. Database theory uses methods from finite model theory (FMT), a sub-discipline of logic in computer science. FMT provided the inspiration for relational database, and both fields have maintained a close connection over these last fifty years. Of particular importance are methods for efficient evaluation of queries and deep insights of what can and cannot be expressed with certain types of queries. Every result in FMT is also a result on databases, as there is an immediate one-to-one correspondence between logical formulas and database queries: Every database query corresponds to a logical formula, and every logical formula corresponds to a database query.This is well-understood for relational databases, but it does directly translate to the textual setting of document spanners: The canonical FMT-approaches to text are provably too weak to model document spanners. But to be able to evaluate, optimize, and compare queries on texts, having this for document spanners would be tremendously useful.As first steps in this direction, previous research used "the theory of concatenation", a different logic from combinatorics on words (CoW), another sub-discipline of theoretical computer science, that studies patterns in texts. But this logic corresponds to infinite model theory, not finite model theory, which makes many useful techniques from FMT unavailable.To address this problem, the proposed new logic FC combines the theory of concatenation from CoW with the finite model approach from FMT. As shown in the preparatory research, FC has exactly the required expressive power of document spanners while still allowing the use of many classical FMT techniques. Apart from the connection to document spanners, FC can be seen as a natural variant of the logics that have previously been used: From an FMT-point of view, FC can be seen as extending the canonical first-order logic on strings with a string equality operator. From a CoW-point of view, FC is the finite model version of the theory of concatenation. As such, FC is a fundamental framework for using logic on textual data, and it can directly be extended to cover other operations, like length constraints as they are used in string verification.The preparatory research introduces FC, proves it has the desired one-to-one corresponds to document spanners, and shows that many FMT-techniques can be adapted. But many open questions remain, in particular regarding two of the most important aspects: The efficient evaluation of queries, and the fundamental limitations of the framework -- or, less formally, which queries can be evaluated quickly, which algorithms should be used for this, and what is possible in this querying model?This project will close this gap by creating a solid theoretical foundation for FC. This will require new proof techniques that at the very least merge those from CoW and FMT, or even go far beyond them. The ultimate goal is that FC will do for querying text what FMT did for databases.
本提案的主题是连接的有限模型理论(FC),这是一种新的逻辑,它结合了理论计算机科学两个不同子领域的方法,并直接应用于信息提取和数据库理论,最终应用于处理文本数据搜索或过滤的所有领域。文本数据在现代生活中无处不在:无论我们处理的是书籍、报告、电子邮件、社交媒体、电子表格,还是HTML文档或日志文件,大量的信息都表示为一系列字母。信息提取(IE)处理的是从这些数据中提取相关信息的问题。其中一个模型是文档空间,一个IE的关系框架,可以理解为像查询数据库一样查询文本。由于它们与关系查询的相似性,文档空间查询最近成为数据库理论中的一个热门话题。数据库理论使用有限模型理论(FMT)的方法,这是计算机科学中逻辑的一个子学科。FMT为关系数据库提供了灵感,这两个领域在过去的50年里保持着密切的联系。特别重要的是有效评估查询的方法,以及对某些类型的查询可以表达什么和不可以表达什么的深刻见解。FMT中的每个结果也是数据库上的结果,因为逻辑公式和数据库查询之间存在直接的一一对应关系:每个数据库查询对应于一个逻辑公式,每个逻辑公式对应于一个数据库查询。这对于关系数据库来说是很好理解的,但它确实直接转换为文档空间的文本设置:可以证明,针对文本的规范FMT方法对于建模文档生成器来说太弱了。但是,为了能够评估、优化和比较文本查询,将其用于文档空间将是非常有用的。作为这个方向的第一步,以前的研究使用了“连接理论”,这是一种与词组合学(CoW)不同的逻辑,CoW是理论计算机科学的另一个子学科,研究文本中的模式。但这种逻辑对应的是无限模型理论,而不是有限模型理论,这使得FMT中的许多有用技术无法使用。正如在预备研究中所示,FC完全具有所需的文档空间的表达能力,同时仍然允许使用许多经典的FMT技术。除了与文档空间的连接之外,FC可以被看作是以前使用的逻辑的自然变体:从FMT的角度来看,FC可以被看作是用字符串相等运算符扩展了字符串上的规范一阶逻辑。从牛的观点来看,FC是连接理论的有限模型版本。因此,FC是一个基本的框架,用于使用逻辑的文本数据,它可以直接扩展到涵盖其他操作,如长度约束,因为它们被用于字符串verifiation.The预备研究介绍FC,证明它有所需的一对一对应的文档spatial,并表明,许多FMT技术可以适应。但是仍然存在许多悬而未决的问题,特别是关于两个最重要的方面:查询的有效评估,以及框架的基本限制-或者,不太正式,哪些查询可以快速评估,哪些算法应该用于此,以及在此查询模型中可能有什么?本项目将通过为FC创建坚实的理论基础来缩小这一差距。这将需要新的证明技术,至少将CoW和FMT的证明技术合并,甚至远远超出它们。最终的目标是FC将像FMT对数据库那样对文本进行查询。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Developments in Language Theory - 27th International Conference, DLT 2023, Umeå, Sweden, June 12-16, 2023, Proceedings
语言理论的发展 - 第 27 届国际会议,DLT 2023,瑞典于默,2023 年 6 月 12-16 日,会议记录
  • DOI:
    10.1007/978-3-031-33264-7_19
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Thompson S
  • 通讯作者:
    Thompson S
The Theory of Concatenation over Finite Models
有限模型的级联理论
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Freydenberger DD
  • 通讯作者:
    Freydenberger DD
Splitting Spanner Atoms: A Tool for Acyclic Core Spanners
分裂扳手原子:非循环核心扳手的工具
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Freydenberger DD
  • 通讯作者:
    Freydenberger DD
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Dominik Freydenberger其他文献

Dominik Freydenberger的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

Finite-time Lyapunov 函数和耦合系统的稳定性分析
  • 批准号:
    11701533
  • 批准年份:
    2017
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

CAREER: Live Programming for Finite Model Finders
职业:有限模型查找器的实时编程
  • 批准号:
    2337667
  • 财政年份:
    2024
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Continuing Grant
Using paired CT/MRI images to create finite element model of metaphyseal fracture in young children
使用配对 CT/MRI 图像创建幼儿干骺端骨折有限元模型
  • 批准号:
    2883532
  • 财政年份:
    2023
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Studentship
Connecting Parton Cascade to QCDS Shear Viscosity for a Dynamical Model of Finite Temperature Kinetic Theory
将 Parton Cascade 连接到 QCDS 剪切粘度以建立有限温度动力学理论的动态模型
  • 批准号:
    2310021
  • 财政年份:
    2023
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Standard Grant
Model-theoretic dividing lines in the context of finite structures
有限结构背景下的模型理论分界线
  • 批准号:
    2882659
  • 财政年份:
    2023
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Studentship
Establishment of large-scale finite element electromagnetic field analysis using a high-precision human body model in a 5G environment
5G环境下高精度人体模型建立大规模有限元电磁场分析
  • 批准号:
    23K16893
  • 财政年份:
    2023
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Linking Head Kinematics and Multi-Modal Imaging Using a Finite Element Head Model to Assess mTBI Risk Mitigation
使用有限元头部模型将头部运动学和多模态成像联系起来以评估 mTBI 风险缓解
  • 批准号:
    558260-2020
  • 财政年份:
    2022
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Alliance Grants
Finite-Temperature Modelling of One, Two, and Many Dopants in the 2D Fermi-Hubbard Model
二维 Fermi-Hubbard 模型中一种、两种和多种掺杂剂的有限温度建模
  • 批准号:
    567963-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Theory and Applications of the empirical likelihood and finite mixture model
经验似然和有限混合模型的理论与应用
  • 批准号:
    RGPIN-2019-04204
  • 财政年份:
    2022
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Discovery Grants Program - Individual
Linking Head Kinematics and Multi-Modal Imaging Using a Finite Element Head Model to Assess mTBI Risk Mitigation
使用有限元头部模型将头部运动学和多模态成像联系起来以评估 mTBI 风险缓解
  • 批准号:
    558260-2020
  • 财政年份:
    2021
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Alliance Grants
Theory and Applications of the empirical likelihood and finite mixture model
经验似然和有限混合模型的理论与应用
  • 批准号:
    RGPIN-2019-04204
  • 财政年份:
    2021
  • 资助金额:
    $ 50.67万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了