Natural Language Data Management

自然语言数据管理

基本信息

  • 批准号:
    RGPIN-2018-04683
  • 负责人:
  • 金额:
    $ 2.04万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

A large volume of data generated everyday is in some form of natural language intended for human consumption; this includes news articles, blog posts, tweets, scientific articles, Wikipedia entries, financial reports, etc. However, our querying capabilities over this data have remained very much limited to keyword search, which reduces the efficiency of the search and the scope of the information that can be retrieved. Specifically, a keyword search is not sufficient when the search is not limited to selection and involves join and set operations over the contents of the documents. Additionally, a keyword search is not very applicable when the granularity of the search result is smaller than a document. This proposal advances the research in querying natural language data and the study of issues that hinder querying and managing such data (including both structured and unstructured) in documents. The particular challenges to be studied are: (1) storage and indexing, (2) querying and query processing, and (3) data integration and aggregation. (1) Standard text-based indices such as inverted index often ignore the structure of natural language data and will not provide the best support for queries. A storage system for natural language data may track both the ordering and the lexical relations between words and between senses to better support certain classes of queries. For example, synonymy and hyponymy relationships may indicate a degree of locality in the sense that a document that matches a word is likely to match the synonyms and hyponyms of the word as well. (2) Natural language data may be stored in and queried using a relational database, but composing queries over data can be cumbersome and relational systems may not provide the best support for the queries. A natural language data management system is expected to be geared towards the needs of applications that use natural language data by providing native support and treating natural language data as first class citizens. In particular, natural language data may be transformed to a meaning representation to better support reasoning and entailment detection and for integration with other sources. The querying system can then provide some support for these transformations; the querying system can also utilize the known relationships between fragments (e.g. distributional similarity) in both evaluating the queries and optimizing their evaluation. (3) Natural language data that resides in different sources can refer to the same entities differently; even the references within the same source can be ambiguous if taken out of their contexts. Ambiguities introduce problems in integrating and aggregating data from multiple sources. Despite the progress in the area of entity resolution, many challenges remain. We will work toward addressing the challenges related to querying, by exploiting new developments in knowledge bases and linked data.
每天产生的大量数据以某种自然语言的形式供人类使用;这包括新闻文章、博客文章、推特、科学文章、维基百科条目、财务报告等。但是,我们对这些数据的查询功能仍然非常局限于关键字搜索,这降低了搜索的效率和可以检索的信息的范围。具体来说,当搜索不限于选择并且涉及对文档内容的连接和设置操作时,关键字搜索是不够的。此外,当搜索结果的粒度小于文档时,关键字搜索不太适用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rafiei, Davood其他文献

Rafiei, Davood的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Rafiei, Davood', 18)}}的其他基金

Natural Language Data Management
自然语言数据管理
  • 批准号:
    RGPIN-2018-04683
  • 财政年份:
    2022
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Integrating third-party and open data with internal corporate databases
将第三方和开放数据与内部企业数据库集成
  • 批准号:
    542303-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Collaborative Research and Development Grants
Natural Language Data Management
自然语言数据管理
  • 批准号:
    RGPIN-2018-04683
  • 财政年份:
    2021
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Integrating third-party and open data with internal corporate databases
将第三方和开放数据与内部企业数据库集成
  • 批准号:
    542303-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Collaborative Research and Development Grants
Natural Language Data Management
自然语言数据管理
  • 批准号:
    RGPIN-2018-04683
  • 财政年份:
    2019
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Integrating third-party and open data with internal corporate databases
将第三方和开放数据与内部企业数据库集成
  • 批准号:
    542303-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Collaborative Research and Development Grants
Natural Language Data Management
自然语言数据管理
  • 批准号:
    RGPIN-2018-04683
  • 财政年份:
    2018
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Fact extraction from organizational corpora
从组织语料库中提取事实
  • 批准号:
    522032-2017
  • 财政年份:
    2017
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Engage Grants Program
Enabling queries on relational data on the Web
启用对 Web 上的关系数据的查询
  • 批准号:
    239127-2013
  • 财政年份:
    2017
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Enabling queries on relational data on the Web
启用对 Web 上的关系数据的查询
  • 批准号:
    239127-2013
  • 财政年份:
    2016
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

CAREER: Data-driven design of graphene oxide for environmental applications enabled by natural language processing and machine learning techniques
职业:通过自然语言处理和机器学习技术实现氧化石墨烯环境应用的数据驱动设计
  • 批准号:
    2238415
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Continuing Grant
Applying Natural Language Processing to real-world patient data to optimise cancer care
将自然语言处理应用于现实世界的患者数据以优化癌症护理
  • 批准号:
    2897525
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Studentship
RFA-CE-23-006 - Rigorous examination of anonymous reporting system data to prevent youth suicide and firearm violence: an applied natural language approach
RFA-CE-23-006 - 严格检查匿名报告系统数据以防止青少年自杀和枪支暴力:应用自然语言方法
  • 批准号:
    10786629
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Collaborative Research: SHF: Medium: Natural Language Models with Execution Data for Software Testing
协作研究:SHF:媒介:用于软件测试的具有执行数据的自然语言模型
  • 批准号:
    2313028
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Standard Grant
EAGER: SSMCDAT2023: Natural Language Processing and Large Language Models for Automated Extraction of Materials Chemistry Data from Scientific Literature
EAGER:SSMCDAT2023:用于从科学文献中自动提取材料化学数据的自然语言处理和大型语言模型
  • 批准号:
    2334411
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Natural Language Models with Execution Data for Software Testing
协作研究:SHF:媒介:用于软件测试的具有执行数据的自然语言模型
  • 批准号:
    2313027
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Standard Grant
Characterizing Bias and Care Disparities with Physical Restraint Use in the Emergency Setting Using Natural Language and Cognitive Data
使用自然语言和认知数据描述紧急情况下使用身体约束的偏见和护理差异
  • 批准号:
    10431043
  • 财政年份:
    2022
  • 资助金额:
    $ 2.04万
  • 项目类别:
Integrated AI analysis of natural language in nursing records by GPT and sensor data to support ward management
通过 GPT 和传感器数据对护理记录中的自然语言进行人工智能综合分析,以支持病房管理
  • 批准号:
    22K19684
  • 财政年份:
    2022
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
Natural Language Data Management
自然语言数据管理
  • 批准号:
    RGPIN-2018-04683
  • 财政年份:
    2022
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Characterizing Bias and Care Disparities with Physical Restraint Use in the Emergency Setting Using Natural Language and Cognitive Data
使用自然语言和认知数据描述紧急情况下使用身体约束的偏见和护理差异
  • 批准号:
    10633167
  • 财政年份:
    2022
  • 资助金额:
    $ 2.04万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了