RAPID: Automated discovery of COVID-19 related hypotheses using publicly available scientific literature

RAPID:使用公开的科学文献自动发现 COVID-19 相关假设

基本信息

  • 批准号:
    2027864
  • 负责人:
  • 金额:
    $ 10.45万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-05-01 至 2021-04-30
  • 项目状态:
    已结题

项目摘要

The vast amounts of biomedical information that accumulate in modern databases (such as MEDLINE of the National Library of Medicine) impose a great difficulty for efficient wide surveying by researchers who try to evaluate new information considering existing biomedical literature even when advanced information search engines are used. Automated hypotheses generation systems are designed to help scientists to overcome these difficulties and accelerate their research. The pandemic situation with COVID-19 is precisely one of the cases when such systems can play an extremely important role in coping with the coronavirus. Using two different AI approaches, we have developed two systems to discover plausible hypotheses in the biomedical domain. In this project, we will will deploy the COVID-19 customized hypothesis generation and knowledge discovery system, massively run it on any relevant to this research queries, and publish the results (including trained AI models, and discovered information) in the open domain for broad scientific community with the goal to accelerate the COVID-19 research. This work focuses heavily on addressing fundamental knowledge discovery questions by modeling and formulating scientific hypotheses using the publicly available information in the biomedical domain. However, in general, these methods are not restricted to any specific information domain, i.e., they can be broadly used to discover knowledge in texts. Although our experimental work will be related to COVID-19, the methods can be applied with some reservations to any literature-based analysis. For example, in the Materials Science Initiative, one of the goals is to establish a systematic understanding of the material properties and discover new materials which can be done by analyzing using the massive corpus of papers. In the legal world, identifying related patents can be done using a similar hypothesis modeling methodology.In the heart of the proposed approach lies a big multi-modal and multi-relational semantic knowledge network of all biomedical objects extracted from a variety of heterogeneous databases of the National Library of Medicine. These objects include but are not limited to scientific papers, abstracts, keywords, phrases, elements of thesaurus, genes, proteins, mutations, pathways, diseases, and diagnoses. We will leverage two systems, namely MOLIERE and AGATHA, that are based on structural and deep learning, respectively. We will customize them using the rapidly updated dataset of new papers that has not been yet processed by the National Library of Medicine but already exists in the open domain such as in various preprint archives and reports. The MOLIERE system is based on the network analysis techniques applied on the graph constructed using the low-dimensional embedding of the papers with the result interpretation methods that are based on the probabilistic topic modeling. The AGATHA system processes texts at much finer granularity, and creates a semantic knowledge network using more accurate embedding techniques followed by the deep learning training for knowledge discovery. Two systems complement each other. While the AGATHA is of higher quality, the MOLIERE is more interpretable. A combination of both will be leveraged in this research.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代数据库(如国家医学图书馆的MEDLINE)中积累的大量生物医学信息为研究人员有效地进行广泛调查带来了极大的困难,即使使用了先进的信息搜索引擎,研究人员也试图评估现有的生物医学文献中的新信息。自动假设生成系统旨在帮助科学家克服这些困难,加快他们的研究。新冠肺炎大流行的情况正是此类系统可以在应对冠状病毒方面发挥极其重要作用的案例之一。使用两种不同的人工智能方法,我们开发了两个系统来发现生物医学领域中可信的假设。在这个项目中,我们将部署新冠肺炎定制的假设生成和知识发现系统,在任何与本研究相关的查询上大规模运行,并将结果(包括训练好的人工智能模型和发现的信息)向广大科学界开放领域发布,以促进新冠肺炎研究。这项工作主要集中在通过使用生物医学领域中公开可用的信息来建模和制定科学假设来解决基本的知识发现问题。然而,总的来说,这些方法并不局限于任何特定的信息领域,即它们可以广泛用于发现文本中的知识。尽管我们的实验工作将与新冠肺炎相关,但这些方法可以有保留地应用于任何基于文献的分析。例如,在材料科学倡议中,目标之一是建立对材料性质的系统了解,并发现新材料,这些材料可以通过使用大量论文语料库进行分析来完成。在法律界,相关专利的识别可以使用类似的假设建模方法来完成。该方法的核心是从国家医学图书馆各种异质数据库中提取的所有生物医学对象的大型多模式和多关系语义知识网络。这些对象包括但不限于科学论文、摘要、关键词、短语、词库元素、基因、蛋白质、突变、途径、疾病和诊断。我们将利用两个系统,即Moliere和Agatha,这两个系统分别基于结构性学习和深度学习。我们将使用快速更新的新论文数据集对它们进行定制,这些新论文尚未被国家医学图书馆处理,但已经存在于开放领域,如各种预印档案和报告中。Moliere系统基于应用于使用论文低维嵌入构建的图表的网络分析技术,以及基于概率主题建模的结果解释方法。Agatha系统以更细的粒度处理文本,并使用更准确的嵌入技术创建语义知识网络,然后进行知识发现的深度学习训练。两个系统相辅相成。虽然《阿加莎》的质量更高,但《莫里哀》更容易理解。这一奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ilya Safro其他文献

Algebraic Distance on Graphs
图上的代数距离
FAIRLEARN: Configurable and Interpretable Algorithmic Fairness
FAIRLEARN:可配置和可解释的算法公平性
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ankit Kulshrestha;Ilya Safro
  • 通讯作者:
    Ilya Safro
Multilevel Graph Partitioning for Three-Dimensional Discrete Fracture Network Flow Simulations
  • DOI:
    10.1007/s11004-021-09944-y
  • 发表时间:
    2021-05-26
  • 期刊:
  • 影响因子:
    3.600
  • 作者:
    Hayato Ushijima-Mwesigwa;Jeffrey D. Hyman;Aric Hagberg;Ilya Safro;Satish Karra;Carl W. Gable;Matthew R. Sweeney;Gowri Srinivasan
  • 通讯作者:
    Gowri Srinivasan
Randomized heuristics for exploiting Jacobian scarcity
利用雅可比稀缺性的随机启发式
A Measure of the Connection Strengths between Graph Vertices with Applications
图顶点间连接强度的测量及其应用
  • DOI:
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jie Chen;Ilya Safro
  • 通讯作者:
    Ilya Safro

Ilya Safro的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ilya Safro', 18)}}的其他基金

Collaborative Research: EAGER: QIA: Large Scale QAOA Quantum Simulator
合作研究:EAGER:QIA:大规模 QAOA 量子模拟器
  • 批准号:
    2035606
  • 财政年份:
    2020
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
RAPID: Automated discovery of COVID-19 related hypotheses using publicly available scientific literature
RAPID:使用公开的科学文献自动发现 COVID-19 相关假设
  • 批准号:
    2127776
  • 财政年份:
    2020
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
Collaborative Research: EAGER: QIA: Large Scale QAOA Quantum Simulator
合作研究:EAGER:QIA:大规模 QAOA 量子模拟器
  • 批准号:
    2122793
  • 财政年份:
    2020
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
EAGER: SSDIM: Multiscale Methods for Generating Infrastructure Networks
EAGER:SSDIM:生成基础设施网络的多尺度方法
  • 批准号:
    1745300
  • 财政年份:
    2017
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
EAGER: Feedback-based Network Optimization for Smart Cities
EAGER:基于反馈的智慧城市网络优化
  • 批准号:
    1647361
  • 财政年份:
    2016
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
Fast and Scalable Multigrid Methods for Hypergraph Partitioning Problems
超图分区问题的快速且可扩展的多重网格方法
  • 批准号:
    1522751
  • 财政年份:
    2015
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant

相似海外基金

A Semi-Automated Antibody-Discovery Platform to Target Challenging Biomolecules
针对具有挑战性的生物分子的半自动化抗体发现平台
  • 批准号:
    MR/Y003616/1
  • 财政年份:
    2024
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Fellowship
Automated reactor platforms for accelerated discovery of next generation polymers
用于加速发现下一代聚合物的自动化反应器平台
  • 批准号:
    2911012
  • 财政年份:
    2024
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Studentship
Automated Discovery and Screening of Stimuli-Responsive Porous Liquids
刺激响应多孔液体的自动发现和筛选
  • 批准号:
    2896345
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Studentship
Computational Infrastructure for Automated Force Field Development and Optimization
用于自动力场开发和优化的计算基础设施
  • 批准号:
    10699200
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
SyncroPatch 384 Automated Patch Clamp Instrument
SyncroPatch 384 自动膜片钳仪器
  • 批准号:
    10721590
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
DMREF/Collaborative Research: Accelerated Discovery of Sustainable Bioplastics: Automated, Tunable, Integrated Design, Processing and Modeling
DMREF/合作研究:加速可持续生物塑料的发现:自动化、可调、集成设计、加工和建模
  • 批准号:
    2323976
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Audacity of Exploration: Toward Automated Discovery of Security Flaws in Networked Systems through Intelligent Documentation Analysis
协作研究:SaTC:核心:中:大胆探索:通过智能文档分析自动发现网络系统中的安全缺陷
  • 批准号:
    2409269
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
DMREF/Collaborative Research: Accelerated Discovery of Sustainable Bioplastics: Automated, Tunable, Integrated Design, Processing and Modeling
DMREF/合作研究:加速可持续生物塑料的发现:自动化、可调、集成设计、加工和建模
  • 批准号:
    2323977
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Standard Grant
Automated Model Discovery for Soft Matter
软物质的自动模型发现
  • 批准号:
    2320933
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Continuing Grant
An automated high-throughput robotic platform for accelerated battery and fuels discovery - DIGIBAT
用于加速电池和燃料发现的自动化高通量机器人平台 - DIGIBAT
  • 批准号:
    EP/W036517/1
  • 财政年份:
    2023
  • 资助金额:
    $ 10.45万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了