CRII: III: A Scalable Framework for Debugging Large Biological Ontologies
CRII:III:用于调试大型生物本体的可扩展框架
基本信息
- 批准号:1657306
- 负责人:
- 金额:$ 15.1万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-03-01 至 2019-02-28
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
To capitalize on the transformative opportunities of the increasingly large amounts of digital data produced by the biological research community, we need to systematically adopt data and metadata standards, such as the Gene Ontology (GO). Because of GO?s fundamental role in codifying, managing, and sharing biological knowledge, quality issues, if not addressed, can cause misleading results or missed biological discoveries. Enhancing the quality of ontological systems such as GO, though a challenging and arduous task, can directly impact the very foundation of data-intensive research discovery. Most existing quality assurance approaches for GO have focused on the enrichment of concepts in order to keep pace with the rapidly evolving biological knowledge. However, critical structural information represented by relations has been largely ignored in existing quality assurance approaches, making them inadequate for their intended roles. Principled, scalable, and automated approaches that can debug GO to generate programmable (rather than manual) suggestions, if successful, can be a game changer in developing a new generation of methods for enhancing the quality of GO. The PI proposes a Subsumption-based Sub-term Inference Framework, SSIF, for auditing the GO by leveraging both its underlying graph structure and a novel term-algebra. SSIF combines the biological knowledge embedded in terms, sub-terms, and relationships captured in GO that can automatically detect semantic inconsistencies and generate change suggestions for future versions of GO.In order to enhance the quality of the Gene Ontology and other biomedical ontologies, the PI proposes development of a Subsumption-based Sub-term Inference Framework, SSIF. The SSIF includes three main components: (1) a sequence-based representation of GO concept terms by using part-of-speech parsing and sub-concept matching; (2) the formulation of algebraic operations for the development of a term-algebra combining this sequence-based representation with antonyms and subsumption-based longest subsequence alignment; and (3) the construction of a set of conditional rules for backward subsumption inference aimed at uncovering semantic inconsistencies in GO and other ontological structures. SSIF will be implemented using scalable computational algorithms and applied to the GO distributions provided by the Gene Ontology Consortium. Two algorithmic strategies will be explored to perform large-scale, backward subsumption inference on GO using the conditional rules: (1) exhaustive, all concept pairs, and (2) the subspace of concept pairs within a special type of induced substructures called non-lattice subgraphs. If an existing relation in GO is inconsistent with the consequence of the conditional rules, it represents a likely candidate of error. The uncovered semantic inconsistencies based on a collection of conditional rules have the potential to automatically reveal local ?bugs? as well as potential systemic patterns for review and revision, to enhance the quality of GO and other biomedical ontologies.
为了利用生物学研究界生产的越来越多的数字数据的变革机会,我们需要系统地采用数据和元数据标准,例如基因本体论(GO)。由于GO的基本作用在编纂,管理和共享生物学知识中,因此质量问题(如果没有解决)可能会引起误导性结果或错过的生物学发现。尽管具有挑战性和艰巨的任务,提高了GO等本体论系统的质量,可以直接影响数据密集型研究发现的基础。 GO的大多数现有质量保证方法都集中在概念的丰富上,以与快速发展的生物学知识保持同步。但是,在现有的质量保证方法中,由关系代表的关键结构信息在很大程度上被忽略了,这使它们因其预期的角色不足。可以调试的原则性,可扩展性和自动化方法可以生成可编程(而不是手动)建议(如果成功的话)可以改变游戏规则,以开发新一代的方法来增强GO的质量。 PI提出了一个基于亚基的子末项推理框架SSIF,用于通过利用其基础图结构和新颖的项代数来审核GO。 SSIF结合了GO中捕获的术语,子接种和关系嵌入的生物学知识,可以自动检测语义上的不一致,并为未来版本的GO.为了提高基因本体论和其他生物医学本体学的质量而产生变化建议,PI提出了基于基于亚物种的基于基于子的子框架的质量。 SSIF包括三个主要组成部分:(1)通过使用词性解析和子概念匹配的匹配,基于序列的GO概念术语表示; (2)代数操作的制定,以开发结合此基于序列的表示形式的术语 - 代数和基于亚基的最长子序列比对; (3)构建一组有条件的规则,用于向后推理,旨在发现GO和其他本体论结构中的语义不一致。 SSIF将使用可扩展的计算算法实施,并应用于基因本体论联盟提供的GO分布。将探索两种算法策略,以使用条件规则对GO进行大规模,向后的推断:(1)详尽的,所有概念对和(2)(2)在一种特殊类型的诱导子结构中的概念对子空间,称为非务必的子结构。如果GO中的现有关系与条件规则的结果不一致,则代表可能的错误候选人。基于有条件规则的集合,未发现的语义不一致有可能自动揭示本地?错误?以及用于审查和修订的潜在系统模式,以提高GO和其他生物医学本体的质量。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Quality Assurance of NCI Thesaurus by Mining Structural-Lexical Patterns
通过挖掘结构词汇模式保证 NCI 同义词库的质量
- DOI:
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Abeysinghe, Rashmie;Brooks, Michael A;Talbert, Jeffery;Cui, Licong
- 通讯作者:Cui, Licong
Exploring Deep Learning-based Approaches for Predicting Concept Names in SNOMED CT
- DOI:10.1109/bibm.2018.8621076
- 发表时间:2018-01-01
- 期刊:
- 影响因子:0
- 作者:Zheng, Fengbo;Cui, Licong
- 通讯作者:Cui, Licong
Auditing Subtype Inconsistencies among Gene Ontology Concepts
- DOI:10.1109/bibm.2017.8217835
- 发表时间:2017-01-01
- 期刊:
- 影响因子:0
- 作者:Abeysinghe, Rashmie;Hinderer, Eugene W., III;Cui, Licong
- 通讯作者:Cui, Licong
Identifying Similar Non-Lattice Subgraphs in Gene Ontology based on Structural Isomorphism and Semantic Similarity of Concept Labels
基于概念标签的结构同构和语义相似性识别基因本体中相似的非格子图
- DOI:
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Abeysinghe, Rashmie;Qu, Xufeng;Cui, Licong
- 通讯作者:Cui, Licong
Query-constraint-based Association Rule Mining from Diverse Clinical Datasets in the National Sleep Research Resource
- DOI:10.1109/bibm.2017.8217834
- 发表时间:2017-01-01
- 期刊:
- 影响因子:0
- 作者:Abeysinghe, Rashmie;Cui, Licong
- 通讯作者:Cui, Licong
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Licong Cui其他文献
Identifying Sleep-Related Factors Associated with Cognitive Function in a Hispanics/Latinos Cohort: A Dual Random Forest Approach
识别西班牙裔/拉丁裔群体中与认知功能相关的睡眠相关因素:双随机森林方法
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Xiaojin Li;Licong Cui;Fei Wang;P. Schulz;Guo - 通讯作者:
Guo
Ontology-guided Health Information Extraction, Organization, and Exploration
本体引导的健康信息提取、组织和探索
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Licong Cui - 通讯作者:
Licong Cui
Design and Implementation of a Comprehensive Web-based Survey for Ovarian Cancer Survivorship with an Analysis of Prediagnosis Symptoms via Text Mining
设计和实施基于网络的卵巢癌生存综合调查,并通过文本挖掘分析诊断前症状
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:2
- 作者:
Jiayang Sun;K. Bogie;Joseph Teagno;Yu;Rebecca R. Carter;Licong Cui;Guoqiang Zhang - 通讯作者:
Guoqiang Zhang
Multi-topic assignment for exploratory navigation of consumer health information in NetWellness using formal concept analysis
使用形式概念分析对 NetWellness 中的消费者健康信息进行多主题分配探索性导航
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:3.5
- 作者:
Licong Cui;Rong Xu;Zhihui Luo;S. Wentz;Kyle Scarberry;Guo - 通讯作者:
Guo
A Data Capture Framework for Large-scale Interventional Studies with Survey Workflow Management
具有调查工作流程管理的大规模干预研究的数据捕获框架
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Shiqiang Tao;Ningzhou Zeng;Xi Wu;Wei Zhu;Xiaojin Li;Licong Cui;Guoqiang Zhang - 通讯作者:
Guoqiang Zhang
Licong Cui的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Licong Cui', 18)}}的其他基金
CAREER: Advancing the Role of Ontologies for Data Science in Biomedicine
职业:推进数据科学本体在生物医学中的作用
- 批准号:
2047001 - 财政年份:2021
- 资助金额:
$ 15.1万 - 项目类别:
Continuing Grant
III: Small: Methods for Auditing and Enhancing Completeness of Ontologies
III:小:审计和增强本体完整性的方法
- 批准号:
1931134 - 财政年份:2019
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
III: Small: Methods for Auditing and Enhancing Completeness of Ontologies
III:小:审计和增强本体完整性的方法
- 批准号:
1816805 - 财政年份:2018
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
相似国自然基金
人工湿地铁循环驱动As(III)氧化的过程调控及其强化除砷机制
- 批准号:52370204
- 批准年份:2023
- 资助金额:51 万元
- 项目类别:面上项目
III-E型CRISPR-Cas系统的结构生物学及其应用研究
- 批准号:32371276
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
乙肝肝纤维化进程咪唑丙酸通过mTORC1通路调控III型固有淋巴细胞糖脂代谢重编程及机制研究
- 批准号:82370622
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
生物炭表面结构属性对Fe(II)氧化诱导As(III)氧化截污的影响机制
- 批准号:42307492
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
铁载体与Fe(III)相互作用过程的铁同位素分馏及机理的模拟实验研究
- 批准号:42377264
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
CRII: III: Multiresolution Tensor Learning for Scalable and Interpretable Spatiotemporal Analysis
CRII:III:用于可扩展和可解释时空分析的多分辨率张量学习
- 批准号:
2037745 - 财政年份:2020
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
CRII: III: Scalable Noise-filtering and Community Queries on User-generated Data
CRII:III:可扩展的噪声过滤和对用户生成数据的社区查询
- 批准号:
1849971 - 财政年份:2019
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
CRII: III: Multiresolution Tensor Learning for Scalable and Interpretable Spatiotemporal Analysis
CRII:III:用于可扩展和可解释时空分析的多分辨率张量学习
- 批准号:
1850349 - 财政年份:2019
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
CRII: III: A Scalable Probabilistic Model Selection Method for Deep Learning in Gene-Protein Network Inference and Integration
CRII:III:基因-蛋白质网络推理和集成中深度学习的可扩展概率模型选择方法
- 批准号:
1850492 - 财政年份:2019
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant
CRII: III: Scalable and Interactive Dependency Visualization to Accelerate Parallel Program Analysis
CRII:III:可扩展和交互式依赖关系可视化,以加速并行程序分析
- 批准号:
1656958 - 财政年份:2017
- 资助金额:
$ 15.1万 - 项目类别:
Standard Grant