权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Pathway Hypotheses Knowledge-base: A Knowledge Source for the Biomedical Data Translator

路径假设知识库：生物医学数据转换器的知识源

基本信息

批准号：
10333496
负责人：
Eugene Santos
金额：
$ 66.61万
依托单位：
DARTMOUTH COLLEGE
依托单位国家：
美国
项目类别：
财政年份：
2020
资助国家：
美国
起止时间：
2020-01-24 至 2022-01-23
项目状态：
已结题

项目摘要

We propose to develop a Pathway Hypotheses Knowledgebase (PHK), a new knowledge source that will analyze and hypothesize novel relationships and interactions driven by researcher data together with the wealth of knowledge as captured by the Biomedical Data Translator project. The capability to bring together different biomedical data knowledge sources, including experimental data, in order to discover yet unknown relationships has been difficult to realize. The fundamental gap lies in a lack of rigorous and robust framework for linking or creating sophisticated lattices of relationships needed for different lines of evidence from heterogeneous knowledge sources. Such a framework must address systematic and mathematically driven algorithms for hypotheses exploration, construction, and assessment in order to bridge gaps, derive, and ultimately discover new knowledge beyond existing sources by assessing existing molecular relationships and generating additional actionable pathway information that can be queried directly or via API. A critical feature of a successful framework must include formal, welldefined mechanisms for sensitivity analyses (measures of fragility and reliability of constructed hypothesis), impact analyses (measures of importance and knowledge novelty), and parsimony analyses (wholistic measures of congruence of hypotheses). Lastly, the functional mechanism behind newly derived knowledge must be invertible and provide an unambiguous, precise, auditable provenance from the original knowledge sources serving as the basis for explainability. To realize this vision, PHK will employ a mature AI knowledge representation called Bayesian Knowledge Bases (BKBs). BKBs model knowledge, relationships, and uncertainty within a rigorous graph-based probabilistic framework capable of managing inconsistent, incomplete, and cyclic knowledge. It fully subsumes a variety of well-known models including (dynamic) Bayesian networks and temporal representations such as hidden Markov models. BKBs can be learned from data but its most critical contribution is the ability to fuse multiple BKBs and their underlying distributions without any loss of information. This inherently provides end-to-end forward to backward auditability of computational derivations which admits ready sensitivity, contribution, and impact analysis. This further leads to a formal mechanism to explore, discover, and create new hypotheses that links multiple heterogenous knowledge sources. PHK will provide a rich AI ready encoding of data, information, and knowledge from any number of new and existing sources including curated databases (e.g., NIH Cancer Genome Atlas (TCGA)) and raw experimental data. PHK’s encoding enables additional knowledge augmentation through probabilistic and statistical inferencing capabilities. This augmentation is further enhanced through knowledge unification and fusion algorithms closely coupling disparate knowledge in a well-defined, rigorous manner. Altogether, PHK can systematically discover and develop novel hypotheses. BKBs have been applied and deployed across a number of projects for over two decades with a mature software base for ready integration into the Translator framework’s target prototypes. Our multi-institutional team (Dartmouth College, Tufts University Clinical and Translational Sciences Institute (CTSI)) is comprised of senior researchers and software engineers in the computer and data sciences, cheminformatics, bioinformatics, molecular biology, and biochemistry. Dr. Eugene Santos Jr. is Professor of Engineering at Dartmouth. He will serve as the PI and will also lead in the technical development of the core BKB component for PHK. Joseph Gormley is the Director for Advanced Systems Development for Tufts/CTSI. He will serve as Project Manager for all software deliverables under this proposal. The PHK capabilities proposed herein will also be based on strategies developed by our team for scalable intelligent information retrieval where the desire for greater transparency when reasoning over experimental data is a primary aim. PHK will provide a powerful new computational representation of pathway structures and molecular components in support of both human and machine-driven interpretation and pathway-based biomarker discovery and drug development. PHK will enable more efficient joint human-machine exploration with explanation.

我们建议开发一个路径假设知识库（PHK），一个新的知识库，将分析和假设由研究人员驱动的新关系和互动的来源数据以及生物医学数据转换器项目捕获的丰富知识。汇集不同生物医学数据知识来源的能力，包括实验数据，以发现尚未未知的关系已经很难实现。的一个根本的差距在于缺乏严格和强大的框架，复杂的关系网格需要不同的证据线，知识来源。这种框架必须解决系统和数学驱动的问题，用于假设探索、构建和评估的算法，以弥合差距，通过评估现有资源，获得并最终发现超越现有资源的新知识。分子关系和产生额外的可操作的途径信息，直接或通过API查询。一个成功的框架的关键特征必须包括正式的、定义良好的敏感性分析机制（建造的建筑物的脆弱性和可靠性的措施）假设）、影响分析（重要性和知识新奇的衡量）和简约性分析（假设一致性的整体测量）。最后，功能机制新获得的知识背后必须是可逆的，并提供一个明确的，精确的，作为可解释性基础的原始知识来源的可审计出处。为了实现这一愿景，PHK将采用成熟的人工智能知识表示，称为贝叶斯知识库（BKB）。BKBs模型知识，关系和不确定性在一个严格的基于图的概率框架，能够管理不一致的，不完整的，循环知识它完全包含了各种著名的模型，包括（动态）贝叶斯网络和时间表示，如隐马尔可夫模型。BKB可以学习但其最关键的贡献是能够融合多个BKB及其底层不丢失任何信息的分发。这本质上提供了端到端转发，计算推导的后向可验证性，它承认现成的灵敏度，贡献，影响分析。这进一步导致了一个正式的机制来探索，发现和创造连接多个异质知识源的新假设。PHK将提供丰富的AI 对来自任何数量的新的和现有的来源的数据、信息和知识进行编码包括策划的数据库（例如，NIH癌症基因组图谱（TCGA））和原始实验数据PHK的编码通过概率和统计推断能力。这种增强通过知识得到进一步加强统一和融合算法紧密耦合不同的知识，在一个定义明确，严格方式总之，PHK可以系统地发现和发展新的假说。BKB有二十多年来，在许多项目中得到了应用和部署，软件基础，可随时集成到Translator框架的目标原型中。我们的多机构团队（达特茅斯学院、塔夫茨大学临床和翻译科学研究所（CTSI））由高级研究人员和软件工程师组成，计算机和数据科学，化学信息学，生物信息学，分子生物学，以及生物化学小尤金桑托斯博士是达特茅斯的工程学教授他将担任他亦会领导PHK核心BKB组件的技术发展。 Joseph Gormley是Tufts/CTSI高级系统开发总监。他将为作为项目经理，负责本提案项下的所有软件交付成果。本文提出的PHK能力也将基于我们团队制定的策略对于可扩展的智能信息检索，对实验数据进行推理是主要目的。PHK将提供一个强大的新途径结构和分子组分的计算表示，人类和机器驱动解释和基于途径的生物标志物发现和药物发展PHK将使更有效的人机联合探索与解释。