权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

A Complex Disease Genetics Knowledge Provider for Biomedical Data Translator

生物医学数据转换器的复杂疾病遗传学知识提供者

基本信息

批准号：
10705402
负责人：
Jason Flannick
金额：
$ 48.4万
依托单位：
BROAD INSTITUTE, INC.
依托单位国家：
美国
项目类别：
财政年份：
2020
资助国家：
美国
起止时间：
2020-01-23 至 2023-11-30
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10705402
关键词：
Address Aloral Architecture Bioinformatics Biological Catalogs Cell physiology Chromatin ClinVar Collaborations Collection Complex Computational Biology Computational algorithm Data Data Set Data Sources Disease Disease susceptibility Feasibility Studies Foundations Genes Genetic Genetic Diseases Genomic Segment Goals Gold Human Human Genetics Interview Knowledge Knowledge Portal Link Mediating Methodology Methods Modeling Molecular Online Mendelian Inheritance In Man Pathogenesis Pathway interactions Positioning Attribute Protocols documentation Provider Rare Diseases Resources Risk Running Science Service provision Source Specific qualifier value Susceptibility Gene System Techniques Testing Tissues Translating Translations United States National Institutes of Health Untranslated RNA Vision causal variant complex data computational pipelines computer science data analysis pipeline data harmonization data integration data translator disease classification disorder risk epigenomic profiling functional genomics genetic association genetic variant genomic data graph database human disease improved insight open source programs tool web portal

项目摘要

A major goal of the Biomedical Data Translator Program is to facilitate disease classification based on molecular and cellular abnormalities. While many experimental approaches exist to interrogate molecular or cellular processes, few can discern which among a host of potential abnormalities are relevant to disease in the human system. Genetic variants associated with disease are unique in providing molecular alterations causally related to human disease risk. There are two types of genetic associations. Rare disease associations can (usually) be clearly linked to a gene and are well represented by catalogs such as ClinVar, OMIM, and Monarch. Complex disease associations are harder to interpret because they (a) are statistical rather than qualitative and (b) usually lie in noncoding genomic regions that cannot be immediately translated to molecular or cellular abnormalities. Many complementary resources to help in the biological translation of complex disease associations have recently emerged, broadly classifiable as either “functional genomic” datasets (e.g. from epigenomic profiling or chromatin capture) or predictive bioinformatic methods (e.g. that integrate various genetic and functional genomic datasets to predict disease-susceptibility genes or pathways). These resources require expertise to curate and interpret, and there is as yet no knowledge source that integrates them to interpret complex disease associations. Furthermore, techniques for harmonizing heterogeneous functional genomic datasets with respect to one another are not yet established, most predictive bioinformatic methods specify complex data-processing pipelines that have not yet been scaled to run across many diseases, and there are few if any “gold standards” to evaluate the molecular or cellular abnormalities identified by these resources. The goal of our proposed project is to address these gaps within a complex disease genetics Knowledge Provider for Translator. We are experts in complex disease genetics and maintain the Knowledge Portal Network (KPN), a collection of open source web portals and Smart APIs that make integrated genetic and genomic datasets publicly accessible for >180 complex diseases. We have built the KPN by developing a protocol for working with disease experts to aggregate and curate high-confidence genetic datasets, building computational pipelines to harmonize these data and apply predictive bioinformatic methods upon them, and extracting relationships mined from these data into a Neo4J graph database. We propose to use the KPN as a foundation to implement a Translator Knowledge Provider of high-confidence complex disease associations and predicted disease-relevant molecular and cellular abnormalities. We will implement this Knowledge Provider by (a) expanding the data sources, data types, and bioinformatic methods integrated within the KPN; (b) developing new computational algorithms to improve the ability of genetic data to identify molecular and cellular abnormalities underlying complex disease; (c) maintaining REST services provisioning Translator with these resources; and (d) developing methodologies for evaluating the accuracy and internal consistency of these data, further curating them, and defining use cases of them within Translator. In so doing, we will enable Translator users to address questions such as: • What genes are causally linked to complex disease [X], and with what confidence? • What is the increase in risk for complex disease [X] when gene [Y] is perturbed? • What pathways are enriched for associations with complex disease [X]? • What tissues mediate the pathogenesis of complex disease [X]? • What other diseases are genetically correlated with complex disease [X]? We participated in the Translator feasibility study and contributed important insights to the project vision including (a) a unifying architectural model of Translator (based on interviews with each Translator team) closely followed by OTA-19-009; (b) the concept of Translator as a tool to augment (rather than replace) human reasoning; and (c) the idea of a “Turing test” to evaluate Translator capabilities. Our expertise in human genetics and hypothesis-driven science, but also computer science and computational biology, ideally positions us to collaborate with NIH staff and other awardees to help guide Translator data integration in a scientifically rigorous manner.

生物医学数据翻译器计划的一个主要目标是促进疾病分类基于分子和细胞的异常。虽然存在许多实验方法来询问分子或细胞过程，很少有人能分辨出一系列潜在的异常与人类系统中的疾病有关。与之相关的遗传变异疾病在提供与人类疾病风险相关的因果关系的分子变化方面是独一无二的。有两种类型的遗传关联。罕见疾病的关联可能(通常) 清楚地与一个基因相关联，并由ClinVar、OMIM和君主。复杂的疾病关联更难解释，因为它们(A)是统计学上的而不是定性的和(B)通常存在于不能被编码的基因组区域立即转化为分子或细胞异常。许多补充资源，以最近出现了在复杂疾病关联的生物翻译方面的帮助，可广泛地分类为“功能基因组”数据集(例如，来自表观基因组图谱或染色质捕获)或预测性生物信息学方法(例如，将各种遗传和预测疾病易感基因或途径的功能基因组数据集)。这些资源需要专业知识来策划和解释，但目前还没有知识来源这就整合了它们来解释复杂的疾病关联。此外，技术还包括协调彼此之间不同的功能基因组数据集还没有公认的、最具预测性的生物信息学方法指定了复杂的数据处理管道它们还没有被扩展到可以跨越许多疾病，而且即使有“黄金”，也是很少的标准“，以评估由这些资源确定的分子或细胞异常。我们提议的项目的目标是解决综合体中的这些差距为翻译提供疾病遗传学知识。我们是复杂疾病方面的专家遗传学和维护知识门户网络(KPN)，这是一个开放源码网络的集合门户和智能API，使集成的遗传和基因组数据集可公开访问治疗&gt；180种复杂疾病。我们已经通过开发一种用于疾病专家聚合和管理高置信度基因数据集，构建协调这些数据并应用预测性生物信息学方法的计算管道并将从这些数据中挖掘的关系提取到Neo4J图形数据库中。我们建议使用KPN作为基础来实现一个翻译知识提供者高置信度的复杂疾病关联和预测的疾病相关分子和细胞异常。我们将通过(A)扩展数据来实现该知识提供者将来源、数据类型和生物信息学方法纳入知识网络；(B)开发新的提高遗传数据识别分子和细胞能力的计算算法复杂疾病的潜在异常；(C)维持REST服务供应使用这些资源的翻译；(D)开发评估准确性的方法以及这些数据的内部一致性，进一步管理这些数据并定义它们的用例在翻译器内。在此过程中，我们将使翻译员用户能够解决以下问题： ·哪些基因与复杂疾病有因果关系[X]，信心如何？ ·当基因[Y]受到干扰时，患复杂疾病的风险增加了多少[X]？ ·哪些途径丰富了与复杂疾病的联系[X]？ ·哪些组织参与复杂疾病的发病机制[X]？ ·哪些其他疾病与复杂疾病有遗传关联[X]？我们参与了翻译机可行性研究，并为项目愿景包括：(A)统一的翻译体系结构模型(基于与每个翻译团队)紧随其后的是OTA-19-009；(B)将翻译作为一种工具的概念增强(而不是取代)人类的推理能力；以及(C)“图灵测试”的想法翻译器功能。我们在人类遗传学和假说驱动的科学方面的专业知识，但也计算机科学和计算生物学，为我们与NIH员工合作提供了理想的定位以及其他获奖者，帮助指导翻译人员以科学严谨的方式整合数据。