权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Illuminating the Druggable Genome by Knowledge Graphs

通过知识图阐明可药物基因组

基本信息

批准号：
10348825
负责人：
CHRISTOPHER J MUNGALL
金额：
$ 53.66万
依托单位：
JACKSON LABORATORY
依托单位国家：
美国
项目类别：
财政年份：
2019
资助国家：
美国
起止时间：
2019-03-01 至 2022-02-28
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10348825
关键词：
Address Algorithms Aloral Amino Acids Animal Model Antineoplastic Agents Area Binding Binding Sites Bioinformatics Biological Biological Models Cancer Model Catalogs Categories Clinical Code Computer Analysis Computer software Data Data Sources Disease Documentation Drug Design Drug Targeting Emerging Technologies Enzymes FDA approved Future Gene Targeting Genes Genome Genomics Goals Graph Human Human Genome Information Networks Information Resources Management Investigation Knowledge Libraries Link Machine Learning Medical Medicine Molecular Biology Ontology Outcome Outcomes Research Pathology Pattern Pharmaceutical Preparations Phenotype Phosphotransferases Pilot Projects Process Protein Kinase Proteins Public Health Pythons Research Resources Scientist Semantics Signal Transduction System The Jackson Laboratory Training Validation anti-cancer base cheminformatics computer science computer studies computing resources dark matter deep learning design disease phenotype drug discovery drug mechanism drug repurposing gene function gene therapy genome resource high risk human disease improved inorganic phosphate knowledge base knowledge graph knowledge integration learning algorithm machine learning algorithm machine learning method mouse model new therapeutic target novel novel drug class open source patient derived xenograft model protein kinase inhibitor protein kinase modulator real world application small molecule tool validation studies

项目摘要

PROJECT SUMMARY / ABSTRACT About 1500 of the ~20,000 protein-coding genes of the human genome can bind drug-like molecules, and yet only about 600 are currently targeted by FDA-approved drugs. Therefore, at least 930 proteins are potential drug targets that are not yet being utilized for human medicine and, given our incomplete state of knowledge about the human genome, the actual number could be much higher. There is therefore a substantial unmet need to improve our understanding of this so-called genomic dark matter in order to develop novel classes of drugs to improve treatment of disease. Comprehensive experimental investigation of these proteins in the context of hundreds of thousands of compounds and thousands of diseases would be prohibitively expensive, but computational approaches could significantly refine the list. In this project we will apply two sophisticated computational approaches to the task of predicting the most promising novel drug targets. We will integrate the knowledge bases DrugCentral and other resources with the disease and phenotype knowledge base of the Monarch Initiative into a semantically harmonized knowledge graph (KG). This will result in a KG with comprehensive coverage of diseases, genes, gene functions, phenotypic abnormalities, drugs, drug mechanisms, and drug targets. Machine learning (ML) identifies patterns from training sets and applies the patterns to predict entities and relations in new data. ML using KGs has become a hot new research area in computer science, but remains difficult to use for real-world applications, owing to the lack of adequate software packages. We will therefore implement state-of-the art learning algorithms based on deep learning on KGs by extending and adapting selected algorithms to the task of drug and drug target discovery. We will develop an easy-to-use software library and demonstrate its use by means of notebooks that will be designed to serve as starting points for future computational research by other scientists, since they will contain the analysis workflow along with documentation about each step. The human genome codes more than 500 protein kinases, which are enzymes that add a phosphate group to specific amino acid residues and thereby transmit a biological signal. There are currently 35 FDA approved protein kinase modulators acting on 38 protein kinases, which are thus one of the most important groups of druggable proteins encoded by our genome. We will perform a detailed computational study of this group and experimentally validate our top, novel candidate using a patient-derived xenograft model system.

项目摘要/摘要在人类基因组的约20,000个蛋白质编码基因中，约有1500个基因可以结合类药物分子，但目前只有大约600人是FDA批准的药物的靶标。因此，至少有930个蛋白质是潜在的药物。尚未用于人类医学的靶标，鉴于我们对人类基因组的实际数量可能要高得多。因此，存在着大量未得到满足的需求提高我们对这种所谓的基因组暗物质的理解，以开发新的药物类别来提高疾病的治疗水平。对这些蛋白质的全面实验研究数十万种化合物和数千种疾病将是令人望而却步的昂贵，但计算方法可能会极大地完善这份清单。在这个项目中，我们将应用两个复杂的预测最有希望的新型药物靶点任务的计算方法。我们将整合知识库DrugCentral和其他带有疾病和表型知识库的资源君主倡议转变为语义协调的知识图谱(KG)。这将导致KG具有全面覆盖疾病、基因、基因功能、表型异常、药物、药物机制和药物靶标。机器学习(ML)从训练集中识别模式，并将预测新数据中的实体和关系的模式。使用KGS的ML已成为一个热门的新研究领域计算机科学，但由于缺乏适当的软件，仍然难以用于现实世界的应用程序包裹。因此，我们将在KGS上实现基于深度学习的最先进的学习算法将选定的算法扩展和调整到药物和药物靶标发现任务中。我们将开发一种易于使用的软件库，并通过笔记本电脑演示其使用方法，该笔记本电脑将设计为其他科学家未来计算研究的起点，因为它们将包含分析工作流以及有关每个步骤的文档。人类基因组编码了500多个蛋白激酶，其中是将磷酸基团添加到特定氨基酸残基从而传递生物信号的酶。目前有35种FDA批准的蛋白激酶调节剂作用于38种蛋白激酶，因此我们基因组编码的一组最重要的可用药蛋白质。我们将执行一项详细的对这一组进行计算研究，并通过实验验证我们最好的、新颖的候选对象，使用患者派生的异种移植模型系统。