Multi-modal data integration to identify kinase substrates
多模式数据集成识别激酶底物
基本信息
- 批准号:10451941
- 负责人:
- 金额:$ 49.85万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-05 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressBindingBiochemicalBioinformaticsBiologyCell CycleCell Cycle RegulationCell physiologyCentral Nervous System DiseasesComputer softwareComputing MethodologiesConsensusDataData SourcesDevelopmentDiseaseDrug TargetingFDA approvedG-Protein-Coupled ReceptorsGene ExpressionGenetic TranscriptionGenomeGoalsHumanHuman BiologyIndividualInternetIon ChannelKnowledgeLightingMalignant NeoplasmsMetabolic DiseasesMethodologyMethodsModalityModelingNamesNeuraxisPharmacologyPhosphorylationPhosphotransferasesPhysiologicalProtein DynamicsProtein FamilyProtein KinaseProtein Kinase InteractionProteinsProteomeSignal TransductionSubstrate InteractionTestingTrainingWorkbasecomputer frameworkdata integrationdata repositorydata sharingdiverse datadrug developmentdrug discoveryexperienceflexibilityimprovedmachine learning methodmachine learning predictionmembermultidisciplinarymultimodal datanovelpredictive modelingprotein structuresmall moleculesoftware repositoryweb server
项目摘要
PROJECT SUMMARY
Kinases are involved in a variety of physiological functions, such as signal transduction, transcription,
development, and cell cycle regulation. Thus, dysregulation of protein kinases is associated with a range of
diseases, including cancer, metabolic diseases, and central nervous system disorders. More than 60 drugs
targeting kinases have been approved by the FDA, making them one of the most druggable protein families.
Despite their biomedical importance, a large group of human protein kinases remains highly understudied. These
proteins, often referred to as “dark kinases”, including by the Illuminating the Druggable Genome (IDG), have
limited knowledge of their substrate(s), which ultimately determine their cellular function. To address this
challenge, we will develop a novel computational framework to predict kinase-substrate interactions by
combining biologically relevant multi-modal data sources with cutting-edge machine learning methodologies.
Specifically, we will first derive features that quantify potential interactions between kinases and substrates from
diverse data sources, such as protein structure and dynamics, gene expression profiles, protein-protein and
protein-small molecule interaction networks, and evolutionary information (Aim 1). We will then develop
predictors of kinase-substrate interactions using an powerful machine learning methodology named Ensemble
Integration (EI; Aim 2). EI is based on the concept of heterogeneous ensembles that can aggregate an
unrestricted number and variety of base predictors derived from the above diverse data sources, and can benefit
from both the consensus and the diversity among these predictors. Due to its flexibility, EI is able to produce
more accurate predictions from multi-modal datasets than other established data integration methodologies, as
is expected for our project as well. Finally, we will evaluate the kinase-substrate interactions predicted by the EI-
based predictive model developed in Aim 2 using both computational and experimental methods (Aim 3). We
will also share the experimentally validated interactions, the most confident predictions from the EI model, and
all the data and software generated during this project through our KinaMetrix web server, as well as other public
data and software repositories. At its culmination, this project will produce novel and validated computational
methods and software to predict substrates of kinases, validated and high-confidence kinase-substrate
interactions for IDG dark kinases, and a public web server (KinaMetrix) to share these products. We expect that
these products will be highly useful for the study of dark kinases, especially in the IDG effort, as well as to better
understand kinase function and improve their utilization in drug development efforts. Our approach is also
expected to be generally applicable to other druggable protein families, such as ion channels and GPCRs.
项目总结
激酶参与多种生理功能,如信号转导、转录、
发育和细胞周期调节。因此,蛋白激酶的失调与一系列
疾病,包括癌症、代谢性疾病和中枢神经系统疾病。60多种药物
靶向激酶已经得到FDA的批准,使其成为最具可药性的蛋白质家族之一。
尽管它们在生物医学上具有重要意义,但一大群人类蛋白激酶仍未得到充分研究。这些
蛋白质,通常被称为“暗激酶”,包括照亮可药物基因组(IDG),有
对它们的底物(S)的了解有限,这最终决定了它们的细胞功能。要解决这个问题
挑战,我们将开发一种新的计算框架来预测激酶-底物相互作用,通过
将生物相关的多模式数据源与尖端的机器学习方法相结合。
具体地说,我们将首先从以下方面推导出量化激酶和底物之间潜在相互作用的特征
各种数据来源,如蛋白质结构和动力学、基因表达谱、蛋白质-蛋白质和
蛋白质-小分子相互作用网络和进化信息(目标1)。然后我们将开发出
使用一种名为集成的强大机器学习方法预测激酶-底物相互作用
一体化(EI;目标2)。EI基于异质合奏的概念,该合奏可以聚合
从上述不同数据来源获得的基本预测值的数量和种类不受限制,并且可以受益
从这些预测者之间的共识和多样性来看。由于其灵活性,Ei能够生产
与其他已建立的数据集成方法相比,来自多模式数据集的预测更准确,例如
对我们的项目来说也是如此。最后,我们将评估EI预测的激酶-底物相互作用。
在目标2中使用计算和实验方法开发的基于预测的模型(目标3)。我们
还将分享经过实验验证的交互作用、来自EI模型的最有信心的预测,以及
项目期间通过我们的KinaMetrix Web服务器以及其他公共服务器生成的所有数据和软件
数据和软件存储库。在它的顶峰,这个项目将产生新的和经过验证的计算
预测酶的底物的方法和软件,验证和高置信度的酶-底物
IDG深色蛋白的交互,以及共享这些产品的公共网络服务器(KinaMetrix)。我们期待着
这些产品将对研究暗蛋白激酶非常有用,特别是在IDG的努力中,以及更好地
了解激酶的功能,并提高其在药物开发工作中的利用率。我们的方法也是
预计将普遍适用于其他可药物蛋白家族,如离子通道和GPCRs。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Gaurav Pandey其他文献
Gaurav Pandey的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Gaurav Pandey', 18)}}的其他基金
Multi-modal data integration to identify kinase substrates
多模式数据集成识别激酶底物
- 批准号:
10659156 - 财政年份:2022
- 资助金额:
$ 49.85万 - 项目类别:
Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles
使用异质集合整合基因组和临床数据来预测疾病表型
- 批准号:
10218766 - 财政年份:2021
- 资助金额:
$ 49.85万 - 项目类别:
Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles
使用异质集合整合基因组和临床数据来预测疾病表型
- 批准号:
10589827 - 财政年份:2021
- 资助金额:
$ 49.85万 - 项目类别:
Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles
使用异质集合整合基因组和临床数据来预测疾病表型
- 批准号:
10409755 - 财政年份:2021
- 资助金额:
$ 49.85万 - 项目类别:
Boosting the Translational Impact of Scientific Competitions by Ensemble Learning
通过集成学习提升科学竞赛的转化影响
- 批准号:
8864679 - 财政年份:2015
- 资助金额:
$ 49.85万 - 项目类别:
相似国自然基金
酵母多糖与真菌毒素结合机理研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
结合蛋白对PEP装载的影响及其在光信号通路中的功能研究
- 批准号:LZ21C020001
- 批准年份:2020
- 资助金额:0.0 万元
- 项目类别:省市级项目
拟南芥中EIN2蛋白调控mRNA翻译并激活乙烯信号的生化机制研究
- 批准号:31870254
- 批准年份:2018
- 资助金额:60.0 万元
- 项目类别:面上项目
大米蛋白/阿魏酸的结合机制对复合物的抗氧化及模拟胃肠消化性能的调控研究
- 批准号:31760433
- 批准年份:2017
- 资助金额:38.0 万元
- 项目类别:地区科学基金项目
拟南芥fimbrin5调控花粉管生长的细胞学基础和生化机制分析
- 批准号:31671390
- 批准年份:2016
- 资助金额:60.0 万元
- 项目类别:面上项目
拟南芥微丝解聚因子第三亚家族成员生理生化功能研究
- 批准号:31670180
- 批准年份:2016
- 资助金额:65.0 万元
- 项目类别:面上项目
结合神经分类的分子超光谱成像生化指标定量分析研究
- 批准号:61240006
- 批准年份:2012
- 资助金额:10.0 万元
- 项目类别:专项基金项目
结合合成生物学和反应器技术的生物制氢研究
- 批准号:21176153
- 批准年份:2011
- 资助金额:65.0 万元
- 项目类别:面上项目
耐辐射奇球菌类胡萝卜素结合蛋白的生化功能及其表达调控研究
- 批准号:31170079
- 批准年份:2011
- 资助金额:60.0 万元
- 项目类别:面上项目
食管鳞癌中候选抑癌蛋白14-3-3σ结合蛋白的蛋白质组研究
- 批准号:30700366
- 批准年份:2007
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Biochemical characterization of an inflammation related protein, mTOC (Celastramycin binding protein)
炎症相关蛋白 mTOC(西拉霉素结合蛋白)的生化特征
- 批准号:
17K07346 - 财政年份:2017
- 资助金额:
$ 49.85万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Characterization of the impact of Arginine Methylation of RNA Binding Proteins on Their Biochemical
RNA 结合蛋白精氨酸甲基化对其生化影响的表征
- 批准号:
511321-2017 - 财政年份:2017
- 资助金额:
$ 49.85万 - 项目类别:
University Undergraduate Student Research Awards
Biochemical & Genetic Analysis of Low Complexity Domains in RNA-binding protein biology
生化
- 批准号:
9335978 - 财政年份:2016
- 资助金额:
$ 49.85万 - 项目类别:
Biochemical & Genetic Analysis of Low Complexity Domains in RNA-binding protein biology
生化
- 批准号:
9158657 - 财政年份:2016
- 资助金额:
$ 49.85万 - 项目类别:
EAGER: Biochemical Mechanism of Oomycete RXLR Effector Binding to PI3P
EAGER:卵菌 RXLR 效应子与 PI3P 结合的生化机制
- 批准号:
1449122 - 财政年份:2014
- 资助金额:
$ 49.85万 - 项目类别:
Standard Grant
Biochemical analysis of plant calcium-binding proteins
植物钙结合蛋白的生化分析
- 批准号:
448832-2013 - 财政年份:2013
- 资助金额:
$ 49.85万 - 项目类别:
University Undergraduate Student Research Awards
Genetic and biochemical analysis of the CaMK family of calmodulin-binding kinases in root and nodule function of Glycine max and Medicago truncatula
钙调蛋白结合激酶 CaMK 家族在大豆和蒺藜苜蓿根和根瘤功能中的遗传和生化分析
- 批准号:
409766-2011 - 财政年份:2013
- 资助金额:
$ 49.85万 - 项目类别:
Postgraduate Scholarships - Doctoral
Genetic and biochemical analysis of the CaMK family of calmodulin-binding kinases in root and nodule function of Glycine max and Medicago truncatula
钙调蛋白结合激酶 CaMK 家族在大豆和蒺藜苜蓿根和根瘤功能中的遗传和生化分析
- 批准号:
409766-2011 - 财政年份:2012
- 资助金额:
$ 49.85万 - 项目类别:
Postgraduate Scholarships - Doctoral
Biochemical, cellular and molecular studies to dissect the contribution of the soluble host carbohydrate binding proteins to HIV-1 pathogenesis
生化、细胞和分子研究,剖析可溶性宿主碳水化合物结合蛋白对 HIV-1 发病机制的贡献
- 批准号:
239201 - 财政年份:2011
- 资助金额:
$ 49.85万 - 项目类别:
Operating Grants
Genetic and biochemical analysis of the CaMK family of calmodulin-binding kinases in root and nodule function of Glycine max and Medicago truncatula
钙调蛋白结合激酶 CaMK 家族在大豆和蒺藜苜蓿根和根瘤功能中的遗传和生化分析
- 批准号:
409766-2011 - 财政年份:2011
- 资助金额:
$ 49.85万 - 项目类别:
Postgraduate Scholarships - Doctoral