Random Forest Prediction of Protein-Ligand Binding Affinities

蛋白质-配体结合亲和力的随机森林预测

基本信息

  • 批准号:
    BB/G000247/1
  • 负责人:
  • 金额:
    $ 10.28万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2009
  • 资助国家:
    英国
  • 起止时间:
    2009 至 无数据
  • 项目状态:
    已结题

项目摘要

The binding affinity between a small molecule ligand and the protein with which it interacts is not easy to calculate. Indeed, its computational prediction remains one of the most important and difficult unsolved problems in computational biochemical science. Most medicines, and many other molecules in uses from agrochemicals to deodorants, are ligands that bind to proteins. The proteins may be from the human, or from a pathogenic or undesirable organism such as a bacterium. It would be very beneficial to be able to predict binding affinities using a computer, because the alternative experimental approach of making very many molecules and assaying them against the relevant protein or proteins is difficult, expensive and time-consuming. The computer calculates an estimated binding affinity using a mathematical formula known as a scoring function. The development of suitable scoring functions for ranking possible three dimensional protein-ligand interaction geometries, and especially for accurate prediction of protein-ligand binding affinities, remains a considerable challenge. The scoring function must capture all the important aspects of the interaction in order to give an accurate and reliable prediction of the binding affinity. In order to develop better scoring functions, we are looking to the fields of machine learning and informatics, and will require the known binding affinities and structures of numerous well-characterised protein-ligand complexes. Fortunately, many hundreds of protein-ligand complexes have both structures and binding affinities available. The method we will use is called Random Forest. The forest is a set of several hundred 'decision trees', each of which is basically a flow diagram. We will train them to learn patterns in the known properties of existing protein-ligand complexes, their binding affinities and their patterns of atom-atom interaction distances. However, the way in which we will generate the trees involves computer-simulated dice-rolling. This will ensure that they are all different, though based on the same underlying information. The decision trees then each made a prediction of the unknown binding affinity. These predictions are averaged to give the final computed value. This averaging over many decision trees maximises the use of the information contained in the underlying data and produces results which are much more accurate than those of any one decision tree. Our models will be validated by using them to predict binding affinities of protein-ligand complexes that the algorithm has not seen before. This ensures that the computer is not simply learning the idiosyncrasies of the data on which it is being trained.
小分子配体与其相互作用的蛋白质之间的结合亲和力不容易计算。事实上,它的计算预测仍然是计算生物化学科学中最重要和最困难的未解决问题之一。大多数药物,以及从农用化学品到杀虫剂的许多其他分子,都是与蛋白质结合的配体。所述蛋白质可以来自人,或来自病原性或不期望的生物体,如细菌。能够使用计算机预测结合亲和力将是非常有益的,因为制备非常多的分子并针对相关蛋白质或蛋白质测定它们的替代实验方法是困难的、昂贵的和耗时的。计算机使用称为评分函数的数学公式计算估计的结合亲和力。开发合适的评分功能,用于排列可能的三维蛋白质-配体相互作用的几何形状,特别是用于准确预测蛋白质-配体结合亲和力,仍然是一个相当大的挑战。评分函数必须捕获相互作用的所有重要方面,以便对结合亲和力进行准确可靠的预测。为了开发更好的评分功能,我们正在寻找机器学习和信息学领域,并将需要许多良好表征的蛋白质-配体复合物的已知结合亲和力和结构。幸运的是,数百种蛋白质-配体复合物具有可用的结构和结合亲和力。我们将使用的方法称为随机森林。森林是一组几百棵“决策树”,每棵树基本上都是一个流程图。我们将训练他们学习现有蛋白质-配体复合物的已知性质的模式,它们的结合亲和力和原子-原子相互作用距离的模式。然而,我们生成树的方式涉及计算机模拟骰子滚动。这将确保它们都是不同的,尽管基于相同的基本信息。然后,每个决策树对未知的结合亲和力进行预测。将这些预测平均以给出最终计算值。这种对许多决策树的平均最大限度地利用了底层数据中包含的信息,并产生比任何一棵决策树更准确的结果。我们的模型将通过使用它们来预测该算法以前从未见过的蛋白质-配体复合物的结合亲和力来进行验证。这确保了计算机不是简单地学习它正在训练的数据的特性。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Hierarchical virtual screening for the discovery of new molecular scaffolds in antibacterial hit identification.
  • DOI:
    10.1098/rsif.2012.0569
  • 发表时间:
    2012-12-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ballester PJ;Mangold M;Howard NI;Robinson RL;Abell C;Blumberger J;Mitchell JB
  • 通讯作者:
    Mitchell JB
Informatics, machine learning and computational medicinal chemistry.
信息学、机器学习和计算药物化学。
  • DOI:
    10.4155/fmc.11.11
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    4.2
  • 作者:
    Mitchell JB
  • 通讯作者:
    Mitchell JB
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

John Mitchell其他文献

The Origin, Nature, and Importance of Soil Organic Constituents having Base Exchange Properties 1
具有碱交换特性的土壤有机成分的起源、性质和重要性 1
  • DOI:
    10.2134/agronj1932.00021962002400040002x
  • 发表时间:
    1932
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    John Mitchell
  • 通讯作者:
    John Mitchell
Securing the Future of GenAI: Policy and Technology
确保 GenAI 的未来:政策和技术
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mihai Christodorescu;Google Ryan;Craven;S. Feizi;Neil Gong;Mia Hoffmann;Somesh Jha;Zhengyuan Jiang;Mehrdad Saberi Kamarposhti;John Mitchell;Jessica Newman;Emelia Probasco;Yanjun Qi;Khawaja Shams;Google Matthew;Turek
  • 通讯作者:
    Turek
The creativity quotient: An objective scoring of ideational fluency
创造力商数:思想流畅性的客观评分
  • DOI:
    10.1080/10400410409534552
  • 发表时间:
    2004
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    A. Snyder;John Mitchell;T. Bossomaier;G. Pallier
  • 通讯作者:
    G. Pallier
Uncertainty in the IPCC's Third Assessment Report
IPCC第三次评估报告的不确定性
  • DOI:
    10.1126/science.1062823
  • 发表时间:
    2001
  • 期刊:
  • 影响因子:
    56.9
  • 作者:
    M. Allen;S. Raper;John Mitchell
  • 通讯作者:
    John Mitchell
Identification of organic compounds by microscopy and X-ray diffractometry
  • DOI:
    10.1007/bf01216628
  • 发表时间:
    1956-01-01
  • 期刊:
  • 影响因子:
    5.300
  • 作者:
    John Mitchell;Ada L. Ryland
  • 通讯作者:
    Ada L. Ryland

John Mitchell的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('John Mitchell', 18)}}的其他基金

AMPS: Mathematical Foundations of Market Operations with Renewable Bidders
AMPS:可再生能源投标人市场运作的数学基础
  • 批准号:
    2229335
  • 财政年份:
    2023
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
AMPS: Rank Minimization Algorithms for Wide-Area Phasor Measurement Data Processing
AMPS:用于广域相量测量数据处理的秩最小化算法
  • 批准号:
    1736326
  • 财政年份:
    2017
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
SaTC-EDU: EAGER: Cybersecurity education for public policy
SaTC-EDU:EAGER:公共政策的网络安全教育
  • 批准号:
    1500089
  • 财政年份:
    2015
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Collaborative Research: Binary Constrained Convex Quadratic Programs with Complementarity Constraints and Extensions
协作研究:具有互补约束和扩展的二元约束凸二次规划
  • 批准号:
    1334327
  • 财政年份:
    2013
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Machine Learning Approaches to Predict Enzyme Function
预测酶功能的机器学习方法
  • 批准号:
    BB/I00596X/1
  • 财政年份:
    2011
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Research Grant
Machine Learning Methods for Predicting Phospholipidosis
预测磷脂沉积症的机器学习方法
  • 批准号:
    EP/F049102/1
  • 财政年份:
    2008
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Research Grant
Collaborative Research: CT-M: Privacy, Compliance and Information Risk in Complex Organizational Processes
合作研究:CT-M:复杂组织流程中的隐私、合规性和信息风险
  • 批准号:
    0831199
  • 财政年份:
    2008
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Continuing Grant
Cutting Planes and Surfaces, and Conic Programming
切割平面和曲面以及圆锥规划
  • 批准号:
    0715446
  • 财政年份:
    2007
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Collaborative research: High-Fidelity Methods for Security Protocols
合作研究:安全协议的高保真方法
  • 批准号:
    0430594
  • 财政年份:
    2004
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Continuing Grant
Polyhedral and Non-polyhedral Cutting Plane Methods: Theory, Algorithims and Applications
多面体和非多面体剖切面方法:理论、算法和应用
  • 批准号:
    0317323
  • 财政年份:
    2003
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant

相似国自然基金

基于深度森林(Deep Forest)模型的表面增强拉曼光谱分析方法研究
  • 批准号:
    2020A151501709
  • 批准年份:
    2020
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
兴安落叶松林(Larix gmelinii forest) 土壤微生物对火干扰的响应机制研究
  • 批准号:
    31870644
  • 批准年份:
    2018
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目

相似海外基金

Human Forests versus Random Forest Models in Prediction
预测中的人类森林与随机森林模型
  • 批准号:
    2050727
  • 财政年份:
    2020
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Technology Transfer of New Forest-fire Smoke Prediction System
新型森林火灾烟雾预报系统技术转让
  • 批准号:
    553150-2020
  • 财政年份:
    2020
  • 资助金额:
    $ 10.28万
  • 项目类别:
    University Undergraduate Student Research Awards
Human Forests versus Random Forest Models in Prediction
预测中的人类森林与随机森林模型
  • 批准号:
    1919333
  • 财政年份:
    2019
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Transfer of radiocesium from canopy to forest floor in deciduous oak dominated forests and the prediction of resume of leaf litter origin compost production
落叶橡树为主的森林中放射性铯从树冠到森林地面的转移以及叶凋落物来源堆肥生产恢复的预测
  • 批准号:
    19K06140
  • 财政年份:
    2019
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Research about estimation of roots condition in forest using LiDAR data and slope collapse prediction
利用LiDAR数据估算森林根系状况及边坡塌陷预测研究
  • 批准号:
    19H01369
  • 财政年份:
    2019
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Collaborative Research: Unfolding the Link between Forest Canopy Structure and Flow Morphology: A Physics-based Representation for Numerical Weather Prediction Simulations
合作研究:揭示森林冠层结构与流动形态之间的联系:数值天气预报模拟的基于物理的表示
  • 批准号:
    1712530
  • 财政年份:
    2017
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Collaborative Research: Unfolding the Link between Forest Canopy Structure and Flow Morphology: A Physics-based Representation for Numerical Weather Prediction Simulations
合作研究:揭示森林冠层结构与流动形态之间的联系:数值天气预报模拟的基于物理的表示
  • 批准号:
    1712532
  • 财政年份:
    2017
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Collaborative Research: Unfolding the Link between Forest Canopy Structure and Flow Morphology: A Physics-based Representation for Numerical Weather Prediction Simulations
合作研究:揭示森林冠层结构与流动形态之间的联系:数值天气预报模拟的基于物理的表示
  • 批准号:
    1712538
  • 财政年份:
    2017
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Standard Grant
Development of prediction model for ambient dose rate based on the analysis of radiocesium dynamics in forest environment
基于森林环境放射性铯动力学分析的环境剂量率预测模型的建立
  • 批准号:
    16K16201
  • 财政年份:
    2016
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Changes and future prediction of provisioning services of non-timber forest products in Fukushima Prefecture
福岛县非木材林产品供应服务的变化及未来预测
  • 批准号:
    15K18717
  • 财政年份:
    2015
  • 资助金额:
    $ 10.28万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了