Building machine learning models and neural networks trained on structural information of drug targets to predict antimicrobial resistance
构建机器学习模型和神经网络,并根据药物靶标的结构信息进行训练,以预测抗菌药物耐药性
基本信息
- 批准号:2597363
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2021
- 资助国家:英国
- 起止时间:2021 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The proposed project focuses on training machine learning models using protein structural, chemical and evolutionary features of relevant antibiotic targets to predict antimicrobial resistance (AMR) conferred by Mycobacterium tuberculosis (Mtb). Whilst many researchers are using genetic features to predict resistance, we have previously demonstrated that traditional machine-learning models trained on structural and biophysical features of RNA polymerase can robustly and accurately predict the effect that a missense mutation confers on rifampicin susceptibility. However, these models are inherently unable to predict the effect multiple mutations can have, thereby constraining usable mutation data to a subset of the available mutation data, and thus limiting the clinical applicability of the models. The primary goal of the DPhil project is to address this. The student will have access to the dataset of around 70,000 clinical TB samples amassed by the international CRyPTIC project which was led by Oxford and is reporting its main findings through a series of publications. CRyPTIC collected 15,211 samples, each of which was whole genome sequenced and the susceptibility of 13 antibiotics measured using a 96-well broth microdilution plate. A limitations of this dataset was the lack of resistance to new compounds, such as bedaquiline. One of the CRyPTIC partners has recently provided c. 1000 high-value samples that are extensively resistant and the initial aims (Y1) of this DPhil are to analyse this additional dataset, including retraining previously developed machine learning models, as well as developing a rigorous statistical analysis pipeline to enable continuous robust and easily accessible performance assessment and benchmarking. This will facilitate the primary objective of the project; developing graph convolutional neural networks (gCNNs) featurised with structural and chemical data to predict AMR conferred by multiple mutations against first- and second-line anti-TB compounds. The hypothesis underlying this approach is that the topology of gCNNs can accurately capture all the information from a resistant allele, thereby allowing machine learning models to be efficiently trained and permitting protein targets with high levels of genetic variability to be considered for the first time. A logical extension, time permitting, would be incorporating dynamic data pulled down from molecular dynamics trajectories into the feature sets and assessing the impact on model performance. Aside from gCNNs being a more intuitive architecture to represent structural data than conventional convolutional neural networks (CNNs), gCNNs also preserve the concepts of the atom and the chemical bond until the final layers of the network, thereby preserving spatial information of the drug target. This allows for interrogation of atom embeddings to boost model attribution, a concept particularly relevant in clinically applicable molecular diagnostics. Although the use of structural data to predict AMR is still a relatively new approach, the real novelty of this methodology is that to date the field of AMR prediction has been largely unable to benefit from neural networks trained on sufficiently large datasets, and particularly neural networks trained on structural, physiochemical, and spatial information of the drug target. Furthermore, equivariant graph neural networks (which arguably show the most potential) are extremely new (2021), and with regard to structural modelling problems, have mostly been adopted by groups focussing on binding affinity prediction, not AMR prediction. This project would fall within the following EPSRC research themes: AI and data science Antimicrobial resistance Biological informatics Biophysics Clinical technologies Software engineering
该项目的重点是利用相关抗生素靶点的蛋白质结构、化学和进化特征来训练机器学习模型,以预测结核分枝杆菌 (Mtb) 所产生的抗菌素耐药性 (AMR)。虽然许多研究人员正在使用遗传特征来预测耐药性,但我们之前已经证明,根据 RNA 聚合酶的结构和生物物理特征训练的传统机器学习模型可以稳健而准确地预测错义突变对利福平敏感性的影响。然而,这些模型本质上无法预测多个突变可能产生的影响,从而将可用突变数据限制为可用突变数据的子集,从而限制了模型的临床适用性。哲学博士项目的主要目标就是解决这个问题。学生将可以访问由牛津大学领导的国际 CRyPTIC 项目收集的约 70,000 个临床结核病样本的数据集,并通过一系列出版物报告其主要发现。 CRyPTIC 收集了 15,211 个样本,每个样本都进行了全基因组测序,并使用 96 孔肉汤微量稀释板测量了 13 种抗生素的敏感性。该数据集的局限性是缺乏对新化合物(例如贝达喹啉)的耐药性。 CRyPTIC 合作伙伴之一最近提供了 c。 1000 个具有广泛抵抗力的高价值样本,该 DPhil 的最初目标 (Y1) 是分析这个额外的数据集,包括重新训练以前开发的机器学习模型,以及开发严格的统计分析管道,以实现持续稳健且易于访问的性能评估和基准测试。这将有助于实现该项目的主要目标;开发以结构和化学数据为特征的图卷积神经网络 (gCNN),以预测一线和二线抗结核化合物的多重突变所带来的 AMR。这种方法的假设是,gCNN 的拓扑结构可以准确捕获抗性等位基因的所有信息,从而允许有效训练机器学习模型,并首次允许考虑具有高水平遗传变异性的蛋白质靶点。如果时间允许,逻辑扩展将从分子动力学轨迹中提取的动态数据合并到特征集中,并评估对模型性能的影响。除了 gCNN 是比传统卷积神经网络 (CNN) 更直观的架构来表示结构数据之外,gCNN 还保留原子和化学键的概念,直到网络的最后几层,从而保留药物靶标的空间信息。这允许询问原子嵌入以增强模型归因,这是一个在临床适用的分子诊断中特别相关的概念。尽管使用结构数据来预测 AMR 仍然是一种相对较新的方法,但这种方法真正的新颖之处在于,迄今为止,AMR 预测领域在很大程度上无法受益于在足够大的数据集上训练的神经网络,特别是在药物靶标的结构、理化和空间信息上训练的神经网络。此外,等变图神经网络(可以说是最具潜力的)是非常新的(2021 年),并且就结构建模问题而言,主要被专注于结合亲和力预测而不是 AMR 预测的团体所采用。该项目将属于以下 EPSRC 研究主题: 人工智能和数据科学 抗菌素耐药性 生物信息学 生物物理学 临床技术 软件工程
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似国自然基金
Understanding structural evolution of galaxies with machine learning
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
非标准随机调度模型的最优动态策略
- 批准号:71071056
- 批准年份:2010
- 资助金额:28.0 万元
- 项目类别:面上项目
微生物发酵过程的自组织建模与优化控制
- 批准号:60704036
- 批准年份:2007
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
相似海外基金
TRUST2 - Improving TRUST in artificial intelligence and machine learning for critical building management
TRUST2 - 提高关键建筑管理的人工智能和机器学习的信任度
- 批准号:
10093095 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Collaborative R&D
CAREER: Building the Merger Tree of the Milky Way with Machine Learning
职业:用机器学习构建银河系的合并树
- 批准号:
2337864 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Continuing Grant
Analysis and design of building frames using machine learning considering uncertainty of parameters
考虑参数不确定性的利用机器学习的建筑框架分析与设计
- 批准号:
23K04104 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Building predictive algorithms to identify resilience and resistance to Alzheimer's disease
构建预测算法来识别对阿尔茨海默病的恢复力和抵抗力
- 批准号:
10659007 - 财政年份:2023
- 资助金额:
-- - 项目类别:
A macromolecular structure building toolkit for machine learning and cloud applications
用于机器学习和云应用的大分子结构构建工具包
- 批准号:
BB/X006492/1 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Research Grant
Building Resources to Assess Impaired Neurocognition for Care and Research among Adults Aging with HIV (BRAIN Care HIV)
建立资源来评估神经认知受损,以促进老年艾滋病毒感染者的护理和研究(BRAIN Care HIV)
- 批准号:
10756384 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Building Health System Capacity to Develop, Validate, and Deploy Novel Machine Learning Population-Based Risk Tools to Support Population Health Management
建设卫生系统能力以开发、验证和部署新型机器学习基于人群的风险工具以支持人口健康管理
- 批准号:
484592 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Fellowship Programs
TRUST - Improving TRUST in artificial intelligence and machine learning for critical building management
信任 - 提高对关键建筑管理的人工智能和机器学习的信任
- 批准号:
10068043 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative R&D
Historical Building Classification Learning System using "Reward" Indicators based on Machine Learning
基于机器学习的使用“奖励”指标的历史建筑分类学习系统
- 批准号:
23K16903 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Early-Career Scientists
“TrustScore” fintech machine-learning: growing the UK economy from enabling underserved communities to access mainstream financial services by building a credit-file through their established practice of saving clubs
–TrustScore – 金融科技机器学习:通过储蓄俱乐部的既定做法建立信用档案,使服务不足的社区获得主流金融服务,从而发展英国经济
- 批准号:
10052433 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative R&D