GPU-based Machine Learning System for fundamental biological research
用于基础生物学研究的基于 GPU 的机器学习系统
基本信息
- 批准号:BB/V019805/1
- 负责人:
- 金额:$ 51.78万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2021
- 资助国家:英国
- 起止时间:2021 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In the past decade, biological sciences have witnessed a major shift towards data-driven research. Consequently, high-performance computing has become a standard research tool in life sciences. Biological data, however, is not only large but it is highly complex. This complexity requires alternative approaches to conventional data analysis. Machine learning (ML) has emerged as a powerful methodology that can successfully tackle the analysis of complex biological data. It is a hopeless task to attempt to develop a mathematical model of an elephant. Yet, a three-year-old child can with ease point at an elephant in a photo. The child was shown a picture of an elephant and told that the object in the picture was an elephant. In other words, she learnt to recognise an elephant by seeing photos of it and now she can identify it on her own. ML emulates the learning process on a computer. Instead of building a precise description of patterns, the computer is "taught" to recognise them. This is a paradigm shift from conventional computing and, thus, has its own challenges. Notably, the brain is well suited for learning by example, yet it performs poorly when it comes to long divisions. Computers, on the other hand, have been designed to perform numerical operations with great speed and precision. It is, therefore, not surprising that emulating an inherently heuristic process such as learning on a computer would require a substantial computational effort. With the recent advent in Graphics Processing Unit (GPU) and Solid State Drive (SSD) technologies, the necessary computer power has become broadly available. It is, however, not surprising that traditional High-Performance Computing facilities are not well-suited for ML applications. ML has been successfully used in biology for more than two decades. An excellent example is the prediction of the viability of cancer cells when exposed to a drug. The idea is to associate a response (e.g., whether a cancer cell survives or not) to a set of characteristics or features (e.g., which genes were mutated and what chemical properties of the drug are). In the so-called supervised learning, the machine is presented with a large set of training data that contains correct responses for given input parameters. Based on that data, the machine learns to predict the response for new, previously unseen parameters. A major challenge is that it is often not easy to identify what the appropriate features are. Cells are very complex and it is often unclear which are the most relevant features that determine a specific response, e.g., mutations of which genes one should consider, etc. An expert is, therefore, required to prepare the appropriate training set. In recent years, so-called deep learning techniques have revolutionised the learning process by allowing the machine to automatically extract the key features from raw data. This is achieved by a set of model neurons, inspired by biological neural cells, organised in a layered network (i.e., a neural network). The information propagates through layers of the network, which enables each layer to capture more and more abstract features in the data. This drastically reduces the need for carefully tailored training sets and makes the ML applicable to a wider range of problems, especially those where expert-made training sets are not available or too costly to make. Deep learning ML approaches, however, require substantial computational resources. Typical deep learning neural networks contain tens to hundreds of layers, thousands of neurons, and hundreds of thousands of links between them. Training them, therefore, requires hardware that operates at TFLOPS speeds (trillions of operations per second) and can access the data at several GB/s.The aim of this proposal is to build a designated GPU-based system for applying deep-learning ML methods in fundamental biological research at the University of Dundee.
在过去的十年中,生物科学见证了向数据驱动研究的重大转变。因此,高性能计算已成为生命科学中的标准研究工具。然而,生物数据不仅庞大,而且非常复杂。这种复杂性需要传统数据分析的替代方法。机器学习(ML)已经成为一种强大的方法,可以成功地处理复杂生物数据的分析。试图建立一个大象的数学模型是一项无望的任务。然而,一个三岁的孩子可以轻松地在照片中指向大象。给孩子看一张大象的照片,并告诉他照片中的物体是大象。换句话说,她学会了通过看大象的照片来识别大象,现在她可以自己识别大象。ML在计算机上模拟学习过程。计算机不是建立一个精确的模式描述,而是被“教会”去识别它们。这是从传统计算的范式转变,因此有其自身的挑战。值得注意的是,大脑非常适合通过例子学习,但在长除法方面表现不佳。另一方面,计算机被设计成能以极高的速度和精度进行数值运算。因此,在计算机上模拟一个固有的启发式过程(如学习)需要大量的计算工作也就不足为奇了。随着最近图形处理单元(GPU)和固态驱动器(SSD)技术的出现,必要的计算机能力已经变得广泛可用。然而,传统的高性能计算设施并不适合ML应用程序,这并不奇怪。ML在生物学中已经成功应用了二十多年。一个很好的例子是预测癌细胞暴露于药物时的生存能力。其想法是将响应(例如,癌细胞是否存活)到一组特性或特征(例如,哪些基因发生了突变以及药物的化学性质是什么)。在所谓的监督学习中,机器被提供了大量的训练数据,其中包含对给定输入参数的正确响应。基于这些数据,机器学习预测对新的、以前看不见的参数的响应。一个主要的挑战是,往往不容易确定哪些是适当的特征。细胞非常复杂,通常不清楚哪些是决定特定反应的最相关特征,例如,应该考虑哪些基因的突变等。因此,需要专家准备适当的训练集。近年来,所谓的深度学习技术通过允许机器自动从原始数据中提取关键特征,彻底改变了学习过程。这是通过一组模型神经元来实现的,这些模型神经元受到生物神经细胞的启发,组织在分层网络中(即,神经网络)。信息通过网络的各个层传播,这使得每一层都能够捕获数据中越来越多的抽象特征。这大大减少了对精心定制的训练集的需求,并使ML适用于更广泛的问题,特别是那些专家制作的训练集不可用或制作成本太高的问题。 然而,深度学习ML方法需要大量的计算资源。典型的深度学习神经网络包含数十到数百层,数千个神经元以及它们之间的数十万个链接。因此,训练它们需要以TFLOPS速度(每秒数万亿次操作)运行的硬件,并且可以以几GB/s的速度访问数据。该提案的目的是建立一个指定的基于GPU的系统,用于在邓迪大学的基础生物学研究中应用深度学习ML方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Rastko Sknepnek其他文献
Vertex model with internal dissipation enables sustained flows
具有内部耗散的顶点模型能够实现持续流动
- DOI:
10.1038/s41467-025-55820-2 - 发表时间:
2025-01-09 - 期刊:
- 影响因子:15.700
- 作者:
Jan Rozman;KVS Chaithanya;Julia M. Yeomans;Rastko Sknepnek - 通讯作者:
Rastko Sknepnek
Cell-Level Modelling of Homeostasis in Confined Epithelial Monolayers
- DOI:
10.1007/s10659-025-10120-0 - 发表时间:
2025-02-24 - 期刊:
- 影响因子:1.400
- 作者:
KVS Chaithanya;Jan Rozman;Andrej Košmrlj;Rastko Sknepnek - 通讯作者:
Rastko Sknepnek
Rastko Sknepnek的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Rastko Sknepnek', 18)}}的其他基金
相似国自然基金
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Incentive and governance schenism study of corporate green washing behavior in China: Based on an integiated view of econfiguration of environmental authority and decoupling logic
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
Exploring the Intrinsic Mechanisms of CEO Turnover and Market Reaction: An Explanation Based on Information Asymmetry
- 批准号:W2433169
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
含Re、Ru先进镍基单晶高温合金中TCP相成核—生长机理的原位动态研究
- 批准号:52301178
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
NbZrTi基多主元合金中化学不均匀性对辐照行为的影响研究
- 批准号:12305290
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
眼表菌群影响糖尿病患者干眼发生的人群流行病学研究
- 批准号:82371110
- 批准年份:2023
- 资助金额:49.00 万元
- 项目类别:面上项目
镍基UNS N10003合金辐照位错环演化机制及其对力学性能的影响研究
- 批准号:12375280
- 批准年份:2023
- 资助金额:53.00 万元
- 项目类别:面上项目
CuAgSe基热电材料的结构特性与构效关系研究
- 批准号:22375214
- 批准年份:2023
- 资助金额:50.00 万元
- 项目类别:面上项目
A study on prototype flexible multifunctional graphene foam-based sensing grid (柔性多功能石墨烯泡沫传感网格原型研究)
- 批准号:
- 批准年份:2020
- 资助金额:20 万元
- 项目类别:
基于大数据定量研究城市化对中国季节性流感传播的影响及其机理
- 批准号:82003509
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
相似海外基金
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
- 批准号:
2339216 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Continuing Grant
Investigating the potential for developing self-regulation in foreign language learners through the use of computer-based large language models and machine learning
通过使用基于计算机的大语言模型和机器学习来调查外语学习者自我调节的潜力
- 批准号:
24K04111 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
STTR Phase II: Optimized manufacturing and machine learning based automation of Endothelium-on-a-chip microfluidic devices for drug screening applications.
STTR 第二阶段:用于药物筛选应用的片上内皮微流体装置的优化制造和基于机器学习的自动化。
- 批准号:
2332121 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Cooperative Agreement
A Novel Contour-based Machine Learning Tool for Reliable Brain Tumour Resection (ContourBrain)
一种基于轮廓的新型机器学习工具,用于可靠的脑肿瘤切除(ContourBrain)
- 批准号:
EP/Y021614/1 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Research Grant
Synergising Process-Based and Machine Learning Models for Accurate and Explainable Crop Yield Prediction along with Environmental Impact Assessment
协同基于流程和机器学习模型,实现准确且可解释的作物产量预测以及环境影响评估
- 批准号:
BB/Y513763/1 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Research Grant
SBIR Phase I: An inclusive machine learning-based digital platform to credential soft skills
SBIR 第一阶段:一个基于机器学习的包容性数字平台,用于认证软技能
- 批准号:
2317077 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Standard Grant
Adaptive Ising-machine-based Solvers for Large-scale Real-world Geospatial Optimization Problems
基于自适应 Ising 机的大规模现实世界地理空间优化问题求解器
- 批准号:
24K20779 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
DeepMARA - Deep Reinforcement Learning based Massive Random Access Toward Massive Machine-to-Machine Communications
DeepMARA - 基于深度强化学习的大规模随机访问实现大规模机器对机器通信
- 批准号:
EP/Y028252/1 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Fellowship
Toxicology-testing platform integrating immunocompetent in vitro/ex vivo modules with real-time sensing and machine learning based in silico models for life cycle assessment and SSbD
毒理学测试平台,将免疫活性体外/离体模块与基于硅模型的实时传感和机器学习相结合,用于生命周期评估和 SSbD
- 批准号:
10100967 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
EU-Funded
Machine learning-based prediction models for morbidity and mortality risk of cardiometabolic diseases in post-disaster residents by using the Fukushima longitudinal health data
利用福岛纵向健康数据基于机器学习的灾后居民心脏代谢疾病发病和死亡风险预测模型
- 批准号:
24K13482 - 财政年份:2024
- 资助金额:
$ 51.78万 - 项目类别:
Grant-in-Aid for Scientific Research (C)