Development of a Rapid Processing Pipeline and Graph-based Visualization for the Analysis of Next Generation Sequencing Data
开发用于分析下一代测序数据的快速处理管道和基于图形的可视化
基本信息
- 批准号:BB/J019267/1
- 负责人:
- 金额:$ 24.66万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2012
- 资助国家:英国
- 起止时间:2012 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Over the last decade or so there has been an explosion of biological data emanating from new laboratory analysis platforms. These data are increasingly complex and large-scale. DNA sequencing in particular has revolutionized the biomedical and biological sciences over the last decade. The recent availability of new DNA sequencing platforms mean that orders of magnitude more data can be produced relative to what was possible just a few years ago. These advances have further changed the way we think about scientific approaches to basic, applied and clinical research. For example, the ability to sequence the whole genome of many related organisms has allowed large-scale comparative and evolutionary studies to be performed that were until recently unimaginable. Sequencing can also be used to determine which genes are currently active at any given state or time by RNA sequencing for gene-expression analyses. In analysing gene-expression studies, RNA-sequencing can identify and quantify rare genes without prior knowledge and can provide information regarding sequence variation in the identified genes. When combined with 'pull-down' technologies, these approaches can also answer important questions regarding gene regulation such as transcription factor or microRNA target binding. These advances in technology however come with significant analytical challenges, in particular with respect to the sheer scale of data now being produced. For example a single run of an Illumina Solexa GA-2 machine produces approximately 100Gb of sequence data alone. A number of approaches exist for the analysis of these data, however they are usually slow and extremely computationally intensive, requiring large-memory computers or high-performance computing clusters in order to effectively analyse these data. How best to analyse this information is an ongoing and active discussion. One approach to resolving some of these issues is to both develop fast optimal algorithms for data analysis and to visualise and analyse data as network graphs. This proposal is to develop an optimised system for the analysis of such data. It will involve the development of extremely fast and optimised algorithms for processing the data for which we have already created prototypes. We will utilise the relatively new field of GPU hardware acceleration to allow these algorithms to run significantly faster when utilising specialised hardware on a consumer 3D graphics card. Data processed through the system will be visualised using a customised 3D visualisation environment designed around the existing BioLayout Express3D system. These sequence graphs have already proved themselves useful identifying novel sequence elements and aiding the assembly of their consensus sequences, in many cases helping to identify where issues lie. Furthermore, we intend to harness the power of correlation analysis for working with RNA-seq data, providing an integrated solution for moving from primary sequence data through to co-expression analysis of tags per gene summaries. In doing however we will also provide network and alignment based views of the primary data that underpin the summary analyses. This will provide novel ways for users to see their data and how reads interact with each other and the genome itself. The entire system will be modular and each module will be accessed from a graphical user interface written in Java, that gives the user control over analysis modules and allows rapid analysis of large-scale datasets from the primary data to genome/gene level analyses.
在过去十年左右的时间里,新的实验室分析平台产生了大量生物数据。这些数据越来越复杂,规模越来越大。特别是DNA测序在过去十年中彻底改变了生物医学和生物科学。最近新的DNA测序平台的可用性意味着相对于几年前可能产生的数据,可以产生数量级更多的数据。这些进步进一步改变了我们对基础、应用和临床研究的科学方法的看法。例如,对许多相关生物体的全基因组进行测序的能力使得能够进行大规模的比较和进化研究,这直到最近都是不可想象的。测序也可用于确定哪些基因在任何给定的状态或时间通过RNA测序基因表达分析目前是活跃的。在分析基因表达研究中,RNA测序可以在没有先验知识的情况下识别和定量稀有基因,并可以提供有关所识别基因中序列变异的信息。当与“下拉”技术相结合时,这些方法还可以回答有关基因调控的重要问题,如转录因子或microRNA靶点结合。然而,这些技术进步带来了重大的分析挑战,特别是在目前产生的数据规模方面。例如,Illumina Solexa GA-2机器的单次运行仅产生大约100 Gb的序列数据。存在多种方法来分析这些数据,但它们通常速度很慢且计算极其密集,需要大内存计算机或高性能计算集群才能有效地分析这些数据。如何最好地分析这一信息是一个持续和积极的讨论。解决其中一些问题的一种方法是开发用于数据分析的快速优化算法,并将数据可视化和分析为网络图。该提案旨在开发一个优化系统,用于分析此类数据。它将涉及开发非常快速和优化的算法,用于处理我们已经创建原型的数据。我们将利用相对较新的GPU硬件加速领域,使这些算法在消费者3D显卡上使用专用硬件时运行得更快。通过该系统处理的数据将使用围绕现有BioLayout 3D系统设计的定制3D可视化环境进行可视化。这些序列图已经证明了它们在识别新的序列元件和帮助组装它们的共有序列方面是有用的,在许多情况下有助于识别问题所在。此外,我们打算利用相关性分析的力量来处理RNA-seq数据,为从一级序列数据到每个基因摘要的标签的共表达分析提供综合解决方案。然而,在此过程中,我们还将提供基于网络和对齐的主要数据视图,以支持汇总分析。这将为用户提供新的方式来查看他们的数据以及读取如何相互作用以及基因组本身。整个系统将是模块化的,每个模块将从用Java编写的图形用户界面访问,这使用户能够控制分析模块,并允许快速分析从原始数据到基因组/基因水平分析的大规模数据集。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Network-based visualization and analysis of next-generation sequencing (NGS) data
基于网络的下一代测序 (NGS) 数据可视化和分析
- DOI:
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Nazarie W.F.
- 通讯作者:Nazarie W.F.
Visualisation and analysis of RNA-Seq assembly networks
RNA-Seq 组装网络的可视化和分析
- DOI:
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Nazarie W.F.
- 通讯作者:Nazarie W.F.
Visualisation and analysis of RNA-Seq assembly graphs
RNA-Seq 组装图的可视化和分析
- DOI:10.1101/409573
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Nazarie F
- 通讯作者:Nazarie F
Modelling the Structure and Dynamics of Biological Pathways.
生物途径的结构和动力学建模。
- DOI:10.1371/journal.pbio.1002530
- 发表时间:2016-08
- 期刊:
- 影响因子:9.8
- 作者:O'Hara L;Livigni A;Theo T;Boyer B;Angus T;Wright D;Chen SH;Raza S;Barnett MW;Digard P;Smith LB;Freeman TC
- 通讯作者:Freeman TC
Visualisation of BioPAX Networks using BioLayout Express (3D).
- DOI:10.12688/f1000research.5499.1
- 发表时间:2014
- 期刊:
- 影响因子:0
- 作者:Wright DW;Angus T;Enright AJ;Freeman TC
- 通讯作者:Freeman TC
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tom Freeman其他文献
Task-independent Acute Effects of delta-9-tetrahydrocannabinol on Human Brain Function and Its Relationship With Cannabinoid Receptor Gene Expression_ A Neuroimaging Meta-regression AnalysisShort title_ Acute effects of THC on the human brain
delta-9-四氢大麻酚对人脑功能的任务独立急性影响及其与大麻素受体基因表达的关系_神经影像元回归分析简称_ THC 对人脑的急性影响
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
B. Gunasekera;Cathy Davies;G. Blest;M. Veronese;F. Nick;Ramsey;M. Bossong;Joaquim Radua;S. Bhattacharyya;Gráinne;McAlonan;Carmen Walter;Jörn Lötsch;Tom Freeman;Valerie Curran;Giovanni Battistella;Eleonora Fornari;Geraldo Busatto Filho;José Alexandre;Crippa;Fabio Duran;A. Zuardi - 通讯作者:
A. Zuardi
Family Medicine’s academic contributions. Family Medicine Research Days, İzmir, Turkey
家庭医学的学术贡献,土耳其伊兹密尔家庭医学研究日。
- DOI:
10.2399/tahd.12.181 - 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Tom Freeman - 通讯作者:
Tom Freeman
Teenagers, Compared to Adults, are More Vulnerable to the Psychotic-Like and Addiction-Forming Risks Associated With Chronic Cannabis Use
- DOI:
10.1016/j.biopsych.2020.02.589 - 发表时间:
2020-05-01 - 期刊:
- 影响因子:
- 作者:
Will Lawn;Claire Mokrysz;Katherine Petrilli;Rachel Lees;Anya Borissova;Michael Bloomfield;Tom Freeman;Val Curran - 通讯作者:
Val Curran
Wait Times for Women With Abnormal Uterine Bleeding in South-Western Ontario
- DOI:
10.1016/s1701-2163(16)34577-7 - 发表时间:
2010-07-01 - 期刊:
- 影响因子:
- 作者:
Jennifer N. Bondy;Amardeep Thind;Moira Stewart;Doug Manuel;Tom Freeman - 通讯作者:
Tom Freeman
THE ASSOCIATION BETWEEN POLYGENIC RISK FOR SCHIZOPHRENIA AND BRAIN AGE IN A POPULATION-BASED SAMPLE OF YOUNG ADULTS: A RECALL-BY-GENOTYPE-BASED APPROACH
- DOI:
10.1016/j.euroneuro.2022.07.534 - 发表时间:
2022-10-01 - 期刊:
- 影响因子:
- 作者:
Constantinos Constantinides;Doretta Caramaschi;Tom Freeman;Thomas Lancaster;Stanley Zammit;Esther Walton - 通讯作者:
Esther Walton
Tom Freeman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tom Freeman', 18)}}的其他基金
BioLayout Express3D: A Community Resource for the Network Visualisation and Analysis of Biological Data and Pathways
BioLayout Express3D:生物数据和通路的网络可视化和分析的社区资源
- 批准号:
BB/I001107/1 - 财政年份:2010
- 资助金额:
$ 24.66万 - 项目类别:
Research Grant
Development of network analysis tool BioLayout Express3D
网络分析工具BioLayout Express3D开发
- 批准号:
BB/F003722/1 - 财政年份:2008
- 资助金额:
$ 24.66万 - 项目类别:
Research Grant
相似国自然基金
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Self-supervised feature learning for rapid processing of marine imagery
用于快速处理海洋图像的自监督特征学习
- 批准号:
LP220200949 - 财政年份:2023
- 资助金额:
$ 24.66万 - 项目类别:
Linkage Projects
RAPID: Learning from the Maui community to understand layers of trauma and trauma-informed STEM education as a tool to support processing, recovery, and healing
RAPID:向毛伊岛社区学习,了解创伤的各个层次,并将创伤相关的 STEM 教育作为支持处理、恢复和治愈的工具
- 批准号:
2345383 - 财政年份:2023
- 资助金额:
$ 24.66万 - 项目类别:
Standard Grant
NMR-Based Rapid Fluid Assessment: Device Design and Signal Processing
基于 NMR 的快速流体评估:设备设计和信号处理
- 批准号:
10441674 - 财政年份:2022
- 资助金额:
$ 24.66万 - 项目类别:
STTR Phase I: Feasibility of multi-layer microplate test for rapid detection and enumeration of Salmonella spp. in raw poultry and processing environment samples
STTR 第一阶段:用于快速检测和计数沙门氏菌的多层微孔板测试的可行性。
- 批准号:
2135699 - 财政年份:2022
- 资助金额:
$ 24.66万 - 项目类别:
Standard Grant
NMR-Based Rapid Fluid Assessment: Device Design and Signal Processing
基于 NMR 的快速流体评估:设备设计和信号处理
- 批准号:
10617808 - 财政年份:2022
- 资助金额:
$ 24.66万 - 项目类别:
An ultra-rapid 3D imaging method for crop root systems in soil using image processing
利用图像处理对土壤中作物根系进行超快速 3D 成像方法
- 批准号:
22K14871 - 财政年份:2022
- 资助金额:
$ 24.66万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Automated processing of sexual assault samples to enable rapid DNA analysis
自动处理性侵犯样本以实现快速 DNA 分析
- 批准号:
562031-2021 - 财政年份:2022
- 资助金额:
$ 24.66万 - 项目类别:
Alliance Grants
Targeted Biomarker Panels and Pre-processing Device for the Rapid Assessment of Radiation Injury in Easily Accessible Biofluids
用于快速评估易于获取的生物流体中的辐射损伤的靶向生物标记物组和预处理装置
- 批准号:
10459218 - 财政年份:2021
- 资助金额:
$ 24.66万 - 项目类别:
SBIR Phase I: Handheld graphene-based sensors for rapid detection of Salmonella species in food processing facilities
SBIR 第一阶段:手持式石墨烯传感器,用于快速检测食品加工设施中的沙门氏菌
- 批准号:
2111881 - 财政年份:2021
- 资助金额:
$ 24.66万 - 项目类别:
Standard Grant
SBIR Phase II: An automated digital pathology lab for rapid on-site processing and imaging of tissue biopsies
SBIR 第二阶段:自动化数字病理学实验室,用于组织活检的快速现场处理和成像
- 批准号:
2039417 - 财政年份:2021
- 资助金额:
$ 24.66万 - 项目类别:
Cooperative Agreement