COMPUTATIONAL METHODS FOR MICROBIAL NEXT GENERATION RE-SEQUENCING DATA
微生物下一代重测序数据的计算方法
基本信息
- 批准号:BB/M001121/1
- 负责人:
- 金额:$ 34.93万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2014
- 资助国家:英国
- 起止时间:2014 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The overwhelming majority of life that has existed or exists is invisible to the naked eye (collectively termed the microbes, or microorganisms) and, including the viruses, forms large and complex communities. Characterising the species present, genome composition and genetic variation in these communities has been a major focus of 'metagenomics', the genomic study of mixed samples from the environment, or from animals or humans, for example, from an animal's gut or a soil microbial ecosystems. Contemporary sequencing technologies (next generation sequencing, NGS) have massively parallelized the determination of nucleotide order within genetic material resulting in our ability to rapidly sequence different microbes. This introduces the potential to explore microbial communities and genetic diversity on a scale that was previously unprecedented. Computational methods play a central role in the analysis, alignment and assembly of NGS data. However, the amount of data being generated is outstripping our ability to analyse them routinely, let alone carry out appropriate comparative analysis. This lack of software arises because most research effort is being directed at assembling single complete genomes from next generation sequence data. However, with microbes many interesting questions concern the diversity of sequences present in a community and population variation, revealed by 'ultra-deep' sequencing. Emerging approaches aim to build a de novo assembly of the NGS reads (each read is an individual sequence fragment corresponding to a region of a genome) in a similar fashion to a jigsaw puzzle where a picture is constructed by joining all the matching pieces together. In de novo assembly the genome sequence is constructed by allocating matching short reads together. The majority of the existing de novo assembly approaches for NGS data make extensive use of the de Bruijn graph method. However, building de Bruijn graphs for very large NGS data sets is very demanding because they require hefty computational resources. In this project we propose to develop novel computational methods, based on compressing the individual NGS reads by recasting them as numerical sequences (and working with this transformed/compressed data directly) that will be generically useful for all types of microbial data sets. In order to do this we will explore novel methods for representing short-read sequence data graphically and apply established mathematical approaches for efficient data mining. The particular problem we will address is the assembly of NGS data sets where the variation in the sample needs to be considered in the analysis. In metagenomics data variation between reads corresponds to both distinct microbial species and variation within individual species or viral populations. A particularly important focus is the ability to assembly a genome without a reference sequence for comparison (de novo assembly) as an appropriate reference genome is frequently not available for many microbes and, even when a reference is available, genome architecture can vary within a species.
绝大多数已经存在或存在的生命是肉眼不可见的(统称为微生物或微生物),包括病毒在内,形成了庞大而复杂的群落。表征这些群落中存在的物种、基因组组成和遗传变异一直是“宏基因组学”的主要焦点,宏基因组学是对来自环境或来自动物或人类的混合样品(例如,来自动物肠道或土壤微生物生态系统)的基因组研究。当代测序技术(下一代测序,NGS)已经大规模并行化了遗传物质内核苷酸顺序的测定,从而使我们能够快速测序不同的微生物。这为以前所未有的规模探索微生物群落和遗传多样性带来了潜力。计算方法在NGS数据的分析、比对和汇编中发挥着核心作用。然而,所产生的数据量超出了我们例行分析的能力,更不用说进行适当的比较分析了。这种软件的缺乏是因为大多数研究工作都是针对从下一代序列数据组装单个完整基因组。然而,对于微生物来说,许多有趣的问题涉及到“超深度”测序揭示的群落和种群变异中存在的序列多样性。新兴的方法旨在以类似于拼图游戏的方式构建NGS读段的从头组装(每个读段是对应于基因组区域的单独序列片段),其中通过将所有匹配的片段连接在一起来构建图片。在从头组装中,通过将匹配的短读段分配在一起来构建基因组序列。大多数现有的从头组装NGS数据的方法广泛使用的de Bruijn图方法。然而,为非常大的NGS数据集构建de Bruijn图是非常苛刻的,因为它们需要大量的计算资源。在这个项目中,我们建议开发新的计算方法,基于通过将单个NGS读数重新转换为数字序列(并直接使用这种转换/压缩的数据)来压缩它们,这将对所有类型的微生物数据集通用。为了做到这一点,我们将探索新的方法,以图形方式表示短读序列数据,并适用于有效的数据挖掘建立数学方法。我们将解决的具体问题是NGS数据集的组装,其中需要在分析中考虑样本的变化。在宏基因组学数据中,读数之间的变化对应于不同的微生物物种和个体物种或病毒群体内的变化。一个特别重要的焦点是在没有用于比较的参考序列的情况下组装基因组(从头组装)的能力,因为对于许多微生物来说,适当的参考基因组通常是不可用的,并且即使当参考可用时,基因组结构也可以在物种内变化。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Marginalised stack denoising autoencoders for metagenomic data binning
用于宏基因组数据分箱的边缘化堆栈去噪自动编码器
- DOI:10.1109/cibcb.2017.8058552
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Kouchaki S
- 通讯作者:Kouchaki S
Local binary patterns as a feature descriptor in alignment-free visualisation of metagenomic data
- DOI:10.1109/ssci.2016.7849955
- 发表时间:2016-12
- 期刊:
- 影响因子:0
- 作者:S. Kouchaki;Santosh Tirunagari;Avraam Tapinos;D. Robertson
- 通讯作者:S. Kouchaki;Santosh Tirunagari;Avraam Tapinos;D. Robertson
Alignment by numbers: sequence assembly using compressed numerical representations
按数字对齐:使用压缩数字表示进行序列组装
- DOI:10.1101/011940
- 发表时间:2014
- 期刊:
- 影响因子:0
- 作者:Tapinos A
- 通讯作者:Tapinos A
Challenges in the analysis of viral metagenomes.
- DOI:10.1093/ve/vew022
- 发表时间:2016-07
- 期刊:
- 影响因子:5.3
- 作者:Rose R;Constantinides B;Tapinos A;Robertson DL;Prosperi M
- 通讯作者:Prosperi M
The Utility of Data Transformation for Alignment, <em>de novo</em> Assembly and Classification of Short Read Virus Sequences
数据转换在短读病毒序列比对、从头组装和分类中的应用
- DOI:10.20944/preprints201904.0014.v1
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Tapinos A
- 通讯作者:Tapinos A
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
David Robertson其他文献
Airway response to sublingual nitroglycerin in acute asthma.
急性哮喘舌下含服硝酸甘油的气道反应。
- DOI:
10.1001/jama.1981.03320020037020 - 发表时间:
1981 - 期刊:
- 影响因子:0
- 作者:
Thomas P. Kennedy;Warren R. Summer;Jimmie Sylvester;David Robertson - 通讯作者:
David Robertson
Ideology, Strategy and Party Change: Spatial Analyses of Post-War Election Programmes in 19 Democracies: Do parties differ, and how? Comparative discriminant and factor analyses.
意识形态、战略和政党变革:19 个民主国家战后选举计划的空间分析:政党是否存在差异,有何不同?
- DOI:
10.1017/cbo9780511558771.019 - 发表时间:
1987 - 期刊:
- 影响因子:0
- 作者:
I. Budge;David Robertson - 通讯作者:
David Robertson
Serum immunoactive inhibin levels in early pregnancy after in vitro fertilization and embryo transfer.
体外受精和胚胎移植后妊娠早期血清免疫活性抑制素水平。
- DOI:
10.1016/s0015-0282(16)55932-9 - 发表时间:
1993 - 期刊:
- 影响因子:6.7
- 作者:
Takashi Yohkaichiya;David W. Polson;Edward G. Hughes;V. Maclachlan;David Robertson;David L. Healy;David M. de Kretser - 通讯作者:
David M. de Kretser
Elevation of follicular phase inhibin and luteinizing hormone levels in mothers of dizygotic twins suggests nonovarian control of human multiple ovulation.
异卵双胞胎母亲的卵泡期抑制素和黄体生成素水平升高表明非卵巢控制人类多次排卵。
- DOI:
- 发表时间:
1991 - 期刊:
- 影响因子:6.7
- 作者:
Nicholas G. Martin;Nicholas G. Martin;David Robertson;David Robertson;Georgia Chenevix;Georgia Chenevix;D. M. D. Kretser;D. M. D. Kretser;John Osborne;John Osborne;Henry G. Burger;Henry G. Burger - 通讯作者:
Henry G. Burger
Bioinformatics experimentation in the OpenKnowledge peer to peer infrastructure
OpenKnowledge 点对点基础设施中的生物信息学实验
- DOI:
- 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
Xueping Quan;Paolo Besana;Siu;David Robertson;D. Gerloff - 通讯作者:
D. Gerloff
David Robertson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('David Robertson', 18)}}的其他基金
Integrative viral genomics and bioinformatics platform
综合病毒基因组学和生物信息学平台
- 批准号:
MC_UU_00034/5 - 财政年份:2023
- 资助金额:
$ 34.93万 - 项目类别:
Intramural
ISCF HDRUK DIH Sprint Exemplar: Graph-Based Data Federation for Healthcare Data Science
ISCF HDRUK DIH Sprint 示例:医疗保健数据科学的基于图的数据联合
- 批准号:
MC_PC_18029 - 财政年份:2019
- 资助金额:
$ 34.93万 - 项目类别:
Intramural
Capital Award in Support of Early Career Researchers: "Edinburgh Vishub"
支持早期职业研究人员的资本奖:“爱丁堡 Vishub”
- 批准号:
EP/S018042/1 - 财政年份:2019
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
eBase: Evidence-Base; growing the Big Grant Club
eBase:证据基础;
- 批准号:
EP/S012087/1 - 财政年份:2018
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
RUI: Assessing the Environmental and Human Drivers and Cultural Dimensions of Changes in Oak Forests of the Eastern U.S.
RUI:评估美国东部橡树林变化的环境和人类驱动因素以及文化层面
- 批准号:
1660388 - 财政年份:2017
- 资助金额:
$ 34.93万 - 项目类别:
Continuing Grant
Cryo-FIB-SEM-CT: a 'three-in-one' imaging facility for opaque soft matter
Cryo-FIB-SEM-CT:不透明软物质的“三合一”成像设备
- 批准号:
EP/P030564/1 - 财政年份:2017
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
Telescope Windows: low-vision scopes to cloaks
望远镜窗:从低视力瞄准镜到斗篷
- 批准号:
EP/M010767/1 - 财政年份:2015
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
University of Edinburgh - Equipment Account
爱丁堡大学 - 设备帐户
- 批准号:
EP/M507258/1 - 财政年份:2014
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
Analysis of HIV-1 resistance to the CCR5 antagonist maraviroc
HIV-1 对 CCR5 拮抗剂马拉维罗的耐药性分析
- 批准号:
G1001806/1 - 财政年份:2011
- 资助金额:
$ 34.93万 - 项目类别:
Fellowship
Understanding the evolution and diversity of viral pathogens using next generation sequencing technologies
使用下一代测序技术了解病毒病原体的进化和多样性
- 批准号:
BB/H012419/1 - 财政年份:2010
- 资助金额:
$ 34.93万 - 项目类别:
Research Grant
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Machine learning-based methods for the analysis of microbial glycomes and proteomes in inflammatory bowel disease.
基于机器学习的方法,用于分析炎症性肠病中微生物糖组和蛋白质组。
- 批准号:
10591842 - 财政年份:2023
- 资助金额:
$ 34.93万 - 项目类别:
Development of methods to prevent microbial infection in shrimp aquaculture using useful microorganisms
利用有用微生物开发预防虾养殖中微生物感染的方法
- 批准号:
22H00379 - 财政年份:2022
- 资助金额:
$ 34.93万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Microbial conversion of plant resources by three different controlled Koji-mold solid cultivation methods
三种不同控制曲霉固体栽培方法对植物资源的微生物转化
- 批准号:
22H02242 - 财政年份:2022
- 资助金额:
$ 34.93万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Improving the representation of microbial reference genomes using pangenome reference sequence graphs - Methods for efficient construction and representation of pangenome graphs for microbial data
使用泛基因组参考序列图改善微生物参考基因组的表示 - 微生物数据的泛基因组图的有效构建和表示方法
- 批准号:
557997-2021 - 财政年份:2022
- 资助金额:
$ 34.93万 - 项目类别:
Postdoctoral Fellowships
Elucidation of the inter-microbial interaction and application to effective screening methods
阐明微生物间的相互作用及其在有效筛选方法中的应用
- 批准号:
22K14820 - 财政年份:2022
- 资助金额:
$ 34.93万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
New Glycosylation Methods for Microbial Glycan Synthesis
微生物聚糖合成的新糖基化方法
- 批准号:
10515058 - 财政年份:2022
- 资助金额:
$ 34.93万 - 项目类别:
Improving the representation of microbial reference genomes using pangenome reference sequence graphs - Methods for efficient construction and representation of pangenome graphs for microbial data
使用泛基因组参考序列图改善微生物参考基因组的表示 - 微生物数据的泛基因组图的有效构建和表示方法
- 批准号:
557997-2021 - 财政年份:2021
- 资助金额:
$ 34.93万 - 项目类别:
Postdoctoral Fellowships
Improving the representation of microbial reference genomes using pangenome reference sequence graphs - Methods for efficient construction and representation of pangenome graphs for microbial data
使用泛基因组参考序列图改善微生物参考基因组的表示 - 微生物数据的泛基因组图的有效构建和表示方法
- 批准号:
557997-2021 - 财政年份:2020
- 资助金额:
$ 34.93万 - 项目类别:
Postdoctoral Fellowships
Modular Automation and Chemical Methods for S- and O-linked Plant and Microbial Carbohydrate Synthesis
S-和O-连接植物和微生物碳水化合物合成的模块化自动化和化学方法
- 批准号:
1955936 - 财政年份:2020
- 资助金额:
$ 34.93万 - 项目类别:
Standard Grant
Computational Methods for Microbial and Microbiome Sequence Analysis
微生物和微生物组序列分析的计算方法
- 批准号:
10331733 - 财政年份:2019
- 资助金额:
$ 34.93万 - 项目类别: