权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Tool for finding linked genetic polymorphisms in reference-less complex plant genomes from unassembled next-generation reads.

用于从未组装的下一代读数中查找无参考复杂植物基因组中连锁遗传多态性的工具。

基本信息

批准号：
BB/I023798/1
负责人：
Dan MacLean
金额：
$ 14.38万
依托单位：
University of East Anglia
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2011
资助国家：
英国
起止时间：
2011 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=BB%2FI023798%2F1
关键词：
Tool finding linked genetic polymorphisms

项目摘要

Differences in the genome of individuals of the same species, called polymorphisms, are the genetic basis of traits such as resistance or susceptibility to disease. By identifying polymorphisms it is possible to pinpoint either the agents of resistance or susceptibility or at the least locate regions on the genome that are placed nearby and can act as positional markers that can be associated with the trait of interest. Many wild populations of plant species that are closely related to domesticated varieties important for food and industry are resistant to common diseases that could potentially devastate important crops across the world. Combating these diseases chemically is both costly and environmentally damaging so breeding varieties that are resistant is absolutely necessary for food security. Genetic methods for identifying markers are time consuming and require large amounts of expensive and slow laboratory work. New methods in high-throughput DNA sequencing are able to comprehensively sample entire genomes at an affordable cost. These technologies return many millions of small fragments not a continuous sequence. The volumes of data generated by the NGS instruments has resulted in the need for new methods to assemble the fragments or align to an existing, previously assembled reference sequence. Currently, polymorphism identification relies on having some sort of reference to which sequence reads can be aligned. Aligned reads are then examined for consensus differences to the reference that indicate a genetic difference between the genome of that sampled in the reads and the reference. Naturally this is only possible where a reference genome exists. Since the creation of even a rough draft genome sequence can take many months, the detection of polymorphisms specifying resistance to diseases in relatives to agriculturally important organisms that have no such reference becomes a massively time consuming and difficult task. When reference sequence is available identifying polymorphisms among many individuals from a population, to associate genotypes with specific phenotypes for example, require many cycles of alignment and comparison. Our objective is to develop a tool that takes advantage of the recent developments in high-throughput DNA sequencing and new computational methods to identify polymorphisms between multiple sources without the need for comparison with a reference sequence. These methods will allow us to detect genetic variants directly from the raw sequences reads without the requirement of a reference genome. The time required would be on the order of hours, rather than months or years in the case where assembly may be required. The tool will produce short but useful genomic mini-assemblies with embedded polymorphisms that can be utilised by bench workers for downstream experiments. We will be able to provide ranking of SNPs and classifications based on the provenance of different reads, for example detecting SNPs common to individuals with a trait. The tool will be an important addition to the repertoire of methods available to bioinformaticians involved in polymorphism detection and could invaluable to projects without an available reference sequence. The tool will also prove useful to bioinformaticians with a reference sequence, we will be able to remove the need for many sequential alignments to a reference and compress subsequent polymorphism detection into a single step.

同一物种个体基因组的差异，称为多态性，是诸如抗病或抗病易感性等特征的遗传基础。通过鉴定多态性，有可能查明抗性或易感性的因素，或至少定位基因组上放置在附近的区域，并可以作为与感兴趣的性状相关的位置标记。许多野生植物种群与对粮食和工业具有重要意义的驯化品种密切相关，它们对可能摧毁世界各地重要作物的常见疾病具有抗性。用化学方法防治这些疾病既昂贵又破坏环境，因此培育具有抗性的品种对于粮食安全是绝对必要的。遗传方法识别标记是耗时的，需要大量昂贵和缓慢的实验室工作。高通量DNA测序的新方法能够以负担得起的成本全面采样整个基因组。这些技术返回数以百万计的小片段，而不是一个连续的序列。NGS仪器产生的大量数据导致需要新的方法来组装碎片或与现有的先前组装的参考序列对齐。目前，多态性识别依赖于序列读取可以对齐的某种引用。然后检查对齐的reads与参考文献的共识差异，这表明在reads和参考文献中采样的基因组之间存在遗传差异。当然，这只有在参考基因组存在的情况下才有可能。由于即使是粗略的基因组序列草图也需要几个月的时间，因此检测在农业上重要的生物体的亲属中指定抗病能力的多态性成为一项非常耗时和困难的任务。当有参考序列时，确定群体中许多个体的多态性，例如将基因型与特定表型联系起来，需要多次比对和比较。我们的目标是开发一种工具，利用高通量DNA测序的最新发展和新的计算方法来识别多个来源之间的多态性，而无需与参考序列进行比较。这些方法将使我们能够直接从原始序列中检测遗传变异，而不需要参考基因组。所需的时间将按小时计算，而不是在可能需要组装的情况下数月或数年。该工具将产生短但有用的基因组微型装配，嵌入多态性，可被工作台工人用于下游实验。我们将能够根据不同reads的来源提供snp的排名和分类，例如检测具有特征的个体共同的snp。该工具将是参与多态性检测的生物信息学家可用方法库的重要补充，对没有可用参考序列的项目具有不可估量的价值。该工具也将被证明对生物信息学家的参考序列有用，我们将能够消除对参考序列的许多序列比对的需要，并将随后的多态性检测压缩到一个步骤中。

项目成果

期刊论文数量（4）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Identifying and classifying trait linked polymorphisms in non-reference species by walking coloured de bruijn graphs.

DOI：
10.1371/journal.pone.0060058
发表时间：
2013
期刊：
PloS one
影响因子：
3.7
作者：
Leggett RM;Ramirez-Gonzalez RH;Verweij W;Kawashima CG;Iqbal Z;Jones JD;Caccamo M;Maclean D
通讯作者：
Maclean D

Reference-free SNP detection: dealing with the data deluge.

DOI：
10.1186/1471-2164-15-s4-s10
发表时间：
2014
期刊：
BMC genomics
影响因子：
4.4
作者：
Leggett RM;MacLean D
通讯作者：
MacLean D

Using 2k + 2 bubble searches to find SNPs in k-mer graphs

使用 2k 2 气泡搜索在 k 聚体图中查找 SNP

DOI：
10.1101/004507
发表时间：
2014
期刊：
影响因子：
0
作者：
Younsi R
通讯作者：
Younsi R

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Dan MacLean其他文献

Complexity of the lichen symbiosis revealed by metagenome and transcriptome analysis of emXanthoria parietina/em

通过地衣共生菌的宏基因组和转录组分析揭示地衣共生的复杂性

DOI：
10.1016/j.cub.2024.12.041
发表时间：
2025-02-24
期刊：
CURRENT BIOLOGY
影响因子：
7.500
作者：
Gulnara Tagirdzhanova;Klara Scharnagl;Neha Sahu;Xia Yan;Angus Bucknell;Adam R. Bentham;Clara Jégousse;Sandra Lorena Ament-Velásquez;Ioana Onuț-Brännström;Hanna Johannesson;Dan MacLean;Nicholas J. Talbot
通讯作者：
Nicholas J. Talbot

Crowdsourcing genomic analyses of ash and ash dieback – power to the people

DOI：
10.1186/2047-217x-2-2
发表时间：
2013-02-12
期刊：
GigaScience
影响因子：
3.900
作者：
Dan MacLean;Kentaro Yoshida;Anne Edwards;Lisa Crossman;Bernardo Clavijo;Matt Clark;David Swarbreck;Matthew Bashton;Patrick Chapman;Mark Gijzen;Mario Caccamo;Allan Downie;Sophien Kamoun;Diane GO Saunders
通讯作者：
Diane GO Saunders