Understanding recombination through tractable statistical analysis of whole genome sequences
通过对全基因组序列进行易于处理的统计分析来了解重组
基本信息
- 批准号:BB/N00874X/1
- 负责人:
- 金额:$ 32.98万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2016
- 资助国家:英国
- 起止时间:2016 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project concerns the analysis of whole genome sequence data: that is, the complete DNA sequence, the genetic code, of an organism. The technology for acquiring such a sequence is relatively new. It was used to sequence a single human genome in the "Human Genome Project", which finished in 2003. This project took 13 years and cost $2.7 billion. However, technological advances in the past 10 years have led to the cost of sequencing genomes to drop dramatically. It now costs only $1,000 to sequence a human genome and this price continues to fall.Why should one wish to sequence a genome? The human genome can be thought of as the blueprint for building a human. Each person has a slightly different genome: the parts that are common to everyone are what make us human; the parts that differ are responsible for the (genetic) differences between us. Understanding our genomes, through studying both the common parts of the genome and the differences, promises scientific breakthroughs in many areas, particularly medicine. For example, Genomics England are currently planning to sequence 100,000 genomes for the purposes of improving clinical practice in dealing with rare disease, cancer and infectious disease.The study of DNA sequences is not restricted to humans. It is also useful to obtain whole genome sequences from many other living things. This project is focussed on analysing genetic information obtained from bacteria. There are many reasons for studying bacteria, one of the most obvious being that some bacteria cause disease in plants and livestock (affecting our food supply) and in humans (affecting our health). Obtaining a better understanding of the genetic code of bacteria offers the promise of both tracking the spread of infections and also reducing the occurrence of disease.However, although sequence data is relatively easy to obtain, extracting useful information from it can be very difficult. Genome sequences can be stored on computers in text files as a long sequence of letters. A single gene might consist of a few hundred or thousand letters. A whole bacterial genome, containing thousands of genes, might be 3 million letters long. Most scientific projects involve studying a population (tens, hundreds or thousands) of these genomes, thus it is not unusual for a dataset to consist of over a billion letters! To make sense of such a large complicated data set, mathematical methods, implemented as computer programs, are required.This project is concerned with the development of such mathematical methods, and their implementation. The aim of the project is to learn about the evolution of bacteria by studying their genome sequences. As a rule, bacteria reproduce clonally: each individual only has a single parent. However, in some cases they can also exchange DNA, in a manner related to sexual reproduction in humans. It is of great scientific interest to understand such exchanges for several reasons, including that it is one of the main ways in which a bacteria can become resistant to antibiotics. MRSA is an example of an antibiotic resistant bacterial strain that has been of significant concern to the NHS. Understanding how antibiotic resistance is acquired is one way in which scientists can help to tackle such problems. Analysing whole genome sequences using mathematical methods, as is done in this project, is fundamental to these investigations.Currently there are several computer programs that can be used to investigate the exchange of DNA, and important discoveries have been made through using them. However, when they are used on whole genome sequences, some of the programs run too slowly to be useful in many cases (sometimes they take months) and others cannot detect (or provide an incomplete picture) of genetic exchange events. This project is developing new programs that are both accurate, and run quickly enough to be useful.
该项目涉及全基因组序列数据的分析:即生物体的完整DNA序列,即遗传密码。获取这种序列的技术相对较新。在2003年完成的“人类基因组计划”中,它被用来对单个人类基因组进行测序。该项目历时13年,耗资27亿美元。然而,过去10年的技术进步使基因组测序成本大幅下降。现在,测定一个人类基因组的序列只需1,000美元,而且这个价格还在继续下降。人类基因组可以被认为是构建人类的蓝图。每个人都有一个稍微不同的基因组:每个人都有共同的部分是我们人类的组成部分;不同的部分是我们之间(遗传)差异的原因。了解我们的基因组,通过研究基因组的共同部分和差异,有望在许多领域取得科学突破,特别是医学。例如,英国基因组公司目前正计划对10万个基因组进行测序,以改善罕见疾病、癌症和传染病的临床实践。从许多其他生物中获得全基因组序列也是有用的。该项目的重点是分析从细菌中获得的遗传信息。研究细菌的原因有很多,其中最明显的一个是,一些细菌会导致植物和牲畜(影响我们的食物供应)以及人类(影响我们的健康)的疾病。更好地了解细菌的遗传密码,不仅可以追踪感染的传播,还可以减少疾病的发生。然而,尽管序列数据相对容易获得,但从中提取有用的信息可能非常困难。基因组序列可以存储在计算机上的文本文件作为一个长序列的字母。一个基因可能由几百或几千个字母组成。一个完整的细菌基因组,包含数千个基因,可能有300万个字母长。大多数科学项目涉及研究这些基因组的群体(数十,数百或数千),因此数据集由超过十亿个字母组成并不罕见!为了理解如此庞大而复杂的数据集,需要用计算机程序实现数学方法。本项目关注的是这种数学方法的发展及其实现。该项目的目的是通过研究细菌的基因组序列来了解细菌的进化。一般来说,细菌是无性繁殖的:每个个体只有一个亲本。然而,在某些情况下,它们也可以以与人类有性生殖有关的方式交换DNA。了解这种交换具有很大的科学意义,原因有几个,包括它是细菌对抗生素产生耐药性的主要方式之一。MRSA是一种抗生素耐药菌株的例子,它一直受到NHS的严重关注。了解抗生素耐药性是如何获得的是科学家可以帮助解决这些问题的一种方法。利用数学方法分析整个基因组序列是这些研究的基础,正如本项目所做的那样。目前有几种计算机程序可以用来研究DNA的交换,并且通过使用它们已经取得了重要的发现。然而,当它们用于全基因组序列时,一些程序运行得太慢,在许多情况下都没有用(有时需要几个月),而另一些程序则无法检测到(或提供不完整的图片)基因交换事件。这个项目正在开发新的程序,既准确,运行速度快,是有用的。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Speeding up Inference of Homologous Recombination in Bacteria
加速细菌同源重组的推断
- DOI:10.1101/2020.05.10.087007
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Medina-Aguayo F
- 通讯作者:Medina-Aguayo F
Sequential Monte Carlo with transformations
具有变换的顺序蒙特卡罗
- DOI:10.48550/arxiv.1612.06468
- 发表时间:2016
- 期刊:
- 影响因子:0
- 作者:Everitt Richard G
- 通讯作者:Everitt Richard G
Perturbation bounds for Monte Carlo within Metropolis via restricted approximations
- DOI:10.1016/j.spa.2019.06.015
- 发表时间:2018-09
- 期刊:
- 影响因子:1.4
- 作者:F. Medina-Aguayo;Daniel Rudolf;Nikolaus Schweizer
- 通讯作者:F. Medina-Aguayo;Daniel Rudolf;Nikolaus Schweizer
Delayed acceptance ABC-SMC
延迟验收 ABC-SMC
- DOI:10.48550/arxiv.1708.02230
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Everitt Richard G.
- 通讯作者:Everitt Richard G.
Revisiting the balance heuristic for estimating normalising constants
重新审视估计归一化常数的平衡启发式
- DOI:10.48550/arxiv.1908.06514
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Medina-Aguayo Felipe J
- 通讯作者:Medina-Aguayo Felipe J
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Richard Everitt其他文献
Subsurface fracture distribution and its correlation with the shape and thickness of the Lac du Bonnet batholith
- DOI:
10.1007/s10064-018-1385-4 - 发表时间:
2018-10-22 - 期刊:
- 影响因子:4.200
- 作者:
Richard Everitt - 通讯作者:
Richard Everitt
Richard Everitt的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Richard Everitt', 18)}}的其他基金
Real-time phylogenetics using sequential Monte Carlo with tree sequences
使用顺序蒙特卡罗和树序列进行实时系统发育
- 批准号:
EP/W006790/1 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Research Grant
Statistical inference and uncertainty quantification for complex process-based models using multiple data sets
使用多个数据集对基于过程的复杂模型进行统计推断和不确定性量化
- 批准号:
NE/T00973X/1 - 财政年份:2020
- 资助金额:
$ 32.98万 - 项目类别:
Research Grant
Tractable inference for statistical network models with local dependence
具有局部依赖性的统计网络模型的易于处理的推理
- 批准号:
EP/N023927/1 - 财政年份:2016
- 资助金额:
$ 32.98万 - 项目类别:
Research Grant
相似国自然基金
AMPKα调控DNA双链断裂修复的机制及其在卵巢癌中的功能研究
- 批准号:31900511
- 批准年份:2019
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
长链非编码RNA BGL3在DNA损伤修复应答中的功能和作用机制研究
- 批准号:31871364
- 批准年份:2018
- 资助金额:60.0 万元
- 项目类别:面上项目
染色体结构维持蛋白1在端粒DNA双链断裂损伤修复中的作用及其机理
- 批准号:31801145
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
SOSS和RPA参与同源重组修复的分子机制研究
- 批准号:31701181
- 批准年份:2017
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
Wnt-β-catenin信号在关节软骨细胞中的作用及其与骨关节炎之间的关系
- 批准号:30973040
- 批准年份:2009
- 资助金额:34.0 万元
- 项目类别:面上项目
表面等离子共振增强硅基发光研究
- 批准号:60606001
- 批准年份:2006
- 资助金额:28.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Targeting ER+ Breast Cancer Through Induced Viral Mimicry
通过诱导病毒模仿来靶向 ER 乳腺癌
- 批准号:
10416945 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
PG545 synergizes with PARP inhibitors in ovarian cancer to disrupt DNA repair through modulation of DEK-RAD51 axis
PG545 与卵巢癌中的 PARP 抑制剂协同作用,通过调节 DEK-RAD51 轴破坏 DNA 修复
- 批准号:
10553686 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Understanding the Co-Evolution of Phylogenomic Signal, Gene Linkage, and Recombination Rate Through Comparative Genomics
通过比较基因组学了解系统发育信号、基因连锁和重组率的共同进化
- 批准号:
2150664 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Standard Grant
Establishment of the meiotic cell cycle program through post-transcriptional regulation by MEIOC and YTHDC2
通过 MEIOC 和 YTHDC2 的转录后调控建立减数分裂细胞周期程序
- 批准号:
10553538 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Reprogramming cell-fate decisions through predictive modeling and synthetic biology
通过预测模型和合成生物学重新编程细胞命运决定
- 批准号:
10344041 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
PG545 synergizes with PARP inhibitors in ovarian cancer to disrupt DNA repair through modulation of DEK-RAD51 axis
PG545 与卵巢癌中的 PARP 抑制剂协同作用,通过调节 DEK-RAD51 轴破坏 DNA 修复
- 批准号:
10426460 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
The Role of ZEB1 in promoting therapeutic resistance through its interaction with 53BP1
ZEB1 通过与 53BP1 相互作用促进治疗耐药的作用
- 批准号:
10551845 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Reprogramming cell-fate decisions through predictive modeling and synthetic biology
通过预测模型和合成生物学重新编程细胞命运决定
- 批准号:
10706965 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
Targeting ER+ Breast Cancer Through Induced Viral Mimicry
通过诱导病毒模仿来靶向 ER 乳腺癌
- 批准号:
10578719 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别:
The Role of ZEB1 in promoting therapeutic resistance through its interaction with 53BP1
ZEB1 通过与 53BP1 相互作用促进治疗耐药的作用
- 批准号:
10445498 - 财政年份:2022
- 资助金额:
$ 32.98万 - 项目类别: