Improving the rat reference genome annotation and building community engagement
改善大鼠参考基因组注释并建立社区参与
基本信息
- 批准号:BB/K009524/1
- 负责人:
- 金额:$ 75.52万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2013
- 资助国家:英国
- 起止时间:2013 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Rats have been used in research for over a 100 years as a model to examine physiology and behaviour to provide insight into human disease. Owing to its well characterised physiology, the rat is also the favoured rodent model used in the pharmaceutical industry for the assessment of drug efficacy and toxicity. In 2004 the first reference Rat genome sequence was made public and this has changed the direction of research using Rat as a model organism, enabling identification of rat genes associated with specific diseases.The first release of the rat genome sequence was not of high quality and contained many gaps and missing genes. This has been updated in 2012 by the Baylor College of Medicine Human sequencing group integrating sequence generated from new sequencing technologies increasing the amount of sequence covered in the genome. Recently new experimental techniques have enabled scientists to knockout genes in the Rat genome facilitating observations of what happens to the rat when a gene is deleted. As a result, it is essential that the genes targeted for this type of genetic experiments are correctly identified i.e. "annotated" on the rat genome. The main aim of this project is to correctly identify all the rat genes on the new release of the reference rat genome. This is achieved in a combination of two strategies. Initially the genes will be identified using state of the art bioinformatic programs and pipelines developed by the Ensembl gene build team. The genes are identified by matches to known rat proteins on the genome, other transcribed data such at mRNAs and ESTs or conserved proteins from other species. As this is an automatic pipeline there maybe complex gene families that cannot be correctly identified and require manual inspection. The HAVANA team have been involved manual annotation of the human, mouse and zebrafish reference genomes and have developed in-house specialist tools to help accurate identification of genes within different genomes. Since manual inspection is expensive and time consuming the manual effort will be targeted on complex gene families and genes of specific interest to the rat scientific research community. Engaging with the community will be essential to receive feedback about targetting of annotation as well as to generate community participation in the manual inspection of genes of interest. There are predicted to be over 22000 protein-coding genes identified on the original rat assembly and therefore community input could improve and refine these gene models. Automatic annotation identifies around 70% of genes correctly, therefore the aim would be to use bioinformatics analysis and feedback from researchers to target the 30% incorrectly annotated genes and improve them.The HAVANA team have previously worked with pig researchers to pursue a community annotation project of identify immuno-response genes on the pig genome. Approximately 8% of protein-coding genes were annotated using the Havana annotation tools remotely on their own laptops in their labs after attending a workshop on how to use the in-house tools. Regular contact with the professional annotators ensured the resulting models were consistent among all researchers and adhered to the guidelines produced by the Havana group. This model of community annotation will be presented to the Rat community as an opportunity to improve the annotation of Rat genes.The reference rat genes can be viewed via the internet using the Ensembl genome browser. This reference gene set will be updated approximately every three months and updates from the manual annotation effort will be merged into the automatic gene set by the Ensembl gene builders. In addition any new Rat specific data that helps with identifying new genes such as new sequencing technology transcriptome data can be integrated into this complex genebuilding pipeline.
大鼠已在研究中被用作100多年的模型,以检查生理和行为,以洞悉人类疾病。由于其特征性的生理学,大鼠也是制药行业中用于评估药物疗效和毒性的最受欢迎的啮齿动物模型。在2004年,第一个参考大鼠基因组序列公开,这改变了使用大鼠作为模型生物体的研究方向,从而鉴定了与特定疾病相关的大鼠基因。大鼠基因组序列的首次释放不具有高质量,并且包含许多间隙和缺失的基因。 2012年,贝勒医学院人类测序组对新测序技术产生的序列进行了更新,从而增加了基因组中涵盖的序列的数量。最近,新的实验技术使科学家能够在大鼠基因组中敲除基因,从而促进了删除基因时大鼠发生的事情的观察。结果,必须正确识别针对此类遗传实验的基因,即大鼠基因组上的“注释”。该项目的主要目的是正确识别参考大鼠基因组新释放的所有大鼠基因。这是通过两种策略的结合来实现的。最初,将使用Ensembl Gene Build Team开发的最先进的生物信息学计划和管道来识别基因。这些基因是通过基因组上已知的大鼠蛋白的匹配来鉴定的,其他转录的数据,例如mRNA和ESTS或其他物种的保守蛋白。由于这是自动管道,因此可能无法正确识别并需要手动检查的复杂基因家族。哈瓦那团队涉及人,小鼠和斑马鱼参考基因组的手动注释,并开发了内部专业工具,以帮助准确鉴定不同基因组中的基因。由于手动检查很昂贵,并且耗时的手动工作将针对复杂的基因家族和大鼠科学研究界的特定兴趣基因。与社区互动将是接收有关注释靶向的反馈,并引起社区参与感兴趣基因的手动检查。预计将在原始大鼠组装上鉴定出超过22000个蛋白质编码基因,因此社区输入可以改善和完善这些基因模型。自动注释正确地识别了约70%的基因,因此目的是使用生物信息学分析和研究人员的反馈来针对30%的错误注释基因并改善它们。哈瓦那团队以前曾与Pig研究人员合作寻求识别猪基因组的免疫反应基因的社区注释项目。在参加如何使用内部工具的研讨会后,使用HAVANA注释工具在实验室中远程远程使用HAVANA注释工具注释了大约8%的蛋白质编码基因。与专业注释者的定期接触确保了所有研究人员的最终模型是一致的,并遵守了哈瓦那组制定的指南。这种社区注释模型将被提交给大鼠社区,以改善大鼠基因的注释。可以使用Ensembl基因组浏览器通过互联网查看参考大鼠基因。该参考基因集将大约每三个月更新一次,并且手动注释工作中的更新将由Ensembl Gene Builders设置的自动基因合并。此外,任何有助于识别新基因(例如新测序技术转录组数据)的新大鼠特定数据都可以集成到这一复杂的基因建造管道中。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The Vertebrate Genome Annotation browser 10 years on.
- DOI:10.1093/nar/gkt1241
- 发表时间:2014-01
- 期刊:
- 影响因子:14.9
- 作者:Harrow JL;Steward CA;Frankish A;Gilbert JG;Gonzalez JM;Loveland JE;Mudge J;Sheppard D;Thomas M;Trevanion S;Wilming LG
- 通讯作者:Wilming LG
Ensembl 2014.
- DOI:10.1093/nar/gkt1196
- 发表时间:2014-01
- 期刊:
- 影响因子:14.9
- 作者:Flicek P;Amode MR;Barrell D;Beal K;Billis K;Brent S;Carvalho-Silva D;Clapham P;Coates G;Fitzgerald S;Gil L;Girón CG;Gordon L;Hourlier T;Hunt S;Johnson N;Juettemann T;Kähäri AK;Keenan S;Kulesha E;Martin FJ;Maurel T;McLaren WM;Murphy DN;Nag R;Overduin B;Pignatelli M;Pritchard B;Pritchard E;Riat HS;Ruffier M;Sheppard D;Taylor K;Thormann A;Trevanion SJ;Vullo A;Wilder SP;Wilson M;Zadissa A;Aken BL;Birney E;Cunningham F;Harrow J;Herrero J;Hubbard TJ;Kinsella R;Muffato M;Parker A;Spudich G;Yates A;Zerbino DR;Searle SM
- 通讯作者:Searle SM
Ensembl 2015.
- DOI:10.1093/nar/gku1010
- 发表时间:2015-01
- 期刊:
- 影响因子:14.9
- 作者:Cunningham F;Amode MR;Barrell D;Beal K;Billis K;Brent S;Carvalho-Silva D;Clapham P;Coates G;Fitzgerald S;Gil L;Girón CG;Gordon L;Hourlier T;Hunt SE;Janacek SH;Johnson N;Juettemann T;Kähäri AK;Keenan S;Martin FJ;Maurel T;McLaren W;Murphy DN;Nag R;Overduin B;Parker A;Patricio M;Perry E;Pignatelli M;Riat HS;Sheppard D;Taylor K;Thormann A;Vullo A;Wilder SP;Zadissa A;Aken BL;Birney E;Harrow J;Kinsella R;Muffato M;Ruffier M;Searle SM;Spudich G;Trevanion SJ;Yates A;Zerbino DR;Flicek P
- 通讯作者:Flicek P
Rat Genome Community Annotation project: A call to arms to manually improve the rat gene set
大鼠基因组社区注释项目:号召手动改进大鼠基因集
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:Gaurab Mukherjee (Author)
- 通讯作者:Gaurab Mukherjee (Author)
Ensembl 2016.
- DOI:10.1093/nar/gkv1157
- 发表时间:2016-01-04
- 期刊:
- 影响因子:14.9
- 作者:Yates A;Akanni W;Amode MR;Barrell D;Billis K;Carvalho-Silva D;Cummins C;Clapham P;Fitzgerald S;Gil L;Girón CG;Gordon L;Hourlier T;Hunt SE;Janacek SH;Johnson N;Juettemann T;Keenan S;Lavidas I;Martin FJ;Maurel T;McLaren W;Murphy DN;Nag R;Nuhn M;Parker A;Patricio M;Pignatelli M;Rahtz M;Riat HS;Sheppard D;Taylor K;Thormann A;Vullo A;Wilder SP;Zadissa A;Birney E;Harrow J;Muffato M;Perry E;Ruffier M;Spudich G;Trevanion SJ;Cunningham F;Aken BL;Zerbino DR;Flicek P
- 通讯作者:Flicek P
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tim Hubbard其他文献
SCOP: A structural classification of proteins database for the investigation of sequences and structures
- DOI:
10.1016/s0022-2836(05)80134-2 - 发表时间:
1995-04-07 - 期刊:
- 影响因子:
- 作者:
Alexey G. Murzin;Steven E. Brenner;Tim Hubbard;Cyrus Chothia - 通讯作者:
Cyrus Chothia
ITFoM - The IT Future of Medicine
ITFoM - 医学的 IT 未来
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Hans Lehrach;Ralf Sudbrak;Peter Boyle;Markus Pasterk;Kurt Zatloukal;Heimo Müller;Tim Hubbard;Angela Brand;M. Girolami;Daniel Jameson;F. Bruggeman;Hans V. Westerhoff - 通讯作者:
Hans V. Westerhoff
Protein structure prediction: playing the fold.
蛋白质结构预测:玩折叠游戏。
- DOI:
10.1016/s0968-0004(96)20018-0 - 发表时间:
1996 - 期刊:
- 影响因子:13.8
- 作者:
Tim Hubbard;Jong Park;Armin Lahm;Raphael Leplae;Anna Tramontano - 通讯作者:
Anna Tramontano
Update on protein structure prediction: results of the 1995 IRBM workshop.
蛋白质结构预测的更新:1995 年 IRBM 研讨会的结果。
- DOI:
10.1016/s1359-0278(96)00028-4 - 发表时间:
1996 - 期刊:
- 影响因子:0
- 作者:
Tim Hubbard;Anna Tramontano - 通讯作者:
Anna Tramontano
New horizons in sequence analysis.
序列分析的新视野。
- DOI:
10.1016/s0959-440x(97)80024-3 - 发表时间:
1997 - 期刊:
- 影响因子:6.8
- 作者:
Tim Hubbard - 通讯作者:
Tim Hubbard
Tim Hubbard的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tim Hubbard', 18)}}的其他基金
Ensembl and enabling genetics and genomics research in farmed animal species
养殖动物物种的集成和遗传学和基因组学研究
- 批准号:
BB/I025360/2 - 财政年份:2014
- 资助金额:
$ 75.52万 - 项目类别:
Research Grant
A lightweight genome browser for data integration, exploration and interactive figures
用于数据集成、探索和交互式图形的轻量级基因组浏览器
- 批准号:
BB/K015427/1 - 财政年份:2013
- 资助金额:
$ 75.52万 - 项目类别:
Research Grant
Ensembl and enabling genetics and genomics research in farmed animal species
养殖动物物种的集成和遗传学和基因组学研究
- 批准号:
BB/I025360/1 - 财政年份:2012
- 资助金额:
$ 75.52万 - 项目类别:
Research Grant
Pig genome annotation and analysis
猪基因组注释与分析
- 批准号:
BB/E011640/1 - 财政年份:2007
- 资助金额:
$ 75.52万 - 项目类别:
Research Grant
相似国自然基金
傣药芽命几及单体异鼠李素调节蛋白合成-分解代谢治疗骨骼肌萎缩的作用及机制研究
- 批准号:82305431
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高原鼠兔食粪行为与肠道菌群互作对机体尿素氮利用的影响及机制
- 批准号:32301301
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
多酚化合物干扰鼠伤寒沙门氏菌群体感应系统的机制研究
- 批准号:32300022
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
猪尾鼠适应高频听力相关的表型特征及其分子机制解析
- 批准号:32300357
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
植物—土壤反馈在高原鼢鼠鼠丘次生演替群落构建中的作用机制
- 批准号:32360345
- 批准年份:2023
- 资助金额:33 万元
- 项目类别:地区科学基金项目
相似海外基金
Pangenomics of nicotine abuse in the hybrid rat diversity panel
混合大鼠多样性小组中尼古丁滥用的泛基因组学
- 批准号:
10582448 - 财政年份:2023
- 资助金额:
$ 75.52万 - 项目类别:
Cardiovascular Risk, Vascular and Kidney Damage in COVID-19 Survivors
COVID-19 幸存者的心血管风险、血管和肾脏损伤
- 批准号:
10364096 - 财政年份:2022
- 资助金额:
$ 75.52万 - 项目类别:
Cardiovascular Risk, Vascular and Kidney Damage in COVID-19 Survivors
COVID-19 幸存者的心血管风险、血管和肾脏损伤
- 批准号:
10553207 - 财政年份:2022
- 资助金额:
$ 75.52万 - 项目类别:
Multi-scale MRI Assessment of Bone Quality and Function in a Chronic Rat Spinal Cord Injury Model
慢性大鼠脊髓损伤模型中骨质量和功能的多尺度 MRI 评估
- 批准号:
10579470 - 财政年份:2022
- 资助金额:
$ 75.52万 - 项目类别:
Long-read assembly and annotation of rat genomes that are important models of complex genetic disease
大鼠基因组的长读组装和注释是复杂遗传疾病的重要模型
- 批准号:
10449388 - 财政年份:2021
- 资助金额:
$ 75.52万 - 项目类别: