Dfam: sustainable growth, curation support, and improved quality for mobile element annotation

Dfam:可持续增长、管理支持和移动元素注释质量的提高

基本信息

  • 批准号:
    10407543
  • 负责人:
  • 金额:
    $ 60.07万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-08-15 至 2023-09-14
  • 项目状态:
    已结题

项目摘要

Project Summary / Abstract Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Thorough and accurate annotation of repetitive content in genomes depends on a comprehensive database of known TEs, along with robust statistical and procedural methods for recognizing decayed instances of elements and disentangling their complex relationships. Annotation of TE instances is usually performed using our RepeatMasker software, which compares a genome to a database containing representations of known repeat families. These have historically been consensus sequences, which generally approximate the sequences of the original TEs. The largest repository of such consensus sequences is Repbase, whose restrictive license and limited interface for curators has led to a lack of input from third parties and the creation of many unaffiliated, often organism-specific open databases. The parallel existence of these many databases has led to a divergence in nomenclature and repeat definition. Our Dfam database is an open access collection of repetitive DNA families, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). We have demonstrated that profile HMMs support improved annotation sensitivity, and Dfam provides numerous aids to both curators of TE families and those who make use of the resulting annotations. In this proposal, we describe a plan to develop the infrastructure of Dfam to expand to 1000s of genomes, and to establish a self-sustaining TE Data Commons dependent on limited centralized curation. We further describe plans to improve the quality of repeat annotation through development of methods for more reliable alignment adjudication, to expand approaches to visualization of this complex data type, and to improve the modeling of TE subfamilies. By further developing this open access database, we will provide a strong disincentive for the proliferation of unaffiliated non-standard repeat datasets and ease the burden of data management for those developing TE libraries.
项目总结/摘要 重复DNA,特别是由于转座因子(TE),构成了许多基因组的很大一部分。 基因组中重复内容的彻底和准确的注释依赖于一个全面的数据库, 已知的TE,沿着的是用于识别元素的衰变实例的稳健的统计和程序方法 理清他们复杂的关系 TE实例的注释通常使用我们的RepeatMasker软件进行,该软件比较基因组 到包含已知重复家族的表示的数据库。这些在历史上是共识 序列,其通常近似于原始TE的序列。世界上最大的 共有序列是Repbase,其限制性许可证和有限的管理员界面导致缺乏 来自第三方的输入以及许多无关联的、通常是特定于组织的开放数据库的创建。并行 这些数据库的存在导致了命名和重复定义的分歧。 我们的Dfam数据库是重复DNA家族的开放获取集合,其中每个家族都有代表性。 通过多序列比对和轮廓隐马尔可夫模型(HMM)。我们已经证明了 Hysteresis支持改进的注释灵敏度,Dfam为TE家族的管理者提供了许多帮助 以及那些使用所得到的注释的人。在这份提案中,我们描述了一项计划, Dfam的基础设施,以扩展到1000个基因组,并建立一个自我维持的TE数据共享 依赖于有限的集中管理。我们进一步描述了提高重复注释质量的计划 通过开发更可靠的对线判定方法,扩展可视化方法 这种复杂的数据类型,并改善TE子家族的建模。 通过进一步开发这一开放获取数据库,我们将有力地抑制 独立的非标准重复数据集,并减轻开发TE的数据管理负担 图书馆.

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Robert MacDonald Hubley其他文献

Robert MacDonald Hubley的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Robert MacDonald Hubley', 18)}}的其他基金

Development and Maintenance of RepeatMasker and RepeatModeler
RepeatMasker和RepeatModeler的开发和维护
  • 批准号:
    10367846
  • 财政年份:
    2022
  • 资助金额:
    $ 60.07万
  • 项目类别:
Development and Maintenance of RepeatMasker and RepeatModeler
RepeatMasker和RepeatModeler的开发和维护
  • 批准号:
    10563214
  • 财政年份:
    2022
  • 资助金额:
    $ 60.07万
  • 项目类别:
Dfam: sustainable growth, curation support, and improved quality for mobile element annotation
Dfam:可持续增长、管理支持和移动元素注释质量的提高
  • 批准号:
    10165778
  • 财政年份:
    2018
  • 资助金额:
    $ 60.07万
  • 项目类别:
Dfam: sustainable growth, curation support, and improved quality for mobile element annotation
Dfam:可持续增长、管理支持和移动元素注释质量的提高
  • 批准号:
    10714226
  • 财政年份:
    2018
  • 资助金额:
    $ 60.07万
  • 项目类别:
Dfam: sustainable growth, curation support, and improved quality for mobile element annotation
Dfam:可持续增长、管理支持和移动元素注释质量的提高
  • 批准号:
    9764454
  • 财政年份:
    2018
  • 资助金额:
    $ 60.07万
  • 项目类别:

相似海外基金

CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Continuing Grant
CAREER: Creating Tough, Sustainable Materials Using Fracture Size-Effects and Architecture
职业:利用断裂尺寸效应和架构创造坚韧、可持续的材料
  • 批准号:
    2339197
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
Travel: Student Travel Support for the 51st International Symposium on Computer Architecture (ISCA)
旅行:第 51 届计算机体系结构国际研讨会 (ISCA) 的学生旅行支持
  • 批准号:
    2409279
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
Understanding Architecture Hierarchy of Polymer Networks to Control Mechanical Responses
了解聚合物网络的架构层次结构以控制机械响应
  • 批准号:
    2419386
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
I-Corps: Highly Scalable Differential Power Processing Architecture
I-Corps:高度可扩展的差分电源处理架构
  • 批准号:
    2348571
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
Collaborative Research: Merging Human Creativity with Computational Intelligence for the Design of Next Generation Responsive Architecture
协作研究:将人类创造力与计算智能相结合,设计下一代响应式架构
  • 批准号:
    2329759
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
Hardware-aware Network Architecture Search under ML Training workloads
ML 训练工作负载下的硬件感知网络架构搜索
  • 批准号:
    2904511
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Studentship
The architecture and evolution of host control in a microbial symbiosis
微生物共生中宿主控制的结构和进化
  • 批准号:
    BB/X014657/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Research Grant
RACCTURK: Rock-cut Architecture and Christian Communities in Turkey, from Antiquity to 1923
RACCTURK:土耳其的岩石建筑和基督教社区,从古代到 1923 年
  • 批准号:
    EP/Y028120/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Fellowship
NSF Convergence Accelerator Track M: Bio-Inspired Surface Design for High Performance Mechanical Tracking Solar Collection Skins in Architecture
NSF Convergence Accelerator Track M:建筑中高性能机械跟踪太阳能收集表皮的仿生表面设计
  • 批准号:
    2344424
  • 财政年份:
    2024
  • 资助金额:
    $ 60.07万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了