CAREER: Advancing evolutionary genomics and eukaryotic biodiversity research through accurate, scalable, and flexible frameworks for structural genome annotation
职业:通过准确、可扩展且灵活的结构基因组注释框架推进进化基因组学和真核生物多样性研究
基本信息
- 批准号:1943371
- 负责人:
- 金额:$ 56.34万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-15 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
A high-quality annotation and associated genome assembly are necessary to understand how genes in a given organism work. Variation associated with genes, and their structure, provides a framework for examining morphological, physiological, and behavioral traits. In the era of high-throughput sequencing, the size and complexity of the genomes attempted has dramatically increased. Despite this, over 91% of these genomes contain a multitude of gene annotation errors. The Earth BioGenome Project intends to sequence 1.5 M Eukaryotic genomes in the next ten years. Related projects, such as the Vertebrate Genomes Project, the Global Invertebrate Genome Alliance, and the 10,000 Plant Genomes Project will contribute to exciting genomic contributions to biodiversity research. Reliable, efficient, and well-integrated software, that maintain connectivity to community data standards, will be critical to address the tremendous data generated by these initiatives. The EASEL (Efficient, Accurate, Scalable Eukaryotic modeLs) framework will tremendously ease the burden on researchers, many of whom, are attempting to assemble and annotate genomes with small teams. Collaborations with these small teams will support the development of an annotation platform that implements machine learning in a user-friendly package. At the same time, EASEL will improve the efficiency and accuracy by responding to the needs of larger and more complex genomes. Collaborations with these large-scale initiatives will support an intensive undergraduate internship program to connect biology students to big data, bioinformatics, and machine learning in the context of genome annotation. EASEL (Efficient, Accurate, Scalable Eukaryotic modeLs), an integrated and accessible deep learning framework for the annotation of eukaryotic reference genomes with limited or extensive external evidence, will be developed. The software will improve both evidence-based and ab initio derived gene models through a full workflow, that encompasses repeat identification through gene model annotation. Software development will be paired with research partnerships representing over 30 new eukaryotic genomes, including insects, plants, and animals. Following successful implementation, EASEL will be translated into a framework compatible with the Galaxy Toolshed so that it can be freely installed and executed through any local instance. A Tripal/Galaxy database module will be developed for installation on any Tripal clade or model organism web-based repository to provide analytical capacity in proximity to the genomic resources housed in community databases. Integration at the database level will be evaluated first within the forest tree genomics and phenomics resource, TreeGenes. Software development will be integrated in a multi-disciplinary research and education driven model. A new undergraduate summer training opportunity, Genome Assembly and Annotation, will guide students through a three module research experience that will culminate in an Annotation-thon and a total of ten new genome annotations. Software and results from this project will be distributed here: https://gitlab.com/PlantGenomicsLab/HBEF/-/tree/master/AnnotationThis award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
高质量的注释和相关的基因组组装是了解特定生物体中基因如何工作的必要条件。与基因相关的变异及其结构为研究形态、生理和行为特征提供了一个框架。在高通量测序的时代,基因组的大小和复杂性急剧增加。尽管如此,超过91%的基因组包含大量的基因注释错误。地球生物基因组计划计划在未来十年内对1.5亿个真核生物基因组进行测序。相关项目,如脊椎动物基因组计划、全球无脊椎动物基因组联盟和10,000种植物基因组计划,将为生物多样性研究做出令人兴奋的基因组贡献。可靠、高效、集成良好的软件,保持与社区数据标准的连接,对于处理这些计划产生的大量数据至关重要。EASEL(高效、准确、可扩展的真核模型)框架将极大地减轻研究人员的负担,他们中的许多人正试图用小团队组装和注释基因组。与这些小团队的合作将支持注释平台的开发,该平台在用户友好的软件包中实现机器学习。同时,EASEL将通过响应更大和更复杂的基因组的需求来提高效率和准确性。与这些大规模计划的合作将支持一个密集的本科实习项目,将生物学专业的学生与基因组注释背景下的大数据、生物信息学和机器学习联系起来。EASEL(高效、准确、可扩展的真核模型)是一个集成的、可访问的深度学习框架,用于在有限或广泛的外部证据下注释真核参考基因组。该软件将通过完整的工作流程改进基于证据的和从头算衍生的基因模型,包括通过基因模型注释进行重复识别。软件开发将与代表30多种新的真核生物基因组(包括昆虫、植物和动物)的研究伙伴关系配对。成功实施后,EASEL将转换为与Galaxy Toolshed兼容的框架,以便通过任何本地实例自由安装和执行。将开发一个Tripal/Galaxy数据库模块,安装在任何Tripal分支或模式生物基于网络的存储库上,以提供接近社区数据库中基因组资源的分析能力。数据库级别的整合将首先在林木基因组学和表型组学资源TreeGenes中进行评估。软件开发将集成在一个多学科研究和教育驱动的模型中。一个新的本科暑期培训机会,基因组组装和注释,将引导学生通过三个模块的研究经验,最终将在注释马拉松和总共十个新的基因组注释。这个项目的软件和结果将在这里发布:https://gitlab.com/PlantGenomicsLab/HBEF/-/tree/master/AnnotationThis奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Urban living can rescue Darwin's finches from the lethal effects of invasive vampire flies
- DOI:10.1111/gcb.17145
- 发表时间:2024-01-01
- 期刊:
- 影响因子:11.6
- 作者:Knutie,Sarah A.;Webster,Cynthia N.;Wegrzyn,Jill L.
- 通讯作者:Wegrzyn,Jill L.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jill Wegrzyn其他文献
Jill Wegrzyn的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Advancing Governance and Resilience for Climate Adaptation through Cultural Heritage (AGREE)
通过文化遗产促进气候适应的治理和抵御能力(同意)
- 批准号:
AH/Z000017/1 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Research Grant
Advancing Child and Youth-led Climate Change Education with Country
与国家一起推进儿童和青少年主导的气候变化教育
- 批准号:
DP240100968 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Discovery Projects
Governing Sustainable Futures: Advancing the use of Participatory Mechanisms for addressing Place-based Contestations of Sustainable Living
治理可持续未来:推进利用参与机制来解决基于地方的可持续生活竞赛
- 批准号:
ES/Z502789/1 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Research Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342498 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Standard Grant
HSI Implementation and Evaluation Project: Green Chemistry: Advancing Equity, Relevance, and Environmental Justice
HSI 实施和评估项目:绿色化学:促进公平、相关性和环境正义
- 批准号:
2345355 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Continuing Grant
AUC-GRANTED: Advancing Transformation of the Research Enterprise through Shared Resource Support Model for Collective Impact and Synergistic Effect.
AUC 授予:通过共享资源支持模型实现集体影响和协同效应,推进研究企业转型。
- 批准号:
2341110 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Cooperative Agreement
ALPACA - Advancing the Long-range Prediction, Attribution, and forecast Calibration of AMOC and its climate impacts
APACA - 推进 AMOC 及其气候影响的长期预测、归因和预报校准
- 批准号:
2406511 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Standard Grant
Planning: Advancing Discovery on a Sustainable National Research Enterprise
规划:推进可持续国家研究企业的发现
- 批准号:
2412406 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Standard Grant
Collaborative Research: CHIPS: TCUP Cyber Consortium Advancing Computer Science Education (TCACSE)
合作研究:CHIPS:TCUP 网络联盟推进计算机科学教育 (TCACSE)
- 批准号:
2414607 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Standard Grant
Photonic-Enabled THz Duplex Metasurface: Advancing Communication and Sensing
光子太赫兹双工超表面:推进通信和传感
- 批准号:
24K17324 - 财政年份:2024
- 资助金额:
$ 56.34万 - 项目类别:
Grant-in-Aid for Early-Career Scientists