ABI Development: Leveraging NSF-funded national cyberinfrastructure to spearhead biological discovery with Galaxy

ABI 开发:利用 NSF 资助的国家网络基础设施,通过 Galaxy 引领生物发现

基本信息

  • 批准号:
    1661497
  • 负责人:
  • 金额:
    $ 165.99万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2017
  • 资助国家:
    美国
  • 起止时间:
    2017-07-01 至 2021-06-30
  • 项目状态:
    已结题

项目摘要

Due to the rapidly increasing volume of biological data from sequencing, imaging, and other technologies, data processing needs in the Life Sciences are now on par with physical and engineering disciplines. Importantly, the distributed nature of data generation in biology makes this situation even more challenging. Today one can hardly find a research institution or university without multiple high-throughput DNA sequencing machines, and there are often references to a "data crisis" in biology. Federal agencies, and the NSF in particular, are investing heavily in cyberinfrastructure by supporting development of high performance computing (HPC) resources such as the Extreme Science and Engineering Discovery Environment (XSEDE). Yet to a large extent, these resources remain unknown to biological researchers who overwhelmingly continue to rely on fragile in-house computation. The goal of this project is to ensure effective utilization of federal funds that have been invested into development of the national computing infrastructure. This project will extend the Galaxy software platform to leverage existing NSF hardware resources, increasing the value of existing infrastructure for biology researchers that were previously unable to take full advantage of these resources.This project will follow a comprehensive approach that addresses the needs of experimental scientists, tool developers, and administrators of high performance compute systems (HPC). Access to national compute infrastructure will be expanded so that Galaxy will function as a middleware interface to existing heterogeneous environments such as XSEDE or individual systems such as Jetstream. Software components necessary to optimize Galaxy as a link between researchers and existing HPC will be developed based on pilot projects with the Texas Advanced Computing Center (TACC), XSEDE, PSC, and Indiana University. (2) XSEDE resources to enable interactive data exploration and visualization will be leveraged to expand Galaxy's current capacity for dynamic scientific data analysis. Integration with Interactive Analysis Environments, such as Jupyter or RStudio will allow manipulation and creation of Galaxy datasets using common scripting languages. Taking advantage of XSEDE resources will enable Galaxy's interactive environments and visual analytics to scale to large datasets and sophisticated workflows. (3) Sustainable training and outreach will focus on creating and disseminating curricula that enable investigators to learn skills needed to analyze large datasets. Creation of pre-configured infrastructure components for running workshops and develop modules for undergraduate and graduate face-to-face and on-line classes will expand the current educational portfolio to scale support for increasing numbers of Galaxy users, including disciplines beyond life sciences such as Natural Language Processing. Outcomes of this project will be available at http://galaxyproject.org and https://github.com/galaxyproject.
由于来自测序、成像和其他技术的生物数据量迅速增加,生命科学的数据处理需求现在与物理和工程学科持平。重要的是,生物学中数据生成的分布式性质使这种情况更具挑战性。今天,几乎找不到没有多台高通量DNA测序仪的研究机构或大学,而且经常有人提到生物学中的“数据危机”。联邦机构,特别是国家科学基金会,通过支持开发高性能计算(HPC)资源,如极端科学和工程发现环境(XSEDE),正在大力投资于网络基础设施。然而,在很大程度上,这些资源对于生物研究人员来说仍然是未知的,他们压倒性地继续依赖脆弱的内部计算。该项目的目标是确保有效利用已投资于发展国家计算基础设施的联邦资金。该项目将扩展Galaxy软件平台以利用现有的NSF硬件资源,为以前无法充分利用这些资源的生物研究人员增加现有基础设施的价值。该项目将遵循一种全面的方法,以满足实验科学家、工具开发人员和高性能计算系统(HPC)管理员的需求。将扩大对国家计算基础设施的使用,以便银河系统将作为连接XSEDE等现有异质环境或Jetstream等个别系统的中间件接口。将在与德克萨斯高级计算中心(TACC)、XSEDE、PSC和印第安纳大学的试点项目的基础上,开发优化Galaxy作为研究人员和现有HPC之间纽带所需的软件组件。(2)利用XSEDE资源实现交互式数据探索和可视化,以扩大银河现有的动态科学数据分析能力。与交互分析环境(如Jupyter或RStudio)的集成将允许使用通用脚本语言操作和创建Galaxy数据集。利用XSEDE资源将使Galaxy的交互环境和视觉分析能够扩展到大型数据集和复杂的工作流程。(3)可持续培训和外联将侧重于创建和传播课程,使调查人员能够学习分析大型数据集所需的技能。创建用于举办讲习班的预先配置的基础设施组成部分,并为本科生和研究生面对面和在线课程开发模块,将扩大目前的教育组合,以扩大对越来越多的银河系统用户的支持,包括自然语言处理等生命科学以外的学科。该项目的成果将在http://galaxyproject.org和https://github.com/galaxyproject.上公布。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anton Nekrutenko其他文献

Next-generation sequencing data interpretation: enhancing reproducibility and accessibility
下一代测序数据解读:提高可重复性和可及性
  • DOI:
    10.1038/nrg3305
  • 发表时间:
    2012-08-17
  • 期刊:
  • 影响因子:
    52.000
  • 作者:
    Anton Nekrutenko;James Taylor
  • 通讯作者:
    James Taylor
In memory of James Taylor: the birth of Galaxy
  • DOI:
    10.1186/s13059-020-02016-0
  • 发表时间:
    2020-04-30
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Anton Nekrutenko;Michael C. Schatz
  • 通讯作者:
    Michael C. Schatz
<em>mNSC1</em> shows no evidence of protein-coding capacity
  • DOI:
    10.1016/j.gene.2005.11.016
  • 发表时间:
    2006-03-29
  • 期刊:
  • 影响因子:
  • 作者:
    Christina Wilson;Paula Goetting-Minesky;Anton Nekrutenko
  • 通讯作者:
    Anton Nekrutenko

Anton Nekrutenko的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Anton Nekrutenko', 18)}}的其他基金

Rapid: Collaborative Research: Agile and effective responses to emerging pathogen threats through open data and open analytics
快速:协作研究:通过开放数据和开放分析,敏捷、有效地应对新兴病原体威胁
  • 批准号:
    2027194
  • 财政年份:
    2020
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
CIBR: Collaborative Research: Providing sustainable Galaxy service on XSEDE resources
CIBR:协作研究:在 XSEDE 资源上提供可持续的 Galaxy 服务
  • 批准号:
    1929694
  • 财政年份:
    2019
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Continuing Grant
Collaborative Research: CC-NIE Integration: Developing Applications with Networking Capabilities via End-to-End SDN (DANCES)
合作研究:CC-NIE 集成:通过端到端 SDN 开发具有网络功能的应用程序 (DANCES)
  • 批准号:
    1340953
  • 财政年份:
    2014
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
Cyberinfrastructure for Accessible and Reproducible Research in Life Sciences
用于生命科学研究可访问和可重复的网络基础设施
  • 批准号:
    0850103
  • 财政年份:
    2009
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Continuing Grant
Tailoring Genomic Data to the Needs of Experimental Biologists and Educators
根据实验生物学家和教育工作者的需求定制基因组数据
  • 批准号:
    0543285
  • 财政年份:
    2006
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Continuing Grant

相似国自然基金

水稻边界发育缺陷突变体abnormal boundary development(abd)的基因克隆与功能分析
  • 批准号:
    32070202
  • 批准年份:
    2020
  • 资助金额:
    58 万元
  • 项目类别:
    面上项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    40 万元
  • 项目类别:

相似海外基金

CAREER: Leveraging Data Science & Policy to Promote Sustainable Development Via Resource Recovery
职业:利用数据科学
  • 批准号:
    2339025
  • 财政年份:
    2024
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Continuing Grant
NSF Engines Development Award: Leveraging innovations for water and energy security (NM, TX)
NSF 发动机开发奖:利用创新实现水和能源安全(新墨西哥、德克萨斯州)
  • 批准号:
    2315274
  • 财政年份:
    2024
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Cooperative Agreement
EAGER GERMINATION Collaborative Research: Leveraging a Research Development Professional Network to Catalyze Statewide Innovative and Societally Relevant Research
渴望萌芽合作研究:利用研究开发专业网络促进全州创新和社会相关研究
  • 批准号:
    2409875
  • 财政年份:
    2023
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
Leveraging a novel health records platform to predict the development of cardiovascular disease following kidney transplantation
利用新型健康记录平台预测肾移植后心血管疾病的发展
  • 批准号:
    10679322
  • 财政年份:
    2023
  • 资助金额:
    $ 165.99万
  • 项目类别:
EAGER GERMINATION Collaborative Research: Leveraging a Research Development Professional Network to Catalyze Statewide Innovative and Societally Relevant Research
渴望萌芽合作研究:利用研究开发专业网络促进全州创新和社会相关研究
  • 批准号:
    2203442
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
EAGER GERMINATION Collaborative Research: Leveraging a Research Development Professional Network to Catalyze Statewide Innovative and Societally Relevant Research
渴望萌芽合作研究:利用研究开发专业网络促进全州创新和社会相关研究
  • 批准号:
    2203425
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
EAGER GERMINATION Collaborative Research: Leveraging a Research Development Professional Network to Catalyze Statewide Innovative and Societally Relevant Research
渴望萌芽合作研究:利用研究开发专业网络促进全州创新和社会相关研究
  • 批准号:
    2203470
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
Intelligent API Engineering: Systematically Leveraging APIs Through Development Knowledge and Usage Data
智能 API 工程:通过开发知识和使用数据系统地利用 API
  • 批准号:
    RGPIN-2022-03505
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Discovery Grants Program - Individual
Leveraging Context in Open Software Development Ecosystems
利用开放软件开发生态系统中的上下文
  • 批准号:
    RGPIN-2016-05257
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Discovery Grants Program - Individual
EAGER GERMINATION Collaborative Research: Leveraging a Research Development Professional Network to Catalyze Statewide Innovative and Societally Relevant Research
渴望萌芽合作研究:利用研究开发专业网络促进全州创新和社会相关研究
  • 批准号:
    2203459
  • 财政年份:
    2022
  • 资助金额:
    $ 165.99万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了