Turning big data analysis infrastructure for HIV research

将大数据分析基础设施用于艾滋病毒研究

基本信息

  • 批准号:
    10214719
  • 负责人:
  • 金额:
    $ 36.83万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-07-09 至 2024-05-31
  • 项目状态:
    已结题

项目摘要

The rapid worldwide spread and severe regional outbreaks of COVID-19 following its emergence in Wuhan in November 2019 has created a sense of urgency and alarm. There are many more cases (>100,000) and deaths (~5,000) than in other recent viral outbreaks/epidemics (SARS, MERS, Ebola and Zika viruses); but in many other respects the epidemic is “typical” – zoonotic introduction from a (yet undetermined) animal reservoir, followed by a period of undetected transmission among humans (with possible adaptation to the new host), and then generalized transmission. The same types of questions arise during each of these emerging outbreaks: Where did the pathogen come from? Is it evolving in the human population? How is it spreading? How to develop reliable diagnostics? What are promising vaccine targets? Many, if not all, of these questions depend on rapid and reliable genomic analysis of diverse viral sample sequences by multiple laboratories. Yet, time and time again, including COVID-19, we encounter the same avoidable shortcomings early in the viral investigation: lack of reproducibility, rigor, and data/analytic sharing. The initial publications describing genomic features of COVID-19 [1–4] used Illumina and Oxford nanopore data to elucidate the sequence composition of patient specimens (although only Wu et al. [3] explicitly provided the accession numbers for their raw short read sequencing data). However, their approaches to processing, assembly, and analysis of raw data differed widely and ranged from transparent [3] to entirely opaque [4]. Such lack of analytical transparency sets a dangerous precedent. Infectious disease outbreaks often occur in locations where infrastructure necessary for data analysis may be inaccessible or unbiased interpretation of results may be politically untenable. Essential questions such as the extent of intra-host genomic variability (indicative of adaptation or multiple infection), viral evolution (selection, recombination), transmission (phylogentic and phylogeographic) cannot be answered reliably if researchers cannot trust/replicate the source data and analytical approaches. The key goals/deliverables of this supplement will be the open analytic workflows that can be used to curate and standardize genomic data, and high quality annotated variation data for SARS-CoV-2 and potential future outbreaks. These workflows will be distributed through proven, fully open, and highly used infrastructure provided by the Galaxy (http://covid19.galaxyproject.org) and HyPhy/Datamonkey (http://covid19.datamonkey.org/) projects.
COVID-19于二零一九年十一月在武汉出现后,迅速在全球蔓延及严重的区域性疫情,令人产生紧迫感及警觉。还有更多的案例(> 100,000)和死亡(~ 5,000),而不是最近的其他病毒爆发/流行病(SARS、MERS、埃博拉和寨卡病毒);但在许多其他方面,这种流行病是“典型的”-人畜共患病从一个(尚未确定)动物宿主,随后是一段未被发现的人类传播期(可能适应新的宿主),然后是广义传播。在这些新出现的疫情中,都会出现同样类型的问题:病原体来自哪里?它是在人类中进化的吗?它是如何传播的?如何进行可靠的诊断?有希望的疫苗靶点是什么? 这些问题中的许多(如果不是全部)取决于多个实验室对不同病毒样本序列进行快速可靠的基因组分析。然而,一次又一次,包括COVID-19在内,我们在病毒调查的早期遇到了同样可以避免的缺点:缺乏可重复性、严谨性和数据/分析共享。描述COVID-19基因组特征的最初出版物[1-4]使用Illumina和Oxford纳米孔数据来阐明患者标本的序列组成(尽管只有Wu等人[3]明确提供了其原始短读测序数据的登录号)。然而,他们处理、组装和分析原始数据的方法差异很大,从透明到完全不透明。这种缺乏分析透明度的做法开创了一个危险的先例。传染病暴发往往发生在数据分析所需的基础设施可能无法进入或对结果的公正解释在政治上可能站不住脚的地方。如果研究人员不能信任/复制源数据和分析方法,则无法可靠地回答基本问题,例如宿主内基因组变异的程度(指示适应或多重感染),病毒进化(选择,重组),传播(遗传和地理传播)。 该补充的关键目标/可交付成果将是开放的分析工作流程,可用于管理和标准化基因组数据,以及SARS-CoV-2和潜在未来爆发的高质量注释变异数据。这些工作流程将通过Galaxy(http://www.example.com)和HyPhy/Datamonkey(http://covid19.datamonkey.org/)项目提供的经过验证的、完全开放的和高度使用的基础设施进行分发。covid19.galaxyproject.org

项目成果

期刊论文数量(43)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen.
  • DOI:
    10.1371/journal.pbio.3001115
  • 发表时间:
    2021-03
  • 期刊:
  • 影响因子:
    9.8
  • 作者:
    MacLean OA;Lytras S;Weaver S;Singer JB;Boni MF;Lemey P;Kosakovsky Pond SL;Robertson DL
  • 通讯作者:
    Robertson DL
Sequencing error profiles of Illumina sequencing instruments.
  • DOI:
    10.1093/nargab/lqab019
  • 发表时间:
    2021-03
  • 期刊:
  • 影响因子:
    4.6
  • 作者:
    Stoler N;Nekrutenko A
  • 通讯作者:
    Nekrutenko A
Predicting runtimes of bioinformatics tools based on historical data: five years of Galaxy usage.
根据历史数据预测生物信息学工具的运行时间:五年的 Galaxy 使用情况。
  • DOI:
    10.1093/bioinformatics/btz054
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tyryshkina,Anastasia;Coraor,Nate;Nekrutenko,Anton
  • 通讯作者:
    Nekrutenko,Anton
Exploring the Natural Origins of SARS-CoV-2 in the Light of Recombination.
  • DOI:
    10.1093/gbe/evac018
  • 发表时间:
    2022-02-04
  • 期刊:
  • 影响因子:
    3.3
  • 作者:
    Lytras S;Hughes J;Martin D;Swanepoel P;de Klerk A;Lourens R;Kosakovsky Pond SL;Xia W;Jiang X;Robertson DL
  • 通讯作者:
    Robertson DL
Galaxy Training: A powerful framework for teaching!
  • DOI:
    10.1371/journal.pcbi.1010752
  • 发表时间:
    2023-01
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

ANTON NEKRUTENKO其他文献

ANTON NEKRUTENKO的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('ANTON NEKRUTENKO', 18)}}的其他基金

Tuning big data analysis infrastructure for HIV research
调整艾滋病毒研究的大数据分析基础设施
  • 批准号:
    9511742
  • 财政年份:
    2017
  • 资助金额:
    $ 36.83万
  • 项目类别:
Tuning big data analysis infrastructure for HIV research
调整艾滋病毒研究的大数据分析基础设施
  • 批准号:
    10170221
  • 财政年份:
    2017
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    8432034
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    10576907
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    10356796
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    10090025
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    8243028
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
Democratization of Data Analysis in Life Sciences Through Galaxy
通过 Galaxy 实现生命科学数据分析的民主化
  • 批准号:
    8606866
  • 财政年份:
    2012
  • 资助金额:
    $ 36.83万
  • 项目类别:
An Efficient Lightweight Environment for Biomedical Computation
生物医学计算的高效轻量级环境
  • 批准号:
    8035956
  • 财政年份:
    2009
  • 资助金额:
    $ 36.83万
  • 项目类别:
An Efficient Lightweight Environment for Biomedical Computation
生物医学计算的高效轻量级环境
  • 批准号:
    7566686
  • 财政年份:
    2009
  • 资助金额:
    $ 36.83万
  • 项目类别:

相似海外基金

The earliest exploration of land by animals: from trace fossils to numerical analyses
动物对陆地的最早探索:从痕迹化石到数值分析
  • 批准号:
    EP/Z000920/1
  • 财政年份:
    2025
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Fellowship
Animals and geopolitics in South Asian borderlands
南亚边境地区的动物和地缘政治
  • 批准号:
    FT230100276
  • 财政年份:
    2024
  • 资助金额:
    $ 36.83万
  • 项目类别:
    ARC Future Fellowships
The function of the RNA methylome in animals
RNA甲基化组在动物中的功能
  • 批准号:
    MR/X024261/1
  • 财政年份:
    2024
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Fellowship
Ecological and phylogenomic insights into infectious diseases in animals
对动物传染病的生态学和系统发育学见解
  • 批准号:
    DE240100388
  • 财政年份:
    2024
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Discovery Early Career Researcher Award
Zootropolis: Multi-species archaeological, ecological and historical approaches to animals in Medieval urban Scotland
Zootropolis:苏格兰中世纪城市动物的多物种考古、生态和历史方法
  • 批准号:
    2889694
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Studentship
Using novel modelling approaches to investigate the evolution of symmetry in early animals.
使用新颖的建模方法来研究早期动物的对称性进化。
  • 批准号:
    2842926
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Studentship
Study of human late fetal lung tissue and 3D in vitro organoids to replace and reduce animals in lung developmental research
研究人类晚期胎儿肺组织和 3D 体外类器官在肺发育研究中替代和减少动物
  • 批准号:
    NC/X001644/1
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Training Grant
RUI: Unilateral Lasing in Underwater Animals
RUI:水下动物的单侧激光攻击
  • 批准号:
    2337595
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Continuing Grant
RUI:OSIB:The effects of high disease risk on uninfected animals
RUI:OSIB:高疾病风险对未感染动物的影响
  • 批准号:
    2232190
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Continuing Grant
A method for identifying taxonomy of plants and animals in metagenomic samples
一种识别宏基因组样本中植物和动物分类的方法
  • 批准号:
    23K17514
  • 财政年份:
    2023
  • 资助金额:
    $ 36.83万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了