GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
基本信息
- 批准号:10468521
- 负责人:
- 金额:$ 41.41万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-23 至 2023-09-22
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Detailed Engagement Plan
The National Institutes of Health Common Fund’s Genotype-Tissue Expression (GTEx) project was
launched in 2010 with a goal of providing the scientific community with a resource for the study of human
gene expression and regulation across multiple tissues, to specifically provide insights into the mechanisms
of gene regulation and disease-related perturbations, and to further our understanding of the role that
inherited genetic variation plays in susceptibility to complex diseases. The project enrolled 960 recently
deceased, adult donors and collected close to 49,000 tissue samples. Core data generation was
completed at the end of 2017, with the primary data types including whole genome (WGS, 30X) and whole
exome (WES, 100X) sequence data on all donors, and RNA-sequence data from at least 25,000 samples
spanning 53 human tissues/organs. This dataset constitutes the largest multi-tissue RNA sequence data
resource generated to date (a previous study of genetic effects on gene expression, TwinsUK/EUROBATS,
generated ~2,700 RNA-seq samples from four accessible tissue sites). The GTEx resource also includes
a rich and well annotated collection of donor, sample, and experiment metadata. Furthermore, additional
molecular data types, aimed at enhancing the core data sets, are still being produced, including mass
spectrometry-based proteomics, measurements of DNA methylation, histone marks (ChIP-seq), somatic
DNA sequencing, and DNase I hypersensitivity sites.
The GTEx resource includes both protected-access and open-access data (Fig. 1). The protected-access
data include extensive sample, subject and technical metadata and raw sequence BAM files from RNAseq,
whole genome (WGS) and whole exome (WES) sequencing, ChIP-seq and m6A RNA-seq, as well as
protected data derived from these such as genotype calls in VCF format. An approved dbGaP application
is required to obtain all protected-access data, including access to the raw sequence data, which are
accessible on the AnVIL platform (on Google Cloud Platform; GCP). The GTEx data also include a large
amount of open-access data, such as gene and transcript expression quantifications, cis- and trans-expression
and splicing QTLs, histology images of every tissue, some eGTEx data summaries, the sample
biobank, and a very limited set of de-identified sample and subject metadata. All of these public data are
available for download, and as interactive visualizations and summary tables on the GTEx portal.
The GTEx project has developed an extensive suite of tools and analysis pipelines that have been
benchmarked, optimized and implemented in GCP for the project (such as the RNA-seq alignment,
quantification, and QC pipeline, and the QTL analysis pipeline). These pipelines were also selected by the
TOPMed project to produce a harmonized resource of RNA sequence data across the large number of
cohorts being sequenced for that project (>20,000 samples to date); our team was involved in initial
benchmarking and harmonization tests of our pipeline across TOPMed sequencing centers and are actively
involved in ongoing data production and analyses. Moreover, very similar pipelines are used by the
ENCODE project, thus facilitating comparisons across large datasets that would be prohibitive in terms of
costs and computational resources in the absence of harmonized pipelines. We have also created
numerous visualizations developed specifically for the open access data on the GTEx portal. The GTEx
project has a very large user community: the GTEx data have the second largest number of Data Access
Requests for protected data in dbGaP (behind TCGA), and it is the most frequently downloaded dbGaP
project. An even larger number of users access the data, tools and interactive visualizations on the GTEx
portal: in the 2019 calendar year, the GTEx portal had 135,000 users (~12,000-18,000/month) worldwide,
with users spiking in October 2019 following the release of the V8 data. The GTEx consortium has published
numerous papers describing the dataset and analyses of the data, and two additional data releases are
still planned.
详细的接洽计划
国家卫生研究院共同基金的基因类型-组织表达(GTEx)项目是
成立于2010年,目标是为科学界提供研究人类的资源
跨多个组织的基因表达和调控,以具体提供对机制的见解
基因调控和疾病相关的干扰,并加深我们对
遗传基因变异在复杂疾病的易感性中起作用。该项目最近招收了960人
已故的成年捐赠者收集了近49,000份组织样本。核心数据的生成是
2017年底完成,主要数据类型包括全基因组(WGS,30X)和全基因组
所有供体的外显子组(WES,100X)序列数据,以及至少25,000个样本的RNA序列数据
横跨53个人体组织/器官。该数据集构成了最大的多组织RNA序列数据
迄今为止产生的资源(先前关于基因表达的遗传效应的研究,TwinsUK/EUROBATS,
从四个可接近的组织部位产生了约2,700个RNA-SEQ样本)。GTEx资源还包括
丰富且注解良好的捐赠者、样本和实验元数据的集合。此外,还增加了
旨在增强核心数据集的分子数据类型仍在产生,包括大量
基于光谱的蛋白质组学、DNA甲基化测量、组蛋白标记(CHIP-SEQ)、体细胞
DNA测序和DNase I超敏位点。
GTEx资源包括受保护访问数据和开放访问数据(图1)。受保护的访问
数据包括大量样本、主题和技术元数据以及来自RNAseq的原始序列BAM文件,
全基因组(WGS)和全外显子组(WES)测序、芯片序列和m6A RNA-序列以及
从这些数据派生的受保护数据,例如VCF格式的基因通话。批准的DBGaP应用程序
获取所有受保护的访问数据,包括对原始序列数据的访问,这些数据
可在Anvil平台上访问(在Google Cloud平台上;GCP)。GTEx数据还包括大型
开放获取数据量,如基因和转录本表达量化、顺式和反式表达
和拼接QTL,每个组织的组织学图像,一些eGTEx数据摘要,样本
生物库,以及一组非常有限的未识别的样本和主题元数据。所有这些公开数据都是
可供下载,并可作为GTEx门户网站上的交互式可视化和汇总表。
GTEx项目已经开发了一套广泛的工具和分析管道,已经
在项目的GCP中进行基准测试、优化和实施(例如RNA-SEQ比对,
量化、QC流水线和QTL分析流水线)。这些管道也是由
TOPMed项目,以产生一个统一的RNA序列数据资源,覆盖大量
正在对该项目的队列进行排序(到目前为止有20,000个样本);我们的团队参与了最初的
对我们跨TOPMed测序中心的管道进行基准测试和协调测试,并正在积极
参与持续的数据生产和分析。此外,非常类似的管道也被
编码项目,从而促进在大型数据集之间进行比较,这在以下方面是令人望而却步的
在没有协调一致的管道的情况下的成本和计算资源。我们还创建了
专门为GTEx门户上的开放访问数据开发的大量可视化工具。GTEx
Project有一个非常大的用户社区:GTEx数据拥有第二大数据访问量
在DBGaP中请求受保护的数据(在TCGA之后),它是下载最频繁的DBGaP
项目。更多的用户访问GTEx上的数据、工具和交互可视化
门户:在2019年,GTEx门户在全球拥有135,000名用户(约12,000-18,000人/月),
V8数据发布后,用户在2019年10月达到高峰。GTEx财团已经发布了
描述数据集和数据分析的大量论文,以及另外两个数据发布
仍在计划中。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KRISTIN ARDLIE其他文献
KRISTIN ARDLIE的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KRISTIN ARDLIE', 18)}}的其他基金
Whole Individual Comprehensive KnowlEDge: Somatic Mosaicism across Human Tissues (WICKed SMaHT)
整体综合知识:人体组织的体细胞镶嵌(WICKed SMaHT)
- 批准号:
10662869 - 财政年份:2023
- 资助金额:
$ 41.41万 - 项目类别:
Developmental GTEx Laboratory, Data Analysis and Coordination Center
GTEx发展实验室、数据分析与协调中心
- 批准号:
10662497 - 财政年份:2021
- 资助金额:
$ 41.41万 - 项目类别:
Developmental GTEx Laboratory, Data Analysis and Coordination Center
GTEx发展实验室、数据分析与协调中心
- 批准号:
10492761 - 财政年份:2021
- 资助金额:
$ 41.41万 - 项目类别:
Developmental GTEx Laboratory, Data Analysis and Coordination Center
GTEx发展实验室、数据分析与协调中心
- 批准号:
10302863 - 财政年份:2021
- 资助金额:
$ 41.41万 - 项目类别:
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10444364 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10905807 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10683507 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
A portal and integrative collaborative analysis platform for GTEx
GTEx 的门户和综合协作分析平台
- 批准号:
10181004 - 财政年份:2017
- 资助金额:
$ 41.41万 - 项目类别:
相似海外基金
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10444364 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10837964 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10905807 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE
阐明可药物基因组数据协调中心 - 与 CFDE 的合作计划
- 批准号:
10217890 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10468520 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10444350 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
GTEx engagement with the CFDE-CC and other DCCs towards building a data ecosystem spanning the Common Fund projects
GTEx 与 CFDE-CC 和其他 DCC 合作,构建涵盖共同基金项目的数据生态系统
- 批准号:
10683507 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE
阐明可药物基因组数据协调中心 - 与 CFDE 的合作计划
- 批准号:
10683510 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE
阐明可药物基因组数据协调中心 - 与 CFDE 的合作计划
- 批准号:
10907966 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:
Illuminating the Druggable Genome Data Coordinating Center - Engagement Plan with the CFDE
阐明可药物基因组数据协调中心 - 与 CFDE 的合作计划
- 批准号:
10468527 - 财政年份:2020
- 资助金额:
$ 41.41万 - 项目类别:














{{item.name}}会员




