A Genome Data Analysis Center Focused on Batch Effect Analysis and Data Integration
专注于批量效应分析和数据集成的基因组数据分析中心
基本信息
- 批准号:10300778
- 负责人:
- 金额:$ 40.33万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-22 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:Algorithmic SoftwareAtlasesBackBioinformaticsBiologicalBiologyBiometryCancer BiologyCancer CenterCellsClinicalCluster AnalysisCollaborationsCommunitiesCompetenceComplexComputational BiologyComputing MethodologiesConsultConsultationsDataData AnalysesData SetDetectionDevelopmentDiagnosisDiseaseDoctor of PhilosophyExperimental DesignsFAIR principlesFacultyFutureGenerationsGenome Data Analysis CenterGoalsInstructionLaboratory ResearchLeadershipMalignant NeoplasmsMedicineMissionModelingMolecular ProfilingMorphologic artifactsPathway interactionsPhaseProcessProtein ArrayProtein Array AnalysisProteomicsQuality ControlReproducibilityResearchResearch PersonnelResearch Project GrantsSamplingScheduleSoftware EngineeringSystemThe Cancer Genome AtlasTimeTranslatingVisualizationWorkbioinformatics toolcancer genomicscomputerized toolscomputing resourcesdata integrationexperienceinnovationmembermolecular scalemultidisciplinarymultiple omicspreventprogramsresistance mechanismsingle cell sequencingsoftware systemssurveillance datatherapy resistanttooltrendtumorworking group
项目摘要
* * * * PROJECT SUMMARY * * * *
Abstract: Technical batch effects pose a fundamental challenge to quality control and reproducibility of even
single-laboratory research projects, but the possibilities for serious error are greatly magnified in complex, multi-
institutional enterprises such as the cancer molecular profiling projects being undertaken by the NCI Center for
Cancer Genomics (CCG). To aid in detection, quantitation, interpretation, and (when appropriate) correction for
technical batch effects in such data, we have developed the MBatch software system. MBatch proved
indispensable for quality-control “surveillance” of data in The Cancer Genome Atlas (TCGA) and ongoing CCG
projects. But detecting and quantitating batch effects (or trend effects or statistical outliers) are just the first steps
in a process. The next steps involve detective work in collaboration with those who generated the data, drawing
upon expertise in integrative analysis across data types, pathways, and systems-level biology. That detective
work usually succeeds in diagnosing the cause of a batch effect as technical or biological. If technical, then
computational methods to ameliorate the batch effect can be applied (judiciously).
The primary aim of the proposed Genome Data Analysis Center (GDAC) is to continue to translate that
successful quality-control model to the CCG’s other current and future large-scale molecular profiling projects
We will be ready to do that on Day 1. We will continue to enhance and extend the power of MBatch and
incorporate a number of innovative new algorithms, tools, and interactive visualizations into it (OmicPioneer-sc,
MutBatch, CarDEC, and CorNet). Evaluating and correcting batch effects is a complex process, so we will
collaborate with other GDACs and data generating centers to determine the influence of artifacts on any analysis
results they produce. The second aim is to contribute and enhance additional competencies. We are prepared
to (i) provide integrated cluster solutions to segregate cases into biologically relevant groups; (ii) provide tools
and expertise for high-level visualization of omic data (including single-cell data); and (iii) analyze RPPA
proteomic data from the subset of projects that generate such data. Our final aim is to communicate results and
distribute corrected data back to other network members, project stakeholders, and the scientific community.
We bring a number of assets to the table, including multidisciplinary expertise in bioinformatics, biostatistics,
software engineering, cancer biology and cancer medicine; PIs with a combined 40+ years of experience in
molecular profiling of cancers; expertise gained in 10 years of doing the batch effects surveillance for TCGA and
other CCG projects; a highly professional software engineering team with a track record of producing high-end
bioinformatics tools; extensive computing resources, including one of the most powerful academic clusters in the
world; and close working relationships with first-class basic, translational, and clinical researchers across MD
Anderson, one of the foremost cancer centers in the U.S. The bottom-line mission of the GDAC will be to aid the
research community’s effort to understand cancer and to prevent, detect, diagnose, and treat it more effectively.
* * *项目摘要 * * * * *
摘要:技术批处理效应对质量控制和重复性构成了根本的挑战
单位实验室的研究项目,但严重错误的可能性在复杂的,多种多样的情况下大大放大
机构企业,例如NCI中心进行的癌症分子分析项目
癌症基因组学(CCG)。有助于检测,定量,解释以及(在适当的情况下)校正
在此类数据中,我们已经开发了Mbatch软件系统。提供了mbatch
对于癌症基因组图集(TCGA)和正在进行的CCG的质量控制“监视”必不可少的
项目。但是检测和定量批处理效应(或趋势效应或统计异常值)只是第一步
在一个过程中。接下来的步骤涉及侦探工作与生成数据的人合作,绘画
具有跨数据类型,途径和系统级生物学的综合分析专业知识。那个侦探
工作通常成功地诊断出批处理作用为技术或生物学的原因。如果是技术性的,那么
可以(明智地)使用改善批处理效应的计算方法。
拟议的基因组数据分析中心(GDAC)的主要目的是继续翻译
CCG的其他当前和未来大规模分子分析项目成功的质量控制模型
我们将准备在第1天做到这一点。我们将继续增强和扩展Mbatch的功能
将许多创新的新算法,工具和交互式可视化纳入其中(Omicpioneer-SC,
MutBatch,Cardec和Cornet)。评估和纠正批处理效果是一个复杂的过程,因此我们将
与其他GDAC和数据生成中心合作,以确定工件对任何分析的影响
它们产生的结果。第二个目的是贡献和增强其他能力。我们已经准备好了
(i)提供了集成的群集解决方案,以将病例分离为生物学相关的组; (ii)提供工具
和高级可视化OMIC数据的专业知识(包括单细胞数据); (iii)分析RPPA
来自生成此类数据的项目子集的蛋白质组学数据。我们的最终目的是传达结果和
将校正的数据分配给其他网络成员,项目利益相关者和科学界。
我们将许多资产带到了桌子上,包括生物信息学,生物统计学,生物统计学的多学科专业知识
软件工程,癌症生物学和癌症医学; PI拥有40多年的共同经验
癌症的分子分析;在对TCGA和
其他CCG项目;一个高度专业的软件工程团队,具有生产高端的记录
生物信息学工具;广泛的计算资源,包括最强大的学术集群之一
世界;以及与MD的一流基本,翻译和临床研究人员的密切工作关系
安德森(Anderson)是美国最重要的癌症中心之一,GDAC的底线任务将是帮助
研究社区努力了解癌症并预防,检测,诊断和治疗它。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Rehan Akbani其他文献
Rehan Akbani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Rehan Akbani', 18)}}的其他基金
The Cancer Proteome Atlas: an Integrated Bioinformatics Resource for Functional Cancer Proteomic Data
癌症蛋白质组图谱:功能性癌症蛋白质组数据的综合生物信息学资源
- 批准号:
10653202 - 财政年份:2022
- 资助金额:
$ 40.33万 - 项目类别:
A Genome Data Analysis Center Focused on Batch Effect Analysis and Data Integration
专注于批量效应分析和数据整合的基因组数据分析中心
- 批准号:
10689115 - 财政年份:2021
- 资助金额:
$ 40.33万 - 项目类别:
Computational Tools for Analysis and Visualization of Quality Control Issues in Metabolomic Data
用于代谢组数据质量控制问题分析和可视化的计算工具
- 批准号:
9615762 - 财政年份:2018
- 资助金额:
$ 40.33万 - 项目类别:
Computational Tools for Analysis and Visualization of Quality Control Issues in Metabolomic Data
用于代谢组数据质量控制问题分析和可视化的计算工具
- 批准号:
10251093 - 财政年份:2018
- 资助金额:
$ 40.33万 - 项目类别:
Computational Tools for Analysis and Visualization of Quality Control Issues in Metabolomic Data
用于代谢组数据质量控制问题分析和可视化的计算工具
- 批准号:
10005202 - 财政年份:2018
- 资助金额:
$ 40.33万 - 项目类别:
Batch effects in molecular profiling data on cancers: detection, quantitation, interpretation, and correction
癌症分子分析数据的批次效应:检测、定量、解释和校正
- 批准号:
9352299 - 财政年份:2016
- 资助金额:
$ 40.33万 - 项目类别:
Integrated analysis of protein expression data from the Reverse Phase Protein Array (RPPA) platform
对反相蛋白阵列 (RPPA) 平台的蛋白表达数据进行集成分析
- 批准号:
10005168 - 财政年份:2016
- 资助金额:
$ 40.33万 - 项目类别:
Batch effects in molecular profiling data on cancers: detection, quantitation, interpretation, and correction
癌症分子分析数据的批次效应:检测、定量、解释和校正
- 批准号:
9789027 - 财政年份:2016
- 资助金额:
$ 40.33万 - 项目类别:
Integrated analysis of protein expression data from the Reverse Phase Protein Array (RPPA) platform
对反相蛋白阵列 (RPPA) 平台的蛋白表达数据进行集成分析
- 批准号:
9789028 - 财政年份:2016
- 资助金额:
$ 40.33万 - 项目类别:
Integrative Pipeline for Analysis & Translational Application of TCGA Data (GDAC)
综合分析管道
- 批准号:
8546703 - 财政年份:2009
- 资助金额:
$ 40.33万 - 项目类别:
相似国自然基金
城市区域专题地图集多元耦合信息设计模式
- 批准号:41871374
- 批准年份:2018
- 资助金额:58.0 万元
- 项目类别:面上项目
集胞藻膜蛋白地图集的构建
- 批准号:31670234
- 批准年份:2016
- 资助金额:65.0 万元
- 项目类别:面上项目
中国古代城市地图的收集、整理、研究和编纂
- 批准号:49771008
- 批准年份:1997
- 资助金额:13.0 万元
- 项目类别:面上项目
应用系统科学进行地图集设计系统工程化、标准化研究
- 批准号:49271061
- 批准年份:1992
- 资助金额:7.0 万元
- 项目类别:面上项目
<<中国古代地图集>>(清代)
- 批准号:49171004
- 批准年份:1991
- 资助金额:5.0 万元
- 项目类别:面上项目
相似海外基金
An acquisition and analysis pipeline for integrating MRI and neuropathology in TBI-related dementia and VCID
用于将 MRI 和神经病理学整合到 TBI 相关痴呆和 VCID 中的采集和分析流程
- 批准号:
10810913 - 财政年份:2023
- 资助金额:
$ 40.33万 - 项目类别:
Multi-Scale 3-D Image Analytics for High Dimensional Spatial Mapping of Normal Tissues
用于正常组织高维空间绘图的多尺度 3D 图像分析
- 批准号:
9893208 - 财政年份:2019
- 资助金额:
$ 40.33万 - 项目类别:
Multi-Scale 3-D Image Analytics for High Dimensional Spatial Mapping of Normal Tissues
用于正常组织高维空间绘图的多尺度 3D 图像分析
- 批准号:
10251375 - 财政年份:2019
- 资助金额:
$ 40.33万 - 项目类别:
Multi-Scale 3-D Image Analytics for High Dimensional Spatial Mapping of Normal Tissues
用于正常组织高维空间绘图的多尺度 3D 图像分析
- 批准号:
10246250 - 财政年份:2019
- 资助金额:
$ 40.33万 - 项目类别: