Removing batch effects in genomic and epigenomic studies

消除基因组和表观基因组研究中的批次效应

基本信息

项目摘要

Project Summary/Abstract Combining high-throughput biomedical data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple batches of data that are generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These so-called batch effects confound true relationships in the data, reducing the power benefits of combining multiple batches of data, and may even lead to spurious results. Many methods have been proposed to filter technical heterogeneity from genomic data. These methods are designed to remove batch effects, unmeasured or “surrogate” variation, or other “unwanted” variation caused by biological or technical sources. Although these approaches represent impactful advances in the field, there are still significant gaps that need to be addressed to appropriately filter technical heterogeneity from -omics data and other high-throughput datasets. For example, many existing methods assume relevant covariates are known or that raw data are generally independent. Some applications require more nuanced correction, including single cell transcriptomics data that are often missing cell-type identifiers, microbiome and mRNA-seq data that are compositional in nature, and imaging and spatial transcriptomics data that have spatially correlated data points. Furthermore, batch correction introduces correlation into the adjusted data, which needs to be accounted for in downstream analyses, and most researchers performing batch correction are unaware of this negative impact and often incorrectly apply downstream analysis tools. Finally, there is still significant need for additional software tools and benchmark datasets for evaluating batch effect methods and their efficacy in specific datasets. We propose to develop algorithms and software to address these specific research gaps facing researchers combining data from multiple experimental batches.
项目摘要/摘要 组合来自多个研究的高通量生物医学数据集有利于提高统计 在后勤考虑限制样本大小或要求按顺序生成数据的研究中的权力。 然而,通常可以在多批数据中观察到显著的技术异质性,这些数据 由不同的处理或试剂批次、实验者、方案或分析平台产生。这些 所谓的批处理效应混淆了数据中的真实关系,降低了组合多个 批量的数据,甚至可能导致虚假的结果。已经提出了许多方法来过滤技术 来自基因组数据的异质性。这些方法旨在消除批处理效果,未测量或 “替代”变异,或由生物或技术来源引起的其他“有害的”变异。尽管这些 虽然各种方法代表着该领域的重大进展,但仍有重大差距需要解决 从组学数据和其他高通量数据集中适当过滤技术异质性。例如, 许多现有的方法假设相关协变量已知或原始数据通常是独立的。一些人 应用程序需要更细微的校正,包括经常丢失的单细胞转录组数据 细胞类型识别符、微生物组和mrna-seq数据,本质上是组成的,以及成像和空间 具有空间相关数据点的转录数据。此外,批量更正引入了 与调整后的数据的相关性,这需要在下游分析中考虑到,而且大多数 执行批量更正的研究人员没有意识到这种负面影响,并且经常错误地应用 下游分析工具。最后,仍然需要更多的软件工具和基准 用于评估批处理效果方法及其在特定数据集中的有效性的数据集。我们建议开发 算法和软件,以解决研究人员在结合来自 多个实验批次。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

William Evan Johnson其他文献

William Evan Johnson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('William Evan Johnson', 18)}}的其他基金

Microbiome-based biomarkers and models of lung cancer development and treatment
基于微生物组的肺癌发展和治疗的生物标志物和模型
  • 批准号:
    10739531
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
Systems Biology Core
系统生物学核心
  • 批准号:
    10493266
  • 财政年份:
    2021
  • 资助金额:
    $ 12.05万
  • 项目类别:
Microbiome-based biomarkers and models of lung cancer development and treatment
基于微生物组的肺癌发展和治疗的生物标志物和模型
  • 批准号:
    10366665
  • 财政年份:
    2021
  • 资助金额:
    $ 12.05万
  • 项目类别:
Systems Biology Core
系统生物学核心
  • 批准号:
    10665023
  • 财政年份:
    2021
  • 资助金额:
    $ 12.05万
  • 项目类别:
Systems Biology Core
系统生物学核心
  • 批准号:
    10271647
  • 财政年份:
    2021
  • 资助金额:
    $ 12.05万
  • 项目类别:
Signature of profiling and staging the progression of TB from infection to disease.
结核病从感染到疾病进展的特征分析和分期。
  • 批准号:
    10214482
  • 财政年份:
    2020
  • 资助金额:
    $ 12.05万
  • 项目类别:
Removing batch effects in genomic and epigenomic studies
消除基因组和表观基因组研究中的批次效应
  • 批准号:
    10155560
  • 财政年份:
    2018
  • 资助金额:
    $ 12.05万
  • 项目类别:
Removing batch effects in genomic and epigenomic studies
消除基因组和表观基因组研究中的批次效应
  • 批准号:
    9926913
  • 财政年份:
    2018
  • 资助金额:
    $ 12.05万
  • 项目类别:
Removing batch effects in high-throughput biomedical studies
消除高通量生物医学研究中的批次效应
  • 批准号:
    10659898
  • 财政年份:
    2018
  • 资助金额:
    $ 12.05万
  • 项目类别:
An interactive analysis toolkit for single cell RNA-seq in cancer research
用于癌症研究中单细胞 RNA-seq 的交互式分析工具包
  • 批准号:
    9389818
  • 财政年份:
    2017
  • 资助金额:
    $ 12.05万
  • 项目类别:

相似海外基金

More sustainable biocatalytic imine reductions to chiral amines with hydrogen-driven NADPH recycling operated in batch and continuous flow
通过批量和连续流操作的氢驱动 NADPH 回收,更可持续地生物催化亚胺还原为手性胺
  • 批准号:
    2889869
  • 财政年份:
    2023
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Studentship
Oakdale: a step-change in UK materials and manufacturing using carbon negative materials to achieve carbon neutral batch designs
Oakdale:英国材料和制造的重大变革,使用碳负材料实现碳中和批次设计
  • 批准号:
    10080073
  • 财政年份:
    2023
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Collaborative R&D
Ultra Clean Cast DLMM - Batch 39 main application
超洁净铸造 DLMM - 第 39 批主要应用
  • 批准号:
    10065261
  • 财政年份:
    2023
  • 资助金额:
    $ 12.05万
  • 项目类别:
    BEIS-Funded Programmes
Selective trapping of transient aryllithiums in halogen dance and batch synthesis of constitutional isomers
卤素舞蹈中瞬态芳基锂的选择性捕获和结构异构体的批量合成
  • 批准号:
    22KJ2277
  • 财政年份:
    2023
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Accounting for batch claim arrivals and fluctuation in premium income: from practice to theory
批量索赔到达和保费收入波动的核算:从实践到理论
  • 批准号:
    RGPIN-2019-06586
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Discovery Grants Program - Individual
Sorption of palladium(II) on biotite, quartz and feldspar in brackish groundwater by batch experiment, sorption model and DFT calculation
间歇实验、吸附模型和DFT计算微咸地下水中黑云母、石英和长石对钯(II)的吸附
  • 批准号:
    561141-2020
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Alliance Grants
Data Driven Modeling and Control of Batch and Batch Like Processes
批处理和类批处理过程的数据驱动建模和控制
  • 批准号:
    573854-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
    University Undergraduate Student Research Awards
Manufacturing of New Batch AV-1959D Drug Product and Placebo for Phase 1 Trial
为 1 期试验生产新批次 AV-1959D 药品和安慰剂
  • 批准号:
    10732215
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
Development of a Hotbox Mechanism For Batch Pyrolysis Kiln Technology
间歇式热解窑技术热箱机构的开发
  • 批准号:
    10046194
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Grant for R&D
SBIR Phase II: Large-Scale Synthesis of Hollow Metal Nanospheres: Conversion of Batch Synthesis to Continuous Flow
SBIR第二阶段:空心金属纳米球的大规模合成:间歇合成向连续流动的转化
  • 批准号:
    2127133
  • 财政年份:
    2022
  • 资助金额:
    $ 12.05万
  • 项目类别:
    Cooperative Agreement
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了