Removing batch effects in high-throughput biomedical studies
消除高通量生物医学研究中的批次效应
基本信息
- 批准号:10659898
- 负责人:
- 金额:$ 30.08万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-05-01 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAlgorithmsBenchmarkingBiologicalBiological AssayCellsCluster AnalysisComputer softwareDataData AdjustmentsData CorrelationsData SetDetectionDimensionsEvaluationExcisionGenerationsHeterogeneityImageLeadMethodsModelingNatureOutcomePathway AnalysisPerformanceProtocols documentationReagentResearchResearch PersonnelSample SizeSoftware ToolsSourceStandardizationStructureTaxonomyVariantWorkbiomarker developmentcell typecombatdata integrationdata resourcedesigndifferential expressionepigenomicsexperimental studygenomic dataheterogenous dataimaging platformimprovedmRNA sequencingmicrobiomemultiple data sourcesmultiple datasetsnoveloutcome predictionsingle-cell RNA sequencingsoftware developmenttooltranscriptomics
项目摘要
Project Summary/Abstract
Combining high-throughput biomedical data sets from multiple studies is advantageous to increase statistical
power in studies where logistical considerations restrict sample size or require the sequential generation of data.
However, significant technical heterogeneity is commonly observed across multiple batches of data that are
generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These
so-called batch effects confound true relationships in the data, reducing the power benefits of combining multiple
batches of data, and may even lead to spurious results. Many methods have been proposed to filter technical
heterogeneity from genomic data. These methods are designed to remove batch effects, unmeasured or
“surrogate” variation, or other “unwanted” variation caused by biological or technical sources. Although these
approaches represent impactful advances in the field, there are still significant gaps that need to be addressed
to appropriately filter technical heterogeneity from -omics data and other high-throughput datasets. For example,
many existing methods assume relevant covariates are known or that raw data are generally independent. Some
applications require more specific and direct correction methods, including single cell transcriptomics data that
are often missing cell-type identifiers, microbiome data that are compositional in nature, and imaging and spatial
transcriptomics data that have spatially correlated data points. Furthermore, batch correction introduces
correlation into the adjusted data, which needs to be accounted for in downstream analyses, and most
researchers performing batch correction are unaware of this negative impact and often incorrectly apply
downstream analysis tools. Finally, there is still significant need for additional software tools and benchmark
datasets for evaluating batch effect methods and their efficacy in specific datasets. We propose to develop
algorithms and software to address these specific research gaps facing researchers combining data from
multiple experimental batches.
项目总结/摘要
组合来自多个研究的高通量生物医学数据集有利于增加统计学上的差异。
在逻辑考虑限制样本量或需要连续生成数据的研究中具有把握度。
然而,在多批数据中通常会观察到显著的技术异质性,
从不同的处理或试剂批次、实验者、方案或分析平台产生。这些
所谓的批量效应混淆了数据中的真实关系,降低了组合多个
批量数据,甚至可能导致虚假的结果。已经提出了许多方法来过滤技术
来自基因组数据的异质性。这些方法旨在消除批次效应,不可测量或
“替代”变异或由生物或技术来源引起的其他“不想要的”变异。虽然这些
虽然这些方法代表了该领域的有效进展,但仍存在重大差距,需要加以解决
以适当地过滤来自组学数据和其他高通量数据集的技术异质性。比如说,
许多现有的方法假设相关的协变量是已知的,或者原始数据通常是独立的。一些
应用需要更具体和直接的校正方法,包括单细胞转录组学数据,
通常缺少细胞类型标识符、本质上是组成的微生物组数据以及成像和空间信息。
转录组学数据具有空间相关的数据点。此外,批次校正引入了
调整后的数据的相关性,这需要在下游分析中考虑,
进行批量校正的研究人员没有意识到这种负面影响,
下游分析工具。最后,仍然需要更多的软件工具和基准测试
用于评估批效应方法及其在特定数据集中的有效性的数据集。我们建议发展
算法和软件来解决研究人员面临的这些具体的研究差距,
多个实验批次
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robustifying genomic classifiers to batch effects via ensemble learning.
通过集成学习增强基因组分类器的批量效果。
- DOI:10.1093/bioinformatics/btaa986
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Zhang,Yuqing;Patil,Prasad;Johnson,WEvan;Parmigiani,Giovanni
- 通讯作者:Parmigiani,Giovanni
Exploring Host-Microbe Interactions in Lung Cancer.
探索肺癌中宿主-微生物的相互作用。
- DOI:10.1164/rccm.201807-1225ed
- 发表时间:2018
- 期刊:
- 影响因子:24.7
- 作者:Zhao,Yue;Johnson,WEvan
- 通讯作者:Johnson,WEvan
animalcules: interactive microbiome analytics and visualization in R.
- DOI:10.1186/s40168-021-01013-0
- 发表时间:2021-03-28
- 期刊:
- 影响因子:15.5
- 作者:Zhao Y;Federico A;Faits T;Manimaran S;Segrè D;Monti S;Johnson WE
- 通讯作者:Johnson WE
Interactive analysis of single-cell data using flexible workflows with SCTK2.
- DOI:10.1016/j.patter.2023.100814
- 发表时间:2023-08-11
- 期刊:
- 影响因子:6.5
- 作者:Wang, Yichen;Sarfraz, Irzam;Pervaiz, Nida;Hong, Rui;Koga, Yusuke;Akavoor, Vidya;Cao, Xinyun;Alabdullatif, Salam;Zaib, Syed Ali;Wang, Zhe;Jansen, Frederick;Yajima, Masanao;Johnson, W. Evan;Campbell, Joshua D.
- 通讯作者:Campbell, Joshua D.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
William Evan Johnson其他文献
William Evan Johnson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('William Evan Johnson', 18)}}的其他基金
Microbiome-based biomarkers and models of lung cancer development and treatment
基于微生物组的肺癌发展和治疗的生物标志物和模型
- 批准号:
10739531 - 财政年份:2022
- 资助金额:
$ 30.08万 - 项目类别:
Microbiome-based biomarkers and models of lung cancer development and treatment
基于微生物组的肺癌发展和治疗的生物标志物和模型
- 批准号:
10366665 - 财政年份:2021
- 资助金额:
$ 30.08万 - 项目类别:
Signature of profiling and staging the progression of TB from infection to disease.
结核病从感染到疾病进展的特征分析和分期。
- 批准号:
10214482 - 财政年份:2020
- 资助金额:
$ 30.08万 - 项目类别:
Removing batch effects in genomic and epigenomic studies
消除基因组和表观基因组研究中的批次效应
- 批准号:
10155560 - 财政年份:2018
- 资助金额:
$ 30.08万 - 项目类别:
Removing batch effects in genomic and epigenomic studies
消除基因组和表观基因组研究中的批次效应
- 批准号:
9926913 - 财政年份:2018
- 资助金额:
$ 30.08万 - 项目类别:
Removing batch effects in genomic and epigenomic studies
消除基因组和表观基因组研究中的批次效应
- 批准号:
10739064 - 财政年份:2018
- 资助金额:
$ 30.08万 - 项目类别:
An interactive analysis toolkit for single cell RNA-seq in cancer research
用于癌症研究中单细胞 RNA-seq 的交互式分析工具包
- 批准号:
9389818 - 财政年份:2017
- 资助金额:
$ 30.08万 - 项目类别:
相似海外基金
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 30.08万 - 项目类别:
Continuing Grant














{{item.name}}会员




