Statistical methods for analyzing messy microbiome data: detection of hidden artifacts and robust modeling approaches
分析杂乱微生物组数据的统计方法:隐藏伪影的检测和稳健的建模方法
基本信息
- 批准号:10503637
- 负责人:
- 金额:$ 38.02万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-23 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAlgorithmsBirthCationsCellsCharacteristicsClinical ResearchCohort StudiesCollectionCommunitiesComplexComputer softwareDNADataData AnalysesData SetDepositionDetectionDiseaseEtiologyEvaluationFailureFoundationsGenesHealthHumanHuman MicrobiomeInvestigationMethodsModelingMorphologic artifactsNew HampshireObservational StudyOdds RatioPerformancePersonal SatisfactionPhenotypePhylogenetic AnalysisPlayPrevention strategyProceduresProcessProtocols documentationPublic DomainsReproducibilityResearchResearch PersonnelResistanceRoleSamplingShotgunsStatistical MethodsStructureSurvival AnalysisTaxonTaxonomyTestingTimeWorkanalytical methodbacterial communitybasebeta diversitydata toolsdata visualizationdesigndisorder riskepidemiology studyexperimental studyhigh dimensionalityhuman microbiotaimprovedinterestmetagenomic sequencingmicrobialmicrobiomemicrobiome analysismicrobiome researchmicrobiome sequencingmicroorganismnovelnovel strategiesopen sourceresearch studysemiparametricsimulationtooltranscriptome sequencingtreatment strategytrustworthinessuser friendly softwarevaping
项目摘要
Project Abstract:
Recent research has highlighted the importance of human associated microbiota in many diseases and health
conditions. Nowadays marker-gene amplicon and shotgun metagenomics sequencing (jointly, MGS) have been
routinely used in epidemiological and clinical studies to investigate the health impact of the microbiome commu-
nity. In the public domain, many researchers now deposit MGS data together with other data for other researchers
to investigate. Despite being increasingly available, MGS data analysis remains difficult. In addition to the classic
statistical challenges inherent to MGS data such as the compositionality, the sparsity, the over dispersion and the
phylogenetic relationship between taxa, large scale MGS studies feature additional complications including the
experimental bias and hidden artifacts (batch effects), which will invalidate downstream analysis if not accounted
for properly. Current analytic approaches largely ignore or insufficiently handle these difficulties.
This proposal aims to develop powerful and robust statistical methods for reproducible microbiome discoveries
that adjust for unknown batch effects and are resistant to sequencing biases. In aim 1, we will develop a novel
approach to search for unmeasured artifacts through a novel surrogate variable analysis and multiple quantile
thresholding. Our approach advances the existing surrogate variable analysis approach to specifically address
the characteristics of MGS data including the differences in variabilities, the sparsity and the zero inflation. In
aims 2 & 3, we develop bias resistant modeling for assessing microbiome-phenotype association and community
level analysis. We will also develop, distribute and support user-friendly software for the proposed methods to
benefit the entire research community. The proposed methods will be evaluated against extensive simulations
and analysis of real microbiome data including data from our motivating studies as in VAPing Observational
Research Study (VAPORS) and the New Hampshire birth cohort study. Successful completion of this proposal
will fill the gap between the increasing research interest in microbiome and the lack of robust and bias-resistant
tools, and facilitate our in-depth understanding of human microbiome in health and disease.
项目摘要:
最近的研究强调了人类相关微生物群在许多疾病和健康中的重要性
条件如今,标记基因扩增和鸟枪宏基因组测序(联合,MGS)已被广泛应用。
常规用于流行病学和临床研究,以调查微生物群落对健康的影响,
nity。在公共领域,许多研究人员现在将MGS数据与其他研究人员的其他数据一起存款
去调查尽管MGS数据分析越来越可用,但仍然很困难。除了经典的
MGS数据固有的统计挑战,如组成性,稀疏性,过度分散性和
分类群之间的系统发育关系,大规模MGS研究的特点是额外的并发症,包括
实验偏差和隐藏的伪影(批次效应),如果不考虑,将使下游分析无效
正确的。目前的分析方法在很大程度上忽略或不足以处理这些困难。
该提案旨在为可重复的微生物组发现开发强大而强大的统计方法
其针对未知的批次效应进行调整并且抵抗测序偏差。在目标1中,我们将开发一种新的
通过新的替代变量分析和多分位数搜索不可测量伪影的方法
阈值化我们的方法改进了现有的替代变量分析方法,
MGS数据的特征,包括变异性差异、稀疏性和零变异性。在
目标2和3,我们开发了抗偏性模型来评估微生物组-表型关联和群落
层次分析我们还将开发、分发和支持用户友好的软件,用于建议的方法,
贝内对整个研究界都有好处。所提出的方法将进行评估,对广泛的模拟
和分析真实的微生物组数据,包括来自我们的动机研究的数据,如VAPing Observational
研究(VAPORS)和新罕布什尔州出生队列研究。成功完成本提案
将填补微生物组研究兴趣日益增加与缺乏强大和抗偏见能力之间的差距
工具,并促进我们深入了解健康和疾病中的人类微生物组。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ni Zhao其他文献
Ni Zhao的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ni Zhao', 18)}}的其他基金
Statistical methods for analyzing messy microbiome data: detection of hidden artifacts and robust modeling approaches
分析杂乱微生物组数据的统计方法:隐藏伪影的检测和稳健的建模方法
- 批准号:
10708908 - 财政年份:2022
- 资助金额:
$ 38.02万 - 项目类别:
Statistical methods for integrative analysis of multiple microbiome datasets
多个微生物组数据集综合分析的统计方法
- 批准号:
10380772 - 财政年份:2021
- 资助金额:
$ 38.02万 - 项目类别:
Statistical methods for integrative analysis of multiple microbiome datasets
多个微生物组数据集综合分析的统计方法
- 批准号:
10217316 - 财政年份:2021
- 资助金额:
$ 38.02万 - 项目类别:
相似海外基金
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 38.02万 - 项目类别:
Research Grant