Robust, scalable, and accurate discovery of mutational signatures
稳健、可扩展且准确的突变特征发现
基本信息
- 批准号:10491360
- 负责人:
- 金额:$ 19.93万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-20 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AgingAlgorithmsBayesian AnalysisBiologicalComputing MethodologiesDNA RepairDNA metabolismDataData ReportingData SetDiagnosticDiseaseEnsureEnvironmental ExposureEtiologyGenomeGoalsHistologicHumanMalignant NeoplasmsMethodsModalityModelingMolecularMutationNational Institute of General Medical SciencesNormal tissue morphologyPathogenesisPathologic MutagenesisPositioning AttributeProblem SolvingProcessRNA metabolismResearchSingle base substitutionStatistical MethodsStructureTissuesTumor TissueVariantWorkbasecancer genomecarcinogenesiscomputing resourcesdesignexperimental studyfallsgenomic dataheuristicsimprovedinsertion/deletion mutationinsightlarge datasetsnovelrepairedsimulationsingle-cell RNA sequencingtool
项目摘要
The mutational signatures inferred from tumor genome sequences have the potential to provide a record of environmental exposure and can give clues about the etiology of carcinogenesis. However, for inferred signatures to be biologically meaningful, each signature must accurately represent the contribution of different mutation types in each mutagenic process. Heuristic algorithms using non-negative matrix factorization (NMF) have primarily been used to discover mutational signatures. But these approaches are inflexible, non-robust, and require massive amounts of computation. The objective of the proposed project is to develop computationally efficient algorithms that, despite imperfect modeling assumptions, can discover biologically meaningful signatures. Aim 1 supports this objective by developing a new framework for scalable, easy-to-use, and accurate variational inference – a widely used approach to approximate Bayesian inference – that is applicable to mutational discovery models. Aim 2 develops statistical methods to extract biologically meaningful signatures from the inferences obtained using the proposed variational inference framework. The accuracy and statistical validity of the methods developed in Aims 1 and 2 is ensured through theoretical analysis and numerical experiments on synthetic and real data. Finally, Aim 3 improves upon the current understanding of mutational processes by (1) applying the methods developed in Aims 1 and 2 to a large Pan-Cancer dataset and (2) by developing a novel model that allows for the structured incorporation of single-base and double-base substitutions, and insertions and deletions in each signature. The proposed work is well-positioned to replace heuristics used for discovering meaningful representations of data, and so have long-term impact on how other genomic data types such as single-cell RNA-seq are analyzed. This work is also directly relevant to the NIGMS as it falls under “DNA and RNA metabolisms (repair)” since many mutational processes are related to aberrant DNA repair or “clock-like” molecular mechanisms that are associated with aging, which can be observed in histologically normal appearing tissue
从肿瘤基因组序列中推断出的突变特征有可能提供环境暴露的记录,并可以为癌症发生的病因提供线索。然而,要使推断的签名具有生物学意义,每个签名必须准确地代表不同突变类型在每个诱变过程中的贡献。使用非负矩阵分解(NMF)的启发式算法主要用于发现突变签名。但是这些方法不灵活、不健壮,并且需要大量的计算。该项目的目标是开发计算效率高的算法,尽管有不完善的建模假设,但可以发现具有生物意义的签名。AIM 1通过开发可扩展、易于使用和准确的变分推理的新框架来支持这一目标--这是一种广泛使用的近似贝叶斯推理的方法,适用于突变发现模型。目的2发展统计方法,从使用所提出的变分推理框架获得的推论中提取具有生物意义的特征。通过对合成数据和实际数据的理论分析和数值实验,保证了目标1和目标2中发展的方法的准确性和统计有效性。最后,AIM 3通过以下方式改进了目前对突变过程的理解:(1)将AIMS 1和AIMS 2中开发的方法应用于大型泛癌数据集;(2)开发了一种新的模型,允许结构化地合并单碱基和双碱基替换,以及在每个签名中插入和删除。这项拟议的工作很好地取代了用于发现有意义的数据表示的启发式方法,因此对单细胞RNA-seq等其他基因组数据类型的分析方式具有长期影响。这项工作也与NIGMS直接相关,因为它属于“DNA和RNA代谢(修复)”,因为许多突变过程与异常的DNA修复或与衰老有关的“时钟状”分子机制有关,这可以在组织学上正常的组织中观察到
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jonathan Huggins其他文献
Jonathan Huggins的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jonathan Huggins', 18)}}的其他基金
Robust, scalable, and accurate discovery of mutational signatures
稳健、可扩展且准确的突变特征发现
- 批准号:
10378273 - 财政年份:2021
- 资助金额:
$ 19.93万 - 项目类别:
Robust, scalable, and accurate discovery of mutational signatures
稳健、可扩展且准确的突变特征发现
- 批准号:
10665756 - 财政年份:2021
- 资助金额:
$ 19.93万 - 项目类别:
相似海外基金
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 19.93万 - 项目类别:
Continuing Grant