IIBR Informatics: Taming Complexity Through Simulations: Scalable Inference Under the Coalescent with Recombination
IIBR 信息学:通过模拟驯服复杂性:重组合并下的可扩展推理
基本信息
- 批准号:2030604
- 负责人:
- 金额:$ 75.38万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-15 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Species have been evolving, diverging, and adapting to their environments for billions of years. While we have no direct access to the history of species, their genomes provide much signal that allows us to reconstruct this history. Understanding the evolution of genomes helps shed light on how species evolve and diverge, how genes emerge and evolve, and how traits evolve. However, the evolution of genomes is a very complex process that results in scenarios where different regions in the genomes have different evolutionary histories. One process that leads to such a scenario is recombination. This project aims to develop methods for inferring evolutionary histories of genes and genomes in the presence of recombination. Currently this task is not doable for large sets of genomes due to challenges with deriving mathematical models and computationally feasible inference solutions. This project will enable this task by allowing for automatically deriving and inferring the evolutionary history of a set of genomes in the presence of recombination. The project will support graduate student and post-doc mentoring, and will allow for broadening participation in computing, especially given its interdisciplinary nature. Results obtained by this project will facilitate new types of genomic analyses and, consequently, biological discoveries. The aim of this project is to devise methods that make practical and scalable the inference of evolutionary histories (topologies and parameters) under a model called the multispecies coalescent with recombination and migration (MSC-RM). This model allows for analyzing data that consists of genomics sequences from different species and different individuals within species while accounting simultaneously for recombination, incomplete lineage sorting, and gene flow, in addition to various models of DNA sequence evolution. For inferring the topology of the species phylogeny, a deep learning approach is taken, where a neural network is trained on simulated data. For inferring the phylogeny’s parameters (divergence times and population sizes), a hidden Markov model is built from simulated data, and a proxy to the likelihood is computed by means of the quadratic Forward algorithm. This combination of novel techniques helps achieve automated and scalable inference under the MSC-RM model. All methods will be implemented and made publicly available in open source, and all results will be disseminated via publications, public lectures, and tutorials. Results of this project will be available at http://bioinfocs.rice.edu.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数十亿年来,物种一直在进化、分化和适应环境。虽然我们无法直接了解物种的历史,但它们的基因组提供了许多信号,使我们能够重建这段历史。了解基因组的进化有助于阐明物种如何进化和分化,基因如何出现和进化,以及性状如何进化。然而,基因组的进化是一个非常复杂的过程,导致基因组中不同区域具有不同进化历史的情况。导致这种情况的一个过程是重组。该项目旨在开发在重组存在下推断基因和基因组进化历史的方法。目前,由于推导数学模型和计算上可行的推理解决方案的挑战,这项任务对于大的基因组集合是不可行的。该项目将通过允许自动推导和推断存在重组的一组基因组的进化历史来实现这一任务。该项目将支持研究生和博士后指导,并将允许扩大对计算的参与,特别是考虑到其跨学科性质。该项目所取得的成果将促进新型基因组分析,从而促进生物学发现。该项目的目的是设计方法,使实用和可扩展的推理的进化历史(拓扑结构和参数)下的一个模型称为多物种结合与重组和迁移(MSC-RM)。该模型允许分析由来自不同物种和物种内不同个体的基因组序列组成的数据,同时考虑重组、不完全谱系分选和基因流,以及DNA序列进化的各种模型。为了推断物种进化的拓扑结构,采用了深度学习方法,其中神经网络在模拟数据上进行训练。为了推断遗传学的参数(发散时间和人口规模),隐马尔可夫模型是建立从模拟数据,并通过二次向前算法计算的代理的可能性。这种新技术的组合有助于在MSC-RM模型下实现自动化和可扩展的推理。所有方法都将以开源的方式实现并公开提供,所有结果都将通过出版物、公开讲座和教程传播。该项目的结果将在www.example.com上公布http://bioinfocs.rice.edu.This奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Comparing inference under the multispecies coalescent with and without recombination
- DOI:10.1016/j.ympev.2023.107724
- 发表时间:2023-02-03
- 期刊:
- 影响因子:4.1
- 作者:Yan, Zhi;Ogilvie, Huw A.;Nakhleh, Luay
- 通讯作者:Nakhleh, Luay
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Luay Nakhleh其他文献
A survey of computational approaches for characterizing microbial interactions in microbial mats
- DOI:
10.1186/s13059-025-03634-2 - 发表时间:
2025-06-16 - 期刊:
- 影响因子:9.400
- 作者:
Vanesa L. Perillo;Michael Nute;Nicolae Sapoval;Kristen D. Curry;Logan Golia;Yongze Yin;Huw A. Ogilvie;Luay Nakhleh;Santiago Segarra;Devaki Bhaya;Diana G. Cuadrado;Todd J. Treangen - 通讯作者:
Todd J. Treangen
Comments on the model parameters in “SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models”
- DOI:
10.1186/s13059-019-1692-5 - 发表时间:
2019-05-16 - 期刊:
- 影响因子:9.400
- 作者:
Hamim Zafar;Anthony Tzen;Nicholas Navin;Ken Chen;Luay Nakhleh - 通讯作者:
Luay Nakhleh
Stranger in a strange land: the experiences of immigrant researchers
- DOI:
10.1186/s13059-017-1370-4 - 发表时间:
2017-12-01 - 期刊:
- 影响因子:9.400
- 作者:
Sophien Kamoun;Rosa Lozano-Durán;Luay Nakhleh - 通讯作者:
Luay Nakhleh
Luay Nakhleh的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Luay Nakhleh', 18)}}的其他基金
DMS/NIGMS 2: Scalable Bayesian Inference with Applications to Phylogenetics
DMS/NIGMS 2:可扩展贝叶斯推理及其在系统发育学中的应用
- 批准号:
2153704 - 财政年份:2022
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
III: Medium: Scalable Evolutionary Analysis of SNVs and CNAs in Cancer Using Single-Cell DNA Sequencing Data
III:中:使用单细胞 DNA 测序数据对癌症中的 SNV 和 CNA 进行可扩展的进化分析
- 批准号:
2106837 - 财政年份:2021
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
The AGEP Data Engineering and Science Alliance Model: Training and Resources to Advance Minority Graduate Students and Postdoctoral Researchers into Faculty Careers
AGEP 数据工程和科学联盟模型:促进少数族裔研究生和博士后研究人员进入教师职业的培训和资源
- 批准号:
1916093 - 财政年份:2019
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
III: Small: Models and Methods for Simultaneous Genotyping and Phylogeny Inference from Single-Cell DNA Data
III:小型:根据单细胞 DNA 数据同时进行基因分型和系统发育推断的模型和方法
- 批准号:
1812822 - 财政年份:2018
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
AF: Medium: Algorithms for Scalable Phylogenetic Network Inference
AF:Medium:可扩展系统发育网络推理算法
- 批准号:
1800723 - 财政年份:2018
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
AF: Medium: Statistical Inference of Complex Evolutionary Histories
AF:媒介:复杂进化历史的统计推断
- 批准号:
1514177 - 财政年份:2015
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
AF: Medium: Algorithmic Foundations for Phylogenetic Networks
AF:中:系统发育网络的算法基础
- 批准号:
1302179 - 财政年份:2013
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
ABI Innovation: Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus Data
ABI 创新:协作研究:多位点数据基因组规模进化分析的新方法
- 批准号:
1062463 - 财政年份:2011
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
CAREER: Computational Tools for Evolutionary Analysis of Biological Interaction Networks
职业:生物相互作用网络进化分析的计算工具
- 批准号:
0845336 - 财政年份:2009
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
SGER: NET HMMs and Their Applications to Biological Network Alignment
SGER:NET HMM 及其在生物网络对齐中的应用
- 批准号:
0829276 - 财政年份:2008
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
相似海外基金
REU Site: Program for Access to Training in Health Informatics (PATHI)
REU 网站:健康信息学培训计划 (PATHI)
- 批准号:
2348793 - 财政年份:2024
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
Travel: IEEE International Conference on Healthcare Informatics (IEEE ICHI 2024) Doctoral Consortium Travel Scholarship
旅行:IEEE 国际医疗信息学会议 (IEEE ICHI 2024) 博士联盟旅行奖学金
- 批准号:
2414093 - 财政年份:2024
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
Reliable Tensor-Network Fusion Approach to Medical Informatics: Novel Techniques and Benchmarks
可靠的张量网络融合医学信息学方法:新技术和基准
- 批准号:
24K03005 - 财政年份:2024
- 资助金额:
$ 75.38万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Development of Informatics Materials with an Awareness of the High School-University connection and a Learning Support Environment for Data-Driven Instruction
开发具有高中与大学联系意识的信息学材料和数据驱动教学的学习支持环境
- 批准号:
23H01019 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Travel: NSF Student Travel Grant for 2023 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)
旅行:2023 年 IEEE-EMBS 国际生物医学和健康信息学会议 (BHI) 的 NSF 学生旅行补助金
- 批准号:
2331680 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Standard Grant
CAREER: Transforming Personal Informatics Systems to Support Routine Transitions in Healthy Eating
职业:转变个人信息系统以支持健康饮食的常规转变
- 批准号:
2414270 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant
Pioneering Research of industrial materials informatics for innovative lithium battery anodes
创新锂电池阳极工业材料信息学的开创性研究
- 批准号:
23K18465 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Categorical Duality and Semantics Across Mathematics, Informatics and Physics and their Applications to Categorical Machine Learning and Quantum Computing
数学、信息学和物理领域的分类对偶性和语义及其在分类机器学习和量子计算中的应用
- 批准号:
23K13008 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
ACTS (AD Clinical Trial Simulation): Developing Advanced Informatics Approaches for an Alzheimer's Disease Clinical Trial Simulation System
ACTS(AD 临床试验模拟):为阿尔茨海默病临床试验模拟系统开发先进的信息学方法
- 批准号:
10753675 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
CAREER: Transforming Personal Informatics Systems to Support Routine Transitions in Healthy Eating
职业:转变个人信息系统以支持健康饮食的常规转变
- 批准号:
2239727 - 财政年份:2023
- 资助金额:
$ 75.38万 - 项目类别:
Continuing Grant