CAREER: Future phylogenies: novel computational frameworks for biomolecular sequence analysis involving complex evolutionary origins

职业:未来的系统发育:涉及复杂进化起源的生物分子序列分析的新型计算框架

基本信息

  • 批准号:
    2144121
  • 负责人:
  • 金额:
    $ 58.57万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-03-01 至 2027-02-28
  • 项目状态:
    未结题

项目摘要

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).Phylogenetics is the discipline that seeks to reconstruct and analyze the phylogeny, or evolutionary history, of a set of organisms. Phylogenetic reconstruction is primarily accomplished through computational analysis of DNA and other biomolecular sequence data. Phylogenies and the evolutionary insights that they provide are essential to biology and other disciplines, as well as many applications: important examples include reconstructing and studying the Tree of Life - the evolutionary history of all life on Earth, understanding human origins, infectious disease epidemiology and discovery of new solutions to future pandemics, crop improvement and agriculture, and forensic science. One of the two key ingredients needed for phylogenetic studies has seen a major leap forward thanks to advances in biomolecular sequencing technology: the scale of available biomolecular data is now among the largest in any domain and, in 2025, biomolecular data velocity and storage is projected to be comparable to or larger than Twitter and YouTube. On the other hand, recent "big data" phylogenetic studies point to a critical gap regarding the second of the two key ingredients in phylogenetics: existing computational algorithms need to move beyond their traditional simplifying assumptions about biomolecular sequence evolution. Two of the most important assumptions are: (1) "sequence-unaware" methods that ignore the inherently sequential nature of biomolecular sequences, and (2) the pre hoc assumption that evolutionary relationships have a simple branching structure and are "tree-like" - i.e., can be accurately described by a tree or other simple representation. New computational approaches and infrastructure are needed to move beyond these traditional assumptions and unlock the study of "future phylogenies" and next-generation phylogenetics. This project will therefore create new pathbreaking models and algorithms for complex phylogenetic analyses of biomolecular sequence data. The project also addresses gaps in STEM education through new curriculum development and a collaboration with the Impression 5 Science Center, a children’s science museum in mid-Michigan. Project impacts will be broadened through open-source software distributions and open data resources, new scientific discoveries enabled by the developed software and data infrastructure, scientific outreach activities, and student training and mentoring with a strong emphasis on diversity, equity, and inclusion (DEI).This project will advance the field of computational phylogenetics along multiple frontiers. The first research objective is to develop new statistical resampling algorithms that move beyond "uninformed" analysis where biomolecular data are assumed to be independent and identically distributed (i.i.d.), and towards "informed" sequence-aware analysis; a central approach will be to make use of the latest advances in machine learning. The new algorithms will be used to better assess rigor and reproducibility during phylogenetic analyses and other critical-path analytical tasks. The second research objective is to create mathematical theory, statistical models, and computational algorithms to move beyond traditional phylogenetic representations (e.g., phylogenetic trees, etc.), and towards more general graph-theoretic models of complex genome evolution. The third research objective is to conduct comprehensive validation and performance assessment studies of the first two research objectives’ computational frameworks. The studies will utilize both synthetic and empirical benchmarking datasets that capture a wide range of evolutionary conditions and dataset features. The project also includes two educational objectives: a new course on DEI topics in interdisciplinary computer science, and a new museum exhibit on technology and computer programming that will be exhibited at the Impression 5 Science Center. Open-source software and open data deliverables will drive future methodological research and enable otherwise inaccessible scientific discoveries, and scientific outreach will help seed and drive uptake of the project’s contributions. The project also includes student training and mentoring activities at the undergraduate and graduate levels. Project deliverables and other results can be found at https://gitlab.msu.edu/liulab.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该奖项全部或部分根据2021年美国救援计划法案(公法117-2)资助。系统发生学是一门旨在重建和分析一组生物体的遗传学或进化史的学科。系统发育重建主要通过DNA和其他生物分子序列数据的计算分析来完成。系统发育和进化的见解,他们提供的生物学和其他学科,以及许多应用是必不可少的:重要的例子包括重建和研究生命之树-地球上所有生命的进化史,了解人类起源,传染病流行病学和发现新的解决方案,以未来的流行病,作物改良和农业,法医学。由于生物分子测序技术的进步,系统发育研究所需的两个关键因素之一已经取得了重大飞跃:现有生物分子数据的规模现在是任何领域中最大的,到2025年,生物分子数据的速度和存储预计将与Twitter和YouTube相当或更大。另一方面,最近的“大数据”系统发育研究指出了一个关键的差距,关于第二个的两个关键成分在生物遗传学:现有的计算算法需要超越其传统的简化假设的生物分子序列进化。其中两个最重要的假设是:(1)“序列不感知”方法,它忽略了生物分子序列固有的序列性质,以及(2)预先假设进化关系具有简单的分支结构并且是“树状”的-即,可以用树或其他简单的表示来准确地描述。需要新的计算方法和基础设施来超越这些传统的假设,并解锁“未来遗传学”和下一代遗传学的研究。因此,该项目将为生物分子序列数据的复杂系统发育分析创建新的开创性模型和算法。该项目还通过新课程开发以及与密歇根州中部儿童科学博物馆Impression 5科学中心的合作来解决STEM教育的差距。项目的影响将通过开源软件分发和开放数据资源、开发的软件和数据基础设施带来的新科学发现、科学推广活动以及强调多样性、公平性和包容性(DEI)的学生培训和指导来扩大。该项目将沿着多个前沿推进计算遗传学领域。第一个研究目标是开发新的统计回归算法,超越“不知情”的分析,其中生物分子数据被假设为独立和同分布(i.i.d.),并走向“知情的”序列感知分析;一个核心方法将是利用机器学习的最新进展。新算法将用于更好地评估系统发育分析和其他关键路径分析任务的严谨性和再现性。第二个研究目标是创建数学理论,统计模型和计算算法,以超越传统的系统发育表示(例如,系统发育树等),以及复杂基因组进化的更一般的图论模型。第三个研究目标是对前两个研究目标的计算框架进行全面的验证和性能评估研究。这些研究将利用合成和经验基准数据集,这些数据集捕捉了广泛的进化条件和数据集特征。该项目还包括两个教育目标:一个关于跨学科计算机科学DEI主题的新课程,以及一个关于技术和计算机编程的新博物馆展览,将在印象5科学中心展出。开源软件和开放数据可交付成果将推动未来的方法学研究,并实现否则无法获得的科学发现,科学外联将有助于播种和推动项目贡献的吸收。该项目还包括本科生和研究生一级的学生培训和辅导活动。项目交付成果和其他成果可以在www.example.com上找到https://gitlab.msu.edu/liulab.This奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The impact of gene sequence alignment and gene tree estimation error on summary-based species network estimation
基因序列比对和基因树估计误差对基于摘要的物种网络估计的影响
  • DOI:
    10.1145/3535508.3545559
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gao, Meijun;Wang, Wei;Liu, Kevin J.
  • 通讯作者:
    Liu, Kevin J.
The Impact of Species Tree Estimation Error on Cophylogenetic Reconstruction
物种树估计误差对共系统发育重建的影响
  • DOI:
    10.1145/3584371.3612964
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zheng, Julia;Nishida, Yuya;Okrasinska, Alicja;Bonito, Gregory M.;Heath-Heckman, Elizabeth A.;Liu, Kevin J.
  • 通讯作者:
    Liu, Kevin J.
Reconstructing Phylogenies Using Branch-Variable Substitution Models and Unaligned Biomolecular Sequences: A Performance Study and New Resampling Method
使用分支变量替换模型和未对齐的生物分子序列重建系统发育:性能研究和新的重采样方法
  • DOI:
    10.1145/3584371.3613011
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Doko, Rei;Liu, Kevin
  • 通讯作者:
    Liu, Kevin
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kevin Liu其他文献

Medulloblastoma. Treatment results.
髓母细胞瘤。
Characterizing Planar Tanglegram Layouts and Applications to Edge Insertion Problems
表征平面缠结图布局及其在边缘插入问题中的应用
Permutation Statistics in Conjugacy Classes of the Symmetric Group
对称群共轭类的排列统计
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Michael Levet;Kevin Liu;Jesse Campion Loth;E. Stucky;S. Sundaram;Mei Yin
  • 通讯作者:
    Mei Yin
Collisions, rebounds and skimming
碰撞、篮板和掠夺
632: The FLASH Effect is dependent on Dose per Pulse and not Mean Dose Rate for Abdominal Irradiations
632:闪光效应取决于每个脉冲的剂量,而不是腹部辐照的平均剂量率
  • DOI:
    10.1016/s0167-8140(24)01200-3
  • 发表时间:
    2024-05-01
  • 期刊:
  • 影响因子:
    5.300
  • 作者:
    Kevin Liu;Trey Waldrop;Edgardo Aguilar;Nefititi Mims;Denae Neill;Abagail Delahoussaye;Cullen Taniguchi;Devarati Mitra;Emil Schueler
  • 通讯作者:
    Emil Schueler

Kevin Liu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kevin Liu', 18)}}的其他基金

AF: Small: Fast and accurate computational tools for large-scale evolutionary inference: a phylogenetic network approach
AF:小型:用于大规模进化推理的快速准确的计算工具:系统发育网络方法
  • 批准号:
    1714417
  • 财政年份:
    2017
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Standard Grant
CRII: AF: Novel evolutionary models and algorithms to connect genomic sequence and phenotypic data
CRII:AF:连接基因组序列和表型数据的新颖进化模型和算法
  • 批准号:
    1565719
  • 财政年份:
    2016
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Standard Grant

相似海外基金

Home helper robots: Understanding our future lives with human-like AI
家庭帮手机器人:用类人人工智能了解我们的未来生活
  • 批准号:
    FT230100021
  • 财政年份:
    2025
  • 资助金额:
    $ 58.57万
  • 项目类别:
    ARC Future Fellowships
FABB-HVDC (Future Aerospace power conversion Building Blocks for High Voltage DC electrical power systems)
FABB-HVDC(高压直流电力系统的未来航空航天电力转换构建模块)
  • 批准号:
    10079892
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Legacy Department of Trade & Industry
Human-Robot Co-Evolution: Achieving the full potential of future workplaces
人机协同进化:充分发挥未来工作场所的潜力
  • 批准号:
    DP240100938
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Discovery Projects
Cities as transformative agents for a climate-safe future
城市是气候安全未来的变革推动者
  • 批准号:
    FL230100021
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Australian Laureate Fellowships
Cloud immersion and the future of tropical montane forests
云沉浸和热带山地森林的未来
  • 批准号:
    EP/Y027736/1
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Fellowship
Mem-Fast Membranes as Enablers for Future Biorefineries: from Fabrication to Advanced Separation Technologies
Mem-Fast 膜作为未来生物精炼的推动者:从制造到先进的分离技术
  • 批准号:
    EP/Y032004/1
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Research Grant
International Centre-to-Centre Collaboration: New catalysts for acetylene processes enabling a sustainable future
国际中心间合作:乙炔工艺的新型催化剂实现可持续的未来
  • 批准号:
    EP/Z531285/1
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Research Grant
Addressing the complexity of future power system dynamic behaviour
解决未来电力系统动态行为的复杂性
  • 批准号:
    MR/S034420/2
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Fellowship
CAREER: Advances to the EMT Modeling and Simulation of Restoration Processes for Future Grids
职业:未来电网恢复过程的 EMT 建模和仿真的进展
  • 批准号:
    2338621
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Continuing Grant
Securing the Future: Inclusive Cybersecurity Education for All
确保未来:全民包容性网络安全教育
  • 批准号:
    2350448
  • 财政年份:
    2024
  • 资助金额:
    $ 58.57万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了