III:Small: Expressiveness of Genome Graphs: Construction, Comparison, and Heterogeneity

III:小:基因组图的表现力:构建、比较和异质性

基本信息

  • 批准号:
    2232121
  • 负责人:
  • 金额:
    $ 60万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-04-01 至 2026-03-31
  • 项目状态:
    未结题

项目摘要

Differences (also known as variants) in a person's genome help to determine specific characteristics such as their susceptibility to disease, their response to drugs, and other significant aspects of their biology. Similarly, differences in the genomes of bacteria and viruses help to determine their specific characteristics such as, for example, whether they are harmful to humans or animals. These genetic differences are important to understand and to take into account when studying the biology of an organism because they play such an important role in how even individual cells in the organism function. Advances in genome sequencing technology have generated huge catalogs of such differences in many organisms, including humans. These rich repositories of genomic information cannot be fully integrated and analyzed due to a lack of effective computational methods, and existing methods suffer from computational inefficiencies and lapses of accuracy when drawing conclusions from collections of genomic differences. This project will develop new computational methods to increase accuracy and decrease computational resource requirements for storing, comparing, and evaluating catalogs of genomic differences. It will result in new scientific software that will better organize catalogs of differences to make computational analyses more tractable. It will also result in software that more accurately measures the diversity of a population of individuals and software that supports making better comparisons between populations. The project will validate these methods by subtyping cancer tumors, assessing the diversity of cells in various types of tumors, and by comparing populations of bacteria found in different environments. The project will result in faster, more accurate software for the analysis of many genomic differences that will advance our understanding of how genomic variants affect human health and biological processes. To better explain the innovations developed during this project and the importance of studying genomic differences, the project will also produce a series of educational videos that will help other people understand the main ideas behind the techniques developed in this project.Genome graphs have emerged as an important data structure in the analysis of collections of genomic variants. These are graphs in which nodes (or edges) are labeled with genomic sequences (strings) and paths in the graph represent substrings that are present in the population that the graph represents. They can be used as representations of a “reference” genome for a population of organisms. Genome graphs have been used to reduce bias in the reference genome, form more inclusive reference genomes, and to reduce space and time requirements to perform genomic sequence analyses. For this reason, many tools are being adapted to use genome graphs as references in lieu of traditional linear (single sequence) references. While genome graphs have consistently proved useful in these areas, the algorithms for a number of problems associated with them suffer from poor computational scaling and lack of formalization. The project will develop and validate algorithms for several central genome graph problems, specifically to (goal 1) construct genome graphs, to (goal 2) compare genome graphs, and to (goal 3) assess the complexity of genome graphs. The framework that the project will use to solve these problems is innovative in that it involves exploiting the under-explored connection between graph flow decompositions and genome graphs. This approach reveals natural relationships between genome graphs and the population of strings they represent. This global view of the expressive power of a genome graph is central to the formulations that the project will explore. The problems that the project will tackle bridge graph theory and genomics, leading to greater interactions and connections between those fields. Our algorithms will allow genome graphs to more accurately reflect desired populations, will allow information from multiple genomes to be better integrated, and will advance the informatics tools needed to exploit large collections of genomic variants. The project will apply and evaluate these algorithms to (1) improve sequence alignment for mapping populations of genomes, (2) improve clustering of cancer tumor sequences and metagenomic samples, and (3) better model the progression of heterogeneity in metastatic cancer samples. The developed algorithms will be implemented in an open-source library to encourage their use in other systems. Finally, the project will create open-source, free instructional videos to introduce concepts such as pan-genomics, genome graphs, and the developed algorithms to a wider audience.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
一个人基因组中的差异(也称为变异)有助于确定特定的特征,例如他们对疾病的易感性,对药物的反应以及他们生物学的其他重要方面。同样,细菌和病毒基因组的差异有助于确定它们的具体特征,例如,它们是否对人类或动物有害。在研究生物体的生物学时,理解和考虑这些遗传差异是很重要的,因为它们对生物体中单个细胞的功能起着如此重要的作用。基因组测序技术的进步已经在包括人类在内的许多生物中产生了大量的这种差异目录。由于缺乏有效的计算方法,这些丰富的基因组信息库无法完全集成和分析,并且现有的方法在从基因组差异集合中得出结论时存在计算效率低下和准确性下降的问题。该项目将开发新的计算方法,以提高准确性,减少存储、比较和评估基因组差异目录的计算资源需求。它将产生新的科学软件,可以更好地组织差异目录,使计算分析更容易处理。它还将导致软件更准确地测量个体群体的多样性,并支持更好地比较群体之间的软件。该项目将通过对癌症肿瘤进行分型、评估不同类型肿瘤中细胞的多样性以及比较在不同环境中发现的细菌种群来验证这些方法。该项目将产生更快、更准确的软件,用于分析许多基因组差异,这将促进我们对基因组变异如何影响人类健康和生物过程的理解。为了更好地解释在这个项目中开发的创新和研究基因组差异的重要性,这个项目还将制作一系列教育视频,帮助其他人理解这个项目中开发的技术背后的主要思想。基因组图谱已成为基因组变异集合分析中的一种重要数据结构。在这些图中,节点(或边)被标记为基因组序列(字符串),图中的路径表示在图所表示的种群中存在的子字符串。它们可以用作生物体种群的“参考”基因组的表示。基因组图谱已被用于减少参考基因组的偏差,形成更具包容性的参考基因组,并减少进行基因组序列分析所需的空间和时间。由于这个原因,许多工具正在适应使用基因组图作为参考,而不是传统的线性(单序列)参考。虽然基因组图在这些领域一直被证明是有用的,但与它们相关的许多问题的算法存在计算缩放差和缺乏形式化的问题。该项目将开发和验证几个核心基因组图问题的算法,特别是(目标1)构建基因组图,(目标2)比较基因组图,以及(目标3)评估基因组图的复杂性。该项目将用于解决这些问题的框架是创新的,因为它涉及开发未被开发的图流分解和基因组图之间的联系。这种方法揭示了基因组图和它们所代表的字符串群之间的自然关系。这种对基因组图谱表达能力的全局观点是该项目将探索的公式的核心。该项目将解决桥图理论和基因组学的问题,导致这些领域之间更大的互动和联系。我们的算法将允许基因组图更准确地反映期望的群体,将允许来自多个基因组的信息更好地整合,并将推进开发大量基因组变异所需的信息学工具。该项目将应用和评估这些算法,以(1)改善基因组群体定位的序列比对,(2)改善癌症肿瘤序列和宏基因组样本的聚类,以及(3)更好地模拟转移性癌症样本的异质性进展。开发的算法将在一个开源库中实现,以鼓励它们在其他系统中使用。最后,该项目将创建开源、免费的教学视频,向更广泛的受众介绍泛基因组学、基因组图谱和已开发的算法等概念。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection
通过变量选择进行计算高效的高维贝叶斯优化
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shen, Yihang;Kingsford, Carl
  • 通讯作者:
    Kingsford, Carl
Reinforcement Learning for Robotic Liquid Handler Planning
机器人液体处理机规划的强化学习
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ferdosi, Mohsen;Ge, Yuejun;Kingsford, Carl
  • 通讯作者:
    Kingsford, Carl
Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
  • DOI:
    10.1186/s13015-024-00262-6
  • 发表时间:
    2024-04-29
  • 期刊:
  • 影响因子:
    1
  • 作者:
    Qiu,Yutong;Shen,Yihang;Kingsford,Carl
  • 通讯作者:
    Kingsford,Carl
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Carleton Kingsford其他文献

Carleton Kingsford的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Carleton Kingsford', 18)}}的其他基金

Conference: NSF-NIH Joint Workshop on Foundational AI in Biology
会议:NSF-NIH 生物学基础人工智能联合研讨会
  • 批准号:
    2325301
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
IIBR:Informatics:Toward an Automated RNA-seq Bioinformatician
IIBR:信息学:走向自动化 RNA-seq 生物信息学家
  • 批准号:
    1937540
  • 财政年份:
    2020
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Workshop on Future Directions for Algorithms in Biology
生物学算法未来方向研讨会
  • 批准号:
    1748493
  • 财政年份:
    2017
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
AF: Small: Multiscale Spectral Signatures for Local and Multi-objective Biological Network Alignment
AF:小:用于局部和多目标生物网络比对的多尺度光谱特征
  • 批准号:
    1319998
  • 财政年份:
    2013
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
  • 批准号:
    1256087
  • 财政年份:
    2012
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
  • 批准号:
    1053918
  • 财政年份:
    2011
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant

相似国自然基金

昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
  • 批准号:
    32000033
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
  • 批准号:
    31972324
  • 批准年份:
    2019
  • 资助金额:
    58.0 万元
  • 项目类别:
    面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
  • 批准号:
    81900988
  • 批准年份:
    2019
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
  • 批准号:
    31870821
  • 批准年份:
    2018
  • 资助金额:
    56.0 万元
  • 项目类别:
    面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
  • 批准号:
    31802058
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
  • 批准号:
    31772128
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
  • 批准号:
    81704176
  • 批准年份:
    2017
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
  • 批准号:
    91640114
  • 批准年份:
    2016
  • 资助金额:
    85.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

Powering Small Craft with a Novel Ammonia Engine
用新型氨发动机为小型船只提供动力
  • 批准号:
    10099896
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Collaborative R&D
"Small performances": investigating the typographic punches of John Baskerville (1707-75) through heritage science and practice-based research
“小型表演”:通过遗产科学和基于实践的研究调查约翰·巴斯克维尔(1707-75)的印刷拳头
  • 批准号:
    AH/X011747/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
Fragment to small molecule hit discovery targeting Mycobacterium tuberculosis FtsZ
针对结核分枝杆菌 FtsZ 的小分子片段发现
  • 批准号:
    MR/Z503757/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
Bacteriophage control of host cell DNA transactions by small ORF proteins
噬菌体通过小 ORF 蛋白控制宿主细胞 DNA 交易
  • 批准号:
    BB/Y004426/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
Windows for the Small-Sized Telescope (SST) Cameras of the Cherenkov Telescope Array (CTA)
切伦科夫望远镜阵列 (CTA) 小型望远镜 (SST) 相机的窗口
  • 批准号:
    ST/Z000017/1
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
CSR: Small: Leveraging Physical Side-Channels for Good
CSR:小:利用物理侧通道做好事
  • 批准号:
    2312089
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
CSR: Small: Multi-FPGA System for Real-time Fraud Detection with Large-scale Dynamic Graphs
CSR:小型:利用大规模动态图进行实时欺诈检测的多 FPGA 系统
  • 批准号:
    2317251
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
AF: Small: Problems in Algorithmic Game Theory for Online Markets
AF:小:在线市场的算法博弈论问题
  • 批准号:
    2332922
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Collaborative Research: FET: Small: Algorithmic Self-Assembly with Crisscross Slats
合作研究:FET:小型:十字交叉板条的算法自组装
  • 批准号:
    2329908
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
NeTS: Small: ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates
NeTS:小型:ML 驱动的多太比特线路速率在线流量分析
  • 批准号:
    2331111
  • 财政年份:
    2024
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了