III:Small: Expressiveness of Genome Graphs: Construction, Comparison, and Heterogeneity
III:小:基因组图的表现力:构建、比较和异质性
基本信息
- 批准号:2232121
- 负责人:
- 金额:$ 60万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-04-01 至 2026-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Differences (also known as variants) in a person's genome help to determine specific characteristics such as their susceptibility to disease, their response to drugs, and other significant aspects of their biology. Similarly, differences in the genomes of bacteria and viruses help to determine their specific characteristics such as, for example, whether they are harmful to humans or animals. These genetic differences are important to understand and to take into account when studying the biology of an organism because they play such an important role in how even individual cells in the organism function. Advances in genome sequencing technology have generated huge catalogs of such differences in many organisms, including humans. These rich repositories of genomic information cannot be fully integrated and analyzed due to a lack of effective computational methods, and existing methods suffer from computational inefficiencies and lapses of accuracy when drawing conclusions from collections of genomic differences. This project will develop new computational methods to increase accuracy and decrease computational resource requirements for storing, comparing, and evaluating catalogs of genomic differences. It will result in new scientific software that will better organize catalogs of differences to make computational analyses more tractable. It will also result in software that more accurately measures the diversity of a population of individuals and software that supports making better comparisons between populations. The project will validate these methods by subtyping cancer tumors, assessing the diversity of cells in various types of tumors, and by comparing populations of bacteria found in different environments. The project will result in faster, more accurate software for the analysis of many genomic differences that will advance our understanding of how genomic variants affect human health and biological processes. To better explain the innovations developed during this project and the importance of studying genomic differences, the project will also produce a series of educational videos that will help other people understand the main ideas behind the techniques developed in this project.Genome graphs have emerged as an important data structure in the analysis of collections of genomic variants. These are graphs in which nodes (or edges) are labeled with genomic sequences (strings) and paths in the graph represent substrings that are present in the population that the graph represents. They can be used as representations of a “reference” genome for a population of organisms. Genome graphs have been used to reduce bias in the reference genome, form more inclusive reference genomes, and to reduce space and time requirements to perform genomic sequence analyses. For this reason, many tools are being adapted to use genome graphs as references in lieu of traditional linear (single sequence) references. While genome graphs have consistently proved useful in these areas, the algorithms for a number of problems associated with them suffer from poor computational scaling and lack of formalization. The project will develop and validate algorithms for several central genome graph problems, specifically to (goal 1) construct genome graphs, to (goal 2) compare genome graphs, and to (goal 3) assess the complexity of genome graphs. The framework that the project will use to solve these problems is innovative in that it involves exploiting the under-explored connection between graph flow decompositions and genome graphs. This approach reveals natural relationships between genome graphs and the population of strings they represent. This global view of the expressive power of a genome graph is central to the formulations that the project will explore. The problems that the project will tackle bridge graph theory and genomics, leading to greater interactions and connections between those fields. Our algorithms will allow genome graphs to more accurately reflect desired populations, will allow information from multiple genomes to be better integrated, and will advance the informatics tools needed to exploit large collections of genomic variants. The project will apply and evaluate these algorithms to (1) improve sequence alignment for mapping populations of genomes, (2) improve clustering of cancer tumor sequences and metagenomic samples, and (3) better model the progression of heterogeneity in metastatic cancer samples. The developed algorithms will be implemented in an open-source library to encourage their use in other systems. Finally, the project will create open-source, free instructional videos to introduce concepts such as pan-genomics, genome graphs, and the developed algorithms to a wider audience.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
一个人基因组的差异(也称为变异)有助于确定特定的特征,如对疾病的易感性,对药物的反应以及生物学的其他重要方面。同样,细菌和病毒基因组的差异有助于确定它们的具体特征,例如它们是否对人类或动物有害。在研究生物体的生物学时,理解和考虑这些遗传差异是很重要的,因为它们在生物体中的单个细胞如何发挥作用方面发挥着重要作用。基因组测序技术的进步已经在包括人类在内的许多生物体中产生了巨大的差异目录。由于缺乏有效的计算方法,这些丰富的基因组信息库不能被完全整合和分析,并且现有的方法在从基因组差异的集合中得出结论时遭受计算效率低下和准确性损失。该项目将开发新的计算方法,以提高准确性,减少存储,比较和评估基因组差异目录的计算资源需求。它将产生新的科学软件,可以更好地组织差异目录,使计算分析更容易处理。它还将产生更准确地测量个体群体多样性的软件,以及支持更好地进行群体间比较的软件。该项目将通过对癌症肿瘤进行分型,评估各种类型肿瘤中细胞的多样性,以及比较不同环境中发现的细菌种群来验证这些方法。该项目将产生更快,更准确的软件,用于分析许多基因组差异,这将促进我们对基因组变异如何影响人类健康和生物过程的理解。为了更好地解释在该项目中开发的创新和研究基因组差异的重要性,该项目还将制作一系列教育视频,帮助其他人了解该项目开发的技术背后的主要思想。基因组图已经成为分析基因组变异集合的重要数据结构。这些图中的节点(或边)用基因组序列(字符串)标记,图中的路径表示图所表示的种群中存在的子字符串。它们可以被用作生物群体的“参考”基因组的表示。基因组图已被用于减少参考基因组中的偏差,形成更具包容性的参考基因组,并减少进行基因组序列分析的空间和时间要求。出于这个原因,许多工具正在适应使用基因组图作为参考,而不是传统的线性(单序列)参考。虽然基因组图在这些领域一直被证明是有用的,但与之相关的许多问题的算法都存在计算伸缩性差和缺乏形式化的问题。该项目将为几个中心基因组图问题开发和验证算法,特别是(目标1)构建基因组图,(目标2)比较基因组图,以及(目标3)评估基因组图的复杂性。该项目将用于解决这些问题的框架是创新的,因为它涉及到利用图流分解和基因组图之间的未充分探索的联系。这种方法揭示了基因组图和它们所代表的字符串群体之间的自然关系。这种对基因组图表达能力的全局观点是该项目将探索的公式的核心。该项目将解决的问题桥图论和基因组学,导致这些领域之间更大的相互作用和联系。我们的算法将允许基因组图更准确地反映所需的人群,将允许来自多个基因组的信息更好地整合,并将推进利用大量基因组变异所需的信息学工具。该项目将应用和评估这些算法,以(1)改善基因组群体映射的序列比对,(2)改善癌症肿瘤序列和宏基因组样本的聚类,以及(3)更好地模拟转移性癌症样本中异质性的进展。开发的算法将在一个开源库中实现,以鼓励它们在其他系统中使用。最后,该项目将创建开源、免费的教学视频,向更广泛的受众介绍泛基因组学、基因组图和已开发的算法等概念。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection
通过变量选择进行计算高效的高维贝叶斯优化
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Shen, Yihang;Kingsford, Carl
- 通讯作者:Kingsford, Carl
Reinforcement Learning for Robotic Liquid Handler Planning
机器人液体处理机规划的强化学习
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Ferdosi, Mohsen;Ge, Yuejun;Kingsford, Carl
- 通讯作者:Kingsford, Carl
Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
- DOI:10.1186/s13015-024-00262-6
- 发表时间:2024-04-29
- 期刊:
- 影响因子:1
- 作者:Qiu,Yutong;Shen,Yihang;Kingsford,Carl
- 通讯作者:Kingsford,Carl
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carleton Kingsford其他文献
Carleton Kingsford的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carleton Kingsford', 18)}}的其他基金
Conference: NSF-NIH Joint Workshop on Foundational AI in Biology
会议:NSF-NIH 生物学基础人工智能联合研讨会
- 批准号:
2325301 - 财政年份:2023
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
IIBR:Informatics:Toward an Automated RNA-seq Bioinformatician
IIBR:信息学:走向自动化 RNA-seq 生物信息学家
- 批准号:
1937540 - 财政年份:2020
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Workshop on Future Directions for Algorithms in Biology
生物学算法未来方向研讨会
- 批准号:
1748493 - 财政年份:2017
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
AF: Small: Multiscale Spectral Signatures for Local and Multi-objective Biological Network Alignment
AF:小:用于局部和多目标生物网络比对的多尺度光谱特征
- 批准号:
1319998 - 财政年份:2013
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1256087 - 财政年份:2012
- 资助金额:
$ 60万 - 项目类别:
Continuing Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1053918 - 财政年份:2011
- 资助金额:
$ 60万 - 项目类别:
Continuing Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
CSR: Small: Leveraging Physical Side-Channels for Good
CSR:小:利用物理侧通道做好事
- 批准号:
2312089 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
NeTS: Small: NSF-DST: Modernizing Underground Mining Operations with Millimeter-Wave Imaging and Networking
NeTS:小型:NSF-DST:利用毫米波成像和网络实现地下采矿作业现代化
- 批准号:
2342833 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
CPS: Small: NSF-DST: Autonomous Operations of Multi-UAV Uncrewed Aerial Systems using Onboard Sensing to Monitor and Track Natural Disaster Events
CPS:小型:NSF-DST:使用机载传感监测和跟踪自然灾害事件的多无人机无人航空系统自主操作
- 批准号:
2343062 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Collaborative Research: FET: Small: Reservoir Computing with Ion-Channel-Based Memristors
合作研究:FET:小型:基于离子通道忆阻器的储层计算
- 批准号:
2403559 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
オミックス解析を用いたブドウ球菌 small colony variants の包括的特徴づけ
使用组学分析全面表征葡萄球菌小菌落变体
- 批准号:
24K13443 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
AF: Small: Problems in Algorithmic Game Theory for Online Markets
AF:小:在线市场的算法博弈论问题
- 批准号:
2332922 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Collaborative Research: FET: Small: Algorithmic Self-Assembly with Crisscross Slats
合作研究:FET:小型:十字交叉板条的算法自组装
- 批准号:
2329908 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
NeTS: Small: ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates
NeTS:小型:ML 驱动的多太比特线路速率在线流量分析
- 批准号:
2331111 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
- 批准号:
2331302 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
- 批准号:
2331301 - 财政年份:2024
- 资助金额:
$ 60万 - 项目类别:
Standard Grant