CSR: Medium: Approximate Membership Query Data Structures in Computational Biology and Storage

CSR:中:计算生物学和存储中的近似成员资格查询数据结构

基本信息

  • 批准号:
    2317838
  • 负责人:
  • 金额:
    $ 120万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-10-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

This project will develop new data structures and software for computational biology and big data storage systems. The data structures created in this project will allow computational biology and big data applications to maintain compact summaries of huge data sets. Because the summaries are small, they can be stored in a computer's fast memory, enabling applications to run much more quickly and to scale to larger data sets. For example, this project will develop a tool for searching through genetic information for thousands (to millions) of individuals to detect genetic variations that are correlated with disease or other traits. A major challenge that this project will address is that applications need compact, feature-rich summary data structures. Applications need summaries that can represent a set of elements, count duplicates in a set of input data, be resized as the data set grows, support deletions of items, be merged with other summaries, and support high concurrency on today's multi-core systems. However, current summary data structures offer limited features. As a result, today's applications must design around these limitations, resulting in software that is slower, uses more memory, and is more complex than necessary.The project will impact core computer science applications, such as databases and file systems, and medical and biological applications, such as genome and transcriptome analysis. Databases and file systems will run faster and use less memory. They will be able to combine fast, expensive solid-state storage devices with cheap, slow, but capacious hard drives to get the best of both devices: low cost and high performance. Biologists will be able to analyze sequencing data more quickly and cheaply, using fewer computational resources. They will be able to search through huge datasets to make new discoveries.All papers, documentation, and software created by this project will be released as open source, typically on popular open-source development websites, such as GitHub, under the COMBINE-lab (https://github.com/COMBINE-lab) and splatlab (https://github.com/splatlab) organizations. Papers will be hosted by the publishers, as well as on the author?s personal websites.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目将为计算生物学和大数据存储系统开发新的数据结构和软件。 该项目中创建的数据结构将允许计算生物学和大数据应用程序维护庞大数据集的紧凑摘要。 由于摘要很小,它们可以存储在计算机的快速内存中,使应用程序能够更快地运行并扩展到更大的数据集。 例如,该项目将开发一种工具,用于搜索数千(至数百万)人的遗传信息,以检测与疾病或其他特征相关的遗传变异。 这个项目将解决的一个主要挑战是应用程序需要紧凑、功能丰富的摘要数据结构。 应用程序需要的摘要可以表示一组元素,计算一组输入数据中的重复项,随着数据集的增长调整大小,支持删除项目,与其他摘要合并,并支持当今多核系统上的高并发性。 然而,当前的摘要数据结构提供有限的特征。 因此,今天的应用程序必须围绕这些限制进行设计,导致软件速度更慢,使用更多内存,并且比必要的更复杂。该项目将影响核心计算机科学应用,如数据库和文件系统,以及医学和生物应用,如基因组和转录组分析。 数据库和文件系统将运行得更快,使用更少的内存。 他们将能够将快速、昂贵的固态存储设备与便宜、缓慢但容量大的硬盘驱动器联合收割机结合起来,以获得这两种设备的最佳效果:低成本和高性能。 生物学家将能够使用更少的计算资源,更快、更便宜地分析测序数据。 该项目创建的所有论文、文档和软件都将以开源形式发布,通常发布在流行的开源开发网站上,如GitHub,隶属于COMBINE-lab(https://github.com/COMBINE-lab)和splatlab(https://github.com/splatlab)组织。 论文将由出版商主办,以及对作者?该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(14)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Automatic HBM Management: Models and Algorithms
On the optimal time/space tradeoff for hash tables
Fulgor: A Fast and Compact k-mer Index for Large-Scale Matching and Color Queries
Fulgor:用于大规模匹配和颜色查询的快速、紧凑的 k-mer 索引
Online List Labeling: Breaking the log 2 n Barrier
在线列表标签:打破 log 2 n 障碍
Mosaic Pages: Big TLB Reach with Small Pages
马赛克页面:小页面实现大 TLB 覆盖范围
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Robert Patro其他文献

Social Snapshot: A System for Temporally Coupled Social Photography
社交快照:时间耦合社交摄影系统
Detecting isoform-level allelic imbalance accounting for inferential uncertainty
检测异构体水平的等位基因不平衡以解释推论的不确定性
  • DOI:
    10.1101/2022.08.12.503785
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Euphy Y. Wu;N. P. Singh;Kwangbom Choi;Mohsen Zakeri;Matt Vincent;G. Churchill;Cheryl L. Ackert;Robert Patro;M. Love
  • 通讯作者:
    M. Love
MDMap: A system for data-driven layout and exploration of molecular dynamics simulations
MDMap:数据驱动布局和分子动力学模拟探索系统
ChromoVis : Feature-Rich Layouts of Chromosome Conformation Graphs
ChromoVis:功能丰富的染色体构象图布局
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Darya Filippova;Geet Duggal;Robert Patro;Carl Kingsford
  • 通讯作者:
    Carl Kingsford
Modeling and Visualization of Human Activities for Multicamera Networks
  • DOI:
    10.1155/2009/259860
  • 发表时间:
    2009-10-22
  • 期刊:
  • 影响因子:
    1.800
  • 作者:
    Aswin C. Sankaranarayanan;Robert Patro;Pavan Turaga;Amitabh Varshney;Rama Chellappa
  • 通讯作者:
    Rama Chellappa

Robert Patro的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Robert Patro', 18)}}的其他基金

CAREER: A Comprehensive and Lightweight Framework for Transcriptome Analysis
职业生涯:全面、轻量级的转录组分析框架
  • 批准号:
    2029424
  • 财政年份:
    2020
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
CAREER: A Comprehensive and Lightweight Framework for Transcriptome Analysis
职业生涯:全面、轻量级的转录组分析框架
  • 批准号:
    1750472
  • 财政年份:
    2018
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
CSR: Medium: Approximate Membership Query Data Structures in Computational Biology and Storage
CSR:中:计算生物学和存储中的近似成员资格查询数据结构
  • 批准号:
    1763680
  • 财政年份:
    2018
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
Bilateral BBSRC-NSF/BIO: ABI Innovation: Data-driven hierarchical analysis of de novo transcriptomes
双边 BBSRC-NSF/BIO:ABI 创新:数据驱动的从头转录组分层分析
  • 批准号:
    1564917
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: SHF: Medium: Approximate Computing for Machine Learning Security: Foundations and Accelerator Design
协作研究:SHF:媒介:机器学习安全的近似计算:基础和加速器设计
  • 批准号:
    2212426
  • 财政年份:
    2022
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
Collaborative Research: SHF: Medium: Approximate Computing for Machine Learning Security: Foundations and Accelerator Design
协作研究:SHF:媒介:机器学习安全的近似计算:基础和加速器设计
  • 批准号:
    2212427
  • 财政年份:
    2022
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
CSR: Medium: Approximate Membership Query Data Structures in Computational Biology and Storage
CSR:中:计算生物学和存储中的近似成员资格查询数据结构
  • 批准号:
    1763680
  • 财政年份:
    2018
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
CSR: Medium: Optimal Control of Approximate Computing Systems
CSR:中:近似计算系统的最优控制
  • 批准号:
    1705092
  • 财政年份:
    2017
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: Sequential and Parallel Algorithms for Approximate Sequence Matching with Applications to Computational Biology
AF:媒介:协作研究:近似序列匹配的顺序和并行算法及其在计算生物学中的应用
  • 批准号:
    1704552
  • 财政年份:
    2017
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: Sequential and Parallel Algorithms for Approximate Sequence Matching with Applications to Computational Biology
AF:媒介:协作研究:近似序列匹配的顺序和并行算法及其在计算生物学中的应用
  • 批准号:
    1703489
  • 财政年份:
    2017
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: The Power of Randomness for Approximate Counting
AF:中:协作研究:近似计数的随机性的力量
  • 批准号:
    1563838
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
AF: Medium: Collaborative Research: The Power of Randomness for Approximate Counting
AF:中:协作研究:近似计数的随机性的力量
  • 批准号:
    1563757
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
AF: Medium: Collaborative Research: Approximate Computational Geometry via Controlled Linear Perturbation
AF:媒介:协作研究:通过受控线性扰动近似计算几何
  • 批准号:
    0904832
  • 财政年份:
    2009
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: Approximate Computational Geometry via Controlled Linear Perturbation
AF:媒介:协作研究:通过受控线性扰动近似计算几何
  • 批准号:
    0904707
  • 财政年份:
    2009
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了