CAREER: Statistical methods and algorithms for the analysis of combinatorial mass spectrometry data

职业:组合质谱数据分析的统计方法和算法

基本信息

  • 批准号:
    1845465
  • 负责人:
  • 金额:
    $ 105.25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-06-01 至 2022-02-28
  • 项目状态:
    已结题

项目摘要

Mass spectrometry is a crucial modern research tool that allows analysis of the components of samples at several scales: nuclear, small chemicals and biological molecules. In biological research, mass spectrometry is used in the analysis of protein ("proteomics") and metabolic ("metabolomics") data, while in non-research areas it is deployed, for example, to detect bomb-associated chemicals in routine airport security screenings. This research addresses three unmet needs in the processing of the data from mass spectrometry machines: The first is statistical identification of proteins in a biological sample; this is important for understanding what makes cells different, e.g., what makes a skin cell different from a blood cell. The second is identification of which biological species are in a sample; this is crucial in applications such as, for example, enabling accurate and automated disease diagnostics. The third is finding the "alphabet" of basic molecular ingredients in a sample. This research addresses these aims by developing new algorithmic and statistical methods that can correctly separate the basic elements of a complex mixture. The researchers working on this project create mathematical tools that are implemented as researcher-friendly software tools for solving the listed problems. To help make the ideas more accessible to both scientific and non-scientific audiences, the researchers will create teaching modules and podcast episodes to explain how the algorithms work, and what math tricks were developed to break down the complexity of the problem so it is amenable to a useful solution.Problems with combinatorial dependencies are ubiquitous in mass spectrometry. Symmetries in combinatorial dependencies can be exploited to construct special dynamic programming algorithms: convolution trees, fast numeric max-convolution, and other approaches, all of which were invented and developed by the researchers. The researchers will use and improve these symmetry-exploiting algorithms to implement superior mass spectrometry-based protein identification, species classification, and small molecule analysis. Convolution trees can be used to solve these problems in quasilinear time, and so they can be applied to a very large number of proteins, species, or small molecules (or to a large number of spectra from any of these problems). The researchers will construct a library of software implementations of these algorithms with permissive open source licensing for unrestricted academic and industrial use. As they further develop these combinatorial methods, the researchers will create a combinatorics curriculum for intuitively teaching these concepts to K-12 students and create podcast episodes explaining these ideas in an accessible manner.The fruits of this research will be freely available at https://alg.cs.umt.edu/nsf-career.html .This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
质谱分析是一种重要的现代研究工具,它可以分析几个尺度上的样品成分:核、小化学物质和生物分子。在生物学研究中,质谱法用于分析蛋白质(“蛋白质组学”)和代谢(“代谢组学”)数据,而在非研究领域,它被用于在机场例行安全检查中检测与炸弹有关的化学物质。本研究解决了质谱机数据处理中三个未满足的需求:首先是生物样品中蛋白质的统计鉴定;这对于理解是什么使细胞不同是很重要的,例如,是什么使皮肤细胞不同于血细胞。第二是鉴定样本中有哪些生物物种;这在诸如实现准确和自动化疾病诊断等应用中至关重要。第三是找出样本中基本分子成分的“字母表”。这项研究通过开发新的算法和统计方法来解决这些目标,这些方法可以正确地分离复杂混合物的基本元素。从事这个项目的研究人员创建了数学工具,这些工具作为研究人员友好的软件工具来实现,用于解决所列出的问题。为了让科学和非科学观众更容易理解这些想法,研究人员将创建教学模块和播客剧集,解释算法是如何工作的,以及开发了哪些数学技巧来分解问题的复杂性,从而使其易于得到有用的解决方案。组合依赖的问题在质谱分析中是普遍存在的。组合依赖关系中的对称性可以用来构造特殊的动态规划算法:卷积树、快速数值最大卷积和其他方法,这些方法都是由研究人员发明和发展的。研究人员将使用并改进这些对称性利用算法,以实现基于质谱的优越蛋白质鉴定、物种分类和小分子分析。卷积树可以用于在拟线性时间内解决这些问题,因此它们可以应用于大量的蛋白质,物种或小分子(或来自任何这些问题的大量光谱)。研究人员将构建这些算法的软件实现库,并使用开放源代码许可,以供不受限制的学术和工业使用。随着他们进一步发展这些组合方法,研究人员将创建一个组合学课程,直观地向K-12学生教授这些概念,并创建播客集,以一种易于理解的方式解释这些想法。这项研究的成果将在https://alg.cs.umt.edu/nsf-career.html上免费提供。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Performing Selection on a Monotonic Function in Lieu of Sorting Using Layer-Ordered Heaps
使用层序堆对单调函数执行选择来代替排序
  • DOI:
    10.1021/acs.jproteome.0c00711
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Lucke, Kyle;Pennington, Jake;Kreitzberg, Patrick;Serang, Oliver
  • 通讯作者:
    Serang, Oliver
Fast Exact Computation of the k Most Abundant Isotope Peaks with Layer-Ordered Heaps
  • DOI:
    10.1021/acs.analchem.0c01670
  • 发表时间:
    2020-08-04
  • 期刊:
  • 影响因子:
    7.4
  • 作者:
    Kreitzberg, Patrick;Pennington, Jake;Serang, Oliver
  • 通讯作者:
    Serang, Oliver
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Oliver Serang其他文献

Oliver Serang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources
职业:高维不同数据源统计集成的下一代方法
  • 批准号:
    2422478
  • 财政年份:
    2024
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Statistical Inference in Observational Studies -- Theory, Methods, and Beyond
职业:观察研究中的统计推断——理论、方法及其他
  • 批准号:
    2338760
  • 财政年份:
    2024
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Practical algorithms and high dimensional statistical methods for multimodal haplotype modelling
职业:多模态单倍型建模的实用算法和高维统计方法
  • 批准号:
    2239870
  • 财政年份:
    2023
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Standard Grant
CAREER: Statistical Models and Parallel-computing Methods for Analyzing Sparse and Large Single-cell Chromatin Interaction Datasets
职业:用于分析稀疏和大型单细胞染色质相互作用数据集的统计模型和并行计算方法
  • 批准号:
    2239350
  • 财政年份:
    2023
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Fast and Accurate Statistical Learning and Inference from Large-Scale Data: Theory, Methods, and Algorithms
职业:从大规模数据中快速准确地进行统计学习和推理:理论、方法和算法
  • 批准号:
    2046874
  • 财政年份:
    2021
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources
职业:高维不同数据源统计集成的下一代方法
  • 批准号:
    2044823
  • 财政年份:
    2021
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Foundational statistical theory and methods for analyzing populations of attributed connectomes
职业:用于分析归因连接体群体的基础统计理论和方法
  • 批准号:
    1942963
  • 财政年份:
    2020
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Computational and statistical methods for allele-specific chromatin structure analysis
职业:等位基因特异性染色质结构分析的计算和统计方法
  • 批准号:
    1751317
  • 财政年份:
    2018
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: New Statistical Methods for Classification and Analysis of High Dimensional and Functional Data
职业:高维和功能数据分类和分析的新统计方法
  • 批准号:
    1812354
  • 财政年份:
    2017
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
CAREER: Research and training in advanced computational methods for quantum and statistical mechanics
职业:量子和统计力学高级计算方法的研究和培训
  • 批准号:
    1454939
  • 财政年份:
    2015
  • 资助金额:
    $ 105.25万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了