CAREER: Machine learning and signal processing methods for analyzing single-cell sequencing data

职业:用于分析单细胞测序数据的机器学习和信号处理方法

基本信息

  • 批准号:
    1942303
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-06-01 至 2025-05-31
  • 项目状态:
    未结题

项目摘要

High throughput genome sequencing technologies, also known as next generation sequencing technologies, have revolutionized genomics and medicine. This achievement is mostly owed to employing computational methods to extract information from massive numbers of small fragments of genomes resulting from the new sequencing technologies. More recently, sequencing individual cells, known as single-cell sequencing, shows significant advancements over conventional sequencing of a large population of cells, called bulk cell sequencing. This is because single-cell sequencing enables discovery of new biological knowledge at the cellular level and a better understanding of the function of an individual cell, which cannot be obtained via bulk sequencing. The emerging and fast-growing single-cell sequencing technology has attracted much interest and has had major impacts on several fields, such as microbiology, neurobiology, immunology, and developmental biology. With rapid advances in single-cell technologies, single-cell sequencing data and their applications continue to grow, while computational methods for analyzing these data are lagging behind. Compared to bulk sequencing, single-cell sequencing introduces new challenges in data analysis due to the low amount of DNA and RNA from a single cell and the extra steps in the sequencing process for single cells. Moreover, thousands or millions of cells are sequenced in parallel in any given experiment, leading to massive data sets to analyze. This project will build a strong computational foundation for analyzing single-cell sequencing data. The project outcome will contribute to advancing biological sciences and improving human health by providing insight into critical biological unknowns requiring single-cell resolutions, such as evolution of cancer cells and development of stem cells. The project will also contribute to education and training in the high in-demand and multidisciplinary field of bioinformatics and computational genomics.Data analysis using advanced computational methods plays an essential role in extracting accurate and meaningful information from single-cell sequencing data. Current single-cell sequencing data analysis methods have been adapted from bulk sequencing technologies. The current methods, however, are not designed to cope with the new challenges in single-cell sequencing data analysis, such as extensive noise, zero inflation and missing data, non-uniform genome coverage, data multimodality, and large amount of data. This project aims to address the new challenges in analyzing single-cell sequencing data by developing novel computational methods and algorithms based on signal processing and machine learning techniques. The focus of this research is on identifying genomic variations in the form of copy number variations using DNA single-cell sequencing data, and clustering cells using RNA single-cell sequencing data. The developed methods and algorithms will significantly advance knowledge in extracting accurate information from complex and massive single-cell sequencing data by (i) providing optimal representation of genome coverage data by applying sparse optimization, (ii) modeling and reducing noise by employing denoising methods in signal processing, (iii) exploring information across cells by applying data-driven learning models and (iv) incorporating prior knowledge by adapting network and word embedding models.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
高通量基因组测序技术,也称为下一代测序技术,已经彻底改变了基因组学和医学。这一成就主要归功于采用计算方法从新测序技术产生的大量基因组小片段中提取信息。最近,对单个细胞进行测序(称为单细胞测序)显示出比对大量细胞进行常规测序(称为批量细胞测序)的显著进步。这是因为单细胞测序能够在细胞水平上发现新的生物学知识,并更好地了解单个细胞的功能,这是通过批量测序无法获得的。新兴和快速发展的单细胞测序技术引起了人们的极大兴趣,并对微生物学、神经生物学、免疫学和发育生物学等多个领域产生了重大影响。随着单细胞技术的快速发展,单细胞测序数据及其应用不断增长,而分析这些数据的计算方法却落后了。与批量测序相比,单细胞测序在数据分析中引入了新的挑战,这是由于来自单细胞的DNA和RNA的量较低,以及单细胞测序过程中的额外步骤。此外,在任何给定的实验中,数千或数百万个细胞都是并行测序的,这导致了大量的数据集需要分析。该项目将为分析单细胞测序数据建立强大的计算基础。该项目的成果将有助于推进生物科学和改善人类健康,通过提供对需要单细胞分辨率的关键生物学未知数的深入了解,例如癌细胞的进化和干细胞的发育。 该项目还将促进生物信息学和计算基因组学这一高需求和多学科领域的教育和培训,使用先进计算方法进行数据分析在从单细胞测序数据中提取准确和有意义的信息方面发挥着至关重要的作用。目前的单细胞测序数据分析方法是从批量测序技术改编而来的。然而,目前的方法并没有被设计为科普单细胞测序数据分析中的新挑战,例如广泛的噪声、零膨胀和缺失数据、非均匀的基因组覆盖、数据多模态和大量数据。该项目旨在通过开发基于信号处理和机器学习技术的新型计算方法和算法来解决分析单细胞测序数据的新挑战。本研究的重点是使用DNA单细胞测序数据识别拷贝数变异形式的基因组变异,并使用RNA单细胞测序数据聚类细胞。所开发的方法和算法将通过以下方式在从复杂且大量的单细胞测序数据中提取准确信息方面显著地推进知识:(i)通过应用稀疏优化来提供基因组覆盖数据的最佳表示,(ii)通过在信号处理中采用去噪方法来建模和降低噪声,(iii)通过应用数据驱动的学习模型来探索跨单元的信息,以及(iv)该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识产权进行评估来支持。优点和更广泛的影响审查标准。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
scGEMOC, A Graph Embedded Contrastive Learning Single-cell Multiomics Clustering Model
Semi-supervised classification of disease prognosis using CR images with clinical data structured graph
使用 CR 图像和临床数据结构化图对疾病预后进行半监督分类
Copy number variation detection using single cell sequencing data
Contrastive Learning in Single-cell Multiomics Clustering
单细胞多组学聚类中的对比学习
  • DOI:
    10.1145/3584371.3613010
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Li, Bingjun;Nabavi, Sheida
  • 通讯作者:
    Nabavi, Sheida
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sheida Nabavi其他文献

Sheida Nabavi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

基于深度学习与多源信息融合的TBM掘进岩体识别方法研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
基于特征工程与深度学习的启闭机品质智能无损检测方法关键技术研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
机器学习助力高稳定金属锑基电极构筑及其储钠机 制研究
  • 批准号:
    2024JJ5126
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
融合深度特征的多视图多超平面孪生支持向量回归算法研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
基 于 八 面 体 特 征 参 数 工 程 与 机 器 学 习 探 索 Ba12B ′Nb9O36 基陶瓷的微波介电性能优化研究
  • 批准号:
    2024JJ6385
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
基于机器学习的神经重症患者撤机决策模型的构建及验证研究
  • 批准号:
    2024JJ7005
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
基于多任务脑电特征的意识障碍患者预后评估方法研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    15.0 万元
  • 项目类别:
    省市级项目
基于多染色深度学习和多任务双流注意力机 制的直肠癌病理完全缓解预测研究
  • 批准号:
    Q24H160054
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
基于类脑运动学习的序贯式脑机接口研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0 万元
  • 项目类别:
    面上项目
基于混合脑机接口和多模态深度学习技术的步态康复行为识别方法研究
  • 批准号:
    LQ23F030015
  • 批准年份:
    2023
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
  • 批准号:
    2339216
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Integrated and end-to-end machine learning pipeline for edge-enabled IoT systems: a resource-aware and QoS-aware perspective
职业:边缘物联网系统的集成端到端机器学习管道:资源感知和 QoS 感知的视角
  • 批准号:
    2340075
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Gaussian Processes for Scientific Machine Learning: Theoretical Analysis and Computational Algorithms
职业:科学机器学习的高斯过程:理论分析和计算算法
  • 批准号:
    2337678
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Heterogeneous Neuromorphic and Edge Computing Systems for Realtime Machine Learning Technologies
职业:用于实时机器学习技术的异构神经形态和边缘计算系统
  • 批准号:
    2340249
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: From Fragile to Fortified: Harnessing Causal Reasoning for Trustworthy Machine Learning with Unreliable Data
职业:从脆弱到坚固:利用因果推理,利用不可靠的数据实现值得信赖的机器学习
  • 批准号:
    2337529
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Ethical Machine Learning in Health: Robustness in Data, Learning and Deployment
职业:健康领域的道德机器学习:数据、学习和部署的稳健性
  • 批准号:
    2339381
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Towards Trustworthy Machine Learning via Learning Trustworthy Representations: An Information-Theoretic Framework
职业:通过学习可信表示实现可信机器学习:信息理论框架
  • 批准号:
    2339686
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Intelligent Battery Management with Safe, Efficient, Fast-Adaption Reinforcement Learning and Physics-Inspired Machine Learning: From Cells to Packs
职业:具有安全、高效、快速适应的强化学习和物理启发机器学习的智能电池管理:从电池到电池组
  • 批准号:
    2340194
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: From Dirty Data to Fair Prediction: Data Preparation Framework for End-to-End Equitable Machine Learning
职业:从脏数据到公平预测:端到端公平机器学习的数据准备框架
  • 批准号:
    2341055
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了