Novel algorithm development, user support and maintenance for STAR

STAR的新颖算法开发、用户支持和维护

基本信息

  • 批准号:
    9330440
  • 负责人:
  • 金额:
    $ 48万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2017
  • 资助国家:
    美国
  • 起止时间:
    2017-08-18 至 2022-05-31
  • 项目状态:
    已结题

项目摘要

Abstract. Sequencing of transcribed RNA molecules (RNA-seq) is an invaluable tool for studying cell transcriptomes at high resolution and depth. STAR is a popular RNA-seq analysis suite that combines high accuracy and ultra- fast speed of mapping with a reach collection of built-in features and tools. STAR is used by hundreds of researchers, including several major consortia and institutions. We propose to significantly enhance and expand STAR capabilities in the following important areas. 1. Develop novel algorithms and tools integrated directly into STAR. RNA-seq analyses require combining multiple tools into “processing pipelines” which is demanding task owing to bottlenecks and compatibility issues. We aim to overcome these impediments by integrating novel tools directly into STAR software: (i) mapping of RNA-seq reads to personal genomes utilizing genotype information to produce more accurate allele aware alignments, thus increasing precision of personal genomics analyses; (ii) mapping of long RNA reads from emerging sequencing technologies such as PacBio and Oxford Nanopore. 2. Increase accuracy and speed and of the core mapping algorithm. New applications, such as personal genomics, require significant improvements in mapping accuracy. We will enhance STAR mapping algorithm with (i) spliced seed extension through mismatches/indels; and (ii) limited local alignment so of the read ends. Tremendous increase of sequencing throughput has put a significant emphasis on the efficiency of the computational algorithms. To keep up with the increasing sequencing throughput, we will boost STAR algorithm with (i) vectorization of query-text comparisons using SIMD/SSE instructions; (ii) dynamical programming for seed stitching. The improvements in accuracy and speed will be validated in both simulated and real RNA-seq data. Mapping accuracy depends strongly on choosing the best mapping parameters for a particular dataset. We will devise automated parameter optimization procedures to eliminate guesswork in parameter selection. 3. Enhance user-friendliness, user support/education, and software maintenance. User-friendliness is crucial for bioinformatics software usefulness to the broadest audience. We aim to significantly enhance users' experience by developing STAR web user interfaces for both pre-run data input, and post-run exploring of results. To enable STAR analysis in the cloud, we will create STAR virtual machines on popular Amazon and Google cloud computing services, and develop Hadoop-based tools for distribute processing of the big datasets. We will also expand user support and education, continue to implement user- requested features and debug user-reported issues.
抽象的。 转录rna分子测序(rna-seq)是研究细胞转录本的宝贵工具,请访问 高分辨率和高深度。STAR是一种广受欢迎的RNA-SEQ分析套件,它结合了高精度和超 借助REACH内置功能和工具集,快速绘制地图。数以百计的人使用星号 研究人员,包括几个主要财团和机构。我们建议显著加强和 在以下重要领域扩大STAR能力。 1.开发直接集成到STAR中的新算法和工具。 Rna-seq分析需要将多种工具组合到“处理管道”中,这是一项艰巨的任务。 瓶颈和兼容性问题。我们的目标是通过集成新工具来克服这些障碍 直接进入STAR软件:(I)利用基因信息将rna-seq读数映射到个人基因组 产生更准确的等位基因识别比对,从而提高个人基因组分析的精度;(Ii) 绘制来自PacBio和牛津纳米孔等新兴测序技术的长RNA读数。 2.提高了核心映射算法的精度和速度。 新的应用,如个人基因组学,需要显著提高图谱的准确性。我们会 通过(I)通过不匹配/INDELs进行拼接种子扩展;以及(Ii)受限来增强星形映射算法 读取端的局部对准。测序吞吐量的极大增加显著地 强调计算算法的效率。为了跟上不断增加的排序 吞吐量,我们将通过(I)使用SIMD/SSE对查询-文本比较进行矢量化来提高STAR算法 指令;(Ii)种子缝合的动态编程。在精度和速度方面的改进将是 在模拟和真实的RNA-SEQ数据中进行了验证。测绘精度在很大程度上取决于选择最好的 映射特定数据集的参数。我们将设计自动参数优化程序,以 消除参数选择中的猜测。 3.增强用户友好性、用户支持/教育和软件维护。 用户友好性是生物信息学软件对最广泛的受众有用的关键。我们的目标是 通过为预先运行的数据输入开发星形网络用户界面, 以及运行后对结果的探索。要在云中启用STAR分析,我们将创建STAR虚拟机 基于流行的Amazon和Google云计算服务,并开发基于Hadoop的分发工具 大数据集的处理。我们还将扩大用户支持和教育,继续实施用户- 请求的功能和调试用户报告的问题。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

ALEXANDER DOBIN其他文献

ALEXANDER DOBIN的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Medcircuit, the algorithmic software reducing waiting times in emergency department and general practice waiting rooms.
MedCircuit,一种算法软件,可减少急诊科和全科候诊室的等待时间。
  • 批准号:
    133416
  • 财政年份:
    2018
  • 资助金额:
    $ 48万
  • 项目类别:
    Feasibility Studies
SHF: Small: Programming Abstractions for Algorithmic Software Synthesis
SHF:小型:算法软件综合的编程抽象
  • 批准号:
    0916351
  • 财政年份:
    2009
  • 资助金额:
    $ 48万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了