ABI Innovation: Fast Algorithms and Tools for Single-Molecule Sequencing Reads

ABI 创新:单分子测序读取的快速算法和工具

基本信息

  • 批准号:
    1759856
  • 负责人:
  • 金额:
    $ 89.89万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-06-01 至 2024-05-31
  • 项目状态:
    已结题

项目摘要

Genomics, which studies the structure, function and evolution of DNA and RNA sequences of organisms, now has significant impact on every aspect of life sciences, such as agriculture, environment, medicine and biology. The rapid advance of sequencing technologies is one of the most important reasons behind the evolution of genomics research. Next-generation sequencing (NGS), which has significantly lowered the cost for sequencing DNA and RNA, has remarkably increased the application of genomics in every aspect of life sciences. More recently, we have seen the emergence of third-generation long-read Single-Molecule Sequencing (SMS) technologies from companies like PacBio and Oxford Nanopore. Unlike short (100-500 bp) NGS reads, the SMS reads have the distinguishing characteristics of long read length (2,000-50,000 bp), unbiased sequencing, a different type and frequency of random errors, and detection of additional modifications to the DNA bases, called epigenetic modification information. These characteristics make SMS reads useful in many genomics investigations, such as de novo genome assemblies (where there is no guiding framework available), methylation detection, gene isoform detection (small sequence changes that identify different alleles of a gene) and structural variation detection (large rearrangements in the organization of the genes). This project will develop efficient algorithms and tools to improve the effectiveness, usefulness and applicability domain of SMS reads. The successful completion of this project will significantly transform genomics research. The new tools will enable biologists to perform genomics studies, such as de novo assembly and global methylation detection, on large genomes using SMS. The tools will significantly lower the cost of analysis and increase the utility of the data for biologists so that they can advance their research. All algorithms, tools and demonstrations resulting from this project will be made publicly available to educators, researchers and students through our project website and GitHub. This project will be useful to train computer science students, including women and minority students, on bioinformatics problems and algorithm design.Although SMS is now widely used in the genomics studies of small bacterial and archaeal genomes, the computational cost and high data volume currently prevent its use in the study of mid-to-large size genomes. The overall goal of this project is to develop fast algorithms and tools to investigate remedies for problems in three SMS applications: pairwise and reference alignment, error correction, and base modification detection. First, we will develop a tool for pairwise and reference genome alignments of SMS reads at least 5X faster than those currently available by designing and integrating fast k-mer matching, linear positional chaining and SIMD (Single-Instruction-Multiple-Data) based banded Smith-Waterman-Gotoh algorithms. Then, we will develop a linear space and linear time algorithm for reads alignment graph (RAG) based method, as well as a multiple reads alignment graph (MRAG) based method to efficiently correct processing for Oxford Nanopore technology data output. Furthermore, we will design an optimized and parallelized Spark pipeline for base modification detection using SMS reads, as well as a two-step classification method for effectively detecting base modification in SMS reads using neural networks. This research will substantially advance the state-of-the-art algorithms and tools for SMS reads. Project pages will be linked from https://people.cs.clemson.edu/~luofeng/research.html .This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
基因组学研究生物体DNA和RNA序列的结构、功能和进化,现已对农业、环境、医学和生物学等生命科学的方方面面产生重大影响。测序技术的快速发展是基因组研究演变的最重要原因之一。下一代测序(NGS)显著降低了DNA和RNA测序的成本,显著增加了基因组学在生命科学各个方面的应用。最近,我们看到了PacBio和牛津纳米孔等公司的第三代长读单分子测序(SMS)技术的出现。与短(100-500bp)NGS读取不同,SMS读取具有长读取长度(2000-5000bp)、无偏测序、不同类型和频率的随机错误以及检测到对DNA碱基的额外修饰(称为表观遗传修饰信息)的显著特征。这些特征使得SMS读数在许多基因组研究中是有用的,例如从头开始基因组组装(其中没有可用的指导框架)、甲基化检测、基因异构体检测(识别基因不同等位基因的小序列变化)和结构变异检测(基因组织中的大重排)。该项目将开发高效的算法和工具,以提高短信阅读的有效性、有用性和适用性领域。该项目的成功完成将极大地改变基因组学研究。这些新工具将使生物学家能够使用短信对大基因组进行基因组学研究,如从头组装和全球甲基化检测。这些工具将显著降低分析成本,并增加生物学家数据的实用性,以便他们能够推进他们的研究。这个项目产生的所有算法、工具和演示将通过我们的项目网站和GitHub向教育工作者、研究人员和学生公开。这个项目将有助于培训计算机科学专业的学生,包括女性和少数民族学生,关于生物信息学问题和算法设计的培训。尽管目前短消息系统已广泛应用于小细菌和古菌基因组的基因组研究,但其计算成本和高数据量目前阻碍了其在中大型基因组研究中的应用。这个项目的总体目标是开发快速算法和工具,以调查针对三个短信应用程序中的问题的补救措施:成对和参考比对、纠错和碱基修改检测。首先,我们将开发一个工具,通过设计和集成快速k-mer匹配、线性位置链和基于SIMD(单指令-多数据)的带式Smith-Waterman-Gotoh算法,使SMS读取的成对和参考基因组比对的速度至少比目前可用的快5倍。然后,我们将开发一种基于线性空间和线性时间算法的读数比对图(RAG)方法,以及一种基于多读数比对图(MRAG)的方法,以有效地纠正对牛津纳米孔技术数据输出的处理。此外,我们还将设计一种优化的并行化Spark流水线来检测短消息读取中的碱基修改,以及使用神经网络来有效地检测短消息读取中的碱基修改的两步分类方法。这项研究将极大地推进用于短信阅读的最先进的算法和工具。项目页面将从https://people.cs.clemson.edu/~luofeng/research.html链接。该奖项反映了国家科学基金会的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks
  • DOI:
    10.1093/bioinformatics/btab354
  • 发表时间:
    2021-05-11
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Huang, Neng;Nie, Fan;Wang, Jianxin
  • 通讯作者:
    Wang, Jianxin
Self-stabilizing algorithm for two disjoint minimal dominating sets
  • DOI:
    10.1016/j.ipl.2019.03.007
  • 发表时间:
    2019-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    P. Srimani;J. Wang
  • 通讯作者:
    P. Srimani;J. Wang
Self-Stabilizing Master-Slave Token Circulation in Unoriented Cactus Graphs
无向仙人掌图中自稳定主从代币流通
BlockPolish: accurate polishing of long-read assembly via block divide-and-conquer
  • DOI:
    10.1093/bib/bbab405
  • 发表时间:
    2021-10
  • 期刊:
  • 影响因子:
    9.5
  • 作者:
    Neng Huang;Fan Nie;Peng Ni;Xin Gao;F. Luo;Jianxin Wang
  • 通讯作者:
    Neng Huang;Fan Nie;Peng Ni;Xin Gao;F. Luo;Jianxin Wang
DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning
  • DOI:
    10.1093/bioinformatics/btz276
  • 发表时间:
    2019-11-15
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Ni, Peng;Huang, Neng;Wang, Jianxin
  • 通讯作者:
    Wang, Jianxin
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Feng Luo其他文献

Function and potential application of quorum sensing in nitrogen-removing functional bacteria: a review
群体感应在脱氮功能细菌中的功能和潜在应用:综述
  • DOI:
    10.5004/dwt.2021.27373
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    1.1
  • 作者:
    Feng Luo;Huizhi Hu;Yirong Liu
  • 通讯作者:
    Yirong Liu
Diagnosis prevention and treatment for PICC‐related upper extremity deep vein thrombosis in breast cancer patients
乳腺癌患者PICC相关上肢深静脉血栓的诊治
  • DOI:
    10.1111/j.1743-7563.2011.01508.x
  • 发表时间:
    2012
  • 期刊:
  • 影响因子:
    0
  • 作者:
    L. Xing;Vishnu Prasad Adhikari;Hong Liu;Ling;Sheng;Hong Yuan Li;G. Ren;Feng Luo;Kai
  • 通讯作者:
    Kai
Degradation of sulfonamides and formation of trihalomethanes by chlorination after pre-oxidation with Fe(VI)
Fe(VI) 预氧化后氯化降解磺酰胺并形成三卤甲烷
  • DOI:
    10.1016/j.jes.2018.01.016
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    6.9
  • 作者:
    Tuqiao Zhang;Feilong Dong;Feng Luo;Cong Li
  • 通讯作者:
    Cong Li
Abnormal elastic behaviour of poly(2-ureidoethyl methacrylate) physical hydrogels
聚(2-脲基乙基甲基丙烯酸酯)物理水凝胶的异常弹性行为
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Taolin Sun;Takayuki Nonoyama;Yoshiyuki Saruwatari;Feng Luo;Takayuki Kurokawa;Tasuku Nakajima;Abu Bin Ihsan;Jian Ping Gong
  • 通讯作者:
    Jian Ping Gong
Synthesis and characterization of PLGA-PEG-PLGA based thermosensitive polyurethane micelles for potential drug delivery
用于潜在药物输送的基于 PLGA-PEG-PLGA 的热敏聚氨酯胶束的合成和表征
  • DOI:
    10.1080/09205063.2020.1854413
  • 发表时间:
    2020-11
  • 期刊:
  • 影响因子:
    3.6
  • 作者:
    Min Wang;Jianghao Zhan;Laijun Xu;Yanjun Wang;Dan Lu;Zhen Li;Jiyao Li;Feng Luo;Hong Tan
  • 通讯作者:
    Hong Tan

Feng Luo的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Feng Luo', 18)}}的其他基金

ATD: Algorithms and Geometric Methods for Community and Anomaly Detection and Robust Learning in Complex Networks
ATD:复杂网络中社区和异常检测以及鲁棒学习的算法和几何方法
  • 批准号:
    2220271
  • 财政年份:
    2023
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Travel: NSF Student Travel Grant for 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
旅费:2021 年 IEEE 国际生物信息学和生物医学会议 (BIBM) 的 NSF 学生旅费补助金
  • 批准号:
    2131662
  • 财政年份:
    2021
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
MRI: Acquisition of a Cyberinstrument for AI-Enabled Computational Science & Engineering
MRI:购买用于人工智能计算科学的网络仪器
  • 批准号:
    2018069
  • 财政年份:
    2020
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: Geometric and Topological Methods for Analyzing Shapes
FRG:协作研究:分析形状的几何和拓扑方法
  • 批准号:
    1760527
  • 财政年份:
    2018
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Discrete Conformal Geometry of Surfaces and Applications
曲面的离散共形几何及其应用
  • 批准号:
    1811878
  • 财政年份:
    2018
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Collaborative Research: ATD: Theory and Algorithms for Discrete Curvatures on Network Data from Human Mobility and Monitoring
合作研究:ATD:人体移动和监测网络数据离散曲率的理论和算法
  • 批准号:
    1737876
  • 财政年份:
    2017
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Geometry and Topology of Polyhedral Surfaces
多面体表面的几何和拓扑
  • 批准号:
    1405106
  • 财政年份:
    2014
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
COLLABORATIVE RESEARCH: ATD: Algorithmic Aspects of Geometry for Using LIDAR and Wireless Sensor Networks for Combating Chemical Terror Attacks
合作研究:ATD:使用激光雷达和无线传感器网络对抗化学恐怖袭击的几何算法
  • 批准号:
    1222663
  • 财政年份:
    2012
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Teichmuller Theory and Quantum Topology
泰希米勒理论和量子拓扑
  • 批准号:
    1207832
  • 财政年份:
    2012
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Volume Optimization on Triangulated 3-Manifolds.
三角 3 流形的体积优化。
  • 批准号:
    1105808
  • 财政年份:
    2011
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant

相似海外基金

Garnet speed dating: Innovation for fast tectonic problem solving
石榴石快速约会:快速解决构造问题的创新
  • 批准号:
    DP220103037
  • 财政年份:
    2022
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Discovery Projects
Active Tape Matrix Fast Start Innovation.
有源磁带矩阵快速启动创新。
  • 批准号:
    10042801
  • 财政年份:
    2022
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Grant for R&D
Made Smarter Smart Factory Innovation Hub – Fast Start Test Bed Pilots (the “Project”)
变得更智能 智能工厂创新中心 – 快速启动试验台试点(“项目”)
  • 批准号:
    900178
  • 财政年份:
    2020
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Responsive Strategy and Planning
Single Microparticle Impact Facilities at Low, Fast and Hypervelocity Regimes: Innovation from Biomedical and Material Sciences to Space Exploration
低速、高速和超高速状态下的单微粒撞击设施:从生物医学和材料科学到太空探索的创新
  • 批准号:
    18KK0128
  • 财政年份:
    2018
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Fund for the Promotion of Joint International Research (Fostering Joint International Research (B))
Fast-tracking Health Innovation for NHS Scotland (Strathclyde CiC 2017)
苏格兰 NHS 快速健康创新 (Strathclyde CiC 2017)
  • 批准号:
    MC_PC_17178
  • 财政年份:
    2018
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Intramural
iCorps PA-16-414 Application related to Fast-Track Small Business Innovation Research Grant Application, PA-15-269: OpenBeds: Improving the Delivery of Care for Patients with Drug Addiction
iCorps PA-16-414 与快速小企业创新研究资助申请相关的申请,PA-15-269:OpenBeds:改善对毒瘾患者的护理服务
  • 批准号:
    9379425
  • 财政年份:
    2017
  • 资助金额:
    $ 89.89万
  • 项目类别:
ABI Innovation: Authors in the driver's seat: fast, consistent, computable phenotype data and ontology production
ABI 创新:作者主导:快速、一致、可计算的表型数据和本体生成
  • 批准号:
    1661485
  • 财政年份:
    2017
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Standard Grant
Fast-tracking Health Innovation for NHS Scotland
苏格兰 NHS 快速健康创新
  • 批准号:
    MC_PC_16060
  • 财政年份:
    2017
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Intramural
iFLAG: innovation in Fast buLk Analysis of Graphene
iFLAG:石墨烯快速批量分析的创新
  • 批准号:
    103715
  • 财政年份:
    2017
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Collaborative R&D
Fast-scanning suveillance radar antenna--Idea to Innovation Phase 1
快速扫描监控雷达天线--创意到创新第一阶段
  • 批准号:
    463157-2014
  • 财政年份:
    2014
  • 资助金额:
    $ 89.89万
  • 项目类别:
    Idea to Innovation
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了