Interpretable and extendable deep learning model for biological sequence analysis and prediction

用于生物序列分析和预测的可解释和可扩展的深度学习模型

基本信息

  • 批准号:
    9925232
  • 负责人:
  • 金额:
    $ 37.82万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-05-01 至 2023-04-30
  • 项目状态:
    已结题

项目摘要

Project Abstract Bioinformatics and computational biology have become the core of biomedical research. The PI Dr. Dong Xu's work in this area focuses on development of novel computational algorithms, software and information systems, as well as on broad applications of these tools and other informatics resources for diverse biological and medical problems. He works on many research problems in protein structure prediction, post-translational modification prediction, high-throughput biological data analyses, in silico studies of plants, microbes and cancers, biological information systems, and mobile App development for healthcare. He has published more than 300 papers, with about 12,000 citations and H-index of 55. In this project, the PI proposes to develop deep-learning algorithms, tools, web resources for analyses and predictions of biological sequences, including DNA, RNA, and protein sequences. The availability of these data provides emerging opportunities for precision medicine and other areas, while deep learning as a cutting-edge technology in machine learning, presents a new powerful method for analyses and predictions of biological sequences. With rapidly accumulating sequence data and fast development of deep-learning methods, there is an urgent need to systematically investigate how to best apply deep learning in sequence analyses and predictions. For this purpose, the PI will develop cutting-edge deep-learning methods with the following goals for the next five years: (1) Develop a series of novel deep-learning methods and models to specifically target biological sequence analyses and predictions in: (a) general unsupervised representations of DNA/RNA, protein and SNP/mutation sequences that capture both local and global features for various applications; (b) methods to make deep-learning models interpretable for understanding biological mechanisms and generating hypotheses; (c) “rule learning”, which abstracts the underlying “rules” by combining unsupervised learning of large unlabeled data and supervised learning of small labeled data so that it can classify new unlabeled data. (2) Apply the proposed deep-learning model to DNA/RNA sequence annotation, genotype-phenotype analyses, cancer mutation analyses, protein function/structure prediction, protein localization prediction, and protein post-translational modification prediction. The PI will exploit particular properties associated with each of these problems to improve the deep-learning models. He will develop a set of related prediction and analysis tools, which will improve the state-of-art performance and shed some light on related biological mechanisms. (3) Make the data, models, and tools freely accessible to the research community. The system will be designed modular and open-source, available through GitHub. They will be available like integrated circuit modules, which are universal and ready to plug in for different applications. The PI will develop a web resource for biological sequence representations, analyses, and predictions, as well as tutorials to help biologists with no computational knowledge to apply deep learning to their specific research problems.
项目摘要

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

DONG XU其他文献

DONG XU的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('DONG XU', 18)}}的其他基金

Multi-view self-supervised deep learning for biological sequences and beyond
针对生物序列及其他领域的多视图自监督深度学习
  • 批准号:
    10623063
  • 财政年份:
    2018
  • 资助金额:
    $ 37.82万
  • 项目类别:
Interpretable and extendable deep learning model for biological sequence analysis and prediction
用于生物序列分析和预测的可解释和可扩展的深度学习模型
  • 批准号:
    10395451
  • 财政年份:
    2018
  • 资助金额:
    $ 37.82万
  • 项目类别:
Deep learning for protein subcellular/sub-organelle localizations and localization motifs
蛋白质亚细胞/亚细胞器定位和定位基序的深度学习
  • 批准号:
    9768571
  • 财政年份:
    2018
  • 资助金额:
    $ 37.82万
  • 项目类别:
Interpretable and extendable deep learning model for biological sequence analysis and prediction
用于生物序列分析和预测的可解释和可扩展的深度学习模型
  • 批准号:
    10409152
  • 财政年份:
    2018
  • 资助金额:
    $ 37.82万
  • 项目类别:
Development of MUFOLD for Building High-Accuracy Protein Structure Models
开发用于建立高精度蛋白质结构模型的 MUFOLD
  • 批准号:
    8656715
  • 财政年份:
    2012
  • 资助金额:
    $ 37.82万
  • 项目类别:
Development of MUFOLD for Building High-Accuracy Protein Structure Models
开发用于建立高精度蛋白质结构模型的 MUFOLD
  • 批准号:
    8258610
  • 财政年份:
    2012
  • 资助金额:
    $ 37.82万
  • 项目类别:
Development of MUFOLD for Building High-Accuracy Protein Structure Models
开发用于建立高精度蛋白质结构模型的 MUFOLD
  • 批准号:
    8469528
  • 财政年份:
    2012
  • 资助金额:
    $ 37.82万
  • 项目类别:
Development of MUFOLD for Building High-Accuracy Protein Structure Models
开发用于建立高精度蛋白质结构模型的 MUFOLD
  • 批准号:
    9086384
  • 财政年份:
    2012
  • 资助金额:
    $ 37.82万
  • 项目类别:
New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction
用于蛋白质结构预测的新评分、组装和评估技术
  • 批准号:
    7648313
  • 财政年份:
    2006
  • 资助金额:
    $ 37.82万
  • 项目类别:
New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction
用于蛋白质结构预测的新评分、组装和评估技术
  • 批准号:
    7267931
  • 财政年份:
    2006
  • 资助金额:
    $ 37.82万
  • 项目类别:

相似海外基金

Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
  • 批准号:
    23K06972
  • 财政年份:
    2023
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
  • 批准号:
    23KJ0939
  • 财政年份:
    2023
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
  • 批准号:
    19K07013
  • 财政年份:
    2019
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
  • 批准号:
    16K05820
  • 财政年份:
    2016
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
  • 批准号:
    16K08426
  • 财政年份:
    2016
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
  • 批准号:
    25461010
  • 财政年份:
    2013
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
  • 批准号:
    23654177
  • 财政年份:
    2011
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
  • 批准号:
    22550105
  • 财政年份:
    2010
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
  • 批准号:
    NE/D00232X/1
  • 财政年份:
    2006
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
  • 批准号:
    16500189
  • 财政年份:
    2004
  • 资助金额:
    $ 37.82万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了