Effective prediction of microRNAs in the face of class imbalance

面对类别不平衡时有效预测 microRNA

基本信息

  • 批准号:
    RGPIN-2016-06179
  • 负责人:
  • 金额:
    $ 1.6万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

MicroRNA (miRNA) are short expressed genomic sequences which encode small RNA molecules that adopt a “hairpin” secondary structure. Computational prediction of miRNA is important, since miRNA are now believed to disrupt or otherwise control the expression of 60-90% of mammalian genes. Sequence-based de novo prediction of miRNA is made difficult due to the acute class imbalance: for each true miRNA within a genome, we expect 1000 pseudo-miRNA (i.e. genomic regions producing miRNA-like hairpin structures). Therefore, effective miRNA prediction systems must have extremely high specificity (i.e. the ability to reject pseudo-miRNA), while also retaining the ability to correctly detect true miRNA (i.e. recall).******We have recently introduced the Species-specific miRNA Prediction (SMIRP) framework for training highly effective species-specific miRNA prediction systems. When applied to three popular miRNA prediction methods, we observe significant improvements in precision (i.e. the proportion of predictions expected to be true miRNA) while maintaining the same high recall rates observed by the original methods. We propose to extend our research in three key areas:******1) Existing miRNA prediction methods perform well on canonical pre-miRNA, but are not well-suited for high-throughput annotation of entire genomes. Therefore, new classification techniques will be developed which optimally differentiate between real and pseudo-miRNA sequences within predicted hairpin structures. This will include the development of novel methods to compute general-purpose information-rich DNA/RNA descriptors. In addition to miRNA prediction, these descriptors will benefit other nucleic acid classification problem domains. ******2) With the increasing availability of transcriptomic data, there is a need and an opportunity to develop an integrated miRNA discovery pipeline that leverages both next-generation sequencing (NGS) read patterns and powerful sequence-based methods such as SMIRP. We will develop and apply advanced machine learning approaches to optimally combine NGS- and sequence-based approaches, improving our ability to discover novel miRNA of potential importance to human health. ******3) Contributions will also be made in the broader field of machine learning in the presence of extreme class imbalance where many classic performance metrics, such as ROC curves, become inappropriate as they do not adequately reflect the impact of false positive predictions. To address this and other issues, we will develop novel performance metrics for cases of acute class imbalance. While these new metrics will find immediate application in the development of miRNA prediction tools, they will also be widely applicable to other problem domains within bioinformatics and beyond.**
微小RNA(microRNA,miRNA)是一种表达量很小的基因组序列,编码的小RNA分子具有发夹结构。miRNA的计算预测是重要的,因为现在认为miRNA破坏或以其他方式控制60-90%的哺乳动物基因的表达。基于序列的miRNA从头预测由于严重的类别不平衡而变得困难:对于基因组内的每个真miRNA,我们预计有1000个假miRNA(即产生miRNA样发夹结构的基因组区域)。因此,有效的miRNA预测系统必须具有极高的特异性(即拒绝伪miRNA的能力),同时还保留正确检测真miRNA的能力(即召回)。我们最近引入了物种特异性miRNA预测(SMIRP)框架,用于训练高效的物种特异性miRNA预测系统。当应用于三种流行的miRNA预测方法时,我们观察到精确度(即预期为真实miRNA的预测比例)的显着提高,同时保持与原始方法相同的高召回率。我们建议在三个关键领域扩展我们的研究:*1)现有的miRNA预测方法在典型的pre-miRNA上表现良好,但不适合整个基因组的高通量注释。因此,将开发新的分类技术,其在预测的发夹结构内最佳地区分真实的和伪miRNA序列。这将包括开发新的方法来计算通用的信息丰富的DNA/RNA描述符。除了miRNA预测之外,这些描述符将有益于其他核酸分类问题领域。**2)随着转录组学数据的可用性不断增加,需要并有机会开发一种集成的miRNA发现管道,该管道利用下一代测序(NGS)读取模式和强大的基于序列的方法(如SMIRP)。我们将开发和应用先进的机器学习方法,以最佳方式将联合收割机NGS和基于序列的方法结合起来,提高我们发现对人类健康具有潜在重要性的新型miRNA的能力。**3)在存在极端类不平衡的情况下,机器学习的更广泛领域也将做出贡献,其中许多经典的性能指标,如ROC曲线,变得不合适,因为它们不能充分反映假阳性预测的影响。为了解决这个问题和其他问题,我们将开发新的性能指标的情况下,严重的类不平衡。虽然这些新指标将立即应用于miRNA预测工具的开发,但它们也将广泛应用于生物信息学内外的其他问题领域。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Green, James其他文献

Quality and Variability of Patient Directions in Electronic Prescriptions in the Ambulatory Care Setting.
  • DOI:
    10.18553/jmcp.2018.17404
  • 发表时间:
    2018-07
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    Yang, Yuze;Ward-Charlerie, Stacy;Dhavle, Ajit A.;Rupp, Michael T.;Green, James
  • 通讯作者:
    Green, James
Internet use in an orthopaedic outpatient population
  • DOI:
    10.1097/bco.0b013e31828e542b
  • 发表时间:
    2013-05-01
  • 期刊:
  • 影响因子:
    0.3
  • 作者:
    Baker, Joseph F.;Green, James;Mulhall, Kevin J.
  • 通讯作者:
    Mulhall, Kevin J.
Child pedestrian casualties and deprivation
  • DOI:
    10.1016/j.aap.2010.10.016
  • 发表时间:
    2011-05-01
  • 期刊:
  • 影响因子:
    5.9
  • 作者:
    Green, James;Muir, Helen;Maher, Mike
  • 通讯作者:
    Maher, Mike
Correlates of head circumference growth in infants later diagnosed with autism spectrum disorders
  • DOI:
    10.1177/0883073807304005
  • 发表时间:
    2007-06-01
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Mraz, Krista D.;Green, James;Fein, Deborah
  • 通讯作者:
    Fein, Deborah
Call for a framework for reporting evidence for life beyond Earth
  • DOI:
    10.1038/s41586-021-03804-9
  • 发表时间:
    2021-10-28
  • 期刊:
  • 影响因子:
    64.8
  • 作者:
    Green, James;Hoehler, Tori;Voytek, Mary
  • 通讯作者:
    Voytek, Mary

Green, James的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Green, James', 18)}}的其他基金

Reciprocal Perspective Machine Learning to Identify Relationships in Sparse Biological Networks
交互视角机器学习识别稀疏生物网络中的关系
  • 批准号:
    RGPIN-2021-04184
  • 财政年份:
    2022
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Metal Mediated and Catalyzed Organic Synthetic Methods
金属介导和催化的有机合成方法
  • 批准号:
    RGPIN-2022-04761
  • 财政年份:
    2022
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Unobtrusive neonatal patient monitoring using video and pressure data
使用视频和压力数据进行不引人注目的新生儿患者监测
  • 批准号:
    543940-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Collaborative Research and Development Grants
Reciprocal Perspective Machine Learning to Identify Relationships in Sparse Biological Networks
交互视角机器学习识别稀疏生物网络中的关系
  • 批准号:
    RGPIN-2021-04184
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Metal Mediated and Catalyzed Organic Synthetic Methods
金属介导和催化的有机合成方法
  • 批准号:
    RGPIN-2016-04946
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Effective prediction of microRNAs in the face of class imbalance
面对类别不平衡时有效预测 microRNA
  • 批准号:
    RGPIN-2016-06179
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Metal Mediated and Catalyzed Organic Synthetic Methods
金属介导和催化的有机合成方法
  • 批准号:
    RGPIN-2016-04946
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Unobtrusive neonatal patient monitoring using video and pressure data
使用视频和压力数据进行不引人注目的新生儿患者监测
  • 批准号:
    543940-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Collaborative Research and Development Grants
Unobtrusive neonatal patient monitoring using video and pressure data
使用视频和压力数据进行不引人注目的新生儿患者监测
  • 批准号:
    543940-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Collaborative Research and Development Grants
Metal Mediated and Catalyzed Organic Synthetic Methods
金属介导和催化的有机合成方法
  • 批准号:
    RGPIN-2016-04946
  • 财政年份:
    2019
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

基于深穿透拉曼光谱的安全光照剂量的深层病灶无创检测与深度预测
  • 批准号:
    82372016
  • 批准年份:
    2023
  • 资助金额:
    48.00 万元
  • 项目类别:
    面上项目
高性能纤维混凝土构件抗爆的强度预测
  • 批准号:
    51708391
  • 批准年份:
    2017
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
隧道超前探测的三分量光纤地震加速度检波机理与应用研究
  • 批准号:
    51079080
  • 批准年份:
    2010
  • 资助金额:
    32.0 万元
  • 项目类别:
    面上项目
非编码RNA与蛋白质相互作用预测算法的研究
  • 批准号:
    31000586
  • 批准年份:
    2010
  • 资助金额:
    18.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Post-transcriptional regulation of gene expression by microRNAs in antibody-mediated rejection
抗体介导的排斥反应中 microRNA 对基因表达的转录后调控
  • 批准号:
    10522285
  • 财政年份:
    2022
  • 资助金额:
    $ 1.6万
  • 项目类别:
Post-transcriptional regulation of gene expression by microRNAs in antibody-mediated rejection
抗体介导的排斥反应中 microRNA 对基因表达的转录后调控
  • 批准号:
    10693399
  • 财政年份:
    2022
  • 资助金额:
    $ 1.6万
  • 项目类别:
Prediction of homologous recombination deficiency by urinary microRNAs in ovarian cancer
通过尿液 microRNA 预测卵巢癌中的同源重组缺陷
  • 批准号:
    22K09613
  • 财政年份:
    2022
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Validation of a saliva test using methylated microRNAs for head and neck cancer recurrence
使用甲基化 microRNA 进行唾液测试验证头颈癌复发情况
  • 批准号:
    10929597
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
The use of microRNAs in detection and prediction of Johne's disease in cattle.
使用 microRNA 检测和预测牛约翰氏病。
  • 批准号:
    10003360
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Collaborative R&D
Validation of a saliva test using methylated microRNAs for head and neck cancer recurrence
使用甲基化 microRNA 进行唾液测试验证头颈癌复发情况
  • 批准号:
    10491773
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
Validation of a saliva test using methylated microRNAs for head and neck cancer recurrence
使用甲基化 microRNA 进行唾液测试验证头颈癌复发情况
  • 批准号:
    10281743
  • 财政年份:
    2021
  • 资助金额:
    $ 1.6万
  • 项目类别:
Effective prediction of microRNAs in the face of class imbalance
面对类别不平衡时有效预测 microRNA
  • 批准号:
    RGPIN-2016-06179
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
    Discovery Grants Program - Individual
Exploratory Analysis of the Functional Implications of MicroRNAs Associated with Incident Type 2 Diabetes and Related Risk Factors.
与 2 型糖尿病事件及相关危险因素相关的 MicroRNA 功能意义的探索性分析。
  • 批准号:
    10404815
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
The Impact of Interventions to Treat Incident Diabetes on Circulating microRNAs in the Diabetes Prevention Program
糖尿病预防计划中治疗糖尿病的干预措施对循环 microRNA 的影响
  • 批准号:
    10545053
  • 财政年份:
    2020
  • 资助金额:
    $ 1.6万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了