Network-based machine learning framework for for data integration in medical applications

基于网络的机器学习框架,用于医疗应用中的数据集成

基本信息

  • 批准号:
    RGPIN-2014-04442
  • 负责人:
  • 金额:
    $ 1.82万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2017
  • 资助国家:
    加拿大
  • 起止时间:
    2017-01-01 至 2018-12-31
  • 项目状态:
    已结题

项目摘要

Recent technological advances have made it possible to assemble very large data collections – Big Data. Aside from the sheer scale, we now have access to multiple data types each describing the same phenomena in their own way. For example, in computer vision, it is now common to combine unlabeled images with text to label images more accurately. In biology and medicine, it has recently become cost-effective to collect genomic, transcriptomic, epigenetic, microbiome and other measurements that describe the state of cells and human bodies in health and disease. It has thus become essential to develop robust and efficient machine learning methods that can integrate multiple data types to gain deeper understanding of the phenomena measured by these data. Majority of existing integrative methods have limitations, e.g. they often require significantly more samples than features (not readily available in biological and medical applications); do not scale to the large number of available features requiring ad hoc feature pre-selection (Shen et al, 2009); do not deal with missing data and noise requiring substantial data pre-processing. In human studies, especially in childhood diseases, where the goal is to combine multiple types of measurements to understand disease mechanisms and reasons for phenotypic heterogeneity, the number of patients is very limited, whereas the number of measurements available for each patient is very large. Majority of existing methods are not applicable in this scenario. It is thus essential to ensure that new methods for biological and clinical data integration are scalable and robust to small sample sizes and many features. Together with my advisees we have developed a robust unsupervised approach to integrate multiple types of biological data, called Patient Network Fusion (PNF) (Wang et al, 2013) that addresses the issues above. Our results on five cancers show that we obtain more clinically relevant subtypes than those previously reported. We believe that networks in general and our approach for patient network fusion in particular set a perfect foundation for the comprehensive integrative framework that we are proposing to build in the course of the next five years. Our research program addresses the problem of data integration from several angles. Building on our recent successes we will develop methods to integrate more data types, specifically, low-signal-to-noise ratio data, such as single nucleotide polymorphisms, into our network-based framework. We will also extend non-negative matrix factorization to perform data integration of biological data using network regularization. These developments will serve as a broad base for the integration framework that will be tested on novel data obtained through our established clinical collaborations. Further, we will investigate several methods for integrative feature selection, i.e. identifying a small set of features stemming from multiple data types that explain majority of the variation in the data. Such methods are essential in shedding light onto the inner workings of biological processes and disease mechanisms. I work in close collaboration with clinicians at the Hospital for Sick Children and internationally to make sure that the methods are applied to real data and can be used and evaluated by clinicians in the process of their development. By working with a selected set of end-user collaborators, we will evaluate and refine our machine learning methods and user-interfaces, and ultimately develop a system that will impact the larger community of researchers interested in biological and medical data analysis.
最近的技术进步使得组装非常大的数据集合(大数据)成为可能。除了规模庞大之外,我们现在还可以访问多种数据类型,每种数据类型都以自己的方式描述相同的现象。例如,在计算机视觉中,现在通常将未标记的图像与文本结合起来以更准确地标记图像。在生物学和医学领域,收集基因组、转录组、表观遗传学、微生物组和其他描述细胞和人体健康和疾病状态的测量数据最近变得具有成本效益。因此,开发强大且高效的机器学习方法变得至关重要,这些方法可以集成多种数据类型,以更深入地了解这些数据测量的现象。大多数现有的综合方法都有局限性,例如它们通常需要比特征多得多的样本(在生物和医学应用中不容易获得);不要扩展到需要临时特征预选的大量可用特征(Shen 等人,2009);不处理需要大量数据预处理的丢失数据和噪声。在人类研究中,特别是在儿童疾病中,目标是结合多种类型的测量来了解疾病机制和表型异质性的原因,患者的数量非常有限,而每个患者可用的测量数量却非常大。大多数现有方法不适用于这种情况。因此,必须确保生物和临床数据集成的新方法对于小样本量和许多特征具有可扩展性和鲁棒性。我们与我的顾问一起开发了一种强大的无监督方法来整合多种类型的生物数据,称为患者网络融合(PNF)(Wang 等人,2013),可以解决上述问题。我们对五种癌症的结果表明,我们获得了比之前报道的更多的临床相关亚型。我们相信,总体网络,特别是我们的患者网络融合方法,为我们提议在未来五年内建立的全面整合框架奠定了完美的基础。我们的研究计划从多个角度解决数据集成问题。在我们最近取得的成功的基础上,我们将开发将更多数据类型(特别是低信噪比数据,例如单核苷酸多态性)集成到我们基于网络的框架中的方法。我们还将扩展非负矩阵分解以使用网络正则化来执行生物数据的数据集成。这些进展将作为集成框架的广泛基础,该框架将根据通过我们建立的临床合作获得的新数据进行测试。此外,我们将研究几种综合特征选择的方法,即识别源自多种数据类型的一小组特征,这些特征可以解释数据中的大部分变化。这些方法对于揭示生物过程和疾病机制的内部运作至关重要。我与病童医院和国际上的临床医生密切合作,以确保这些方法应用于真实数据,并可供临床医生在其开发过程中使用和评估。通过与一组选定的最终用户合作者合作,我们将评估和完善我们的机器学习方法和用户界面,并最终开发出一个系统,该系统将影响对生物和医学数据分析感兴趣的更广泛的研究人员社区。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Goldenberg, Anna其他文献

Dr.VAE: improving drug response prediction via modeling of drug perturbation effects
  • DOI:
    10.1093/bioinformatics/btz158
  • 发表时间:
    2019-10-01
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Rampasek, Ladislav;Hidru, Daniel;Goldenberg, Anna
  • 通讯作者:
    Goldenberg, Anna
Predicting Node Characteristics from Molecular Networks
  • DOI:
    10.1007/978-1-61779-276-2_20
  • 发表时间:
    2011-01-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mostafavi, Sara;Goldenberg, Anna;Morris, Quaid
  • 通讯作者:
    Morris, Quaid
Subtyping: What It Is and Its Role in Precision Medicine
  • DOI:
    10.1109/mis.2015.60
  • 发表时间:
    2015-07-01
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
    Saria, Suchi;Goldenberg, Anna
  • 通讯作者:
    Goldenberg, Anna
Similarity network fusion for aggregating data types on a genomic scale
  • DOI:
    10.1038/nmeth.2810
  • 发表时间:
    2014-03-01
  • 期刊:
  • 影响因子:
    48
  • 作者:
    Wang, Bo;Mezlini, Aziz M.;Goldenberg, Anna
  • 通讯作者:
    Goldenberg, Anna
Multiple Germline Events Contribute to Cancer Development in Patients with Li-Fraumeni Syndrome.
  • DOI:
    10.1158/2767-9764.crc-22-0402
  • 发表时间:
    2023-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Subasri, Vallijah;Light, Nicholas;Kanwar, Nisha;Brzezinski, Jack;Luo, Ping;Hansford, Jordan R.;Cairney, Elizabeth;Portwine, Carol;Elser, Christine;Finlay, Jonathan L.;Nichols, Kim E.;Alon, Noa;Brunga, Ledia;Anson, Jo;Kohlmann, Wendy;de Andrade, Kelvin C.;Khincha, Payal P.;Savage, Sharon A.;Schiffman, Joshua D.;Weksberg, Rosanna;Pugh, Trevor J.;Villani, Anita;Shlien, Adam;Goldenberg, Anna;Malkin, David
  • 通讯作者:
    Malkin, David

Goldenberg, Anna的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Goldenberg, Anna', 18)}}的其他基金

Robust machine learning for healthcare
用于医疗保健的强大机器学习
  • 批准号:
    RGPIN-2020-05777
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Robust machine learning for healthcare
用于医疗保健的强大机器学习
  • 批准号:
    RGPIN-2020-05777
  • 财政年份:
    2021
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Utilizing high resolution physiological data and artificial intelligence to develop a pediatric cardiac arrest prediction tool for integration into bedside clinical practice
利用高分辨率生理数据和人工智能开发儿科心脏骤停预测工具,以融入床边临床实践
  • 批准号:
    538815-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Collaborative Health Research Projects
Robust machine learning for healthcare
用于医疗保健的强大机器学习
  • 批准号:
    RGPIN-2020-05777
  • 财政年份:
    2020
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Utilizing high resolution physiological data and artificial intelligence to develop a pediatric cardiac arrest prediction tool for integration into bedside clinical practice
利用高分辨率生理数据和人工智能开发儿科心脏骤停预测工具,以融入床边临床实践
  • 批准号:
    538815-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Collaborative Health Research Projects
Network-based machine learning framework for for data integration in medical applications
基于网络的机器学习框架,用于医疗应用中的数据集成
  • 批准号:
    RGPIN-2014-04442
  • 财政年份:
    2019
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Network-based machine learning framework for for data integration in medical applications
基于网络的机器学习框架,用于医疗应用中的数据集成
  • 批准号:
    RGPIN-2014-04442
  • 财政年份:
    2018
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Network-based machine learning framework for for data integration in medical applications
基于网络的机器学习框架,用于医疗应用中的数据集成
  • 批准号:
    RGPIN-2014-04442
  • 财政年份:
    2016
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Network-based machine learning framework for for data integration in medical applications
基于网络的机器学习框架,用于医疗应用中的数据集成
  • 批准号:
    RGPIN-2014-04442
  • 财政年份:
    2015
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual
Network-based machine learning framework for for data integration in medical applications
基于网络的机器学习框架,用于医疗应用中的数据集成
  • 批准号:
    RGPIN-2014-04442
  • 财政年份:
    2014
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国青年学者研究基金项目
Exploring the Intrinsic Mechanisms of CEO Turnover and Market Reaction: An Explanation Based on Information Asymmetry
  • 批准号:
    W2433169
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目
含Re、Ru先进镍基单晶高温合金中TCP相成核—生长机理的原位动态研究
  • 批准号:
    52301178
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
NbZrTi基多主元合金中化学不均匀性对辐照行为的影响研究
  • 批准号:
    12305290
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
眼表菌群影响糖尿病患者干眼发生的人群流行病学研究
  • 批准号:
    82371110
  • 批准年份:
    2023
  • 资助金额:
    49.00 万元
  • 项目类别:
    面上项目
镍基UNS N10003合金辐照位错环演化机制及其对力学性能的影响研究
  • 批准号:
    12375280
  • 批准年份:
    2023
  • 资助金额:
    53.00 万元
  • 项目类别:
    面上项目
CuAgSe基热电材料的结构特性与构效关系研究
  • 批准号:
    22375214
  • 批准年份:
    2023
  • 资助金额:
    50.00 万元
  • 项目类别:
    面上项目
基于大数据定量研究城市化对中国季节性流感传播的影响及其机理
  • 批准号:
    82003509
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Linking endotype and phenotype to understand COPD heterogeneity via deep learning and network science
通过深度学习和网络科学将内型和表型联系起来以了解 COPD 异质性
  • 批准号:
    10569732
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
Self-adaptive and Cooperative Multi-agent Reinforcement Learning-based Network Traffic Control
基于强化学习的自适应协作多智能体网络流量控制
  • 批准号:
    23K19982
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
TMJ SYMPHONY Systems-integrated model and mechanisms of patient-centered holistic outcomes and network-supported training and therapy
TMJ SYMPHONY 系统集成模型和以患者为中心的整体结果机制以及网络支持的培训和治疗
  • 批准号:
    10829112
  • 财政年份:
    2023
  • 资助金额:
    $ 1.82万
  • 项目类别:
Examining the electroencephalographic fingerprint of default mode network hyperconnectivity for scalable and personalized neurofeedback in schizophrenia
检查默认模式网络超连接的脑电图指纹,以实现精神分裂症的可扩展和个性化神经反馈
  • 批准号:
    10509002
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
Investigating electroencephalographic predictors of default mode network anticorrelation for personalized neurofeedback
研究个性化神经反馈的默认模式网络反相关的脑电图预测因子
  • 批准号:
    10447471
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
Multiscale, Multi-fidelity and Multiphysics Bayesian Neural Network (BNN) Machine Learning (ML) Surrogate Models for Modelling Design Based Accidents
用于基于事故建模设计的多尺度、多保真度和多物理场贝叶斯神经网络 (BNN) 机器学习 (ML) 替代模型
  • 批准号:
    2764855
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
    Studentship
Innovative biostatistical approaches to network level analyses of connectome-behavior relationships
连接组-行为关系网络级分析的创新生物统计方法
  • 批准号:
    10630851
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
Sequence-based Machine Learning for Inference of Dynamic Cell State Gene Network Models
基于序列的机器学习用于动态细胞状态基因网络模型的推理
  • 批准号:
    10665735
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
Multimodal imaging biomarkers of cognitive control network deficits in youths with disruptive behavior
具有破坏性行为的青少年认知控制网络缺陷的多模态成像生物标志物
  • 批准号:
    10705654
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
Examining the electroencephalographic fingerprint of default mode network hyperconnectivity for scalable and personalized neurofeedback in schizophrenia
检查默认模式网络超连接的脑电图指纹,以实现精神分裂症的可扩展和个性化神经反馈
  • 批准号:
    10675554
  • 财政年份:
    2022
  • 资助金额:
    $ 1.82万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了