Interpretable Machine Learning Approaches Applied to Omics Datasets
应用于组学数据集的可解释机器学习方法
基本信息
- 批准号:RGPIN-2022-04262
- 负责人:
- 金额:$ 2.84万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In recent decades, advances in sequencing technologies have set off a revolution resulting in an explosion of genetic data, propelling human genomics into the era of big data. Even more recently, novel biotechnologies allow us to obtain information at the molecular level for each individual, such as the concentrations of metabolites (small molecules such as antioxidants, vitamins) or the quantification of RNA transcripts and proteins. These so-called `omics' data sets, combined with genetics established at birth, hold the promise to reveal the molecular causes of the differences between humans, for traits such as height, weight or risk of developing disease. In parallel, recent advances in artificial intelligence have led to the development of powerful methods for making predictions from large data sets, in several areas from autonomous driving to natural language processing. However, machine learning technologies on individual omics signatures are lagging behind, because several challenges still need to be addressed. First, these "black box" methods produce predictions without providing interpretations, preventing experts from getting the evidence necessary to validate the results. Second, current methods tend to learn the datasets "by heart" instead of extracting general knowledge from them. Indeed, although our datasets are large, they contain far fewer participants than variables measured for each participant, which reduces the ability to generalize. Finally, the sources of biological and technical variation, generally inconsistent from one dataset to another, must be considered to obtain reliable predictions in the real world. My research program offers concrete solutions to make these methodologies applicable to omics data, for specific biological problems. One of them aims to predict, from an individual's genetics, changes in the levels of RNA transcripts and metabolites. Another problem targets the prediction of the risk of developing a complex disease based on an individual's omics data. For these concrete applications, we will use omics data from several biobanks, including local (Montreal Heart Institute Biobank), national (CanPath cohort) and international (UK Biobank) cohorts. We will develop these methodologies while ensuring to obtain interpretable and plausible results, which generalize well, and are independent of the noise sources. Our program is interdisciplinary and offers a rich training opportunity for students. Our results will have the potential to improve the appropriate use of machine learning in molecular biology research, providing researchers and Canadian industry with robust tools for omics data analysis that can be interpreted by humans.
近几十年来,测序技术的进步引发了一场革命,导致基因数据爆炸式增长,推动人类基因组学进入大数据时代。甚至最近,新的生物技术使我们能够获得每个个体的分子水平信息,例如代谢物(抗氧化剂,维生素等小分子)的浓度或RNA转录物和蛋白质的定量。这些所谓的“组学”数据集与出生时确定的遗传学相结合,有望揭示人类之间差异的分子原因,如身高、体重或患病风险。 与此同时,人工智能的最新进展导致了从自动驾驶到自然语言处理等多个领域的大型数据集进行预测的强大方法的发展。然而,针对个体组学特征的机器学习技术仍然落后,因为仍需要解决几个挑战。首先,这些“黑箱”方法产生的预测没有提供解释,阻止专家获得必要的证据来验证结果。其次,目前的方法倾向于“用心”学习数据集,而不是从中提取一般知识。事实上,尽管我们的数据集很大,但它们包含的参与者远远少于为每个参与者测量的变量,这降低了概括的能力。最后,生物和技术变异的来源,通常不一致,从一个数据集到另一个,必须考虑到获得可靠的预测在真实的世界。我的研究计划提供了具体的解决方案,使这些方法适用于组学数据,为特定的生物学问题。其中一项旨在从个体的遗传学中预测RNA转录物和代谢物水平的变化。另一个问题是基于个体的组学数据预测发展复杂疾病的风险。对于这些具体的应用,我们将使用来自多个生物库的组学数据,包括当地(蒙特利尔心脏研究所生物库),国家(CanPath队列)和国际(英国生物库)队列。我们将开发这些方法,同时确保获得可解释的和合理的结果,推广良好,是独立的噪声源。 我们的课程是跨学科的,为学生提供了丰富的培训机会。我们的研究结果将有可能改善机器学习在分子生物学研究中的适当使用,为研究人员和加拿大工业提供可由人类解释的组学数据分析的强大工具。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hussin, Julie其他文献
A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data
- DOI:
10.2202/1544-6115.1713 - 发表时间:
2012-01-01 - 期刊:
- 影响因子:0.9
- 作者:
Cartwright, Reed A.;Hussin, Julie;Awadalla, Philip - 通讯作者:
Awadalla, Philip
The race to understand immunopathology in COVID-19: Perspectives on the impact of quantitative approaches to understand within-host interactions.
- DOI:
10.1016/j.immuno.2023.100021 - 发表时间:
2023-03 - 期刊:
- 影响因子:0
- 作者:
Gazeau, Sonia;Deng, Xiaoyan;Ooi, Hsu Kiang;Mostefai, Fatima;Hussin, Julie;Heffernan, Jane;Jenner, Adrianne L;Craig, Morgan - 通讯作者:
Craig, Morgan
Rare allelic forms of PRDM9 associated with childhood leukemogenesis
- DOI:
10.1101/gr.144188.112 - 发表时间:
2013-03-01 - 期刊:
- 影响因子:7
- 作者:
Hussin, Julie;Sinnett, Daniel;Awadalla, Philip - 通讯作者:
Awadalla, Philip
Hussin, Julie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Hussin, Julie', 18)}}的其他基金
Interpretable Machine Learning Approaches Applied to Omics Datasets
应用于组学数据集的可解释机器学习方法
- 批准号:
DGECR-2022-00208 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Launch Supplement
Inférences des mutations et recombinants de novo par l'analyse de données génétiques familiales
家族遗传分析的突变和重组推论
- 批准号:
378841-2009 - 财政年份:2011
- 资助金额:
$ 2.84万 - 项目类别:
Postgraduate Scholarships - Doctoral
Inférences des mutations et recombinants de novo par l'analyse de données génétiques familiales
家族遗传分析的突变和重组推论
- 批准号:
378841-2009 - 财政年份:2010
- 资助金额:
$ 2.84万 - 项目类别:
Postgraduate Scholarships - Doctoral
Inférences des mutations et recombinants de novo par l'analyse de données génétiques familiales
家族遗传分析的突变和重组推论
- 批准号:
378841-2009 - 财政年份:2009
- 资助金额:
$ 2.84万 - 项目类别:
Postgraduate Scholarships - Doctoral
相似国自然基金
Understanding structural evolution of galaxies with machine learning
- 批准号:
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
相似海外基金
22-BBSRC/NSF-BIO - Interpretable & Noise-robust Machine Learning for Neurophysiology
22-BBSRC/NSF-BIO - 可解释
- 批准号:
BB/Y008758/1 - 财政年份:2024
- 资助金额:
$ 2.84万 - 项目类别:
Research Grant
Interpretable Machine Learning Modelling of Future Extreme Floods under Climate Change
气候变化下未来极端洪水的可解释机器学习模型
- 批准号:
2889015 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Studentship
UKRI/BBSRC-NSF/BIO: Interpretable and Noise-Robust Machine Learning for Neurophysiology
UKRI/BBSRC-NSF/BIO:用于神经生理学的可解释且抗噪声的机器学习
- 批准号:
2321840 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Continuing Grant
CAREER: Interpretable and Robust Machine Learning Models: Analysis and Algorithms
职业:可解释且稳健的机器学习模型:分析和算法
- 批准号:
2239787 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Continuing Grant
Macroeconomic structural changes and their characteristics: Applications of interpretable machine learning
宏观经济结构变化及其特征:可解释机器学习的应用
- 批准号:
23K01319 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Optimization and Validation of a Cost-effective Image-Guided Automated Extracapsular Extension Detection Framework through Interpretable Machine Learning in Head and Neck Cancer
通过可解释的机器学习在头颈癌中优化和验证具有成本效益的图像引导自动囊外扩展检测框架
- 批准号:
10648372 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Accurate, reliable, and interpretable machine learning for assessment of neonatal and pediatric brain micro-structure
准确、可靠且可解释的机器学习,用于评估新生儿和儿童大脑微结构
- 批准号:
10566299 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Improving Interpretable Machine Learning for Plasmas: Towards Physical Insight, Data-Driven Models, and Optimal Sensing
改进等离子体的可解释机器学习:迈向物理洞察、数据驱动模型和最佳传感
- 批准号:
2329765 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Continuing Grant
Interpretable machine learning to synergize brain age estimation and neuroimaging genetics
可解释的机器学习可协同大脑年龄估计和神经影像遗传学
- 批准号:
10568234 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Collaborative Research: CIF: Small: Interpretable Fair Machine Learning: Frameworks, Robustness, and Scalable Algorithms
协作研究:CIF:小型:可解释的公平机器学习:框架、稳健性和可扩展算法
- 批准号:
2343869 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Standard Grant














{{item.name}}会员




