Correcting biases in deep learning models
纠正深度学习模型中的偏差
基本信息
- 批准号:10584314
- 负责人:
- 金额:$ 34.44万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-01-20 至 2027-12-31
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalAccountingAddressAlgorithmsAlzheimer&aposs DiseaseAlzheimer&aposs disease diagnosisArchitectureArtificial IntelligenceAutomationBiologicalBiological AssayBiological SciencesBloodBrainCellsClassificationCodeCognitiveCommunitiesComplexComputer ModelsComputer softwareDataData AnalysesData CorrelationsData SetDependenceDimensionsFutureGenerationsGliomaGoalsHealth SciencesImageIndividualInstitutionLearningMachine LearningMagnetic Resonance ImagingMeasurementMeasuresMedicalMedical ImagingMethodsMethylationMicroscopeMicroscopyModelingModificationNerve DegenerationOutcomeParticipantPatientsPerformancePeripheral Blood Mononuclear CellPythonsRecoverySamplingSiteTensorFlowTestingThe Cancer Imaging ArchiveThree-Dimensional ImageTissuesTrainingValidationVisualizationautoencoderbasebiomedical imagingcancer cellconvolutional neural networkdeep learningdeep learning modelfeedforward neural networkhuman diseaseimprovedinnovationlearning communitylearning networklive cell imagingmachine learning frameworkmild cognitive impairmentmultimodalityneural networkneural network architecturenovelopen sourcepredictive modelingpreservationresiliencestatisticssuccesstooltranscriptome sequencingvector
项目摘要
Project Summary/Abstract
Deep learning (DL) has been widely applied across all life sciences to construct predictive models. However, it
relies on the assumption that training samples are independent and identically distributed. This is frequently
violated in the life sciences, where data is “grouped” by measurements from the same sample (patient, cell,
tissue), by the same observer, or at the same site. This leads to clusters of correlated data (random effects), and
when the models are fit to such data, the model parameters can be severely biased, leading to type I and II
errors. Proper accounting for such dependencies in DL models has gone unsolved. The objective of this proposal
is to develop the appropriate DL modifications to separately model global fixed effects and random effects that
increase model interpretability and performance for precise unbiased predictions related to human disease.
Our proposal is based on a novel, model-agnostic framework to transform conventional DL models into proper
mixed effects DL (MEDL) models. This affords capabilities of statistical linear mixed effects models, including
the separation of cluster-invariant fixed effects from cluster-specific random effects, while preserving the ability
of DL to learn data-driven nonlinear associations. The core premise is that proper MEDL models 1) are more
resilient to confounding effects and more attentive to true predictive features, 2) can capture, quantify, and
visualize random effects to enhance interpretability, and 3) attain better generalization to new clusters. We
propose to incorporate MEDL into three of the most important DL model types including dense feed-forward
neural networks (DFNNs), convolutional neural networks (CNNs), and autoencoders. Our preliminary results
demonstrate multiple advantages of MEDL over conventional DL in both accuracy and interpretability. MEDL
outperforms previous clustered data approaches including: domain adversarial models, meta-learning, and the
inclusion of cluster membership as an input covariate. We developed an ME-DFNN to predict conversion from
mild cognitive impairment to Alzheimer’s Disease (AD) from tabular data, an ME-CNN to diagnose AD from MRI,
and an ME-autoencoder to compress and classify live cell images. Across these test cases, MEDL models were
the most discriminative between known confounded and real features; they were able to quantify or visualize the
random effects and outperformed other models on clusters both seen and unseen during training. This proposal
further develops the methods to handle complex architectures and hierarchical effects, with external validation,
through these aims: 1) Develop ME-DFNNs for classification and regression. 2) Develop 3D ME-CNNs and multi-
modal 3D ME-CNNs for medical image classification. 3) Develop convolutional and vector ME-autoencoders for
image and omics data. We describe the innovative incorporation of an adversarial classifier to constrain the base
model to learn fixed effects, a Bayesian random effects subnetwork, and an approach to apply random effects
to unseen clusters. All these solutions will be released as open source software that improve existing DL models
to ultimately support precision biomedicine for the study and treatment of human disease.
项目总结/摘要
深度学习(DL)已广泛应用于所有生命科学领域来构建预测模型。但
依赖于训练样本是独立且同分布的假设。这经常被
在生命科学中,数据通过来自相同样品(患者,细胞,
组织),由同一观察者,或在同一部位。这会导致相关数据的集群(随机效应),
当模型与这些数据拟合时,模型参数可能会严重偏差,导致I型和II型
错误.在深度学习模型中对这种依赖性的正确解释还没有解决。本提案的目的
是开发适当的DL修改,以分别建模全局固定效应和随机效应,
提高模型的可解释性和性能,以实现与人类疾病相关的精确无偏预测。
我们的建议是基于一个新的,模型不可知的框架,将传统的DL模型转换为适当的
混合效应DL(MEDL)模型。这提供了统计线性混合效应模型的功能,包括
将集群不变的固定效应与集群特定的随机效应分离,同时保留
学习数据驱动的非线性关联。核心前提是适当的MEDL模型1)更多
对混杂效应有弹性,更关注真正的预测特征,2)可以捕获,量化,
可视化随机效应以增强可解释性,以及3)获得对新聚类的更好的泛化。我们
建议将MEDL纳入三种最重要的DL模型类型,包括密集前馈
神经网络(DFNN)、卷积神经网络(CNN)和自动编码器。我们的初步结果
在准确性和可解释性方面,MEDL比传统DL具有多个优势。MEDL
优于以前的聚类数据方法,包括:域对抗模型,元学习,
包含聚类成员资格作为输入协变量。我们开发了一个ME-DFNN来预测从
从表格数据中轻度认知障碍到阿尔茨海默病(AD),从MRI诊断AD的ME-CNN,
以及ME自动编码器,用于对活细胞图像进行压缩和分类。在这些测试用例中,MEDL模型
已知混淆和真实的特征之间最具区分力;他们能够量化或可视化
随机效应,并且在训练过程中看到和看不到的集群上都优于其他模型。这项建议
进一步发展的方法来处理复杂的架构和层次效应,与外部验证,
通过这些目标:1)开发用于分类和回归的ME-DFNN。2)开发3D ME-CNN和多
用于医学图像分类的模态3D ME-CNN。3)开发卷积和矢量ME自动编码器,
图像和组学数据。我们描述了一个对抗分类器的创新结合,以约束基础
学习固定效应的模型,贝叶斯随机效应子网络,以及应用随机效应的方法
到看不见的星团所有这些解决方案都将作为开源软件发布,以改进现有的DL模型
最终支持用于研究和治疗人类疾病的精确生物医学。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Albert Amos Montillo其他文献
Albert Amos Montillo的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Unraveling the Dynamics of International Accounting: Exploring the Impact of IFRS Adoption on Firms' Financial Reporting and Business Strategies
揭示国际会计的动态:探索采用 IFRS 对公司财务报告和业务战略的影响
- 批准号:
24K16488 - 财政年份:2024
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Mighty Accounting - Accountancy Automation for 1-person limited companies.
Mighty Accounting - 1 人有限公司的会计自动化。
- 批准号:
10100360 - 财政年份:2024
- 资助金额:
$ 34.44万 - 项目类别:
Collaborative R&D
Accounting for the Fall of Silver? Western exchange banking practice, 1870-1910
白银下跌的原因是什么?
- 批准号:
24K04974 - 财政年份:2024
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A New Direction in Accounting Education for IT Human Resources
IT人力资源会计教育的新方向
- 批准号:
23K01686 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
An empirical and theoretical study of the double-accounting system in 19th-century American and British public utility companies
19世纪美国和英国公用事业公司双重会计制度的实证和理论研究
- 批准号:
23K01692 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
An Empirical Analysis of the Value Effect: An Accounting Viewpoint
价值效应的实证分析:会计观点
- 批准号:
23K01695 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Accounting model for improving performance on the health and productivity management
提高健康和生产力管理绩效的会计模型
- 批准号:
23K01713 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CPS: Medium: Making Every Drop Count: Accounting for Spatiotemporal Variability of Water Needs for Proactive Scheduling of Variable Rate Irrigation Systems
CPS:中:让每一滴水都发挥作用:考虑用水需求的时空变化,主动调度可变速率灌溉系统
- 批准号:
2312319 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Standard Grant
New Role of Not-for-Profit Entities and Their Accounting Standards to Be Unified
非营利实体的新角色及其会计准则将统一
- 批准号:
23K01715 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Improving Age- and Cause-Specific Under-Five Mortality Rates (ACSU5MR) by Systematically Accounting Measurement Errors to Inform Child Survival Decision Making in Low Income Countries
通过系统地核算测量误差来改善特定年龄和特定原因的五岁以下死亡率 (ACSU5MR),为低收入国家的儿童生存决策提供信息
- 批准号:
10585388 - 财政年份:2023
- 资助金额:
$ 34.44万 - 项目类别: