权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

PREMIERE: A PREdictive Model Index and Exchange REpository

PREMIERE：预测模型索引和交换存储库

基本信息

批准号：
10668938
负责人：
ALEX BUI
金额：
$ 67.35万
依托单位：
UNIVERSITY OF CALIFORNIA LOS ANGELES
依托单位国家：
美国
项目类别：
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-15 至 2024-05-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10668938
关键词：
Acceleration Access to Information Address Algorithms Area Attention Bayesian Network Big Data Biological Markers Calibration Characteristics Clinical Communities Computational Biology Computer software Data Data Science Data Set Data Sources Decision Making Decision Trees Dermatology Development Diagnosis Diagnostic Diagnostic Imaging Ecosystem Educational workshop Electronic Health Record Environment Evaluation FAIR principles Fostering Foundations Goals Human Image Image Analysis Informatics Language Link Literature Machine Learning Measures Medical Metadata Methodology Methods Modeling Nature Ophthalmology Parameter Estimation Pathway interactions Performance Population Proliferating Publications Publishing Radiology Specialty Receiver Operating Characteristics Reporting Reproducibility Reproduction Research Personnel Risk Assessment Source Techniques Testing Training Validation Variant Work biomarker discovery biomedical imaging case-based cohort collaborative environment comparative computer aided detection convolutional neural network data harmonization data sharing deep learning deep learning model design experience feature selection improved indexing innovation insight interest interoperability learning network lung basal segment lung cancer screening mHealth machine learning method machine learning prediction model development novel novel strategies online repository predictive modeling prognostic repository software repository statistical and machine learning statistics stem tool web portal

项目摘要

The confluence of new machine learning (ML) data-driven approaches; increased computational power; and access to the wealth of electronic health records (EHRs) and other emergent types of data (e.g., omics, imaging, mHealth) are accelerating the development of biomedical predictive models. Such models range from traditional statistical approaches (e.g., regression) through to more advanced deep learning techniques (e.g., convolutional neural networks, CNNs), and span different tasks (e.g., biomarker/pathway discovery, diagnostic, prognostic). Two issues have become evident: 1) as there are no comprehensive standards to support the dissemination of these models, scientific reproducibility is problematic, given challenges in interpretation and implementation; and 2) as new models are put forth, methods to assess differences in performance, as well as insights into external validity (i.e., transportability), are necessary. Tools moving beyond the sharing of data and model “executables” are needed, capturing the (meta)data necessary to fully reproduce a model and its evaluation. The objective of this R01 is the development of an informatics standard supporting the requisite information for scientific reproducibility for statistical and ML-based biomedical predictive models; from this foundation, we then develop new computational approaches to compare models' performance. We begin by extending the current Predictive Model Markup Language (PMML) standard to fully characterize biomedical datasets and harmonize variable definitions; to elucidate the algorithms involved in model creation (e.g., data preprocessing, parameter estimation); and to explain the validation methodology. Importantly, models in this PMML format will become findable, accessible, interoperable, and reusable (i.e., following FAIR principles). We then propose novel meth- ods to compare and contrast predictive models, assessing transportability across datasets. While metrics exist for comparing models (e.g., c-statistics, calibration), often the required case-level information is not available to calculate these measures. We thus introduce an approach to simulate cases based on a model's reported da- taset statistics, enabling such calculations. Different levels of transportability are then assigned to the metrics, determining the extent to which a selected model is applicable to a given population/cohort (i.e., helping answer the question, can I use this published model with my own data?). We tie these efforts together in our proposed framework, the PREdictive Model Index & Exchange REpository (PREMIERE). We will develop an online portal and repository for model sharing around PREMIERE, and our efforts will include fostering a community of users to guide its development through workshops, model-thons, and other activities. To demonstrate these efforts, we will bootstrap PREMIERE with predictive models from a targeted domain (risk assessment in imaging-based lung cancer screening). Our efforts to evaluate these developments will engage a range of stakeholders (model developers, users) to inform the completeness of our standard; and biostatisticians and clinical experts to guide assessment of model transportability.

新的机器学习（ML）数据驱动方法的融合;增加的计算能力;以及访问丰富的电子健康记录（EHR）和其他紧急类型的数据（例如，组学，成像， mHealth）正在加速生物医学预测模型的发展。这些模型包括传统的统计方法（例如，回归）到更高级的深度学习技术（例如，卷积神经网络，CNN），并且跨越不同的任务（例如，生物标志物/途径发现、诊断、预后）。有两个问题变得明显：1）由于没有全面的标准来支持传播鉴于这些模型在解释和实施方面的挑战，科学再现性是有问题的; 2)随着新模型的提出，评估绩效差异的方法，以及对外部环境的洞察力，有效性（即，运输能力）是必要的。超越数据共享和模型"可执行文件"的工具捕获完全再现模型及其评估所需的（元）数据。本R01的目标是制定一个信息学标准，支持以下方面的必要信息：统计和基于ML的生物医学预测模型的科学再现性;从这个基础上，我们然后开发新的计算方法来比较模型的性能。我们开始通过延长电流预测模型标记语言（PMML）标准，用于充分表征生物医学数据集，变量定义;阐明模型创建中涉及的算法（例如，数据预处理，参数估计）;并解释验证方法。重要的是，这种PMML格式的模型将成为可查找、可访问、可互操作和可重用（即，遵循公平原则）。我们提出一种新的冰毒- ODS比较和对比预测模型，评估数据集之间的可移植性。虽然存在度量标准，为了比较模型（例如，c-统计，校准），通常无法获得所需的案例级信息，计算这些措施。因此，我们介绍了一种方法来模拟的情况下，模型的报告数据的基础上， taset统计，使这种计算。然后将不同级别的可移植性分配给度量，确定所选模型适用于给定群体/群组的程度（即，帮助解答问题是，我可以将这个发布的模型与我自己的数据一起使用吗？）。我们将这些努力结合在一起， Predictive Model Index & Exchange REpository（PREMIERE）我们将开发一个在线门户网站以及围绕PREMIERE共享模型的存储库，我们的努力将包括培养用户社区通过讲习班、示范和其他活动指导其发展。为了证明这些努力，我们将引导PREMIERE使用来自目标领域的预测模型（基于成像的风险评估肺癌筛查）。我们评估这些发展的努力将吸引一系列利益相关者（模型开发人员、用户）告知我们标准的完整性;生物统计学家和临床专家指导模型可运输性评估。

项目成果

期刊论文数量（5）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Prevention of Bias and Discrimination in Clinical Practice Algorithms.

预防临床实践算法中的偏见和歧视。

DOI：
10.1001/jama.2022.23867
发表时间：
2023
期刊：
JAMA
影响因子：
0
作者：
Shachar,Carmel;Gerke,Sara
通讯作者：
Gerke,Sara

How AI can learn from the law: putting humans in the loop only on appeal.

DOI：
10.1038/s41746-023-00906-8
发表时间：
2023-08-25
期刊：
NPJ digital medicine
影响因子：
15.2
作者：
通讯作者：

Health Care AI and Patient Privacy-Dinerstein v Google.

医疗保健人工智能和患者隐私 - Dinerstein 诉 Google。

DOI：
10.1001/jama.2024.1110
发表时间：
2024
期刊：
JAMA
影响因子：
0
作者：
Duffourc,MindyNunez;Gerke,Sara
通讯作者：
Gerke,Sara

Artificial intelligence tools in clinical neuroradiology: essential medico-legal aspects.

DOI：
10.1007/s00234-023-03152-7
发表时间：
2023-07
期刊：
NEURORADIOLOGY
影响因子：
2.8
作者：
Hedderich, Dennis M.;Weisstanner, Christian;Van Cauter, Sofie;Federau, Christian;Edjlali, Myriam;Radbruch, Alexander;Gerke, Sara;Haller, Sven
通讯作者：
Haller, Sven