Scalable Biomedical Pattern Recognition Via Deep Learning

通过深度学习进行可扩展的生物医学模式识别

基本信息

  • 批准号:
    9302040
  • 负责人:
  • 金额:
    $ 0.77万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2013
  • 资助国家:
    美国
  • 起止时间:
    2013-07-01 至 2016-06-30
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): Patterns extracted from Electronic Medical Records (EMRs) and other biomedical datasets can provide valuable feedback to a learning healthcare system, but our ability to find them is limited by certain manual steps. The dominant approach to finding the patterns uses supervised learning, where a computational algorithm searches for patterns among input variables (or features) that model an outcome variable (or label). This usually requires an expert to specify the learning task, construct input features, and prepare the outcome labels. This workflow has served us well for decades, but the dependence on human effort prevents it from scaling and it misses the most informative patterns, which are almost by definition the ones that nobody anticipates. It is poorly suited to the emerging era of population-scale data, in which we can conceive of massive new undertakings such as surveiling for all emerging diseases, detecting all unanticipated medication effects, or inferring the complete clinical phenotype of all genetic variants. The approach of unsupervised feature learning overcomes these limitations by identifying meaningful patterns in massive, unlabeled datasets with little or no human involvement. While there is a large literature on feature creation, a new surge of interest in unsupervised methods is being driven by the recent development of deep learning, in which a compact hierarchy of expressive features is learned from large unlabeled datasets. In the domains of image and speech recognition, deep learning has produced features that meet or exceed (by as much as 70%) the previous state of the art on difficult standardized tasks. Unfortunately, the noisy, sparse, and irregular data typically found in an EMR is a poor substrate for deep learning. Our approach uses Gaussian process regression to convert such an irregular sequence of observations into a longitudinal probability density that is suitable for use with a deep architecture. With this approach, we can learn continuous unsupervised features that capture the longitudinal structure of sparse and irregular observations. In our preliminary results unsupervised features were as powerful (0.96 AUC) in an unanticipated classification task as gold-standard features engineered by an expert with full knowledge of the domain, the classification task, and the class labels. In this project we will learn unsupervised features for records of all individuals in our deidentifed EMR image, for each of 100 laboratory tests and 200 medications of relevance to type 1 or type 2 diabetes. We will evaluate the features using three pattern recognition tasks that were unknown to the feature-learning algorithm: 1) an easy supervised classification task of distinguishing diabetics vs. nondiabetics, 2) a much more difficult task of distinguishing type 1 vs. type 2 diabetics, and 3) a genetic association task that considers the features as micro-phenotypes and measures their association with 29 different single nucleotide polymorphisms with known associations to type 1 or type 2 diabetes.
描述(由申请人提供):从电子病历 (EMR) 和其他生物医学数据集中提取的模式可以为学习型医疗保健系统提供有价值的反馈,但我们找到它们的能力受到某些手动步骤的限制。寻找模式的主要方法是使用监督学习,其中计算算法在对结果变量(或标签)建模的输入变量(或特征)中搜索模式。这通常需要专家指定学习任务、构建输入特征并准备结果标签。这个工作流程几十年来一直为我们提供了良好的服务,但对人力的依赖阻碍了它的扩展,并且它错过了信息最丰富的模式,而从定义上来说,这些模式几乎是没有人预料到的。它不太适合人口规模数据的新兴时代,在这个时代,我们可以设想大规模的新任务,例如监测所有新出现的疾病,检测所有意外的药物效应,或推断所有遗传变异的完整临床表型。 无监督特征学习方法通​​过在很少或没有人类参与的情况下识别大量未标记数据集中有意义的模式来克服这些限制。尽管有大量关于特征创建的文献,但对无监督方法的新兴趣正在兴起 受到深度学习最近发展的推动,其中从大型未标记数据集中学习表达特征的紧凑层次结构。在图像和语音识别领域,深度学习在困难的标准化任务上产生的功能达到或超过(多达 70%)先前的技术水平。 不幸的是,电子病历中通常存在的嘈杂、稀疏和不规则的数据并不是深度学习的基础。我们的方法使用高斯过程回归将这种不规则的观察序列转换为适合深度架构使用的纵向概率密度。通过这种方法,我们可以学习连续的无监督特征,捕获稀疏和不规则观察的纵向结构。在我们的初步结果中,无监督特征在意料之外的分类任务中与由完全了解领域、分类任务和类别标签的专家设计的黄金标准特征一样强大(0.96 AUC)。 在这个项目中,我们将学习去识别 EMR 图像中所有个体记录的无监督特征,以及与 1 型或 2 型糖尿病相关的 100 项实验室测试和 200 种药物中的每一项。我们将使用特征学习算法未知的三个模式识别任务来评估特征:1)区分糖尿病患者与非糖尿病患者的简单监督分类任务,2)区分 1 型糖尿病与 2 型糖尿病患者的更困难的任务,以及 3)遗传关联任务,将特征视为微表型并测量它们与 29 种不同单核苷酸多态性的关联 已知与 1 型或 2 型糖尿病的关联。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Nonstationary Gaussian Process Regression for Evaluating Clinical Laboratory Test Sampling Strategies.
用于评估临床实验室测试采样策略的非平稳高斯过程回归。
Efficient Inference of Gaussian-Process-Modulated Renewal Processes with Application to Medical Event Data.
高斯过程调制更新过程的有效推理及其应用于医疗事件数据。
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Thomas Lasko其他文献

Thomas Lasko的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Thomas Lasko', 18)}}的其他基金

Data-Driven Guidance for Timing Repeated Inpatient Laboratory Tests
重复住院实验室测试时间的数据驱动指南
  • 批准号:
    10450243
  • 财政年份:
    2022
  • 资助金额:
    $ 0.77万
  • 项目类别:
Data-Driven Guidance for Timing Repeated Inpatient Laboratory Tests
重复住院实验室测试时间的数据驱动指南
  • 批准号:
    10599337
  • 财政年份:
    2022
  • 资助金额:
    $ 0.77万
  • 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
  • 批准号:
    9248768
  • 财政年份:
    2016
  • 资助金额:
    $ 0.77万
  • 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
  • 批准号:
    9051683
  • 财政年份:
    2016
  • 资助金额:
    $ 0.77万
  • 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
  • 批准号:
    9420613
  • 财政年份:
    2016
  • 资助金额:
    $ 0.77万
  • 项目类别:
Scalable Biomedical Pattern Recognition Via Deep Learning
通过深度学习进行可扩展的生物医学模式识别
  • 批准号:
    8689173
  • 财政年份:
    2013
  • 资助金额:
    $ 0.77万
  • 项目类别:

相似海外基金

Investigation of deep learning neural network architectures for biomedical pattern classification problems
研究生物医学模式分类问题的深度学习神经网络架构
  • 批准号:
    485676-2015
  • 财政年份:
    2015
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Engage Grants Program
Scalable Biomedical Pattern Recognition Via Deep Learning
通过深度学习进行可扩展的生物医学模式识别
  • 批准号:
    8689173
  • 财政年份:
    2013
  • 资助金额:
    $ 0.77万
  • 项目类别:
Cell image tracking over time to extract spatial and temporal features of individual cells by combination of biomedical /biological image processing and pattern recognition techniques
通过结合生物医学/生物图像处理和模式识别技术,随着时间的推移进行细胞图像跟踪,以提取单个细胞的空间和时间特征
  • 批准号:
    319408-2005
  • 财政年份:
    2007
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
CISM-Kurs "Pattern formation at interfaces with applications to material science, biomedical and physico-chemical processes" (16.-20.10.2006 in Udine/Italien)
CISM 课程“界面图案形成及其在材料科学、生物医学和物理化学过程中的应用”(2006 年 10 月 16 日至 20 日,意大利乌迪内)
  • 批准号:
    36678258
  • 财政年份:
    2006
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Research Grants
CISM-Kurs "Pattern formation at interfaces with applications to materials science, biomedical and physico-chemical processes" (16.-20.10.2006 in Udine/Italien)
CISM 课程“界面图案形成及其在材料科学、生物医学和物理化学过程中的应用”(2006 年 10 月 16 日至 20 日,意大利乌迪内)
  • 批准号:
    33237560
  • 财政年份:
    2006
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Research Grants
CISM-Kurs "Pattern formation at interfaces with applications to materials science, biomedical and physico-chemical processes" (16.-20.10.2006 in Udine/Italien)
CISM 课程“界面图案形成及其在材料科学、生物医学和物理化学过程中的应用”(2006 年 10 月 16 日至 20 日,意大利乌迪内)
  • 批准号:
    36678035
  • 财政年份:
    2006
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Research Grants
Cell image tracking over time to extract spatial and temporal features of individual cells by combination of biomedical /biological image processing and pattern recognition techniques
通过结合生物医学/生物图像处理和模式识别技术,随着时间的推移进行细胞图像跟踪,以提取单个细胞的空间和时间特征
  • 批准号:
    319408-2005
  • 财政年份:
    2006
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Cell image tracking over time to extract spatial and temporal features of individual cells by combination of biomedical /biological image processing and pattern recognition techniques
通过结合生物医学/生物图像处理和模式识别技术,随着时间的推移进行细胞图像跟踪,以提取单个细胞的空间和时间特征
  • 批准号:
    319408-2005
  • 财政年份:
    2005
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Pattern recognition with neural networks for biomedical, hands written text and human face recognition
用于生物医学、手写文本和人脸识别的神经网络模式识别
  • 批准号:
    1749-2003
  • 财政年份:
    2004
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Discovery Grants Program - Individual
Pattern recognition with neural networks for biomedical, hands written text and human face recognition
用于生物医学、手写文本和人脸识别的神经网络模式识别
  • 批准号:
    1749-2003
  • 财政年份:
    2003
  • 资助金额:
    $ 0.77万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了