Scalable Biomedical Pattern Recognition Via Deep Learning
通过深度学习进行可扩展的生物医学模式识别
基本信息
- 批准号:8689173
- 负责人:
- 金额:$ 20.27万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-07-01 至 2016-04-29
- 项目状态:已结题
- 来源:
- 关键词:Acquired Immunodeficiency SyndromeAlgorithmsArchitectureAreaBiomedical ResearchCaringClassificationClinicalClinical DataComputational algorithmComputerized Medical RecordCouplesDataData SetData SourcesDependenceDevelopmentDiabetes MellitusDiagnosisDiseaseEngineeringEvaluationExhibitsFeedbackGoalsGoldHealthcare SystemsHumanImageIndividualJudgmentKnowledgeLabelLaboratoriesLearningLiteratureManualsMeasuresMedicalMetforminMethodsModelingMyocardial InfarctionNatureNon-Insulin-Dependent Diabetes MellitusOutcomePTGS2 genePatternPattern RecognitionPerformancePharmaceutical PreparationsPhenotypePilot ProjectsPopulationProbabilityProcessROC CurveRecordsResearchRiskSamplingSeaSerumSingle Nucleotide PolymorphismSourceSpecific qualifier valueStructureTestingTimeTo specifyUncertaintyUric AcidWorkcell growthclinical careclinical phenotypedensitydiabeticgenetic associationgenetic variantinhibitor/antagonistinterestmeetingsneoplastic cellnon-diabeticoutcome forecastpreventspeech recognitionsuccesstype I and type II diabetes
项目摘要
DESCRIPTION (provided by applicant): Patterns extracted from Electronic Medical Records (EMRs) and other biomedical datasets can provide valuable feedback to a learning healthcare system, but our ability to find them is limited by certain manual steps. The dominant approach to finding the patterns uses supervised learning, where a computational algorithm searches for patterns among input variables (or features) that model an outcome variable (or label). This usually requires an expert to specify the learning task, construct input features, and prepare the outcome labels. This workflow has served us well for decades, but the dependence on human effort prevents it from scaling and it misses the most informative patterns, which are almost by definition the ones that nobody anticipates. It is poorly suited to the emerging era of population-scale data, in which we can conceive of massive new undertakings such as surveiling for all emerging diseases, detecting all unanticipated medication effects, or inferring the complete clinical phenotype of all genetic variants.
The approach of unsupervised feature learning overcomes these limitations by identifying meaningful patterns in massive, unlabeled datasets with little or no human involvement. While there is a large literature on feature creation, a new surge of interest in unsupervised methods is
being driven by the recent development of deep learning, in which a compact hierarchy of expressive features is learned from large unlabeled datasets. In the domains of image and speech recognition, deep learning has produced features that meet or exceed (by as much as 70%) the previous state of the art on difficult standardized tasks.
Unfortunately, the noisy, sparse, and irregular data typically found in an EMR is a poor substrate for deep learning. Our approach uses Gaussian process regression to convert such an irregular sequence of observations into a longitudinal probability density that is suitable for use with a deep architecture. With this approach, we can learn continuous unsupervised features that capture the longitudinal structure of sparse and irregular observations. In our preliminary results unsupervised features were as powerful (0.96 AUC) in an unanticipated classification task as gold-standard features engineered by an expert with full knowledge of the domain, the classification task, and the class labels.
In this project we will learn unsupervised features for records of all individuals in our deidentifed EMR image, for each of 100 laboratory tests and 200 medications of relevance to type 1 or type 2 diabetes. We will evaluate the features using three pattern recognition tasks that were unknown to the feature-learning algorithm: 1) an easy supervised classification task of distinguishing diabetics vs. nondiabetics, 2) a much more difficult task of distinguishing type 1 vs. type 2 diabetics, and 3) a genetic association task that considers the features as micro-phenotypes and measures their association with 29 different single nucleotide polymorphisms with known associations to type 1 or type 2 diabetes.
描述(由申请人提供):从电子病历(EMR)和其他生物医学数据集提取的模式可以为学习医疗保健系统提供有价值的反馈,但我们发现它们的能力受到某些手动步骤的限制。寻找模式的主要方法是使用监督学习,其中计算算法在输入变量(或特征)中搜索模式,这些输入变量(或特征)对结果变量(或标签)进行建模。这通常需要专家指定学习任务,构建输入特征,并准备结果标签。几十年来,这个工作流程一直很好地为我们服务,但对人类努力的依赖阻止了它的扩展,它错过了最具信息性的模式,这些模式几乎是没有人预料到的。它不太适合人口规模数据的新兴时代,在这个时代,我们可以设想大规模的新任务,例如监测所有新出现的疾病,检测所有未预料到的药物效应,或推断所有遗传变异的完整临床表型。
无监督特征学习的方法克服了这些局限性,通过识别大量无标签数据集中有意义的模式,很少或没有人类参与。虽然有大量关于特征创建的文献,但对无监督方法的新兴趣激增,
这是由深度学习的最新发展驱动的,其中表达特征的紧凑层次结构是从大型未标记数据集中学习的。在图像和语音识别领域,深度学习产生的功能在困难的标准化任务上达到或超过(高达70%)以前的最新技术水平。
不幸的是,EMR中通常存在的噪声,稀疏和不规则数据是深度学习的不良基础。我们的方法使用高斯过程回归将这种不规则的观测序列转换为适合用于深度架构的纵向概率密度。通过这种方法,我们可以学习连续的无监督特征,这些特征可以捕获稀疏和不规则观测的纵向结构。在我们的初步结果中,无监督特征在意料之外的分类任务中与由对领域、分类任务和类标签有充分了解的专家设计的金标准特征一样强大(0.96 AUC)。
在这个项目中,我们将学习我们的deidentified EMR图像中所有个人记录的无监督特征,用于与1型或2型糖尿病相关的100项实验室检查和200种药物。我们将使用特征学习算法未知的三个模式识别任务来评估特征:1)区分糖尿病患者与非糖尿病患者的容易的监督分类任务,2)区分1型糖尿病患者与2型糖尿病患者的困难得多的任务,以及3)将特征视为微观的遗传关联任务,表型,并测量其与已知与1型或2型糖尿病相关的29种不同单核苷酸多态性的相关性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Thomas Lasko其他文献
Thomas Lasko的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Thomas Lasko', 18)}}的其他基金
Data-Driven Guidance for Timing Repeated Inpatient Laboratory Tests
重复住院实验室测试时间的数据驱动指南
- 批准号:
10450243 - 财政年份:2022
- 资助金额:
$ 20.27万 - 项目类别:
Data-Driven Guidance for Timing Repeated Inpatient Laboratory Tests
重复住院实验室测试时间的数据驱动指南
- 批准号:
10599337 - 财政年份:2022
- 资助金额:
$ 20.27万 - 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
- 批准号:
9248768 - 财政年份:2016
- 资助金额:
$ 20.27万 - 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
- 批准号:
9051683 - 财政年份:2016
- 资助金额:
$ 20.27万 - 项目类别:
Identification, Extraction and Display of Clinical Data Patterns with Application to Anesthesia Workflows
临床数据模式的识别、提取和显示及其在麻醉工作流程中的应用
- 批准号:
9420613 - 财政年份:2016
- 资助金额:
$ 20.27万 - 项目类别:
Scalable Biomedical Pattern Recognition Via Deep Learning
通过深度学习进行可扩展的生物医学模式识别
- 批准号:
9302040 - 财政年份:2013
- 资助金额:
$ 20.27万 - 项目类别:
相似海外基金
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 20.27万 - 项目类别:
Continuing Grant
Collaborative Research: SHF: Small: Artificial Intelligence of Things (AIoT): Theory, Architecture, and Algorithms
合作研究:SHF:小型:物联网人工智能 (AIoT):理论、架构和算法
- 批准号:
2221742 - 财政年份:2022
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Artificial Intelligence of Things (AIoT): Theory, Architecture, and Algorithms
合作研究:SHF:小型:物联网人工智能 (AIoT):理论、架构和算法
- 批准号:
2221741 - 财政年份:2022
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant
Algorithms and Architecture for Super Terabit Flexible Multicarrier Coherent Optical Transmission
超太比特灵活多载波相干光传输的算法和架构
- 批准号:
533529-2018 - 财政年份:2020
- 资助金额:
$ 20.27万 - 项目类别:
Collaborative Research and Development Grants
OAC Core: Small: Architecture and Network-aware Partitioning Algorithms for Scalable PDE Solvers
OAC 核心:小型:可扩展 PDE 求解器的架构和网络感知分区算法
- 批准号:
2008772 - 财政年份:2020
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant
Algorithms and Architecture for Super Terabit Flexible Multicarrier Coherent Optical Transmission
超太比特灵活多载波相干光传输的算法和架构
- 批准号:
533529-2018 - 财政年份:2019
- 资助金额:
$ 20.27万 - 项目类别:
Collaborative Research and Development Grants
Visualization of FPGA CAD Algorithms and Target Architecture
FPGA CAD 算法和目标架构的可视化
- 批准号:
541812-2019 - 财政年份:2019
- 资助金额:
$ 20.27万 - 项目类别:
University Undergraduate Student Research Awards
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
- 批准号:
1759836 - 财政年份:2018
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
- 批准号:
1759796 - 财政年份:2018
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
- 批准号:
1759807 - 财政年份:2018
- 资助金额:
$ 20.27万 - 项目类别:
Standard Grant