Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
基本信息
- 批准号:10551349
- 负责人:
- 金额:$ 45.57万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-02-12 至 2025-01-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAutomationBig DataBig Data MethodsBiological MarkersBiological ProcessBiometryCase StudyCharacteristicsChronicChronic DiseaseChronic Kidney FailureClassificationClinicalClinical DataClinical MedicineComplexDataData ReportingData ScienceDerivation procedureDiagnosisDiseaseDisease ProgressionElectronic Health RecordEnd stage renal failureEnvironmentEtiologyExhibitsExposure toGeneticGenomicsGoalsHealth ProfessionalHealthcareHeterogeneityHumanIndividualInformaticsInterdisciplinary StudyInterventionKnowledgeLengthLifeLiteratureMachine LearningMedicalMedical GeneticsMedical RecordsMethodsModelingNatural Language ProcessingOutcomePatientsPharmaceutical PreparationsPopulationPreparationReportingReproducibilityResearchRiskSocial EnvironmentSourceSurveysTechniquesbiomedical informaticsclinical decision supportclinical decision-makingclinical phenotypeclinical riskdata analysis pipelinedata modelingdata standardsdesigndisease diagnosisfeature selectionhealth assessmenthealth datahealth determinantsimprovedinteroperabilitymortality risknovelopen dataopen sourceoutcome predictionpatient health informationpatient populationprecision medicinepredict clinical outcomerisk predictionsocioeconomicssupport toolsvector
项目摘要
Project Summary
Electronic health records (EHR) provide rich source of data about representative populations and are yet to be
fully utilized to enhance clinical decision-making. Conventional approaches in clinical decision-making start
with the identification of relevant biomarkers based on subject-matter knowledge, followed by detailed but
limited analysis using these biomarkers exclusively. As the current scientific literature indicates, many human
disorders share a complex etiological basis and exhibit correlated disease progression. Therefore, it is
desirable to use comprehensive patient data for patient similarity. This proposal focuses on deriving a
comprehensive and integrated score of patient similarity from complete patient characteristics currently
available, including but not limited to 1) demographic similarity; 2) genetic similarity; 3) clinical phenotype
similarity; 4) treatment similarity; and 5) exposome similarity (here exposome defined as all available attributes
of the living environment an individual is exposed to), when some of the aspects may overlap and interact. We
will optimize information fusion and task-dependent feature selection for assessing patient similarity for clinical
risk prediction. Since currently there does not exist a pipeline that is able to extract executable complete
patient determinant data, to achieve the research goal described above, we propose first deliver an open-
source data preparation pipeline that is based on a widely used clinical data standard, the OMOP
(Observational Medical Outcomes Partnership) Common Data Model (CMD) version 5.2. Moreover, to mitigate
common missingness and sparsity challenges in clinical data, we describe the first attempt to represent
patients' sparse clinical information with missingness, including diagnosis information, medication data,
treatment intervention, with a fixed-length feature vector (i.e. the Patient2Vec). This project has four specific
aims. Aim 1 is to develop a clinical data processing pipeline for harmonizing patient information from multiple
sources into a standards-based uniformed data representation and to evaluate its efficiency, interoperability,
and accuracy. Aim 2 is to leverage a powerful machine learning technique, Document2Vec, from the natural
language processing literature, to create an open-source Patient2Vec framework for the derivation of
informative numerical representations of patients. Aim 3 is to develop a unified machine learning clinical-
outcome-prediction framework for Optimized Patient Similarity Fusion (OptPSF) that integrates traditional
medical covariates with the derived numerical patient representations from Patient2Vec (Aim 2) for improved
clinical risk prediction. Aim 4 is to evaluate our similarity framework for predicting 1) the risk of end-stage
kidney disease (ESKD) in general EHR patient population and 2) the risk of death among patients with chronic
kidney disease (CKD).
项目概要
电子健康记录 (EHR) 提供了有关代表性人群的丰富数据来源,但仍有待完善
充分利用来增强临床决策。临床决策中的传统方法开始
根据主题知识识别相关生物标志物,然后进行详细但
仅使用这些生物标志物进行有限的分析。正如当前的科学文献表明,许多人类
疾病具有复杂的病因学基础并表现出相关的疾病进展。因此,它是
希望使用全面的患者数据来确定患者的相似性。该提案的重点是导出
目前根据完整的患者特征对患者相似性进行综合综合评分
可用,包括但不限于 1) 人口统计相似性; 2)遗传相似性; 3) 临床表型
相似; 4)治疗相似度; 5)暴露组相似性(这里暴露组定义为所有可用属性
个人所接触的生活环境),其中某些方面可能重叠和相互作用。我们
将优化信息融合和任务相关的特征选择,以评估临床患者的相似性
风险预测。由于目前不存在能够完整提取可执行文件的管道
患者决定因素数据,为了实现上述研究目标,我们建议首先提供一个开放的
基于广泛使用的临床数据标准 OMOP 的源数据准备管道
(观察医疗结果合作伙伴关系)通用数据模型 (CMD) 5.2 版。此外,为了减轻
临床数据中常见的缺失和稀疏挑战,我们描述了代表的第一次尝试
患者临床信息稀疏且缺失,包括诊断信息、用药数据、
治疗干预,具有固定长度的特征向量(即 Patient2Vec)。该项目有四个具体内容
目标。目标 1 是开发一个临床数据处理管道,用于协调来自多个方面的患者信息
源转化为基于标准的统一数据表示并评估其效率、互操作性、
和准确性。目标 2 是利用来自自然的强大机器学习技术 Document2Vec
语言处理文献,创建一个开源 Patient2Vec 框架来推导
患者的信息性数字表示。目标 3 是开发一个统一的机器学习临床-
优化患者相似性融合 (OptPSF) 的结果预测框架集成了传统的
医学协变量与从 Patient2Vec(目标 2)导出的数字患者表示,以改进
临床风险预测。目标 4 是评估我们用于预测 1) 末期风险的相似性框架
一般 EHR 患者群体中的肾脏疾病 (ESKD) 以及 2) 慢性肾病患者的死亡风险
肾脏疾病(CKD)。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KRZYSZTOF KIRYLUK其他文献
KRZYSZTOF KIRYLUK的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KRZYSZTOF KIRYLUK', 18)}}的其他基金
Non-APOL1 genetic factors and kidney transplant outcomes
非 APOL1 遗传因素与肾移植结果
- 批准号:
10717171 - 财政年份:2023
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10438855 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10251946 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10020606 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
- 批准号:
10323033 - 财政年份:2019
- 资助金额:
$ 45.57万 - 项目类别:
Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
- 批准号:
10087958 - 财政年份:2019
- 资助金额:
$ 45.57万 - 项目类别:
Genetics of IgA nephropathy by integrative network-based association studies
基于综合网络关联研究的 IgA 肾病遗传学
- 批准号:
9258422 - 财政年份:2015
- 资助金额:
$ 45.57万 - 项目类别:
相似海外基金
Treecle - data and automation to unlock woodland creation in the UK to achieve net zero
Treecle - 数据和自动化解锁英国林地创造以实现净零排放
- 批准号:
10111492 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
SME Support
STTR Phase II: Optimized manufacturing and machine learning based automation of Endothelium-on-a-chip microfluidic devices for drug screening applications.
STTR 第二阶段:用于药物筛选应用的片上内皮微流体装置的优化制造和基于机器学习的自动化。
- 批准号:
2332121 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Cooperative Agreement
Improving access to AI automation to support new digital offerings within Professional/Financial Services
改善对人工智能自动化的访问,以支持专业/金融服务中的新数字产品
- 批准号:
10095096 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Collaborative R&D
Cost-Effective, AI-driven Automation Technology for Cell Culture Monitoring: Boosting Efficiency and Sustainability in Industrial Biomanufacturing and Streamlining Supply Chains
用于细胞培养监测的经济高效、人工智能驱动的自动化技术:提高工业生物制造的效率和可持续性并简化供应链
- 批准号:
10104748 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Launchpad
Sustainable Remanufacturing solution with increased automation and recycled content in laser and plasma based process (RESTORE)
可持续再制造解决方案,在基于激光和等离子的工艺中提高自动化程度和回收内容(RESTORE)
- 批准号:
10112149 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
EU-Funded
Next-generation automation and PAT implementation for QbD and enhanced approaches for cell and gene therapy
QbD 的下一代自动化和 PAT 实施以及细胞和基因治疗的增强方法
- 批准号:
10087446 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Collaborative R&D
SBIR Phase II: Radar-based Building Automation
SBIR 第二阶段:基于雷达的楼宇自动化
- 批准号:
2335079 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Cooperative Agreement
Automation and cost reduction of the hardware and software components of a novel indoor sustainable vertical growing solution
新型室内可持续垂直种植解决方案的硬件和软件组件的自动化和成本降低
- 批准号:
83007861 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Innovation Loans
Artificial intelligence coupled to automation for accelerated medicine design
人工智能与自动化相结合,加速药物设计
- 批准号:
EP/Z533038/1 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Research Grant
CAREER: Algorithm-Hardware Co-design of Efficient Large Graph Machine Learning for Electronic Design Automation
职业:用于电子设计自动化的高效大图机器学习的算法-硬件协同设计
- 批准号:
2340273 - 财政年份:2024
- 资助金额:
$ 45.57万 - 项目类别:
Continuing Grant














{{item.name}}会员




