Integrative data science approaches for rare disease discovery in health records
用于发现健康记录中罕见疾病的综合数据科学方法
基本信息
- 批准号:10541283
- 负责人:
- 金额:$ 23.59万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-06-01 至 2025-05-31
- 项目状态:未结题
- 来源:
- 关键词:AdultAffectAmericanAwardBasic ScienceBehavioralBioinformaticsClinicalClinical DataClinical MedicineClinical ResearchComputing MethodologiesConsensusDataData ScienceData SetDetectionDiagnosisDiagnosticDiagnostics ResearchDiseaseEconomic BurdenElectronic Health RecordEnvironmentFacultyFamilyGenesGeneticGenomicsGenotypeGoalsHealthcareHealthcare SystemsIndividualInformaticsKnowledgeMachine LearningManuscriptsMarkov ChainsMedicalMedical GeneticsMental disordersMentorsMentorshipMethodsMiningModelingMolecularNamesNatural Language ProcessingNatural Language Processing pipelineOntologyOutcomePacific NorthwestPatient RecruitmentsPatientsPatternPersonsPhasePhenotypePopulationPositioning AttributePrevalencePrincipal InvestigatorRare DiseasesRecording of previous eventsResearchResearch PersonnelStandardizationSymptomsSystemTestingTimeTrainingUniversitiesValidationVocabularyWashingtonWorkaccurate diagnosisbasebiomedical data sciencebiomedical informaticscareercausal variantclinical data warehouseclinical decision-makingcohortdiagnostic accuracydisease phenotypeearly onset disorderexome sequencinggene discoverygenomic datahealth care deliveryhealth datahealth recordimprovedmembermultimodal datanovelopen sourcepatient health informationphenotypic dataprototypepsychologicrare conditionrare genetic disorderrecruitskillssoftware developmentsupport toolstooltrait
项目摘要
ABSTRACT: There are nearly 7,000 diseases that have a prevalence of only one in 2,000 individuals or less.
Yet, such rare diseases are estimated to collectively affect over 300 million people worldwide, representing a
significant healthcare concern. Although rare diseases have predominantly genetic origins, nearly half of them
do not manifest symptoms until adulthood and frequently confound discovery and diagnosis. Even in the case
of early onset disorders, the sheer number of possible diagnoses can often overwhelm clinicians. As a result,
rare diseases are often diagnosed with delay, misdiagnosed or even remain undiagnosed, not only disrupting
patient lives but also hindering progress on our understanding of such diseases. Data science methods that
mine large-scale retrospective health record data for phenotypic information will aid in timely and accurate
diagnoses of rare diseases, especially when combined with additional data types, thus, having significant real-
world impact. This proposal will integrate electronic health record (EHR) data sets with publicly available
vocabularies and ontologies, and genomic data for the improved identification and characterization of patients
with rare diseases, using approaches from machine learning, natural language processing (NLP) and basic
bioinformatics. The work has three specific aims and will be carried out in two phases. During the mentored
phase, the principal investigator (PI) will develop data-driven methods to extract standardized concepts related
to rare diseases from clinical notes and infer the occurrence of each disease (Aim 1). He will also develop data
science approaches to compare and contrast longitudinal patterns associated with patients' journeys through
the healthcare system when seeking a diagnosis for a rare disease, and aid in clinical decision-making by
leveraging these patterns (Aim 2). During the independent phase (Aim 3), computational methods will be
developed for the integrated modeling and analysis of genotypic (from Aim 3) and phenotypic information (from
Aims 1 and 2). Cohorts to be sequenced will cover diseases for which causal genes or disease definitions are
unclear (discovery), as well as those for which these are well known (validation). This work will be carried out
under the mentorship of four faculty members with complementary expertise in biomedical informatics, data
science, NLP, and rare disease genomics at the University of Washington, the largest medical system in the
Pacific Northwest (four million EHRs), world-renowned researchers in medical genetics, and a robust data
science environment. In addition, under the direction of the mentoring team, the PI will complete advanced
coursework, receive training in translational bioinformatics and clinical research informatics, submit
manuscripts, and seek an independent research position. This proposal will yield preliminary results for
subsequent studies on data-driven phenotyping and enable the realization of the PI's career goals by providing
him with the necessary training to build on his machine learning and basic bioinformatics expertise to transition
into an independent investigator in biomedical data science.
摘要:有近7,000种疾病的患病率仅为2,000分之一或更低。
然而,据估计,这些罕见疾病共同影响全球3亿多人,
重要的医疗保健问题。虽然罕见疾病主要是遗传性的,但近一半的罕见疾病
直到成年才表现出症状,并且经常混淆发现和诊断。即使是
对于早发性疾病,可能的诊断数量之多常常使临床医生不知所措。因此,在本发明中,
罕见病往往被延误诊断、误诊甚至未被诊断,不仅扰乱了
病人的生命,但也阻碍了我们对这些疾病的理解。数据科学方法,
挖掘大规模回顾性健康记录数据的表型信息将有助于及时准确地
罕见疾病的诊断,特别是当与其他数据类型相结合时,因此,具有显著的真实的-
世界影响。该提案将整合电子健康记录(EHR)数据集,
词汇表和本体以及基因组数据,用于改进患者的识别和表征
使用机器学习、自然语言处理(NLP)和基础方法来治疗罕见疾病
生物信息学这项工作有三个具体目标,将分两个阶段进行。在指导期间,
在这一阶段,主要研究者(PI)将开发数据驱动的方法,以提取相关的标准化概念。
从临床记录中发现罕见疾病,并推断每种疾病的发生率(目标1)。他还将开发数据
科学方法来比较和对比与患者经历相关的纵向模式,
在寻求罕见疾病的诊断时,医疗保健系统,并通过以下方式帮助临床决策
利用这些模式(目标2)。在独立阶段(目标3),计算方法将
开发用于基因型(来自目标3)和表型信息(来自
目标1和2)。待测序的队列将涵盖致病基因或疾病定义不明确的疾病。
不清楚的(发现),以及那些众所周知的(验证)。这项工作将在
在四名具有生物医学信息学互补专业知识的教师的指导下,数据
科学,NLP和罕见疾病基因组学在华盛顿大学,最大的医疗系统在美国
太平洋西北地区(400万EHR),世界著名的医学遗传学研究人员,以及强大的数据
科学环境。此外,在指导团队的指导下,PI将完成高级
课程,接受翻译生物信息学和临床研究信息学的培训,提交
手稿,并寻求独立的研究立场。这项建议将产生初步结果,
随后的研究数据驱动的表型,并使PI的职业目标的实现,提供
他与必要的培训,以建立在他的机器学习和基本的生物信息学专业知识,以过渡
成为生物医学数据科学的独立调查员
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vikas Rao Pejaver其他文献
Vikas Rao Pejaver的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Vikas Rao Pejaver', 18)}}的其他基金
Integrative data science approaches for rare disease discovery in health records
用于发现健康记录中罕见疾病的综合数据科学方法
- 批准号:
10626148 - 财政年份:2022
- 资助金额:
$ 23.59万 - 项目类别:
Integrative data science approaches for rare disease discovery in health records
用于发现健康记录中罕见疾病的综合数据科学方法
- 批准号:
9884791 - 财政年份:2019
- 资助金额:
$ 23.59万 - 项目类别:
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Fellowship
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
- 批准号:
23K24936 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Research Grant
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
- 批准号:
2901648 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
- 批准号:
488039 - 财政年份:2023
- 资助金额:
$ 23.59万 - 项目类别:
Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
- 批准号:
23K00129 - 财政年份:2023
- 资助金额:
$ 23.59万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
- 批准号:
2883985 - 财政年份:2023
- 资助金额:
$ 23.59万 - 项目类别:
Studentship














{{item.name}}会员




