Semi-structured Information Retrieval in Clinical Text for Cohort Identification
用于队列识别的临床文本中的半结构化信息检索
基本信息
- 批准号:10879792
- 负责人:
- 金额:$ 63.75万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-09-20 至 2026-04-30
- 项目状态:未结题
- 来源:
- 关键词:AccelerationAdoptedAdoptionAlgorithmsArchitectureCOVID-19ClinicClinicalClinical DataClinical ResearchClinical TrialsCohort StudiesCollectionCommunitiesComplexDataDevelopmentElectronic Health RecordEligibility DeterminationEvaluationFeedbackFormulationGoalsHealthcareHealthcare SystemsHumanInformaticsInformation RetrievalInstitutionLanguageLearningMedicalMethodsModelingMorphologic artifactsNatural Language ProcessingOutcomePatient RecruitmentsPatientsPerformancePharmaceutical PreparationsProcessResearchResourcesRetrievalSemanticsSiteStructureSystemTechniquesTextTimeTrainingTranslational ResearchVisitWorkclinical data warehouseclinical research siteclinical trial recruitmentcohortdata modelingdata standardsdensitydesignexperimental studyhealth care deliveryheterogenous dataindexinginnovationlearning strategyneuralnovelopen sourceportabilitypredictive modelingquery toolsrecruitsearch enginestructured datatool
项目摘要
Project Summary
The widespread adoption of Electronic Health Records (EHRs) has enabled the use of clinical data for clinical
research and healthcare delivery. Many institutions have established clinical data warehouses (CDWs) in
conjunction with cohort discovery tools (e.g., i2b2) to support the use of clinical data for clinical research
including retrospective clinical studies as well as feasibility assessment or patient recruitment for clinical trials.
However, a significant portion of relevant patient information is embedded in clinical narratives and natural
language processing (NLP) techniques such as information extraction are critical when using EHR data for
clinical research. Many clinical NLP systems have been developed to extract information from text for various
downstream applications but have had unsatisfactory performance and portability issues. Information retrieval
(IR), a technique used in search engines for storing, retrieving, and ranking documents from a large collection of
text documents based on users’ queries, can provide an alternative approach to leverage clinical narratives for
cohort discovery as it is less dependent on semantics. In order to accomplish this, additional work is needed
since current IR approaches are generally document-based and the formulation of cohort discovery as an IR
task requires the development of innovative IR approaches to handle complex EHR data and cohort criteria with
contextual (e.g., spatial or temporal) constraints.
Our long-term goal is to develop informatics solutions to accelerate the use of EHR data for clinical research.
The main goal of this proposal is to develop innovative IR methods, which formulate cohort discovery from EHR
data as an IR task, aiming to accelerate the identification of patient cohorts for cohort studies or the recruitment
of eligible patients for clinical trials. In our current R01-supported study (R01LM011934), we introduced novel
language models to enable the reuse of NLP-produced artifacts for IR-based cohort retrieval and developed
parallel resources for IR evaluation at two institutions (Mayo Clinic and OHSU). We hypothesize that, given
complex cohort criteria with contextual constraints, an IR framework with tailored architecture components (e.g.,
indexing, ranking, evaluation, and query processing) for storing and querying EHR data has an advantage over
traditional cohort discovery tools for querying unstructured EHR data as well as an advantage over text-based
search engines for querying both structured and unstructured EHR data. For the proposed renewal, we plan to
i) adopt common data models (CDMs) and deploy the framework at one additional site to assess the
generalizability of methods, ii) extend the IR framework to incorporate contextual information, and iii)
incorporate deep semantic representations into the IR framework. If successful, the proposed project will
advance informatics research on cohort discovery and identification, which impacts many applications based on
EHR data such as learning healthcare systems, predictive modeling, or AI in healthcare.
项目摘要
电子健康记录(EHR)的广泛采用使得能够将临床数据用于临床诊断。
研究和医疗保健服务。许多机构已经建立了临床数据仓库(CDW),
结合群组发现工具(例如,i2b2)支持使用临床数据进行临床研究
包括回顾性临床研究以及可行性评估或临床试验的患者招募。
然而,很大一部分相关患者信息嵌入在临床叙述中,并且自然
语言处理(NLP)技术,如信息提取,在使用EHR数据进行
临床研究已经开发了许多临床NLP系统,以从文本中提取各种信息。
下游应用程序,但具有不令人满意的性能和可移植性问题。信息检索
(IR)搜索引擎中使用的一种技术,用于从大量的
基于用户查询的文本文档,可以提供一种替代方法来利用临床叙述,
队列发现,因为它较少依赖于语义。为了做到这一点,需要做更多的工作
由于当前的IR方法通常是基于文档的,并且将群组发现公式化为IR
这项任务需要开发创新的IR方法来处理复杂的EHR数据和队列标准,
上下文(例如,空间或时间)约束。
我们的长期目标是开发信息学解决方案,以加速EHR数据在临床研究中的使用。
该提案的主要目标是开发创新的IR方法,该方法从EHR制定队列发现
数据作为IR任务,旨在加速队列研究的患者队列识别或招募
临床试验的合格患者。在我们目前的R01支持的研究(R01LM011934)中,我们引入了新的
语言模型,使重用NLP产生的文物为基础的IR队列检索和开发
在两个机构(马约诊所和OHSU)进行IR评价的平行资源。我们假设,
具有上下文约束的复杂群组标准,具有定制架构组件的IR框架(例如,
索引、排名、评估和查询处理)的优势,
用于查询非结构化EHR数据的传统队列发现工具,
用于查询结构化和非结构化EHR数据的搜索引擎。对于建议的更新,我们计划
i)采用通用数据模型,并在另一个地点部署框架,以评估
方法的可推广性,ii)扩展IR框架以纳入上下文信息,以及iii)
将深层语义表示纳入IR框架。如果成功,该项目将
推进队列发现和识别的信息学研究,这影响了许多基于
EHR数据,如学习医疗保健系统,预测建模或医疗保健中的AI。
项目成果
期刊论文数量(34)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Extracting chemical-protein relations using attention-based neural networks.
- DOI:10.1093/database/bay102
- 发表时间:2018-01-01
- 期刊:
- 影响因子:0
- 作者:Liu S;Shen F;Komandur Elayavilli R;Wang Y;Rastegar-Mojarad M;Chaudhary V;Liu H
- 通讯作者:Liu H
Clinical concept extraction: A methodology review.
临床概念提取:方法论。
- DOI:10.1016/j.jbi.2020.103526
- 发表时间:2020-09
- 期刊:
- 影响因子:4.5
- 作者:Fu S;Chen D;He H;Liu S;Moon S;Peterson KJ;Shen F;Wang L;Wang Y;Wen A;Zhao Y;Sohn S;Liu H
- 通讯作者:Liu H
A Frequency-based Strategy of Obtaining Sentences from Clinical Data Repository for Crowdsourcing.
从临床数据存储库获取句子以进行众包的基于频率的策略。
- DOI:
- 发表时间:2015
- 期刊:
- 影响因子:0
- 作者:Li,Dingcheng;RastegarMojarad,Majid;Li,Yanpeng;Sohn,Sunghwan;Mehrabi,Saeed;KomandurElayavilli,Ravikumar;Yu,Yue;Liu,Hongfang
- 通讯作者:Liu,Hongfang
Contextual Variation of Clinical Notes induced by EHR Migration.
EHR 迁移引起的临床记录的上下文变化。
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Miller,Kurt;Moon,Sungrim;Fu,Sunyang;Liu,Hongfang
- 通讯作者:Liu,Hongfang
The IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era.
- DOI:10.1038/s41746-023-00878-9
- 发表时间:2023-07-21
- 期刊:
- 影响因子:15.2
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
WILLIAM R HERSH其他文献
WILLIAM R HERSH的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('WILLIAM R HERSH', 18)}}的其他基金
Attracting Talented and Diverse Students to Biomedical Informatics and Data Science Careers Through Short-Term Study at OHSU
通过在 OHSU 的短期学习吸引才华横溢、多元化的学生从事生物医学信息学和数据科学职业
- 批准号:
10630618 - 财政年份:2022
- 资助金额:
$ 63.75万 - 项目类别:
Attracting Talented and Diverse Students to Biomedical Informatics and Data Science Careers Through Short-Term Study at OHSU
通过在 OHSU 的短期学习吸引才华横溢、多元化的学生从事生物医学信息学和数据科学职业
- 批准号:
10701083 - 财政年份:2022
- 资助金额:
$ 63.75万 - 项目类别:
Computational Omics and Biomedical Informatics Program (COBIP)
计算组学和生物医学信息学计划(COBIP)
- 批准号:
10319196 - 财政年份:2021
- 资助金额:
$ 63.75万 - 项目类别:
Computational Omics and Biomedical Informatics Program (COBIP)
计算组学和生物医学信息学计划(COBIP)
- 批准号:
10490403 - 财政年份:2021
- 资助金额:
$ 63.75万 - 项目类别:
Computational Omics and Biomedical Informatics Program (COBIP)
计算组学和生物医学信息学计划(COBIP)
- 批准号:
10676322 - 财政年份:2021
- 资助金额:
$ 63.75万 - 项目类别:
Research Training in Biomedical Informatics and Data Science at Oregon Health & Science University
俄勒冈健康中心生物医学信息学和数据科学研究培训
- 批准号:
9524502 - 财政年份:2017
- 资助金额:
$ 63.75万 - 项目类别:
Biomedical Informatics Research Training at Oregon Health & Science University
俄勒冈健康中心的生物医学信息学研究培训
- 批准号:
9369268 - 财政年份:2016
- 资助金额:
$ 63.75万 - 项目类别:
Semi-structured Information Retrieval in Clinical Text for Cohort Identification
用于队列识别的临床文本中的半结构化信息检索
- 批准号:
10450805 - 财政年份:2014
- 资助金额:
$ 63.75万 - 项目类别:
Semi-structured Information Retrieval in Clinical Text for Cohort Identification
用于队列识别的临床文本中的半结构化信息检索
- 批准号:
10207950 - 财政年份:2014
- 资助金额:
$ 63.75万 - 项目类别:
OHSU Summer Internship in Biomedical Informatics for College Undergraduates
OHSU 大学本科生生物医学信息学暑期实习
- 批准号:
8281433 - 财政年份:2011
- 资助金额:
$ 63.75万 - 项目类别:
相似海外基金
How novices write code: discovering best practices and how they can be adopted
新手如何编写代码:发现最佳实践以及如何采用它们
- 批准号:
2315783 - 财政年份:2023
- 资助金额:
$ 63.75万 - 项目类别:
Standard Grant
One or Several Mothers: The Adopted Child as Critical and Clinical Subject
一位或多位母亲:收养的孩子作为关键和临床对象
- 批准号:
2719534 - 财政年份:2022
- 资助金额:
$ 63.75万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633211 - 财政年份:2020
- 资助金额:
$ 63.75万 - 项目类别:
Studentship
A material investigation of the ceramic shards excavated from the Omuro Ninsei kiln site: Production techniques adopted by Nonomura Ninsei.
对大室仁清窑遗址出土的陶瓷碎片进行材质调查:野野村仁清采用的生产技术。
- 批准号:
20K01113 - 财政年份:2020
- 资助金额:
$ 63.75万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2436895 - 财政年份:2020
- 资助金额:
$ 63.75万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633207 - 财政年份:2020
- 资助金额:
$ 63.75万 - 项目类别:
Studentship
The limits of development: State structural policy, comparing systems adopted in two European mountain regions (1945-1989)
发展的限制:国家结构政策,比较欧洲两个山区采用的制度(1945-1989)
- 批准号:
426559561 - 财政年份:2019
- 资助金额:
$ 63.75万 - 项目类别:
Research Grants
Securing a Sense of Safety for Adopted Children in Middle Childhood
确保被收养儿童的中期安全感
- 批准号:
2236701 - 财政年份:2019
- 资助金额:
$ 63.75万 - 项目类别:
Studentship
A Study on Mutual Funds Adopted for Individual Defined Contribution Pension Plans
个人设定缴存养老金计划采用共同基金的研究
- 批准号:
19K01745 - 财政年份:2019
- 资助金额:
$ 63.75万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Structural and functional analyses of a bacterial protein translocation domain that has adopted diverse pathogenic effector functions within host cells
对宿主细胞内采用多种致病效应功能的细菌蛋白易位结构域进行结构和功能分析
- 批准号:
415543446 - 财政年份:2019
- 资助金额:
$ 63.75万 - 项目类别:
Research Fellowships