Transfer Learning for Digital Curation of the EMR Clinical Narrative

用于 EMR 临床叙述数字化管理的迁移学习

基本信息

  • 批准号:
    10647748
  • 负责人:
  • 金额:
    $ 37.61万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-08-12 至 2025-05-31
  • 项目状态:
    未结题

项目摘要

Project Summary This proposal is in response to PAR 18-796 to seek support for advancing methodologies for a transfer learning framework for the digital curation of the Electronic Medical Records (EMR) clinical narrative. In the current era of increasing importance of Artificial Intelligence (AI) in biomedicine, our proposal tackles a critical AI component – automated annotation of health-related text. Since 2015 the development and application of machine learning (ML) methods has exploded propelled by the convergence of plentiful digitized unstructured data (text, speech, images), hardware and the refinement of neural networks or deep learning. 2018 marked a turning point in Natural Language Processing (NLP), particularly transfer learning through pre-trained models like Universal Language Model Fine-tuning for Text Classification, Allen AI's ELMO, OpenAI's Open-GPT. In November 2018, Google published the Bidirectional Encodings Representations from Transformers (BERT), a transformer-based model pre-trained on massive general text databases (3.3B words total). The publication reported using BERT representations to build classifiers for 11 NLP tasks which outperformed the state-of-the- art (SOTA) with large margins. The NLP research community jumped to the idea of exploring this new framework but quickly came to the realization that building BERT-style models from scratch is affordable and feasible to only a few. Thus, research investigation proceeded in the direction of using these gigantic models as resources for language representations. Scientific efforts focused on pre-trained models (e.g. BERT) as a source of extracting high quality language features or fine-tuning on a specific task, i.e. using a model as a checkpoint and re-training with much smaller amounts of task-specific data to produce predictions by typically adding one fully-connected layer on top of the representations and training for a few epochs. This general watershed shift in NLP to transfer learning which parallels the developments in computer vision a few years ago coupled with our latest work brings to the forefront a critical NLP research topic ripe for exploration – a transfer learning framework for the digital curation of the EMR clinical narrative. The proposed work is research of novel scientific methods for extracting detailed information from health-related text especially the EMR, the major source of phenotype data for patients. Precise phenotype information is needed to advance translational research, particularly to unravel the effects of genetic, epigenetic, and systems changes on responsiveness. This research is in line with the latest developments in neural deep learning approaches and AI in general and is expected to enhance biomedical research and through that the health of the public.
项目摘要 本提案是对PAR 18-796的回应,旨在寻求支持, 电子病历(EMR)临床叙述数字化管理的学习框架。在 当前人工智能(AI)在生物医学中的重要性日益增加,我们的建议解决了一个关键问题, AI组件-健康相关文本的自动注释。自2015年以来, 机器学习(ML)方法在大量数字化非结构化 数据(文本、语音、图像)、硬件和神经网络或深度学习的改进。2018年, 自然语言处理(NLP)的转折点,特别是通过预训练模型的迁移学习 比如用于文本分类的通用语言模型微调,艾伦AI的埃尔莫,OpenAI的Open-GPT。在 2018年11月,谷歌发布了变形金刚双向编码表示(BERT), 基于transformer的模型在大规模的通用文本数据库(总共33亿字)上进行了预训练。出版 报告使用BERT表示来构建11个NLP任务的分类器,这些任务的性能超过了 艺术(SOTA)与大利润率。NLP研究社区跳到探索这个新的想法, 框架,但很快就意识到,从头开始构建BERT风格的模型是负担得起的, 只对少数人可行。因此,研究调查朝着使用这些巨大模型的方向进行 作为语言表征的资源。科学工作集中在预先训练的模型(例如BERT)上, 提取高质量语言特征或对特定任务进行微调的来源,即使用模型作为 检查点和重新训练,使用更少量的特定于任务的数据来生成预测, 在表示的顶部添加一个完全连接的层,并训练几个时期。这个一般 NLP向迁移学习的分水岭式转变,这与计算机视觉的发展相似 前加上我们的最新工作带来了前沿的一个关键的NLP研究课题成熟的探索-一个 迁移学习框架,用于EMR临床叙述的数字化管理。拟议的工作是研究 从健康相关文本中提取详细信息的新科学方法,特别是EMR, 患者表型数据的主要来源。需要精确的表型信息来推进翻译 研究,特别是解开遗传,表观遗传和系统变化对反应的影响。 这项研究符合神经深度学习方法和人工智能的最新发展, 预计将加强生物医学研究,并通过公众的健康。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles
使用 BERT 和大型语言模型集成提高临床记录部分分类模型的可迁移性
  • DOI:
    10.18653/v1/2023.clinicalnlp-1.16
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Weipeng Zhou;M. Afshar;Dmitriy Dligach;Yanjun Gao;Timothy Miller
  • 通讯作者:
    Timothy Miller
Large language models to identify social determinants of health in electronic health records.
  • DOI:
    10.1038/s41746-023-00970-0
  • 发表时间:
    2024-01-11
  • 期刊:
  • 影响因子:
    15.2
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

GUERGANA K. SAVOVA其他文献

GUERGANA K. SAVOVA的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('GUERGANA K. SAVOVA', 18)}}的其他基金

Transfer Learning for Digital Curation of the EMR Clinical Narrative
用于 EMR 临床叙述数字化管理的迁移学习
  • 批准号:
    10092340
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Transfer Learning for Digital Curation of the EMR Clinical Narrative
用于 EMR 临床叙述数字化管理的迁移学习
  • 批准号:
    10468604
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Cancer Deep Phenotype Extraction from Electronic Medical Records
从电子病历中提取癌症深层表型
  • 批准号:
    9538366
  • 财政年份:
    2014
  • 资助金额:
    $ 37.61万
  • 项目类别:
Multi-source clinical Question Answering system
多源临床问答系统
  • 批准号:
    7842799
  • 财政年份:
    2009
  • 资助金额:
    $ 37.61万
  • 项目类别:
Multi-source clinical Question Answering system
多源临床问答系统
  • 批准号:
    7936991
  • 财政年份:
    2009
  • 资助金额:
    $ 37.61万
  • 项目类别:

相似国自然基金

基于Apache Spark的可扩展宏基因组序列组装方法研究
  • 批准号:
    61802246
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

DISES:Restoration of a southwestern cultural keystone species: Integrating socio-ecological systems to predict resilience of traditional acorn harvest by western Apache communities
疾病:西南文化关键物种的恢复:整合社会生态系统来预测西部阿帕奇社区传统橡子收获的恢复力
  • 批准号:
    2206810
  • 财政年份:
    2023
  • 资助金额:
    $ 37.61万
  • 项目类别:
    Standard Grant
Developing and evaluating scalable and culturally relevant interventions to improve breast cancer screening among White Mountain Apache women
制定和评估可扩展且与文化相关的干预措施,以改善白山阿帕奇妇女的乳腺癌筛查
  • 批准号:
    10223758
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Arsenic and other co-metals in the San Carlos Apache drinking water
圣卡洛斯阿帕奇饮用水中的砷和其他共金属
  • 批准号:
    10302159
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
NARCH XI White Mountain Apache Tribe (WMAT)- Johns Hopkins University (JHU) Administrative Core
NARCH XI 白山阿帕奇部落 (WMAT)- 约翰霍普金斯大学 (JHU) 行政核心
  • 批准号:
    10223755
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Community-informed interventions to address the large burden of Staphylococcus aureus infections on the White Mountain Apache Tribal lands
社区知情干预措施,以解决白山阿帕奇部落土地上金黄色葡萄球菌感染的巨大负担
  • 批准号:
    10494072
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Community-informed interventions to address the large burden of Staphylococcus aureus infections on the White Mountain Apache Tribal lands
社区知情干预措施,以解决白山阿帕奇部落土地上金黄色葡萄球菌感染的巨大负担
  • 批准号:
    10223757
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Arsenic and other co-metals in the San Carlos Apache drinking water
圣卡洛斯阿帕奇饮用水中的砷和其他共金属
  • 批准号:
    10480930
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Arsenic and other co-metals in the San Carlos Apache drinking water
圣卡洛斯阿帕奇饮用水中的砷和其他共金属
  • 批准号:
    10693969
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
Developing and evaluating scalable and culturally relevant interventions to improve breast cancer screening among White Mountain Apache women
制定和评估可扩展且与文化相关的干预措施,以改善白山阿帕奇妇女的乳腺癌筛查
  • 批准号:
    10494075
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
NARCH XI White Mountain Apache Tribe (WMAT)- Johns Hopkins University (JHU) Administrative Core
NARCH XI 白山阿帕奇部落 (WMAT)- 约翰霍普金斯大学 (JHU) 行政核心
  • 批准号:
    10494066
  • 财政年份:
    2021
  • 资助金额:
    $ 37.61万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了