POET-2: High-performance computing for advanced clinical narrative preprocessing

POET-2:用于高级临床叙述预处理的高性能计算

基本信息

  • 批准号:
    8182025
  • 负责人:
  • 金额:
    $ 32.52万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2011
  • 资助国家:
    美国
  • 起止时间:
    2011-09-01 至 2014-08-31
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): This project focuses on clinical natural language processing (cNLP), a field of emerging importance in informatics. Starting with the Linguistic String Project's Medical Language Processor (New York University) in the 1970s, researchers have made steady gains in cNLP through empirical studies and by building sophisticated high-level cNLP software applications (e.g., Columbia's MedLEE). There are no fewer than four scientific conferences devoted exclusively to biomedical/clinical NLP. The cNLP literature has been growing over the past decade, and this will gain momentum as more clinical text repositories are released, such as the MIMIC II and University of Pittsburgh BLU Lab corpora. However, sustained success in the field of cNLP is hampered by the reality that clinical texts have a far more noise than do texts traditionally studied in NLP, such as newswire articles, biomedical abstracts, and discharge summaries. Noise in this context is defined by the parseability characteristics of the language and the linguistic structures that appear in text. Clinical texts come in a striking variety of note types, with the best studied types being discharge summaries, radiology reports, and pathology reports. These note types share an important feature: they are written to communicate care issues between healthcare providers and hence typically are well-composed, well-edited, and often are dictated. But the vast majority of notes in the electronic health record are written primarily to document care issues. They communicate as well, of course, but much less care is used in their creation than with discharge summaries and reports. As a result they are often ungrammatical; are composed of short, telegraphic phrases; are replete with misspellings and shorthand (e.g., abbreviations); are ill-formatted with templates and liberal use of white space; and are embedded with "non-prose" (e.g., strings of laboratory values). All of these sources of noise complicate otherwise straightforward NLP tasks like tokenization, sentence segmentation, and ultimately information extraction itself. We propose a systematic study of ways to increase the signal-to-noise ratio in clinical narratives to improve cNLP. This work extends our preliminary research (under the POET project) and has the following aims: o Develop and implement a suite of parseability improvement tools designed for all clinical note types from multiple healthcare institutions. o Evaluate the empirical and the functional success of the parseability improvement tools. o Design and implement a HIPAA-compliant UlMA-based pipeline cNLP framework for use in a typical high-performance, multi-processor computing environment.
描述(由申请人提供): 该项目的重点是临床自然语言处理(cNLP),这是信息学中一个新兴的重要领域。从20世纪70年代语言字符串项目的医学语言处理器(纽约大学)开始,研究人员通过实证研究和构建复杂的高级cNLP软件应用程序(例如,哥伦比亚的MedLEE)。有不少于四个专门讨论生物医学/临床NLP的科学会议。cNLP文献在过去十年中一直在增长,随着更多临床文本库的发布,如MIMIC II和匹兹堡大学BLU实验室语料库,这将获得动力。 然而,cNLP领域的持续成功受到现实的阻碍,临床文本比传统上在NLP中研究的文本(如新闻通讯文章,生物医学摘要和出院摘要)具有更多的噪音。在这种情况下,噪音是由语言的可解析性特征和出现在文本中的语言结构定义的。临床文本有各种各样的笔记类型,最好的研究类型是出院摘要,放射学报告和病理学报告。这些笔记类型共享一个重要的功能:它们是为了在医疗保健提供者之间传达护理问题而编写的,因此通常是精心编写、精心编辑的,并且通常是口述的。但电子健康记录中的绝大多数笔记主要是为了记录护理问题。当然,他们也会交流,但在创建过程中,他们比出院总结和报告要少得多。因此,它们往往不合语法;由简短的电报短语组成;充满拼写错误和速记(例如,缩写);模板格式不好并且随意使用白色空间;并且嵌入有“非散文”(例如,实验室值字符串)。所有这些噪音来源都使原本简单的NLP任务变得复杂,例如标记化、句子分割以及最终的信息提取本身。 我们提出了一个系统的研究方法,以提高临床叙述的信噪比,以改善cNLP。这项工作扩展了我们的初步研究(在POET项目下),并具有以下目标: o开发和实施一套针对多个医疗机构的所有临床笔记类型设计的可解析性改进工具。 o评估可解析性改进工具的经验和功能成功。 o设计和实现一个符合HIPAA标准的基于UlMA的流水线cNLP框架,用于典型的高性能多处理器计算环境。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

JOHN F. HURDLE其他文献

JOHN F. HURDLE的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('JOHN F. HURDLE', 18)}}的其他基金

University of Utah Biomedical Informatics Training Grant Supplement
犹他大学生物医学信息学培训补助金补充
  • 批准号:
    9380137
  • 财政年份:
    2016
  • 资助金额:
    $ 32.52万
  • 项目类别:
POET-2: High-performance computing for advanced clinical narrative preprocessing
POET-2:用于高级临床叙述预处理的高性能计算
  • 批准号:
    8326648
  • 财政年份:
    2011
  • 资助金额:
    $ 32.52万
  • 项目类别:
POET: Consolidated, Comprehensive Clinical Text Preprocessing
POET:整合、全面的临床文本预处理
  • 批准号:
    7570254
  • 财政年份:
    2008
  • 资助金额:
    $ 32.52万
  • 项目类别:
POET: Consolidated, Comprehensive Clinical Text Preprocessing
POET:整合、全面的临床文本预处理
  • 批准号:
    7689273
  • 财政年份:
    2008
  • 资助金额:
    $ 32.52万
  • 项目类别:
POET: Consolidated, Comprehensive Clinical Text Preprocessing
POET:整合、全面的临床文本预处理
  • 批准号:
    7847940
  • 财政年份:
    2008
  • 资助金额:
    $ 32.52万
  • 项目类别:
Statistical NLP Analysis of Cross-discipline Clinical Text
跨学科临床文本的统计NLP分析
  • 批准号:
    6836781
  • 财政年份:
    2004
  • 资助金额:
    $ 32.52万
  • 项目类别:
Statistical NLP Analysis of Cross-discipline Clinical Text
跨学科临床文本的统计NLP分析
  • 批准号:
    6944955
  • 财政年份:
    2004
  • 资助金额:
    $ 32.52万
  • 项目类别:
University of Utah Biomedical Informatics Training Grant
犹他大学生物医学信息学培训补助金
  • 批准号:
    8681515
  • 财政年份:
    1997
  • 资助金额:
    $ 32.52万
  • 项目类别:
University of Utah Biomedical Informatics Training Grant
犹他大学生物医学信息学培训补助金
  • 批准号:
    8261299
  • 财政年份:
    1997
  • 资助金额:
    $ 32.52万
  • 项目类别:
University of Utah Biomedical Informatics Training Grant
犹他大学生物医学信息学培训补助金
  • 批准号:
    9086432
  • 财政年份:
    1997
  • 资助金额:
    $ 32.52万
  • 项目类别:

相似海外基金

Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315700
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Building a Calculus Active Learning Environment Equally Beneficial Across a Diverse Student Population
建立一个对不同学生群体同样有益的微积分主动学习环境
  • 批准号:
    2315747
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315699
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
CyberCorps Scholarship for Service: Defending Cyberspace through Active Learning
Cyber​​Corps 服务奖学金:通过主动学习捍卫网络空间
  • 批准号:
    2336586
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Continuing Grant
Project Visibility: Understanding the Experiences of Black Students in Active Learning Mathematics Courses in a Hispanic-Serving Institution Context
项目可见性:了解黑人学生在西班牙裔服务机构背景下主动学习数学课程的经历
  • 批准号:
    2337029
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315697
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315696
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Conference: Active Learning Communities in Biochemistry
会议:生物化学主动学习社区
  • 批准号:
    2411535
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315698
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
Collaborative Research: New to IUSE: EDU DCL:Diversifying Economics Education through Plug and Play Video Modules with Diverse Role Models, Relevant Research, and Active Learning
协作研究:IUSE 新增功能:EDU DCL:通过具有不同角色模型、相关研究和主动学习的即插即用视频模块实现经济学教育多元化
  • 批准号:
    2315701
  • 财政年份:
    2024
  • 资助金额:
    $ 32.52万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了