Fine-grained spatial information extraction for radiology reports

放射学报告的细粒度空间信息提取

基本信息

项目摘要

ABSTRACT Automated biomedical image classification has seen enormous improvements in performance over recent years, particularly in radiology. However, the machine learning (ML) methods that have achieved this remarkable performance often require enormous amounts of labeled data for training. An increasingly accepted means of acquiring this data is through the use of natural language processing (NLP) on the free-text reports associated with an image For example, take the following brain MRI report snippet: There is evidence of left parietal encephalomalacia consistent with known history of prior stroke. Small focal area of hemosiderin deposition along the lateral margins of the left lateral ventricle. Here, the associated MRI could be labeled for both Encephalomalacia and Hemosiderin. NLP methods to automatically label images in this way have been used to create several large image classification datasets However, as this example demonstrates, radiology reports often contain far more granular information than prior NLP methods attempted to extract. Both findings in the above example mention their anatomical location, which linguistically is referred to as a spatial grounding, as the location anchors the finding in a spatial reference. Further, the encephalomalacia finding is connected to the related diagnosis of stroke, while the hemosiderin finding provides a morphological description (small focal area). This granular information is important for image classification, as advanced deep learning methods are capable of utilizing highly granular structured data. This is logical, as for instance a lung tumor has a slightly different presentation than a liver tumor. If an ML algorithm can leverage both the coarse information (the general presentation of a tumor) while also recognizing the subtle granular differences, it can find an optimal balance between specificity and generalizability. From an imaging perspective, this can also be seen as a middle ground between image-level labels (which are cheap but require significant data for training—a typical dataset has thousands of images or more) and segmentation (which is expensive to obtain, but provides better training data—a typical dataset has 40 to 200 images), as the fine-grained spatial labels correspond to natural anatomical segments. Our fundamental hypothesis in this project is that if granular information can be extracted from radiology reports with NLP, this will improve downstream radiological image classification when training on a sufficiently large dataset. For radiology, the primary form of granularity is spatial (location, shape, orientation, etc.), so this will be the focus of our efforts. We further hypothesize that these NLP techniques will be generalizable to most types of radiology reports. For the purpose of this R21-scale project, however, we will focus on three distinct types of reports with different challenges: chest X-rays (one of the most-studied and largest-scale image classification types), extremity X-rays (which offer different findings than chest X-rays), and brain MRIs (which present a different image modality and the additional complexity of three dimensions).
摘要 近年来,自动生物医学图像分类在性能上有了巨大的改进, 特别是在放射学方面。然而,机器学习(ML)方法已经实现了这一显着 性能通常需要大量标记数据进行训练。一种越来越被接受的方法, 获取这些数据是通过对相关的自由文本报告使用自然语言处理(NLP) 例如,以下面的大脑MRI报告片段为例: 有证据表明左顶叶脑软化与已知的既往卒中史一致。小 沿着左侧脑室的侧缘沿着含铁血黄素沉积的局灶性区域。 此处,相关MRI可标记为脑软化和含铁血黄素。NLP方法 以这种方式自动标记图像已用于创建几个大型图像分类数据集 然而,正如这个例子所展示的,放射学报告通常包含比 现有的NLP方法试图提取。上述例子中的两个发现都提到了它们的解剖位置, 这在语言学上被称为空间基础,因为位置将发现锚定在空间参考中。 此外,脑软化症的发现与中风的相关诊断有关,而含铁血黄素的发现与中风的相关诊断有关。 发现提供了形态学描述(小病灶区)。这种粒度信息对于图像 分类,因为先进的深度学习方法能够利用高度粒度的结构化数据。这 这是合乎逻辑的,因为例如肺肿瘤与肝肿瘤的表现略有不同。如果ML算法 可以利用粗糙的信息(肿瘤的一般表现),同时也可以识别微妙的 颗粒的差异,它可以找到一个最佳平衡之间的具体性和普遍性。 从成像的角度来看,这也可以被看作是图像级标签之间的中间地带(图像级标签是 便宜,但需要大量数据进行训练-典型的数据集具有数千张图像或更多), 分割(这是昂贵的获得,但提供了更好的训练数据-一个典型的数据集有40至200 图像),因为细粒度的空间标签对应于自然解剖段。 我们在这个项目中的基本假设是,如果可以从放射学报告中提取颗粒信息, 使用NLP,当在足够大的数据集上训练时,这将改善下游放射图像分类。 数据集。对于放射学,粒度的主要形式是空间(位置、形状、方向等),所以这将 成为我们努力的焦点。我们进一步假设,这些NLP技术将推广到大多数类型 放射学报告。然而,为了这个R21规模的项目,我们将重点关注三种不同类型的 具有不同挑战的报告:胸部X射线(研究最多、规模最大的图像分类之一 类型)、四肢X光检查(提供与胸部X光检查不同的结果)和脑部MRI检查(提供 不同的图像模态和三维的附加复杂性)。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kirk Edward Roberts其他文献

Kirk Edward Roberts的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kirk Edward Roberts', 18)}}的其他基金

Fine-grained spatial information extraction for radiology reports
放射学报告的细粒度空间信息提取
  • 批准号:
    10373961
  • 财政年份:
    2020
  • 资助金额:
    $ 19.5万
  • 项目类别:
Fine-Grained Spatial Information Extraction For Radiology Reports
放射学报告的细粒度空间信息提取
  • 批准号:
    10288320
  • 财政年份:
    2020
  • 资助金额:
    $ 19.5万
  • 项目类别:
Natural Language Question Understanding for Electronic Health Records
电子健康记录的自然语言问题理解
  • 批准号:
    9228509
  • 财政年份:
    2016
  • 资助金额:
    $ 19.5万
  • 项目类别:
Natural Language Question Understanding for Electronic Health Records
电子健康记录的自然语言问题理解
  • 批准号:
    9479293
  • 财政年份:
    2016
  • 资助金额:
    $ 19.5万
  • 项目类别:

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 19.5万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了