Regional Oncology Research Center (LLMs for Unstructured Data Extraction)
区域肿瘤学研究中心(非结构化数据提取法学硕士)
基本信息
- 批准号:10891024
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-06-01 至 2027-05-31
- 项目状态:未结题
- 来源:
- 关键词:AddressArtificial IntelligenceClinicalClinical DataClinical ManagementData CommonsData ElementDiagnosisEngineeringEquityHealthcareIndividualInformation RetrievalLanguageMalignant NeoplasmsManualsModelingMolecularNatural Language ProcessingOncologyOnline SystemsOutputPathologicPathology ReportPatientsPerformancePlayProcessPrognosisReportingResearchResearch DesignRoleStructureSupervisionTechniquesTextTrainingWorkcancer diagnosiscohortcompare effectivenessdesignheuristicsimprovedinnovationinterdisciplinary approachmedical schoolsmodel buildingmodel developmentneoplasm registryprecision medicineunstructured data
项目摘要
Abstract
Artificial intelligence (AI) has the potential to revolutionize healthcare by leveraging clinical data to advance
research and improve oncology practice. Within free-text pathology reports, crucial information about primary
cancer diagnoses and evolving molecular features is embedded. Extracting and interpreting this information
accurately is essential for determining cancer stage, which plays a decisive role in prognosis and guiding clinical
management. Although natural language processing (NLP) techniques have been applied to extract focused
information from pathology reports, there is still a need for adaptable, generalizable, and interpretable strategies
to enhance clinical data abstraction. To address this need, we propose a multidisciplinary approach to develop
an integrative clinical information extraction pipeline. This work aims to improve, assess, and enhance the
abstraction of relevant features of pathological diagnosis from pathology reports by leveraging large language
models.
Our research design involves several steps. First, we will establish a diverse and equitable cohort of patients
from our Cancer Registry and collect free-text pathology reports, along with structured clinical data obtained from
the Johns Hopkins School of Medicine Precision Medicine Analytics Platform (PMAP) Data Commons. Next, we
will employ an information extraction platform to identify pathological features from the reports. This platform will
utilize a suite of models, including BERT-like models, GPT-3.5, and GPT-4, provided by Microsoft, specifically
designed for identifying key cancer attributes. Subsequently, we will evaluate the output of individual models
using the CASPER interactive model development framework, enhancing and refining the results through
heuristics and weak supervision. The augmented model output will be presented through a web-based user
interface, allowing expert curators to provide further input. We will then compare the effectiveness of each
CASPER-augmented model and its derived pathological features against the established gold standard
annotations from the Cancer Registry. Finally, we will enhance the GPT-based language models based on the
assessment, curation, and comparison process, employing prompt engineering techniques to improve
performance and mitigate bias.
抽象的
人工智能(AI)有可能通过利用临床数据来促进医疗保健
研究和改善肿瘤学实践。在自由文本病理报告中,有关主要的关键信息
嵌入了癌症诊断和不断发展的分子特征。提取和解释此信息
准确地对于确定癌症阶段至关重要,癌症阶段在预后和指导临床方面起决定性作用
管理。尽管已经应用了自然语言处理(NLP)技术来提取集中精力
病理报告中的信息仍然需要适应性,可解释的策略
增强临床数据抽象。为了满足这一需求,我们提出了一种多学科的方法来发展
综合临床信息提取管道。这项工作旨在改善,评估和增强
通过利用大语言的病理报告的病理诊断相关特征的抽象
型号。
我们的研究设计涉及多个步骤。首先,我们将建立各种各样且公平的患者队列
从我们的癌症注册表中收集自由文本病理报告,以及从
约翰·霍普金斯医学学院精密医学分析平台(PMAP)数据共享。接下来,我们
将采用信息提取平台从报告中识别病理特征。这个平台将
使用Microsoft提供的一套模型,包括Bert样模型,GPT-3.5和GPT-4,特别是
设计用于识别关键癌症属性。随后,我们将评估单个模型的输出
使用Casper Interactive模型开发框架,通过
启发式和监督弱。增强模型输出将通过基于Web的用户呈现
接口,允许专家策展人提供进一步的输入。然后,我们将比较每个人的有效性
Casper aigment Model及其衍生的病理特征针对已建立的黄金标准
癌症注册表的注释。最后,我们将根据
评估,策展和比较过程,采用及时的工程技术来改进
性能和减轻偏见。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
WILLIAM George NELSON其他文献
WILLIAM George NELSON的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('WILLIAM George NELSON', 18)}}的其他基金
Regional Oncology Research Center (American Eurasian Cancer Alliance Supplement)
区域肿瘤学研究中心(美国欧亚癌症联盟增刊)
- 批准号:
10923392 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
MBD2 as a Target for Cancer Prevention and Treatment
MBD2作为癌症预防和治疗的靶点
- 批准号:
7070564 - 财政年份:2005
- 资助金额:
$ 30万 - 项目类别:
MBD2 as a Target for Cancer Prevention and Treatment
MBD2作为癌症预防和治疗的靶点
- 批准号:
7245006 - 财政年份:2005
- 资助金额:
$ 30万 - 项目类别:
MBD2 as a Target for Cancer Prevention and Treatment
MBD2作为癌症预防和治疗的靶点
- 批准号:
6899546 - 财政年份:2005
- 资助金额:
$ 30万 - 项目类别:
AUA/SBUR Res. Conf.-"Inflammation in Prostate Diseases"
AUA/SBUR 研究。
- 批准号:
7001935 - 财政年份:2005
- 资助金额:
$ 30万 - 项目类别:
相似国自然基金
甲状腺乳头状癌人工智能辅助诊断技术的建立及临床应用研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于微尺度多维LC-MS大数据和人工智能的临床代谢组学新技术研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
甲状腺乳头状癌人工智能辅助诊断技术的建立及临床应用研究
- 批准号:32201234
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于微尺度多维LC-MS大数据和人工智能的临床代谢组学新技术研究
- 批准号:22274151
- 批准年份:2022
- 资助金额:54.00 万元
- 项目类别:面上项目
基于代谢的人工智能辅助肥胖症临床分型的脑功能学研究
- 批准号:82170861
- 批准年份:2021
- 资助金额:80 万元
- 项目类别:面上项目
相似海外基金
A multicenter study in bronchoscopy combining Stimulated Raman Histology with Artificial intelligence for rapid lung cancer detection - The ON-SITE study
支气管镜检查结合受激拉曼组织学与人工智能快速检测肺癌的多中心研究 - ON-SITE 研究
- 批准号:
10698382 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Artificial Intelligence assisted echocardiography to facilitate optimal image extraction for congenital heart defects diagnosis in Sub-Saharan Africa
人工智能辅助超声心动图促进撒哈拉以南非洲先天性心脏缺陷诊断的最佳图像提取
- 批准号:
10710681 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
HEAR-HEARTFELT (Identifying the risk of Hospitalizations or Emergency depARtment visits for patients with HEART Failure in managed long-term care through vErbaL communicaTion)
倾听心声(通过口头交流确定长期管理护理中的心力衰竭患者住院或急诊就诊的风险)
- 批准号:
10723292 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
ISimcha Technology Platform for Recruiting a Diverse Population of Older Adults into Clinical Trials
ISimcha 技术平台,用于招募不同的老年人群进行临床试验
- 批准号:
10761602 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别: