Leveraging large language models and knowledge graphs on clinical, pathological, and sequencing data to inform precision cancer therapy

利用临床、病理和测序数据的大型语言模型和知识图为精准癌症治疗提供信息

基本信息

  • 批准号:
    10888730
  • 负责人:
  • 金额:
    $ 30万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-01-20 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Project Summary: Precision medicine and targeted therapy are emerging domains in cancer biology that aim to incorporate individual-level clinical, pathological and genomic profiles to tailor treatment strategies for cancer patients. Several precision oncology knowledge bases, like OncoKB, My Cancer Genome, have been established to democratize clinical decision-making by leveraging expert curation of biological and clinical significance of alterations using publicly available resources. These knowledge bases, while extremely powerful, have their limitations, including the scope of annotated genes and alterations, as well as identifying precise therapies for specific combinations of a patient's genomic and clinical profiles. In this proposal, we plan to develop new computational methodologies that will integrate (i) the broad range of implicit cancer knowledge accrued by Large Language Models (LLMs) with (ii) the explicit structured clinical, pathological, and genomic knowledge derived from cancer patients in the Memorial Sloan Kettering Cancer Center’s (MSKCC) Clinical Sequencing cohort and AACR Project GENIE cohort. This will further be reinforced by expert curation, with the aim to predict combinations of genomic alterations and clinical or pathological profiles that can be matched to a specific cancer therapy. The goal of this research is to develop computational models fundamentally anchored around knowledge graphs and LLMs to bridge the gap between clinical and functional risk factors of cancer and cancer therapeutics, and to inform and enhance personalized therapies. The first aim of this proposal is to develop a knowledge graph, MSK-CancerKG, based on patient-specific clinical, pathological, and genomic alteration information from more than 100,000 patients from the MSKCC Clinical Sequencing Cohort and the AACR GENIE Project cohort. This multi-relational knowledge graph will integrate a wide spectrum of clinical features associated with each patient, abstracted features from pathological reports corresponding to the patient-derived tumor samples, along with comprehensive characterization of genomic alterations and the implicated genes. The second aim will be geared towards the fine-tuning of pre-trained Large Language Models (LLMs) using the structured, detailed and more reliable cancer-specific knowledge from MSK- CancerKG. We will meticulously benchmark these fine-tuned models against 4 state-of-the art pre-trained language models, ultimately deriving an optimized combined predictive model, coined MSK-CancerLLM. The benchmarking step will include successful clinical, alteration and treatment prediction accuracy on held-out patient data. The third aim of the proposal will be to further fine-tune MSK-CancerLLM using clinical practice guidelines and feedback to model output from cancer domain experts. The resulting model will be integrated into an AI chatbot, called MSK-Assistant, to facilitate seamless integration and interaction between the backend model and a frontend chatbot interface. Like the ChatGPT application, this will allow the research community to query about cancer biology and personalized drug recommendations and therapeutic interventions.
项目摘要: 精确医学和靶向疗法是癌症生物学的新兴领域,旨在纳入 为癌症患者量身定制治疗策略的个体临床,病理和基因组特征。 已经建立了几种精确的肿瘤学知识库,例如我的癌症基因组Oncokb 通过利用生物学和临床意义的专家策划来使临床决策民主化 使用公开资源的更改。这些知识基础虽然非常强大,但 局限性,包括注释基因和改变的范围,以及确定精确疗法 患者基因组和临床特征的特定组合。在此提案中,我们计划开发新的 将集成的计算方法(i)积累了广泛的隐式癌症知识 通过(ii)明确的结构临床,病理和基因组的大语言模型(LLM) 在纪念斯隆开氏癌症中心(MSKCC)中源自癌症患者的知识 临床测序队列和AACR项目精灵队列。专家将进一步加强 策划,目的是预测基因组改变和临床或病理特征的组合 这可以与特定的癌症疗法相匹配。这项研究的目的是开发计算 从根本上锚定在知识图和LLM的模型,以弥合临床和 癌症和癌症疗法的功能危险因素,并为个性化疗法提供信息和增强。 该建议的第一个目的是基于患者特定临床的知识图MSK-Cancerkg, 来自MSKCC临床的100,000多名患者的病理和基因组改变信息 测序队列和AACR GENIE项目队列。这个多关系知识图将集成 与每个患者相关的广泛临床特征,病理报告中的抽象特征 对应于患者衍生的肿瘤样品,以及基因组的全面表征 改变和隐含基因。第二个目标将针对预训练的大型进行微调 语言模型(LLMS)使用MSK-的结构化,详细和更可靠的癌症特定知识 癌症我们将对这些微调模型进行精心测试,以实现4个先进的预培训 语言模型最终得出了优化的组合预测模型,创造了MSK-Cancerllm。这 基准测试步骤将包括成功的临床,改变和治疗预测准确性 患者数据。该提案的第三个目标是使用临床实践进一步调整MSK-Cancerllm 指南和反馈,以模拟癌症领域专家的产出。结果模型将集成到 AI聊天机器人,称为MSK助剂,可促进后端之间的无缝集成和互动 模型和前端聊天机器人接口。像chatgpt应用程序一样,这将使研究社区能够 有关癌症生物学和个性化药物建议和治疗干预措施的查询。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SELWYN M VICKERS其他文献

SELWYN M VICKERS的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SELWYN M VICKERS', 18)}}的其他基金

UAB/TU FIRST Administrative Core
UAB/TU FIRST 行政核心
  • 批准号:
    10361942
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
UAB/TU FIRST Administrative Core
UAB/TU FIRST 行政核心
  • 批准号:
    10705179
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Clinical Managment and Trials Core and Advocacy Sub-Core
临床管理和试验核心和宣传子核心
  • 批准号:
    7962152
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
Research Training/Education Core
研究培训/教育核心
  • 批准号:
    7771813
  • 财政年份:
    2009
  • 资助金额:
    $ 30万
  • 项目类别:
Surgical Oncology Research Training Program
肿瘤外科研究培训计划
  • 批准号:
    7914439
  • 财政年份:
    2008
  • 资助金额:
    $ 30万
  • 项目类别:
Surgical Oncology Research Training Program
肿瘤外科研究培训计划
  • 批准号:
    8305781
  • 财政年份:
    2008
  • 资助金额:
    $ 30万
  • 项目类别:
Surgical Oncology Research Training Program
肿瘤外科研究培训计划
  • 批准号:
    7693807
  • 财政年份:
    2008
  • 资助金额:
    $ 30万
  • 项目类别:
Surgical Oncology Research Training Program
肿瘤外科研究培训计划
  • 批准号:
    8131578
  • 财政年份:
    2008
  • 资助金额:
    $ 30万
  • 项目类别:
Surgical Oncology Research Training Program
肿瘤外科研究培训计划
  • 批准号:
    7560791
  • 财政年份:
    2008
  • 资助金额:
    $ 30万
  • 项目类别:
Developmental Funds
发展基金
  • 批准号:
    10921265
  • 财政年份:
    2007
  • 资助金额:
    $ 30万
  • 项目类别:

相似海外基金

Dance4Healing: a feasibility study to reduce health disparity and increase engagement of an intergenerational telehealth program for minority diabetes patients and their care partners.
Dance4Healing:一项可行性研究,旨在减少少数族裔糖尿病患者及其护理伙伴的健康差距并提高代际远程医疗计划的参与度。
  • 批准号:
    10604415
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
Prevalence and temporal dynamics of clonal mutations associated with the risk of hematological cancer in a cohort of clinically healthy Nigerians
临床健康尼日利亚人队列中与血液癌风险相关的克隆突变的患病率和时间动态
  • 批准号:
    10490839
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Prevalence and temporal dynamics of clonal mutations associated with the risk of hematological cancer in a cohort of clinically healthy Nigerians
临床健康尼日利亚人队列中与血液癌风险相关的克隆突变的患病率和时间动态
  • 批准号:
    10610478
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Prevalence and temporal dynamics of clonal mutations associated with the risk of hematological cancer in a cohort of clinically healthy Nigerians
临床健康尼日利亚人队列中与血液癌风险相关的克隆突变的患病率和时间动态
  • 批准号:
    10292857
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Technology Innovations for Supporting Health in Alaska Native People
支持阿拉斯加原住民健康的技术创新
  • 批准号:
    8911525
  • 财政年份:
    2014
  • 资助金额:
    $ 30万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了