Flexible NLP toolkit for automatic curation of outcomes for breast cancer patients
灵活的 NLP 工具包,用于自动治疗乳腺癌患者的结果
基本信息
- 批准号:10675009
- 负责人:
- 金额:$ 54.09万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-08-01 至 2027-07-31
- 项目状态:未结题
- 来源:
- 关键词:AdherenceAdoptedAgeAnxietyBiologicalBlack PopulationsBreastBreast Cancer PatientBreast Cancer Risk FactorBreast Cancer TreatmentCaliforniaCancer BurdenCancer PatientCessation of lifeClinicClinicalCollaborationsCommunicationComputerized Medical RecordComputersDataData CollectionDatabasesDetectionDevelopmentDiagnosisDiagnostic Neoplasm StagingDiseaseDisparityDistant MetastasisEarly DiagnosisEpidemiologistEthnic OriginEvaluationFatigueFundingFutureGuidelinesHealth systemHealthcareHispanicHispanic PopulationsHospitalsHourHumanImmunotherapyInformaticsInstitutionInsurance CoverageInterventionLeadLearningMalignant NeoplasmsMalignant neoplasm of prostateManualsMedical centerMental DepressionMethodologyMethodsMissionModelingMorbidity - disease rateNatural Language ProcessingNauseaNeoplasm MetastasisNot Hispanic or LatinoOncologistOperative Surgical ProceduresOutcomePaperPathologyPathology ReportPatient-Focused OutcomesPatientsPatternPerformancePopulationPopulation HeterogeneityPrimary NeoplasmProcessPrognosisPsyche structureQuality of lifeRaceRadiationRadiology SpecialtyRecording of previous eventsRecurrenceRecurrent Malignant NeoplasmRegistriesReportingResearch PersonnelRoleRunningScientistSiteSoftware ToolsStage at DiagnosisStructureTeam NursingTechnologyTestingTextTimeTumor SubtypeTumor stageUniversity HospitalsValidationVisitWomananticancer researchartificial intelligence algorithmbiological systemsbreast cancer survivalcancer classificationcancer preventioncancer recurrencecancer sitecancer survivalcancer therapychemotherapyclinical centerclinical encountercomorbiditycostdata curationdata integrationexperienceflexibilityfollow-uphormone therapyinformatics toollow income countrymalignant breast neoplasmmolecular subtypesmultidisciplinarymultimodal dataneoplasm registryopen sourceoutcome predictionpatient populationphysical conditioningpopulation basedradiologistrelational databasesocioeconomic disparitysurveillance datasurvival disparitytooltreatment and outcometreatment planningtrend
项目摘要
Project summary/Abstract
Breast cancer has the largest number of new cases in world (11.7%). Although the prognosis of
breast cancer patients is generally favorable due to early detection and comprehensive treatment,
20%–30% of patients will still develop distant metastases and cases with progressive stage only
have a median two-year survival time. Breast cancer is widely recognized as a heterogeneous
disease in the sense of both primary tumor metastatic capacity and time to metastatic spread of
disease. High-quality population-based cancer surveillance data are needed to: (1) describe
cancer burden, patterns, and outcomes in order to (2) inform cancer prevention, detection and
control activities; and (3) evaluate interventions on the basis of past and future trends so that
optimal approaches to alleviate burden and suffering from cancer can be adopted. However, the
laborious manual curation process makes the population wise surveillance data collection
challenging. It has been shown in studies that a large percentage of total registry cost is devoted
to labor for data curation, even in the low-income countries. In this project, our mission is to build
a flexible NLP toolset that can be executed locally at the institution level and will curate the clinical
and patient-centered outcomes of breast cancer patients by parsing longitudinally acquired clinic
notes, radiology and pathology reports. In order to test the generalizability of the tools and to
initiate their deployment for data collection, we will partner with both Georgia SEER and California
state cancer registry and will curate the outcome data of past 10-years breast cancer patients
from two institutions across US representing diverse patient populations - Emory University
hospital (Georgia) and Stanford Medical Center (California). We will leverage the previously
developed tools and technologies and extend them to automatically curate the clinical and patient-
centered outcome data – recurrence date and site of recurrence, treatment administered, mental
and physical outcomes – from clinic notes and convert these into structured and query-able
format. The NLP tools will be dockerized and run locally at the hospital registry level for automated
outcome curation. Finally, the NLP extracted outcomes will be shared with State Cancer registry
for evaluation. From a methodological perspective, the framework and the open-source software
tools developed can be employed for cancer research beyond the scope of our project for curating
outcomes regardless of the problem domain.
项目总结/文摘
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A large language model-based generative natural language processing framework fine-tuned on clinical notes accurately extracts headache frequency from electronic health records.
基于大型语言模型的生成自然语言处理框架根据临床记录进行微调,可以从电子健康记录中准确提取头痛频率。
- DOI:10.1111/head.14702
- 发表时间:2024
- 期刊:
- 影响因子:5
- 作者:Chiang,Chia-Chun;Luo,Man;Dumkrieger,Gina;Trivedi,Shubham;Chen,Yi-Chieh;Chao,Chieh-Ju;Schwedt,ToddJ;Sarker,Abeed;Banerjee,Imon
- 通讯作者:Banerjee,Imon
A Large Language Model-Based Generative Natural Language Processing Framework Finetuned on Clinical Notes Accurately Extracts Headache Frequency from Electronic Health Records.
基于大型语言模型的生成自然语言处理框架根据临床记录进行微调,可从电子健康记录中准确提取头痛频率。
- DOI:10.1101/2023.10.02.23296403
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Chiang,Chia-Chun;Luo,Man;Dumkrieger,Gina;Trivedi,Shubham;Chen,Yi-Chieh;Chao,Chieh-Ju;Schwedt,ToddJ;Sarker,Abeed;Banerjee,Imon
- 通讯作者:Banerjee,Imon
Graph convolutional network-based fusion model to predict risk of hospital acquired infections.
基于图卷积网络的融合模型来预测医院获得性感染的风险。
- DOI:10.1093/jamia/ocad045
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Tariq,Amara;Lancaster,Lin;Elugunti,Praneetha;Siebeneck,Eric;Noe,Katherine;Borah,Bijan;Moriarty,James;Banerjee,Imon;Patel,BhavikN
- 通讯作者:Patel,BhavikN
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Imon Banerjee其他文献
Imon Banerjee的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Imon Banerjee', 18)}}的其他基金
SCH: Artificial Intelligence enabled multi-modal sensor platform for at-home health monitoring of patients
SCH:人工智能支持的多模式传感器平台,用于患者的家庭健康监测
- 批准号:
10816667 - 财政年份:2023
- 资助金额:
$ 54.09万 - 项目类别:
Flexible NLP toolkit for automatic curation of outcomes for breast cancer patients
灵活的 NLP 工具包,用于自动治疗乳腺癌患者的结果
- 批准号:
10420233 - 财政年份:2022
- 资助金额:
$ 54.09万 - 项目类别:
TCIA Sustainment and Scalability - Platforms for Quantitative Imaging Informatics in Precision Medicine
TCIA 持续性和可扩展性 - 精准医学中的定量成像信息学平台
- 批准号:
10227670 - 财政年份:2017
- 资助金额:
$ 54.09万 - 项目类别:
TCIA Sustainment and Scalability - Platforms for Quantitative Imaging Informatics in Precision Medicine
TCIA 持续性和可扩展性 - 精准医学中的定量成像信息学平台
- 批准号:
10013134 - 财政年份:2017
- 资助金额:
$ 54.09万 - 项目类别:
TCIA Sustainment and Scalability - Platforms for Quantitative Imaging Informatics in Precision Medicine
TCIA 持续性和可扩展性 - 精准医学中的定量成像信息学平台
- 批准号:
9753190 - 财政年份:2017
- 资助金额:
$ 54.09万 - 项目类别:
相似海外基金
How novices write code: discovering best practices and how they can be adopted
新手如何编写代码:发现最佳实践以及如何采用它们
- 批准号:
2315783 - 财政年份:2023
- 资助金额:
$ 54.09万 - 项目类别:
Standard Grant
One or Several Mothers: The Adopted Child as Critical and Clinical Subject
一位或多位母亲:收养的孩子作为关键和临床对象
- 批准号:
2719534 - 财政年份:2022
- 资助金额:
$ 54.09万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633211 - 财政年份:2020
- 资助金额:
$ 54.09万 - 项目类别:
Studentship
A material investigation of the ceramic shards excavated from the Omuro Ninsei kiln site: Production techniques adopted by Nonomura Ninsei.
对大室仁清窑遗址出土的陶瓷碎片进行材质调查:野野村仁清采用的生产技术。
- 批准号:
20K01113 - 财政年份:2020
- 资助金额:
$ 54.09万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2436895 - 财政年份:2020
- 资助金额:
$ 54.09万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633207 - 财政年份:2020
- 资助金额:
$ 54.09万 - 项目类别:
Studentship
The limits of development: State structural policy, comparing systems adopted in two European mountain regions (1945-1989)
发展的限制:国家结构政策,比较欧洲两个山区采用的制度(1945-1989)
- 批准号:
426559561 - 财政年份:2019
- 资助金额:
$ 54.09万 - 项目类别:
Research Grants
Securing a Sense of Safety for Adopted Children in Middle Childhood
确保被收养儿童的中期安全感
- 批准号:
2236701 - 财政年份:2019
- 资助金额:
$ 54.09万 - 项目类别:
Studentship
A Study on Mutual Funds Adopted for Individual Defined Contribution Pension Plans
个人设定缴存养老金计划采用共同基金的研究
- 批准号:
19K01745 - 财政年份:2019
- 资助金额:
$ 54.09万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Structural and functional analyses of a bacterial protein translocation domain that has adopted diverse pathogenic effector functions within host cells
对宿主细胞内采用多种致病效应功能的细菌蛋白易位结构域进行结构和功能分析
- 批准号:
415543446 - 财政年份:2019
- 资助金额:
$ 54.09万 - 项目类别:
Research Fellowships