MACE2K - Molecular And Clinical Extraction: A Natural Language Processing Tool for Personalized Medicine
MACE2K - 分子和临床提取:个性化医疗的自然语言处理工具
基本信息
- 批准号:9146381
- 负责人:
- 金额:$ 45.71万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-22 至 2018-05-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsBig DataBig Data to KnowledgeBiologicalBiomedical ResearchCancer CenterClinicalClinical Decision Support SystemsClinical TrialsComputer softwareComputing MethodologiesCrowdingDataData AggregationDatabasesDictionaryDiseaseExclusion CriteriaGene ExpressionGene MutationGenomeGoalsGoldHealthInformaticsInformation RetrievalInvestmentsLettersLiteratureMalignant NeoplasmsMapsMeta-AnalysisMethodsMolecularMolecular ProfilingMolecular TargetMutationNational Cancer InstituteNatural Language ProcessingOncologistOnline SystemsOutcomePatientsPeer ReviewPharmaceutical PreparationsPharmacotherapyPhosphorylationProcessPubMedPublicationsRecording of previous eventsReportingResearchResearch DesignResearch PersonnelSoftware ValidationSourceStructureSystemSystems BiologyTestingTherapeuticTimeUnited States National Institutes of Healthabstractingbasecrowdsourcingdata to knowledgedata wranglingdesignimprovedinclusion criteriainnovationinterestknowledge basemeetingsnovelnovel strategiespersonalized cancer carepersonalized cancer therapypersonalized medicineprogramsprotein expressionsearch enginesoftware developmentsymposiumtargeted treatmenttooluser friendly softwareverification and validation
项目摘要
DESCRIPTION (provided by applicant): The velocity, variety, volume and veracity of data from relevant information sources make it extremely challenging for oncologists to collect and review pertinent data that can support routine personalized treatment for their patients. There is an urgent need to develop data wrangling approaches including Natural Language Processing and information retrieval methods to extract and curate personalized-therapy related publications and clinical trials. Once curated, the structured data can be used by biomedical researchers to generate novel scientific hypotheses, design new studies, obtain a better understanding of biological mechanisms of disease, perform meta-analyses, and create clinical decision support systems. There is an urgent need to develop improved search interfaces specific to the field of personalized therapy, including ways to display, rank, and save results by
end users. While several database and web-based keyword search engine algorithms exist, there is a lack of tools that meet the unique challenges of personalized medicine. There is also an urgent need to develop software that allows for verification and validation of information extracted and ranked through computational methods using subject matter expertise to improve the gold standard corpus that can be used for biomedical research into personalized therapies. To address these issues, we will build an innovative software stack (MACE2K) to adapt and extend widely tested Biocreative natural language processing (NLP) tools to automatically retrieve and pre-process targeted therapy information from clinicaltrials.gov, PubMed abstracts as well as open access articles, and conference proceedings. We will build an entity extraction cartridge to accurately parse gene mutations, translocations, gene expression, protein expression, and protein phosphorylation. A marker disambiguation cartridge will be built to assess for trial inclusion or exclusion criteria and to determine marker-related primary endpoints. We will include a ranking cartridge that uses the disambiguated information on markers, drugs and trials to provide a rigorous scoring of trials and studies according to their relevance for personalized medicine. A novel gamification cartridge will be built to allow subject matter experts to verify and validate the information corpus. Our research leverages National Cancer Institute's investments in several programs (many of which we are involved in) including the NCI drug dictionary, National Cancer Informatics Program (NCIP), I-SPY trials, and Center for cancer systems biology (CCSB) to efficiently accomplish our aims.
描述(由申请人提供):来自相关信息来源的数据的速度、种类、数量和准确性使得肿瘤学家收集和审查可以支持患者常规个性化治疗的相关数据极具挑战性。迫切需要开发数据处理方法,包括自然语言处理和信息检索方法,以提取和管理个性化治疗相关的出版物和临床试验。一旦得到管理,生物医学研究人员可以使用结构化数据来生成新的科学假设,设计新的研究,更好地了解疾病的生物学机制,进行荟萃分析,并创建临床决策支持系统。迫切需要开发专用于个性化治疗领域的改进的搜索界面,包括通过以下方式来显示、排名和保存结果的方式:
最终用户。虽然存在几种基于数据库和网络的关键词搜索引擎算法,但缺乏满足个性化医疗独特挑战的工具。还迫切需要开发软件,允许使用主题专业知识通过计算方法对提取和排序的信息进行验证和确认,以改进可用于个性化疗法的生物医学研究的金标准语料库。 为了解决这些问题,我们将构建一个创新的软件栈(MACE 2K),以适应和扩展经过广泛测试的生物创造性自然语言处理(NLP)工具,以自动检索和预处理来自clinicaltrials.gov的靶向治疗信息,PubMed摘要以及开放获取文章和会议记录。我们将建立一个实体提取盒,以准确解析基因突变,易位,基因表达,蛋白质表达和蛋白质磷酸化。将构建标记物歧义消除测试卡片,以评估试验入选或排除标准,并确定标记物相关的主要终点。我们将包括一个排名盒,它使用标记物,药物和试验的消歧信息,根据其与个性化医疗的相关性提供严格的试验和研究评分。将建立一个新的游戏化模块,以允许主题专家验证和确认信息语料库。我们的研究利用了国家癌症研究所在几个项目(其中许多我们都参与了)中的投资,包括NCI药物词典,国家癌症信息学项目(NCIP),I-SPY试验和癌症系统生物学中心(CCSB),以有效地实现我们的目标。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Subha Madhavan其他文献
Subha Madhavan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Subha Madhavan', 18)}}的其他基金
MACE2K - Molecular And Clinical Extraction: A Natural Language Processing Tool for Personalized Medicine
MACE2K - 分子和临床提取:个性化医疗的自然语言处理工具
- 批准号:
8874546 - 财政年份:2015
- 资助金额:
$ 45.71万 - 项目类别:
MACE2K - Molecular And Clinical Extraction: A Natural Language Processing Tool for Personalized Medicine
MACE2K - 分子和临床提取:个性化医疗的自然语言处理工具
- 批准号:
9282279 - 财政年份:2015
- 资助金额:
$ 45.71万 - 项目类别:
Informatics Support Center for the Cancer Family Registries
癌症家族登记信息学支持中心
- 批准号:
8537027 - 财政年份:2009
- 资助金额:
$ 45.71万 - 项目类别:
相似海外基金
Big Data Analytics: Optimization Models and Algorithms with Applications in Smart Food Supply Chains and Networks
大数据分析:优化模型和算法在智能食品供应链和网络中的应用
- 批准号:
RGPIN-2020-06792 - 财政年份:2022
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
Large Systems and Big Data: Models, Tools, Analysis, and Algorithms
大型系统和大数据:模型、工具、分析和算法
- 批准号:
RGPIN-2020-04075 - 财政年份:2022
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
Algorithms and Tools for Big Data Analysis and Automated Real Time Optimal or Near Optimal Decision Making for Industrial Systems
用于工业系统大数据分析和自动实时最佳或接近最佳决策的算法和工具
- 批准号:
RGPIN-2017-05785 - 财政年份:2022
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
Novel Learning-Based Visual Algorithms and Fusion Methods for High-Dimensional/Multi-Modality Big Data
基于学习的新型高维/多模态大数据视觉算法和融合方法
- 批准号:
RGPIN-2022-02948 - 财政年份:2022
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
(Re)designing Clustering Algorithms for Big Data
(重新)设计大数据聚类算法
- 批准号:
RGPIN-2017-05617 - 财政年份:2022
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
- 批准号:
2203524 - 财政年份:2021
- 资助金额:
$ 45.71万 - 项目类别:
Standard Grant
Big Data Analytics: Optimization Models and Algorithms with Applications in Smart Food Supply Chains and Networks
大数据分析:优化模型和算法在智能食品供应链和网络中的应用
- 批准号:
RGPIN-2020-06792 - 财政年份:2021
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual
Exploring Novel Mathematical Models and Efficient Algorithms to Discover Periodic Spatial Patterns in Irregular Spatiotemporal Big Data
探索新颖的数学模型和高效算法以发现不规则时空大数据中的周期性空间模式
- 批准号:
21K12034 - 财政年份:2021
- 资助金额:
$ 45.71万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A comprehensive study of big data clustering algorithms
大数据聚类算法综合研究
- 批准号:
571110-2018 - 财政年份:2021
- 资助金额:
$ 45.71万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
(Re)designing Clustering Algorithms for Big Data
(重新)设计大数据聚类算法
- 批准号:
RGPIN-2017-05617 - 财政年份:2021
- 资助金额:
$ 45.71万 - 项目类别:
Discovery Grants Program - Individual