Dynamic Mining and Contextualization of the Scientific Literature. This project creates interactive science articles and collects data metrics, accelerating scientific discovery and reproducibility.
科学文献的动态挖掘和语境化。
基本信息
- 批准号:9345927
- 负责人:
- 金额:$ 21.17万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-04-01 至 2018-08-22
- 项目状态:已结题
- 来源:
- 关键词:AddressAllelesAmericasBiologicalBiological SciencesBiological databasesBiologyBiotechnologyCodeCollaborationsCommunicationCommunitiesComputer softwareDataDatabasesEquipment and SuppliesFission YeastFundingGenesGeneticGenetic EnhancementGrantHourHumanIncomeIndividualIntuitionJournalsLinkLiteratureMarketingMeasuresMetadataMethodsMiningModelingModernizationMonitorNational Human Genome Research InstituteNomenclatureOutcomePhaseProcessProductionPublicationsPublishingReadabilityReaderReagentReproducibilityResearchResearch PersonnelResolutionResourcesSaccharomycesScienceServicesSocietiesSourceStandardizationTestingTimeTimeLineVendorVocabularyWorkWritingZebrafishbasecrosslinkdata miningdatabase querydesignexperienceflexibilitygenome databaseimprovedinsightinstrumentationjournal articlelexicalmodel organisms databasesmouse genomenovelrat genomeresearch and developmentstatisticssuccesstext searchingtool
项目摘要
The proposed Dynamic Mining and Contextualization of the Scientific Literature (DMCSL) provides an open lane of communication between authors, science journals, readers, and databases. The outcome of this communication portal will be a database containing mineable metadata for researchers, reagent supply and biotech companies. Data will be available to companies through individualized subscription models. This pipeline identifies biological entities, e.g., gene, alleles, etc., and embeds hyperlinks from these entities to NHGRI-funded curated Model Organism Databases (MODs). DMCSL is an enhancement of a markup pipeline that has been in effect since 2009, and has linked biological entities in over 850 research articles in GENETICS and G3, published by the Genetics Society of America (GSA), to pages in MODs, WormBase, Flybase, and the Saccharomyces Genome Database. This proposal seeks funding to expand the scope of the GSA markup pipeline in all aspects: biological entities linked; authoritative databases linked to (Rat Genome Database; Mouse Genome Information; Zebrafish Model Organism Database; and the fission yeast genome database); and journals linked from. This expansion will also include collecting information on supplies and equipment described in Materials and Method sections of articles along with supplier information. The DMCSL will collect and store link information along with author and journal metadata and link access statistics. By doing so, the DMCSL will provide valuable metrics to all stakeholders, including biotech companies and life science vendors as well as a comprehensive and queryable view of biology not currently available. In Phase I, we will develop code that is flexible enough to scale the pipeline to link an article to more lexica and more databases within a single article and within a strict time limit of turnaround set by the publisher's production process. We will also be testing the software in linking publications of other journals and develop tools to query and data mine relationships identified through the data extraction process. We will develop basic API's to serve as a core API database resource; a linking API to store created links and monitor link activity, and use modern API management to develop a portal for key-based access to other API data. Proving stability and flexibility of the software based on current parameters, in Phase II we will work in collaboration with a wider range of stakeholders, more journals, more databases, including expanding to human biomedical databases, and more companies, to develop experience-based APIs for each stakeholder group. These APIs will be intuitively designed based on how each group interacts with the basic API developed in Phase I, and will be used to develop subscription-based access for commercial companies, access for academic stakeholders and collaborating journals will remain free.
提出的科学文献动态挖掘和语境化(DMCSL)为作者、科学期刊、读者和数据库之间的交流提供了一个开放的通道。这个交流门户的结果将是一个包含可挖掘元数据的数据库,供研究人员、试剂供应商和生物技术公司使用。数据将通过个性化订阅模式提供给企业。该管道识别生物实体,如基因、等位基因等,并将这些实体的超链接嵌入到nhgri资助的模型生物数据库(MODs)。DMCSL是自2009年以来生效的标记管道的增强版,它将美国遗传学会(GSA)发表的850多篇遗传学和G3研究文章中的生物实体与MODs、WormBase、Flybase和Saccharomyces基因组数据库中的页面联系起来。本提案寻求资金,以扩大GSA标记管道在各个方面的范围:生物实体链接;权威数据库链接到(大鼠基因组数据库;小鼠基因组信息;斑马鱼模式生物数据库;和裂变酵母基因组数据库);和期刊链接。这一扩展还将包括收集文章中材料和方法部分所描述的供应品和设备的信息以及供应商信息。DMCSL将收集和存储链接信息以及作者和期刊元数据和链接访问统计信息。通过这样做,DMCSL将为所有利益相关者提供有价值的指标,包括生物技术公司和生命科学供应商,以及目前无法获得的全面和可查询的生物学观点。在第一阶段,我们将开发足够灵活的代码,以扩展管道,在一篇文章中将一篇文章链接到更多的词典和更多的数据库,并且在出版商的生产流程设置的严格的周转时间限制内。我们还将测试该软件连接其他期刊的出版物,并开发工具来查询和数据挖掘通过数据提取过程确定的关系。我们将开发基础API,作为核心API数据库资源;一个链接API,用于存储创建的链接和监视链接活动,并使用现代API管理来开发门户,用于对其他API数据进行基于密钥的访问。为了证明基于当前参数的软件的稳定性和灵活性,在第二阶段,我们将与更广泛的利益相关者、更多的期刊、更多的数据库(包括扩展到人类生物医学数据库)和更多的公司合作,为每个利益相关者群体开发基于经验的api。这些API将根据每个组与第一阶段开发的基本API的交互方式直观地设计,并将用于为商业公司开发基于订阅的访问,为学术利益相关者和合作期刊提供访问将保持免费。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Karen J Yook其他文献
Karen J Yook的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Linkage of HIV amino acid variants to protective host alleles at CHD1L and HLA class I loci in an African population
非洲人群中 HIV 氨基酸变异与 CHD1L 和 HLA I 类基因座的保护性宿主等位基因的关联
- 批准号:
502556 - 财政年份:2024
- 资助金额:
$ 21.17万 - 项目类别:
Olfactory Epithelium Responses to Human APOE Alleles
嗅觉上皮对人类 APOE 等位基因的反应
- 批准号:
10659303 - 财政年份:2023
- 资助金额:
$ 21.17万 - 项目类别:
Deeply analyzing MHC class I-restricted peptide presentation mechanistics across alleles, pathways, and disease coupled with TCR discovery/characterization
深入分析跨等位基因、通路和疾病的 MHC I 类限制性肽呈递机制以及 TCR 发现/表征
- 批准号:
10674405 - 财政年份:2023
- 资助金额:
$ 21.17万 - 项目类别:
An off-the-shelf tumor cell vaccine with HLA-matching alleles for the personalized treatment of advanced solid tumors
具有 HLA 匹配等位基因的现成肿瘤细胞疫苗,用于晚期实体瘤的个性化治疗
- 批准号:
10758772 - 财政年份:2023
- 资助金额:
$ 21.17万 - 项目类别:
Identifying genetic variants that modify the effect size of ApoE alleles on late-onset Alzheimer's disease risk
识别改变 ApoE 等位基因对迟发性阿尔茨海默病风险影响大小的遗传变异
- 批准号:
10676499 - 财政年份:2023
- 资助金额:
$ 21.17万 - 项目类别:
New statistical approaches to mapping the functional impact of HLA alleles in multimodal complex disease datasets
绘制多模式复杂疾病数据集中 HLA 等位基因功能影响的新统计方法
- 批准号:
2748611 - 财政年份:2022
- 资助金额:
$ 21.17万 - 项目类别:
Studentship
Genome and epigenome editing of induced pluripotent stem cells for investigating osteoarthritis risk alleles
诱导多能干细胞的基因组和表观基因组编辑用于研究骨关节炎风险等位基因
- 批准号:
10532032 - 财政年份:2022
- 资助金额:
$ 21.17万 - 项目类别:
Recessive lethal alleles linked to seed abortion and their effect on fruit development in blueberries
与种子败育相关的隐性致死等位基因及其对蓝莓果实发育的影响
- 批准号:
22K05630 - 财政年份:2022
- 资助金额:
$ 21.17万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Investigating the Effect of APOE Alleles on Neuro-Immunity of Human Brain Borders in Normal Aging and Alzheimer's Disease Using Single-Cell Multi-Omics and In Vitro Organoids
使用单细胞多组学和体外类器官研究 APOE 等位基因对正常衰老和阿尔茨海默病中人脑边界神经免疫的影响
- 批准号:
10525070 - 财政年份:2022
- 资助金额:
$ 21.17万 - 项目类别:
Leveraging the Evolutionary History to Improve Identification of Trait-Associated Alleles and Risk Stratification Models in Native Hawaiians
利用进化历史来改进夏威夷原住民性状相关等位基因的识别和风险分层模型
- 批准号:
10689017 - 财政年份:2022
- 资助金额:
$ 21.17万 - 项目类别: