SGER: Mining Metadata for Metagenomics

SGER:挖掘宏基因组学元数据

基本信息

  • 批准号:
    0746650
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-09-01 至 2009-02-28
  • 项目状态:
    已结题

项目摘要

Both the metagenomics and the biodiversity communities face a major challenge: capturing detailed data on the environmental and the ecological context from which samples are drawn. Without these 'metadata' (data about the primary, biological data), the value of the primary data is greatly diminished. To capture these different types of metadata, biologists/ecologists need tools that allow them to survey the literature in the relevant areas, identify the key concepts and vocabulary in the articles, and extract data and metadata into an appropriate representation for further processing, querying and exchange. The goal of this research is to create a proof-of-concept demonstration of interactive tools for the capture and curation of metadata, working in close collaboration with the metagenomics and biodiversity communities to understand their requirements. The text mining community has demonstrated significant progress: results from the recent BioCreative workshop (April 2007) show that text mining tools can identify mentions of key biological entities in running text and map these mentions to associated unique identifiers (e.g., Entrez Gene identifiers) at 80-90% accuracy. Much of this progress has been driven by a close association between curators of mature biological databases and the text mining community. Groups such as GOA, MINT, IntAct, Flybase, MGI, SGD, Wormbase have expert curators, a documented curation process, and they produce large quantities of expert curated data. These resources have provided good testbeds for evaluating new tools against human curated "gold standard" test sets. The question now is how to apply this progress to create tools to support curators in an interactive process for extraction and mapping of critical information, such as gene/protein identifiers, geospatial information, or habitat information. These tools will support emerging communities, such as the metagenomics and biodiversity communities; they also can be put into the hands of both ontology builders to speed design of ontologies, and authors for capture of metadata at the source, rather than relying on post-publication expert curation. This work will have impact in four distinct areas: first, it will support the metagenomics and biodiversity communities, to speed capture of metadata; second, it will provide new challenges to the text mining community, to integrate tools into an interactive pipeline to support real curation activities; furthermore, such interactive tools can have major impact on the ability to create new ontologies and controlled vocabularies, through exploration of concepts in the literature; and finally, such tools can provide a prototype for author-driven annotation, to support capture and encode metadata at the source, as a kind of "automated spell-checker" for annotations.
宏基因组学和生物多样性社区都面临着一个重大挑战:捕获有关环境和生态背景的详细数据,从中提取样本。如果没有这些“元数据”(关于原始生物数据的数据),原始数据的价值就会大大降低。为了捕获这些不同类型的元数据,生物学家/生态学家需要工具,使他们能够调查相关领域的文献,识别文章中的关键概念和词汇,并将数据和元数据提取到适当的表示中,以进行进一步处理,查询和交换。这项研究的目标是创建一个概念验证演示的交互式工具,用于捕获和管理元数据,与宏基因组学和生物多样性社区密切合作,以了解他们的要求。文本挖掘社区已经取得了显著的进展:最近的BioCreative研讨会(2007年4月)的结果表明,文本挖掘工具可以识别运行文本中提到的关键生物实体,并将这些提到映射到相关的唯一标识符(例如,基因标识符),准确率为80-90%。这一进展在很大程度上是由成熟生物数据库的管理者和文本挖掘社区之间的密切联系推动的。诸如果阿、MINT、IntAct、Flybase、MGI、SGD、Wormbase等组织都有专业的策展人,有一个记录在案的策展流程,他们会产生大量的专家策展数据。这些资源提供了很好的测试平台,用于根据人类策划的“黄金标准”测试集评估新工具。现在的问题是如何应用这一进展来创建工具,以支持策展人在互动过程中提取和绘制关键信息,如基因/蛋白质标识符,地理空间信息或栖息地信息。这些工具将支持新兴的社区,如宏基因组学和生物多样性社区;它们也可以交给本体构建者,以加快本体的设计,并在源头捕获元数据的作者,而不是依赖于出版后的专家策展。这项工作将在四个不同的领域产生影响:首先,它将支持宏基因组学和生物多样性社区,以加快元数据的捕获;其次,它将为文本挖掘社区提供新的挑战,将工具集成到交互式管道中,以支持真实的策展活动;此外,这种交互式工具可以对创建新本体和受控词汇表的能力产生重大影响,最后,这些工具可以为作者驱动的注释提供原型,以支持在源处捕获和编码元数据,作为注释的一种“自动拼写检查器”。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lynette Hirschman其他文献

The 15th Genomic Standards Consortium meeting
  • DOI:
    10.4056/sigs.3457
  • 发表时间:
    2013-01-01
  • 期刊:
  • 影响因子:
    5.400
  • 作者:
    Lynn Schriml;Ilene Mizrachi;Peter Sterk;Dawn Field;Lynette Hirschman;Tatiana Tatusova;Susanna Sansone;Jack Gilbert;David Schindel;Neil Davies;Chris Meyer;Folker Meyer;George Garrity;Lita Proctor;M. H. Medema;Yemin Lan;Anna Klindworth;Frank Oliver Glöckner;Tonia Korves;Antonia Gonzalez;Peter Dwayndt;Markus Göker;Anjette Johnston;Evangelos Pafilis;Susanne Schneider;K. Baker;Cynthia Parr;G. Sutton;H. H. Creasy;Nikos Kyrpides;K. Eric Wommack;Patricia L. Whetzel;Daniel Nasko;Hilmar Lapp;Takamoto Fujisawa;Adam M. Phillippy;Renzo Kottman;Judith A. Blake;Junhua Li;Elizabeth M. Glass;Petra ten Hoopen;Rob Knight;Susan Holmes;Curtis Huttenhower;Steven L. Salzberg;Bing Ma;Owen White
  • 通讯作者:
    Owen White
Meeting Report from the Genomic Standards Consortium (GSC) Workshop 8
  • DOI:
    10.4056/sigs.1022942
  • 发表时间:
    2010-08-20
  • 期刊:
  • 影响因子:
    5.400
  • 作者:
    Nikos Kyrpides;Dawn Field;Peter Sterk;Renzo Kottmann;Frank Oliver Glöckner;Lynette Hirschman;George M. Garrity;Guy Cochrane;John Wooley
  • 通讯作者:
    John Wooley

Lynette Hirschman的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lynette Hirschman', 18)}}的其他基金

SGER: Utility and Usability of Text Mining for Biological Curation
SGER:文本挖掘在生物管理中的实用性和可用性
  • 批准号:
    0844419
  • 财政年份:
    2008
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Critical Assessment of Information Extraction in Biology
生物学信息提取的批判性评估
  • 批准号:
    0640153
  • 财政年份:
    2006
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Evaluating Bioinformatics Technology
评估生物信息学技术
  • 批准号:
    0326404
  • 财政年份:
    2003
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
BioLINK Workshop: Biological Language, Information and Knowledge
BioLINK 研讨会:生物语言、信息和知识
  • 批准号:
    0228162
  • 财政年份:
    2002
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Industry/Univers>ty Cooperative Research Program Acquisition and Use ov Semantic Information for Natural Language
产学合作研究项目自然语言语义信息的获取和使用
  • 批准号:
    8502205
  • 财政年份:
    1985
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Industry/University Cooperative Research Activity: Robust Natural Language Parsing Using Graded Acceptability (Computer Research)
产学合作研究活动:使用分级可接受性的鲁棒自然语言解析(计算机研究)
  • 批准号:
    8202397
  • 财政年份:
    1982
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant

相似国自然基金

基于Genome mining技术研究抑制表皮葡萄球菌生物膜形成的次级代谢产物
  • 批准号:
    21242003
  • 批准年份:
    2012
  • 资助金额:
    10.0 万元
  • 项目类别:
    专项基金项目

相似海外基金

NeTS: Small: NSF-DST: Modernizing Underground Mining Operations with Millimeter-Wave Imaging and Networking
NeTS:小型:NSF-DST:利用毫米波成像和网络实现地下采矿作业现代化
  • 批准号:
    2342833
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Development of social attention indicators of emerging technologies and science policies with network analysis and text mining
利用网络分析和文本挖掘开发新兴技术和科学政策的社会关注指标
  • 批准号:
    24K16438
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
FightAMR: Novel global One Health surveillance approach to fight AMR using Artificial Intelligence and big data mining
FightAMR:利用人工智能和大数据挖掘对抗 AMR 的新型全球统一健康监测方法
  • 批准号:
    MR/Y034422/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
ART: Mining the Rich Vein of Research in Montana
艺术:挖掘蒙大拿州研究的丰富脉络
  • 批准号:
    2331325
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Cooperative Agreement
DISES Investigating mercury biogeochemical cycling via mixed-methods in complex artisanal gold mining landscapes and implications for community health
DISES 通过混合方法研究复杂手工金矿景观中的汞生物地球化学循环及其对社区健康的影响
  • 批准号:
    2307870
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Toward carbon-neutral society: Development of a full-sustainable eco-friendly green mining process for gold recovery
迈向碳中和社会:开发完全可持续的环保绿色采矿工艺以回收黄金
  • 批准号:
    24K17540
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Generating green hydrogen from mining wastes
从采矿废物中产生绿色氢气
  • 批准号:
    IM240100202
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Mid-Career Industry Fellowships
Novel Hydrophobic Concrete for Durable and Resilient Mining Infrastructure
用于耐用且有弹性的采矿基础设施的新型疏水混凝土
  • 批准号:
    LP230100288
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Linkage Projects
SBIR Phase I: Electromagnetic-ablative PGM Refining for In-situ Asteroid Mining
SBIR 第一阶段:用于小行星原位采矿的电磁烧蚀铂族金属精炼
  • 批准号:
    2327078
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Temporal Graph Mining for Anomaly Detection
用于异常检测的时间图挖掘
  • 批准号:
    DP240101547
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Discovery Projects
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了