Textpresso, an information retrieval and extraction system for biological literat

Textpresso,生物文学信息检索和提取系统

基本信息

项目摘要

DESCRIPTION (provided by applicant): An information retrieval and extraction system that processes the full text of biological papers will be developed. A prototype system has been in operation at WormBase for over a year, used by C. elegans researchers as well as WormBase biological curators, and has recently been implemented for yeast at SGD. The system, called Textpresso, separates text into sentences, and labels words and phrases according to an ontology (an organized lexicon), and allows queries to be performed on a database of labeled sentences. The current ontology comprises 37 categories of terms, such as "gene," "regulation," "method," etc. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a threefold increase of search efficiency. This system will be further developed in three ways. First, the core system will be refined and altered to allow expansion to multiple domains of interest, e.g., model organisms, human disease. Simple modifications to the system and website functionality will be made, including synonym, search phrases, and case-sensitivity. A software package for local installation will be supported. The project team will maintain the Textpresso site (www.textpresso.org). which will include C. elegans and pilot systems, but software package will be available for installation of Textpresso at local sites, e.g., SGD, Flybase etc. Second, the ontology will be structured somewhat more deeply and lexica expanded for organism and field specific terms. Third, algorithms for information extraction will be implemented. One approach will be the implementation of similarity measures using categories (high level nodes) of the Textpresso ontology to reduce the dimensionality of associated vector spaces. A second approach will be the development of hidden Markov models to fill slots of a fact template based on the marked-up text. Information extracted will be presented to the user or expert curator. Public Description: The quality and pace of research depends upon rapid access to published information. This project will provide researchers with a search engine that rapidly gives them detailed, technical information they want by indexing the complete text of research articles.
描述(由申请人提供):

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

PAUL Warren STERNBERG其他文献

PAUL Warren STERNBERG的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('PAUL Warren STERNBERG', 18)}}的其他基金

Curation at scale: Integrating AI into community curation
大规模策展:将人工智能融入社区策展
  • 批准号:
    10621338
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Curation at scale: Integrating AI into community curation
大规模策展:将人工智能融入社区策展
  • 批准号:
    10344771
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
Bipartite gene expression system for C. elegans genetic and neural circuit analysis
用于线虫遗传和神经回路分析的二分基因表达系统
  • 批准号:
    9437389
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
Genetics 2012: Model Organism to Human Cancer
遗传学 2012:人类癌症模型生物
  • 批准号:
    8319996
  • 财政年份:
    2012
  • 资助金额:
    $ 30万
  • 项目类别:
C. elegans transcriptional regulatory elements
线虫转录调控元件
  • 批准号:
    8064423
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
C. elegans transcriptional regulatory elements
线虫转录调控元件
  • 批准号:
    8258290
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
C. elegans transcriptional regulatory elements
线虫转录调控元件
  • 批准号:
    8460166
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
C. elegans transcriptional regulatory elements
线虫转录调控元件
  • 批准号:
    7785896
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
Textpresso, information retrieval and extraction system for biological literature
Textpresso,生物文献信息检索和提取系统
  • 批准号:
    7347569
  • 财政年份:
    2006
  • 资助金额:
    $ 30万
  • 项目类别:
Textpresso, information retrieval and extraction system for biological literature
Textpresso,生物文献信息检索和提取系统
  • 批准号:
    7212077
  • 财政年份:
    2006
  • 资助金额:
    $ 30万
  • 项目类别:

相似海外基金

Elucidation candidate genes of cleft lip and palate using gene-gene interaction analysis
利用基因-基因相互作用分析阐明唇裂和腭裂的候选基因
  • 批准号:
    16K11689
  • 财政年份:
    2016
  • 资助金额:
    $ 30万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Bioinformatics strategies to relate age of onset with gene-gene interaction
将发病年龄与基因间相互作用联系起来的生物信息学策略
  • 批准号:
    9097781
  • 财政年份:
    2015
  • 资助金额:
    $ 30万
  • 项目类别:
Omega-3 PUFA-gene interaction in prostate cancer
Omega-3 PUFA 与前列腺癌中基因的相互作用
  • 批准号:
    8215565
  • 财政年份:
    2012
  • 资助金额:
    $ 30万
  • 项目类别:
Omega-3 PUFA-gene interaction in prostate cancer
Omega-3 PUFA 与前列腺癌中基因的相互作用
  • 批准号:
    8607164
  • 财政年份:
    2012
  • 资助金额:
    $ 30万
  • 项目类别:
Omega-3 PUFA-gene interaction in prostate cancer
Omega-3 PUFA 与前列腺癌中基因的相互作用
  • 批准号:
    8434839
  • 财政年份:
    2012
  • 资助金额:
    $ 30万
  • 项目类别:
Omega-3 PUFA-gene interaction in prostate cancer
Omega-3 PUFA 与前列腺癌中基因的相互作用
  • 批准号:
    8815275
  • 财政年份:
    2012
  • 资助金额:
    $ 30万
  • 项目类别:
GENE-GENE INTERACTION NETWORKS IN GENOME WIDE ASSOCIATION STUDIES
全基因组关联研究中的基因-基因相互作用网络
  • 批准号:
    8364348
  • 财政年份:
    2011
  • 资助金额:
    $ 30万
  • 项目类别:
Diesel, Allergens and Gene Interaction and Child Atopy
柴油、过敏原和基因相互作用以及儿童特应性
  • 批准号:
    7834176
  • 财政年份:
    2009
  • 资助金额:
    $ 30万
  • 项目类别:
Gene Interaction in Development and Disease
发育和疾病中的基因相互作用
  • 批准号:
    7881930
  • 财政年份:
    2009
  • 资助金额:
    $ 30万
  • 项目类别:
BIOINFORMATICS OF A GENE INTERACTION MAP
基因相互作用图谱的生物信息学
  • 批准号:
    6693150
  • 财政年份:
    2004
  • 资助金额:
    $ 30万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了