Text Processing and Geospatial Uncertainty for Phylogeography of Zoonotic Viruses

人畜共患病毒系统发育地理学的文本处理和地理空间不确定性

基本信息

项目摘要

DESCRIPTION (provided by applicant): Phylogeography of zoonotic viruses studies the geographical spread and genetic lineages of viruses that are transmittable between animals and humans such as avian influenza and rabies. This science can help state public health and agriculture agencies identify the animal hosts that most impact virus propagation in a particular geographic region, the migration path of the virus including its origin, and the patterns of infection in various host populations, including humans, over time. The National Center for Biotechnology Information (NCBI), specifically GenBank, provides an abundance of available viral sequence data for phylogeography. Sequences and their metadata can be downloaded and imported into software applications that generate phylogeographic trees and models for surveillance. However, geospatial metadata such as host location is inconsistently represented and sparse across GenBank entries, with our preliminary studies showing only about 20% of the GenBank records contain specific information such as a county, town, or region within a state. While this detailed geospatial information might be included in the corresponding journal article, it is not available for immediate use in a bioinformatics or GIS application unless it is manually extracted and linked back to the appropriate sequence. Absence of precise sampling locations from easily-computable secondary data sources such as GenBank increases the difficulty of achieving accurate phylogeographic models of virus migration. We propose an infrastructure to improve phylogeographic models of virus migration by linking relevant geospatial data from the literature. This work represents the first effort to use automatically extracted geospatial data present in journal articles corresponding to GenBank records in order to enhance modeling of virus migration. Our research will extend phylogeography and zoonotic surveillance by: creating a Natural Language Processing (NLP) infrastructure that will improve the level of detail of geospatial data for phylogeography of zoonotic viruses (Aim 1), develop phylogeographic models using the data extracted in Aim 1 with adequate biostatistical models (Aim 2), and evaluating the impact of our approach for phylogeography and surveillance of zoonotic viruses (Aim 3). Thus, this work will provide researchers with a framework for population surveillance using an integrated biomedical informatics approach including NLP, biostatistics, bioinformatics, and database design.
描述(由申请人提供):人畜共患病毒的系统地理学研究在动物和人类之间传播的病毒的地理传播和遗传谱系,如禽流感和狂犬病。这门科学可以帮助州公共卫生和农业机构确定在特定地理区域对病毒传播影响最大的动物宿主、病毒的迁移路径(包括其起源)以及随着时间的推移在包括人类在内的各种宿主群体中的感染模式。国家生物技术信息中心(NCBI),特别是基因库,为系统地理学提供了大量可用的病毒序列数据。序列及其元数据可以下载并导入软件应用程序,生成系统地理树和用于监视的模型。然而,地理空间元数据,如主机位置,在GenBank条目中是不一致的和稀疏的,我们的初步研究表明,只有大约20%的GenBank记录包含特定的信息,如一个州内的县、镇或地区。虽然这些详细的地理空间信息可能包含在相应的期刊文章中,但它不能立即用于生物信息学或地理信息系统应用,除非手工提取并链接回适当的序列。从易于计算的次要数据源(如GenBank)中缺乏精确的采样位置,增加了实现病毒迁移的准确系统地理模型的难度。我们提出了一个基础设施,通过链接文献中的相关地理空间数据来改进病毒迁移的系统地理模型。这项工作是第一次尝试使用自动提取的地理空间数据,这些数据出现在与GenBank记录相对应的期刊文章中,以增强病毒迁移的建模。我们的研究将通过以下方式扩展系统地理学和人畜共患病监测:创建一个自然语言处理(NLP)基础设施,该基础设施将提高人畜共患病病毒系统地理学地理空间数据的细节水平(目标1),使用在目标1中提取的数据开发系统地理学模型,并提供适当的生物统计模型(目标2),并评估我们的方法对系统地理学和人畜共患病病毒监测的影响(目标3)。因此,这项工作将为研究人员提供一个使用综合生物医学信息学方法的人口监测框架,包括自然语言处理、生物统计学、生物信息学和数据库设计。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Knowledge-driven geospatial location resolution for phylogeographic models of virus migration.
  • DOI:
    10.1093/bioinformatics/btv259
  • 发表时间:
    2015-06-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Weissenbacher D;Tahsin T;Beard R;Figaro M;Rivera R;Scotch M;Gonzalez G
  • 通讯作者:
    Gonzalez G
Natural language processing methods for enhancing geographic metadata for phylogeography of zoonotic viruses.
用于增强人畜共患病毒系统发育地理学地理元数据的自然语言处理方法。
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

GRACIELA GONZALEZ HERNANDEZ其他文献

GRACIELA GONZALEZ HERNANDEZ的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('GRACIELA GONZALEZ HERNANDEZ', 18)}}的其他基金

Enriching SARS-CoV-2 sequence data in public repositories with information extracted from full text articles
利用从全文文章中提取的信息丰富公共存储库中的 SARS-CoV-2 序列数据
  • 批准号:
    10681068
  • 财政年份:
    2022
  • 资助金额:
    $ 45.15万
  • 项目类别:
AD/ADRD Pilot Core
AD/ADRD 试点核心
  • 批准号:
    10491793
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
AD/ADRD Pilot Core
AD/ADRD 试点核心
  • 批准号:
    10274453
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
AD/ADRD Pilot Core
AD/ADRD 试点核心
  • 批准号:
    10907321
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
Enriching SARS-CoV-2 sequence data in public repositories with information extracted from full text articles
利用从全文文章中提取的信息丰富公共存储库中的 SARS-CoV-2 序列数据
  • 批准号:
    10701081
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
Enriching SARS-CoV-2 sequence data in public repositories with information extracted from full text articles
利用从全文文章中提取的信息丰富公共存储库中的 SARS-CoV-2 序列数据
  • 批准号:
    10390667
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
AD/ADRD Pilot Core
AD/ADRD 试点核心
  • 批准号:
    10685544
  • 财政年份:
    2021
  • 资助金额:
    $ 45.15万
  • 项目类别:
Tracking Evolution and Spread of Viral Genomes by Geospatial Observation Error
通过地理空间观测误差追踪病毒基因组的进化和传播
  • 批准号:
    9249484
  • 财政年份:
    2016
  • 资助金额:
    $ 45.15万
  • 项目类别:
Social Media Mining for Pharmacovigilance
用于药物警戒的社交媒体挖掘
  • 批准号:
    10407315
  • 财政年份:
    2012
  • 资助金额:
    $ 45.15万
  • 项目类别:
Mining Social Network Postings for Mentions of Potential Adverse Drug Reactions
挖掘社交网络帖子中提及潜在药物不良反应的内容
  • 批准号:
    8222740
  • 财政年份:
    2012
  • 资助金额:
    $ 45.15万
  • 项目类别:

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Fellowship
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Continuing Grant
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Research Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 45.15万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了