Data integration for large scale ecological models

大规模生态模型的数据集成

基本信息

  • 批准号:
    NE/R005133/1
  • 负责人:
  • 金额:
    $ 4.09万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2017
  • 资助国家:
    英国
  • 起止时间:
    2017 至 无数据
  • 项目状态:
    已结题

项目摘要

Ecological models are becoming larger, more complicated, and being used for an increasingly wide range of applications, from describing trends and mapping distributions to understanding mechanistic relationships and predicting the impact of future scenarios. In response, there has been a huge growth in statistical methods for large-scale ecological models. However, most such methods do not account for the fact that ecological data is inherently heterogeneous, and large datasets typically contain many forms of bias.Recently, a set of hierarchical Bayesian models (HBMs) have emerged as promising ways for dealing with biased data, particularly for occurrence records and other unstructured data. Many millions of unstructured occurrence records exist, so the potential of these new methods is enormous. Not all data contain biases, though. A minority of biodiversity data is highly structured in terms of the sample locations, fixed protocols and regular sampling. Ideally, we'd like to retain the information about this in our models, but combine it with the much larger sample sizes of unstructured datasets.Integrated models provide a way to do this. They are a subclass of HBM in which data heterogeneity is modelled explicitly, by treating datasets with different observation processes as independent realisations of the same underlying state. For example, causal observations on GBIF and the Breeding Bird Survey both contain information about whether the population of a particular species was extant at a particular point in space and time.At present, these integrated models are the preserve of highly competent statisticians. They are hard to specify and difficult to fit and diagnose. One goal of this partnership is to build an extensible framework for fitting integrated models that will make them accessible to a broad community of ecological modellers. This framework, in the form of open source tools, will make it easier for ecologists to handle biased data when addressing large-scale questions about biodiversity.Although attractive from a conceptual standpoint, it is unclear whether the sophistication of integrated models deliver real benefits over simple ones. In particular there is an urgent need for some general principles about how to proceed when both structured and unstructured data sources are available. Critical questions include:Q1. When and how should we combine datasets with different properties?Q2. Under what circumstances is simple aggregation (i.e. ignoring the different observation processes) better than integration? Q3. If we suspect the data contain biases, can we detect them and handle them adequately?Q4. What are the most appropriate metrics for information content and model fit?These general questions lie at the intersection of the research interests of PI Isaac, Co-I Henrys and Project Partner O'Hara. Each has made some progress towards addressing specific aspects of these questions. Working in partnership would add significant value to each, by taking existing research beyond the specific context and toward general answers to these big questions. It would permit a co-ordinated effort and build a work program of international significance. This pump-priming award would provide a platform for this partnership. The overall aim is to build a framework for inference in large-scale models of species' distribution, and to test it using computer simulations.
生态模型正变得越来越大,越来越复杂,并被用于越来越广泛的应用,从描述趋势和绘制分布图到理解机械关系和预测未来情景的影响。作为回应,大规模生态模型的统计方法有了巨大的增长。然而,大多数这样的方法没有考虑到这样一个事实,即生态数据是固有的异质性,和大型数据集通常包含多种形式的bias.Recently,一组层次贝叶斯模型(HBM)已经出现了有前途的方法来处理有偏见的数据,特别是发生记录和其他非结构化数据。数以百万计的非结构化事件记录存在,所以这些新方法的潜力是巨大的。然而,并非所有数据都包含偏见。少数生物多样性数据在取样地点、固定协议和定期取样方面结构性很强。理想情况下,我们希望在模型中保留这方面的信息,但联合收割机将其与非结构化数据集的更大样本量相结合。集成模型提供了一种实现这一点的方法。它们是HBM的一个子类,其中数据异质性被明确建模,通过将具有不同观测过程的数据集视为相同底层状态的独立实现。例如,对GBIF和鸟类繁殖调查的因果观测都包含了关于特定物种的种群是否在空间和时间的特定点上存在的信息,目前,这些综合模型是高度胜任的统计学家的专利。它们很难具体说明,也很难拟合和诊断。这种伙伴关系的一个目标是建立一个可扩展的框架,以适应综合模型,使它们能够被广泛的生态建模者社区所使用。这一框架以开源工具的形式出现,将使生态学家在解决大规模生物多样性问题时更容易处理有偏见的数据。尽管从概念上看很有吸引力,但目前还不清楚复杂的综合模型是否比简单模型带来了真实的好处。特别是,迫切需要一些关于在结构化和非结构化数据源都可用时如何进行的一般原则。关键问题包括:Q1。何时以及如何组合具有不同属性的联合收割机数据集?Q2.在什么情况下,简单的汇总(即忽略不同的观察过程)比整合更好?Q3.如果我们怀疑数据中含有偏见,我们能否发现它们并充分处理它们?Q4.什么是信息内容和模型匹配的最合适的度量标准?这些一般性的问题是PI Isaac,Co-I Henrys和项目合作伙伴O 'Hara的研究兴趣的交叉点。每一个国家都在解决这些问题的具体方面取得了一些进展。通过将现有的研究超越具体背景,并对这些重大问题做出一般性的回答,合作将为每一个人增加重要的价值。它将允许协调努力,并建立一个具有国际意义的工作计划。这个泵启动奖将为这种伙伴关系提供一个平台。总体目标是建立一个大规模物种分布模型的推理框架,并使用计算机模拟对其进行测试。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Integrated species distribution models: A comparison of approaches under different data quality scenarios
综合物种分布模型:不同数据质量场景下方法的比较
  • DOI:
    10.1111/ddi.13255
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    4.6
  • 作者:
    Ahmad Suhaimi S
  • 通讯作者:
    Ahmad Suhaimi S
A century of social wasp occupancy trends from natural history collections: spatiotemporal resolutions have little effect on model performance
  • DOI:
    10.1111/icad.12494
  • 发表时间:
    2021-03-30
  • 期刊:
  • 影响因子:
    3.5
  • 作者:
    Jonsson, Galina M.;Broad, Gavin R.;Isaac, Nick J. B.
  • 通讯作者:
    Isaac, Nick J. B.
Is more data always better? A simulation study of benefits and limitations of integrated distribution models
  • DOI:
    10.1111/ecog.05146
  • 发表时间:
    2020-07-14
  • 期刊:
  • 影响因子:
    5.9
  • 作者:
    Simmonds, Emily G.;Jarvis, Susan G.;O'Hara, Robert B.
  • 通讯作者:
    O'Hara, Robert B.
Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation
INLA 中安装的综合物种分布模型对网格参数化很敏感
  • DOI:
    10.1111/ecog.06391
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    5.9
  • 作者:
    Dambly L
  • 通讯作者:
    Dambly L
Integrating data from different taxonomic resolutions to better estimate community alpha diversity
整合来自不同分类分辨率的数据以更好地估计群落阿尔法多样性
  • DOI:
    10.22541/au.169823343.32048210/v1
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Adjei K
  • 通讯作者:
    Adjei K
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Nicholas Isaac其他文献

Nicholas Isaac的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Nicholas Isaac', 18)}}的其他基金

GLobal Insect Threat-Response Synthesis (GLiTRS): a comprehensive and predictive assessment of the pattern and consequences of insect declines
全球昆虫威胁响应综合(GLiTRS):对昆虫衰退模式和后果的全面预测评估
  • 批准号:
    NE/V007548/1
  • 财政年份:
    2020
  • 资助金额:
    $ 4.09万
  • 项目类别:
    Research Grant
A unified approach to studying animal abundance: integrating evolution, ecology and scale dependency
研究动物丰度的统一方法:整合进化、生态和规模依赖性
  • 批准号:
    NE/D009448/2
  • 财政年份:
    2008
  • 资助金额:
    $ 4.09万
  • 项目类别:
    Fellowship
A unified approach to studying animal abundance: integrating evolution, ecology and scale dependency
研究动物丰度的统一方法:整合进化、生态和规模依赖性
  • 批准号:
    NE/D009448/1
  • 财政年份:
    2007
  • 资助金额:
    $ 4.09万
  • 项目类别:
    Fellowship

相似海外基金

AI models of multi-omic data integration for ming longevity core signaling pathways
长寿核心信号通路多组学数据整合的人工智能模型
  • 批准号:
    10745189
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
Collaborative Research: PPoSS: LARGE: Research into the Use and iNtegration of Data Movement Accelerators (RUN-DMX)
协作研究:PPoSS:大型:数据移动加速器 (RUN-DMX) 的使用和集成研究
  • 批准号:
    2316176
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
    Continuing Grant
Leveraging complementary big data methods and patient intervention designs to optimize neural markers of adolescent cannabis use
利用互补的大数据方法和患者干预设计来优化青少年大麻使用的神经标记
  • 批准号:
    10739527
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
Harmonizing and Integrating Nursing Data into Multidisciplinary Datasets to Evaluate Hospital Care and Readmissions of Older Adults with Alzheimer's Disease-Related Dementias
将护理数据协调并整合到多学科数据集中,以评估患有阿尔茨海默病相关痴呆症的老年人的医院护理和再入院情况
  • 批准号:
    10789306
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
Collaborative Research: PPoSS: LARGE: Research into the Use and iNtegration of Data Movement Accelerators (RUN-DMX)
协作研究:PPoSS:大型:数据移动加速器 (RUN-DMX) 的使用和集成研究
  • 批准号:
    2316177
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
    Continuing Grant
Predicting 3D physical gene-enhancer interactions through integration of GTEx and 4DN data
通过整合 GTEx 和 4DN 数据预测 3D 物理基因增强子相互作用
  • 批准号:
    10776871
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
Biostatistics and Data Analysis
生物统计学和数据分析
  • 批准号:
    10555807
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
In silico screening for immune surveillance adaptation in cancer using Common Fund data resources
使用共同基金数据资源对癌症免疫监测适应进行计算机筛选
  • 批准号:
    10773268
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
MOSAIC: Data Integration and Computation Core
MOSAIC:数据集成与计算核心
  • 批准号:
    10729426
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
Implementing a coupled system of integrative ML modeling and data validation for elucidating microglial therapeutic targets in neurodegenerative disease
实施集成机器学习建模和数据验证的耦合系统,以阐明神经退行性疾病中的小胶质细胞治疗靶点
  • 批准号:
    10699794
  • 财政年份:
    2023
  • 资助金额:
    $ 4.09万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了