Approximate Inference for Latent Position Models
潜在位置模型的近似推理
基本信息
- 批准号:RGPIN-2022-03012
- 负责人:
- 金额:$ 2.26万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Toy Example and Introduction: Fans of competitive games, from hockey to videogames such as DOTA, enjoy ranking teams based on their performance. A very simple statistical model for ranking might assume that every team has a single unobserved latent characteristic (their "true skill"), then model the chance of winning a match as a simple function of the "true skills" of the teams playing. A statistician could use this model to infer the ranking and "true skills" of various teams based on the observed games. Furthermore, the statistician could use tools such as Markov chain Monte Carlo (MCMC) to calculate the uncertainty of each part of the estimated ranking - they might be 98% sure that Tampa Bay is better than Columbus, but only 54% sure that it is better than Vegas. In practice, statisticians develop more complicated models that incorporate many types of team skill, but the basic principles and goals are the same. The NHL has only 32 teams, and so it is easy to fit very complicated models. On the other hand, over 400,000 people play DOTA every day. An algorithm that runs in minutes on your phone for NHL data could take months on your desktop for DOTA data. This discrepancy becomes even worse for calculating certainty estimates. The fundamental "big data" problem illustrated by this example is: the computational cost of fitting latent-position models grows very quickly in the size of the dataset, making many natural statistical analyses computationally intractable. The computational costs grow much more quickly than linearly in the size of the dataset, which means that the problem can't easily be solved by simply buying a slightly better computer. The primary goal of this proposal is to develop algorithms that ameliorate this problem, allowing researchers to fit sophisticated models to substantially larger datasets. The secondary goal is to develop a deeper understanding of the limits of this approach - when one must "give up" and try a different approach. Impact: Latent-position models (LPMs) that are almost identical to the "ranking" model described above are not primarily used for sports analysis. They are ubiquitous in cybersecurity (for detecting and prioritizing anamolies), neuroscience (for visualizing functional relationships), and many other areas. Achieving the central goal would allow more sophisticated versions of these models to be applied to larger datasets, improving inference. Methodology and Relation to Existing Literature: Achieving the primary goal requires new (i) point estimators related to LPMs and (ii) methods for incorporating such point estimators into MCMC algorithms. The secondary goal is based on new probabilistic "anti-concentration" bounds. Both rely on expertise in MCMC theory. In the long term, solutions to the "big data" for MCMC and LPMs will lead to solutions for other MCMC "big-data" problems such as tensor models.
玩具示例和介绍:从曲棍球到DOTA等视频游戏,竞技游戏的粉丝们喜欢根据他们的表现对球队进行排名。一个非常简单的排名统计模型可以假设每个球队都有一个未被观察到的潜在特征(他们的“真正技能”),然后将赢得比赛的机会建模为球队“真正技能”的简单函数。统计学家可以利用这个模型,根据观察到的比赛来推断各个球队的排名和“真正的技能”。此外,统计学家可以使用马尔可夫链蒙特卡罗(MCMC)等工具来计算估计排名的每个部分的不确定性-他们可能有98%的把握认为坦帕湾优于哥伦布,但只有54%的把握认为它优于拉斯维加斯。在实践中,统计学家开发了更复杂的模型,其中包含许多类型的团队技能,但基本原则和目标是相同的。NHL只有32支球队,所以很容易适应非常复杂的模型。另一方面,每天有超过40万人玩DOTA。一个在手机上运行几分钟的NHL数据算法可能需要几个月的桌面上的DOTA数据。这种差异在计算确定性估计时变得更糟。这个例子所说明的基本“大数据”问题是:拟合潜在位置模型的计算成本随着数据集的大小而快速增长,使得许多自然的统计分析在计算上难以处理。计算成本的增长速度比数据集大小的线性增长速度快得多,这意味着这个问题不能简单地通过购买一台稍微好一点的计算机来解决。该提案的主要目标是开发改善这一问题的算法,使研究人员能够将复杂的模型拟合到更大的数据集。第二个目标是更深入地理解这种方法的局限性-当一个人必须“放弃”并尝试不同的方法时。影响:与上述“排名”模型几乎相同的潜在位置模型(LPM)主要不用于体育分析。它们在网络安全(用于检测和优先考虑Anamolies),神经科学(用于可视化功能关系)和许多其他领域中无处不在。实现中心目标将允许这些模型的更复杂版本应用于更大的数据集,从而改善推理。方法和与现有文献的关系:实现主要目标需要新的(i)与LPM相关的点估计量和(ii)将此类点估计量纳入MCMC算法的方法。第二个目标是基于新的概率“反浓度”的界限。两者都依赖于MCMC理论的专业知识。从长远来看,MCMC和LPM的“大数据”解决方案将导致其他MCMC“大数据”问题的解决方案,如张量模型。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Smith, Aaron其他文献
Estimating the market effect of a food scare: The case of genetically modified StarLink corn
- DOI:
10.1162/rest.89.3.522 - 发表时间:
2007-08-01 - 期刊:
- 影响因子:8
- 作者:
Carter, Colin A.;Smith, Aaron - 通讯作者:
Smith, Aaron
Traveling surface undulation on a Ni-Mn-Ga single crystal elemen
Ni-Mn-Ga 单晶元件的行进表面起伏
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:4.1
- 作者:
Armstrong, Andrew;Karki, Bibek;Smith, Aaron;Müllner, Peter - 通讯作者:
Müllner, Peter
Growth-Inhibitory and Immunomodulatory Activities of Wild Mushrooms from North-Central British Columbia (Canada)
- DOI:
10.1615/intjmedmushrooms.v19.i6.10 - 发表时间:
2017-01-01 - 期刊:
- 影响因子:1.2
- 作者:
Smith, Aaron;Javed, Sumreen;Lee, Chow H. - 通讯作者:
Lee, Chow H.
Accelerating Asymptotically Exact MCMC for Computationally Intensive Models via Local Approximations
- DOI:
10.1080/01621459.2015.1096787 - 发表时间:
2016-12-01 - 期刊:
- 影响因子:3.7
- 作者:
Conrad, Patrick R.;Marzouk, Youssef M.;Smith, Aaron - 通讯作者:
Smith, Aaron
Parallel Local Approximation MCMC for Expensive Models
- DOI:
10.1137/16m1084080 - 发表时间:
2018-01-01 - 期刊:
- 影响因子:2
- 作者:
Conrad, Patrick R.;Davis, Andrew D.;Smith, Aaron - 通讯作者:
Smith, Aaron
Smith, Aaron的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Smith, Aaron', 18)}}的其他基金
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2021
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2020
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2019
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2018
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2017
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2016
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Examination and Assessment of Hydrologic Controls Utilizing Stable Water Isotopes in Northern Canadian Sparsely Gauged Basins
在加拿大北部稀疏测量盆地利用稳定水同位素进行水文控制的检查和评估
- 批准号:
460643-2014 - 财政年份:2016
- 资助金额:
$ 2.26万 - 项目类别:
Postgraduate Scholarships - Doctoral
Mixing Regimes for Adaptive Markov Chain Monte Carlo
自适应马尔可夫链蒙特卡罗的混合机制
- 批准号:
RGPIN-2015-05460 - 财政年份:2015
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Examination and Assessment of Hydrologic Controls Utilizing Stable Water Isotopes in Northern Canadian Sparsely Gauged Basins
在加拿大北部稀疏测量盆地利用稳定水同位素进行水文控制的检查和评估
- 批准号:
460643-2014 - 财政年份:2015
- 资助金额:
$ 2.26万 - 项目类别:
Postgraduate Scholarships - Doctoral
Examination and Assessment of Hydrologic Controls Utilizing Stable Water Isotopes in Northern Canadian Sparsely Gauged Basins
在加拿大北部稀疏测量盆地利用稳定水同位素进行水文控制的检查和评估
- 批准号:
460643-2014 - 财政年份:2014
- 资助金额:
$ 2.26万 - 项目类别:
Postgraduate Scholarships - Doctoral
相似海外基金
Cognitive and Neural Strategies for Latent Feature Inference
潜在特征推理的认知和神经策略
- 批准号:
10662877 - 财政年份:2023
- 资助金额:
$ 2.26万 - 项目类别:
Using latent growth modelling approach for integration and inference from multiple longitudinal microbiome datasets.
使用潜在生长建模方法对多个纵向微生物组数据集进行整合和推断。
- 批准号:
475717 - 财政年份:2022
- 资助金额:
$ 2.26万 - 项目类别:
Studentship Programs
Developing New Algebraic Geometric Information Criteria for Monte Carlo Inference and Model Selection in Latent Variable and Missing Data Problems
为潜变量和缺失数据问题中的蒙特卡罗推理和模型选择开发新的代数几何信息准则
- 批准号:
261488-2012 - 财政年份:2017
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Theoretical Analysis of Variational Bayesian Inference for Latent Variable Models
潜变量模型变分贝叶斯推理的理论分析
- 批准号:
17K12743 - 财政年份:2017
- 资助金额:
$ 2.26万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Identification and Statistical Inference in Graphical Models with Feedback and Latent Variables
具有反馈和潜变量的图模型中的识别和统计推断
- 批准号:
1712535 - 财政年份:2017
- 资助金额:
$ 2.26万 - 项目类别:
Continuing Grant
Design Principles of Learning and Inference Models with Optimal Latent Distributions
具有最佳潜在分布的学习和推理模型的设计原理
- 批准号:
15K16050 - 财政年份:2015
- 资助金额:
$ 2.26万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Developing New Algebraic Geometric Information Criteria for Monte Carlo Inference and Model Selection in Latent Variable and Missing Data Problems
为潜变量和缺失数据问题中的蒙特卡罗推理和模型选择开发新的代数几何信息准则
- 批准号:
261488-2012 - 财政年份:2015
- 资助金额:
$ 2.26万 - 项目类别:
Discovery Grants Program - Individual
Effective inference and selection of statistical models to represent latent structure in spatial data
有效推断和选择统计模型来表示空间数据中的潜在结构
- 批准号:
26330042 - 财政年份:2014
- 资助金额:
$ 2.26万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Methods of analysis and inference for social survey data within the framework of latent variable modeling and pairwise likelihood
在潜变量建模和成对似然框架内分析和推断社会调查数据的方法
- 批准号:
ES/L009838/1 - 财政年份:2014
- 资助金额:
$ 2.26万 - 项目类别:
Research Grant
Latent Dirichlet Allocation for Protein Inference in Quantitative Proteomics
定量蛋白质组学中蛋白质推断的潜在狄利克雷分配
- 批准号:
8771434 - 财政年份:2014
- 资助金额:
$ 2.26万 - 项目类别: