Computational Methods for Structured Data and Models
结构化数据和模型的计算方法
基本信息
- 批准号:2113079
- 负责人:
- 金额:$ 15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-08-15 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Scientists in many fields, including genetics, neuroscience, ecology, and economics, are obtaining richer measurements of more complex processes than ever before. This offers thrilling opportunities to answer scientific questions that were previously out of reach. For instance, measurements of disease prevalence collected at many locations over time provide the opportunity to estimate the spread of disease. Latent variable models, which model the observed data as a simple transformation of unobserved latent random variables, are a popular approach to extracting answers to scientific questions from measurements of complex processes. They are flexible, but they can also be computationally prohibitive. As a result, scientists using latent variable models may have to settle for approximations of unknown quality or ad-hoc simplifications that provide poor estimates or fail to answer questions of interest. This project aims to develop novel methods for fitting latent variable models that are more computationally efficient, reliable, and accessible. Involvement in the project will train statisticians at all levels, with a focus on statisticians from populations that are underrepresented in statistics research. Specifically, the investigator will supervise graduate student involvement in the research, guide the development of engaging outreach materials by undergraduate students, participate in outreach at local high schools, and lead writing groups for early-career faculty. A variety of computational challenges arise when fitting latent variable models. It can be difficult to characterize and simulate from the conditional distribution of the latent variables given observed data, even when the small number of parameters characterizing the latent variable model are known. Furthermore, it can be difficult to estimate the unknown parameters of a latent variable model because the likelihood of the data corresponds to a high dimensional integral for which a closed-form expression may be unavailable or expensive to evaluate. Even when feasible methods are available, it can be difficult for practitioners to implement latent variable models without access to open-source software and detailed tutorials. Accordingly, this project aims to contribute (i) novel methods for simulating from the conditional distributions of high dimensional latent variables, (ii) improved methods for maximum likelihood estimation of latent variable model parameters, and (iii) versatile statistical software that allows practitioners to implement them. Regarding (i), the PI plans to develop novel pathwise methods for simulating from the conditional distributions of high dimensional latent variables given data that leverage the relationship of the target conditional distribution to related or approximate distributions. Regarding (ii), the PI will develop improved methods for maximum likelihood estimation of latent variable model parameters that leverage the pathwise simulation methods introduced in the first aim. Regarding (iii), the PI will apply the new methods to disease mapping and genome-wide association studies and develop R packages that allow other practitioners to implement the methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
包括遗传学、神经科学、生态学和经济学在内的许多领域的科学家正在获得比以往任何时候都更丰富的测量数据。这提供了令人兴奋的机会来回答以前遥不可及的科学问题。例如,随着时间的推移,在许多地方收集的疾病流行率的测量数据为估计疾病的传播提供了机会。潜变量模型将观测数据建模为未观测潜随机变量的简单变换,是从复杂过程的测量中提取科学问题答案的流行方法。它们是灵活的,但它们也可能是计算上禁止的。因此,使用潜变量模型的科学家可能不得不满足于未知质量的近似值或提供较差估计或无法回答感兴趣问题的特设简化。该项目旨在开发新的方法来拟合潜在变量模型,这些模型在计算上更有效,更可靠,更容易获得。参与该项目将培训各级统计人员,重点是来自统计研究中代表性不足的人口的统计人员。具体而言,调查员将监督研究生参与研究,指导本科生开发参与外展材料,参与当地高中的外展活动,并领导早期职业教师的写作小组。在拟合潜变量模型时,会出现各种计算挑战。即使已知表征潜变量模型的少量参数,也难以根据给定观测数据的潜变量的条件分布来表征和模拟。此外,估计潜在变量模型的未知参数可能是困难的,因为数据的似然性对应于高维积分,对于高维积分,封闭形式的表达式可能是不可用的或评估昂贵的。即使有可行的方法,从业者也很难在没有开源软件和详细教程的情况下实现潜变量模型。因此,本项目的目的是贡献(i)新的方法,从高维潜变量的条件分布进行模拟,(ii)改进的方法,最大似然估计的潜变量模型参数,和(iii)通用的统计软件,允许从业者实现它们。 关于(i),PI计划开发新的路径方法,用于从高维潜变量的条件分布模拟给定数据,这些数据利用目标条件分布与相关或近似分布的关系。关于(ii),PI将开发潜在变量模型参数的最大似然估计的改进方法,该方法利用第一个目标中引入的路径模拟方法。关于(iii),PI将把新方法应用于疾病图谱和全基因组关联研究,并开发R包,让其他从业者实施这些方法。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Maryclare Griffin其他文献
Modeling Nonlinear Growth Followed by Long-Memory Equilibrium with Unknown Change Point
建模非线性增长以及未知变化点的长记忆平衡
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Wenyu Zhang;Maryclare Griffin;D. Matteson - 通讯作者:
D. Matteson
Improved Pathwise Coordinate Descent for Power Penalties
改进了功率惩罚的路径坐标下降
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:2.4
- 作者:
Maryclare Griffin - 通讯作者:
Maryclare Griffin
Lasso ANOVA decompositions for matrix and tensor data
矩阵和张量数据的 Lasso ANOVA 分解
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:1.8
- 作者:
Maryclare Griffin;P. Hoff - 通讯作者:
P. Hoff
Likelihood Inference for Possibly Non-Stationary Processes via Adaptive Overdifferencing.
通过自适应过度差分对可能的非平稳过程进行似然推断。
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Maryclare Griffin;G. Samorodnitsky;David S. Matteson - 通讯作者:
David S. Matteson
Structured Shrinkage Priors
结构化收缩先验
- DOI:
10.1080/10618600.2023.2233577 - 发表时间:
2019 - 期刊:
- 影响因子:2.4
- 作者:
Maryclare Griffin;P. Hoff - 通讯作者:
P. Hoff
Maryclare Griffin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Maryclare Griffin', 18)}}的其他基金
Enhancing Underrepresented Participation in Mathematics & Statistics: Mentoring From Junior to Master’s
提高少数群体对数学的参与
- 批准号:
2130262 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Statistical Challenges and Methods in the Analysis of High Dimensional and Complex Structured Data
高维复杂结构化数据分析中的统计挑战和方法
- 批准号:
RGPIN-2018-05475 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Discovery Grants Program - Individual
Novel methods for network-structured time series modelling
网络结构时间序列建模的新方法
- 批准号:
2751518 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Studentship
Statistical Methods for Analyzing Complex Structured and Count Data
分析复杂结构化和计数数据的统计方法
- 批准号:
2210019 - 财政年份:2022
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Statistical Challenges and Methods in the Analysis of High Dimensional and Complex Structured Data
高维复杂结构化数据分析中的统计挑战和方法
- 批准号:
RGPIN-2018-05475 - 财政年份:2021
- 资助金额:
$ 15万 - 项目类别:
Discovery Grants Program - Individual
Optimization Methods for Nonconvex Structured Optimization
非凸结构化优化的优化方法
- 批准号:
2110722 - 财政年份:2021
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Information-Based Complexity Analysis and Optimal Methods for Saddle-Point Structured Optimization
基于信息的鞍点结构优化的复杂性分析和优化方法
- 批准号:
2053493 - 财政年份:2021
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
III: Small: Accessible and Interpretable Machine Reading Methods for Extracting Structured Information from Text
III:小:从文本中提取结构化信息的可访问且可解释的机器阅读方法
- 批准号:
2006583 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Continuing Grant
Statistical Challenges and Methods in the Analysis of High Dimensional and Complex Structured Data
高维和复杂结构化数据分析中的统计挑战和方法
- 批准号:
RGPIN-2018-05475 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Discovery Grants Program - Individual
Effective Algorithms for Structured Nonconvex Optimization Based on First- and Second-Order Methods and Convex Relaxations
基于一阶、二阶方法和凸松弛的结构化非凸优化的有效算法
- 批准号:
2445089 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Studentship
Development of high-performance graph mining methods for graph structured data using various additional information
使用各种附加信息开发图结构化数据的高性能图挖掘方法
- 批准号:
19K12102 - 财政年份:2019
- 资助金额:
$ 15万 - 项目类别:
Grant-in-Aid for Scientific Research (C)