Data Analysis Using Finite Mixture Models

使用有限混合模型进行数据分析

基本信息

  • 批准号:
    9404479
  • 负责人:
  • 金额:
    $ 11.6万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    1994
  • 资助国家:
    美国
  • 起止时间:
    1994-07-01 至 1998-06-30
  • 项目状态:
    已结题

项目摘要

Finite mixture models are increasingly popular in the social and medical sciences for analyzing data thought to arise from a population consisting of categorical types. This research project considers Bayesian methods for analyzing data using finite mixture models. The application of classical methods to such models is difficult because the likelihoods for such models are nonstandard: they are inherently multimodal due to symmetries in the labeling of the mixture components, may be multimodal even for a single fixed labeling of the components, and fail to satisfy the regularity conditions required for classical likelihood ratio tests. The principal scientific questions of interest concern drawing inferences about model parameters, determining the number of mixture components, and stochastically classifying sampling units into the mixture components. Modern advances in statistical computing, such as the EM and ECM algorithms, data augmentation, and Gibbs sampling, are used to obtain inferences about the posterior distribution of model parameters; however special care is needed in applying these methods because of the difficulties mentioned above. This research describes methods for distinguishing between the two types of modes described above, and for carrying out the data analysis given the existence of multiple modes. Draws from the posterior distribution of the model parameters can be used to obtain draws from the posterior predictive distribution of replicate experiments similar to the current experiment. The posterior predictive distribution of test statistics, or of other discrepancy measures, can be used to evaluate the fit of a model, e.g., comparing two component and three component mixture models even though the problem is irregular. Additionally, averaging over a prior distribution on plausible alternatives to the existing model can be used to estimate the sample size required to assess the appropriateness of the existing model against such alternatives and therefore can be used to inform design decisions. It is not uncommon to consider a particular class of statistical models, called mixture models, that assume the population of interest consists of a number of relatively homogeneous subpopulations. Mixture models are useful when a relatively complex model would be required to describe the pattern of data that is observed within the entire population, whereas a relatively simple model applies within each subpopulation. Classical approaches are of limited use in such cases, e.g., classical methods do not apply to the crucial question of determining whether and how many subpopulations are in evidence. This research proposal aims to develop new methods for analyzing data using mixture models and for assessing the adequacy of such models. These new methods take advantage of recent theoretical and computational advances to draw accurate inferences about the important features of the mixture models. The basic approach is to average over all descriptions of the population that are supported by the data and thereby provide an accurate assessment of the variation and patterns to be expected in the population.
有限混合模型在社会科学和医学科学中越来越受欢迎,用于分析由分类类型组成的总体产生的数据。该研究项目考虑使用有限混合模型分析数据的贝叶斯方法。将经典方法应用于此类模型是困难的,因为此类模型的似然性是非标准的:由于混合成分标记的对称性,它们本质上是多峰的,即使对于成分的单个固定标记也可能是多峰的,并且无法满足经典似然比检验所需的正则性条件。感兴趣的主要科学问题涉及对模型参数进行推断、确定混合物成分的数量以及将采样单元随机分类为混合物成分。现代统计计算的进步,例如 EM 和 ECM 算法、数据增强和吉布斯采样,用​​于获得有关模型参数后验分布的推论;然而,由于上述困难,在应用这些方法时需要特别小心。本研究描述了区分上述两种模式的方法,以及在存在多种模式的情况下进行数据分析的方法。从模型参数的后验分布中提取的数据可用于从与当前实验类似的重复实验的后验预测分布中获取数据。检验统计量或其他差异度量的后验预测分布可用于评估模型的拟合度,例如,即使问题不规则,也可以比较两分量和三分量混合模型。此外,对现有模型的合理替代方案的先验分布进行平均可用于估计评估现有模型相对于此类替代方案的适当性所需的样本量,因此可用于为设计决策提供信息。 考虑一类特定的统计模型(称为混合模型)并不罕见,该模型假设感兴趣的总体由许多相对同质的子总体组成。当需要相对复杂的模型来描述在整个群体中观察到的数据模式,而相对简单的模型适用于每个子群体时,混合模型非常有用。 在这种情况下,经典方法的用途有限,例如,经典方法不适用于确定是否存在以及有多少亚群是证据的关键问题。 本研究提案旨在开发使用混合模型分析数据并评估此类模型的充分性的新方法。这些新方法利用最新的理论和计算进展来对混合模型的重要特征做出准确的推论。基本方法是对数据支持的总体的所有描述进行平均,从而提供对总体中预期的变化和模式的准确评估。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Donald Rubin其他文献

Trans-illumination Devices: Improving IV Insertion Accuracy and Success Rates
  • DOI:
    10.1016/j.jopan.2022.05.015
  • 发表时间:
    2022-08-01
  • 期刊:
  • 影响因子:
  • 作者:
    Ivy Mendoza;Patricia L. Ryan;Conny Villareal;Maria Latido;Maria Anicoche;Patricia Bulacan;Dawn McDowell;Dustin Te;Amarylis Ortega;Donald Rubin;Maelynn Mendoza
  • 通讯作者:
    Maelynn Mendoza
The Impact of Education Abroad Participation on College Student Success Among First-Generation Students
第一代学生海外教育参与对大学生成功的影响
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Anthony C. Ogden;H. Ho;Yeana W. Lam;Angela Bell;Rachana Bhatt;Leslie Hodges;Coryn Shiflet;Donald Rubin
  • 通讯作者:
    Donald Rubin
Happy Accidents: Serendipity in Modern Medical Breakthroughs in the Twentieth Century
幸福的意外:二十世纪现代医学突破的机缘
Launching Effectiveness Research to Guide Practice in Neurosurgery: A National Institute Neurological Disorders and Stroke Workshop Report
开展有效性研究来指导神经外科实践:国家神经疾病和中风研究所的研讨会报告
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    4.8
  • 作者:
    P. Walicke;A. Abosch;A. Asher;Fred G. Barker;Z. Ghogawala;R. Harbaugh;L. Jehi;J. Kestle;W. Koroshetz;Roderick J. Little;Donald Rubin;A. Valadka;Stephen Wisniewski;E. A. Chiocca
  • 通讯作者:
    E. A. Chiocca

Donald Rubin的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Donald Rubin', 18)}}的其他基金

Collaborative Research: Generalized Propensity Score Methods
合作研究:广义倾向评分方法
  • 批准号:
    0550887
  • 财政年份:
    2006
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant
Multiple Imputation: Research for the Third Decade
多重插补:第三个十年的研究
  • 批准号:
    9705158
  • 财政年份:
    1997
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant
Bridging Randomized Experiments and Observational Studies
连接随机实验和观察研究
  • 批准号:
    9709359
  • 财政年份:
    1997
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant
Causal Inference Applied to Income Effects
因果推理应用于收入效应
  • 批准号:
    9423018
  • 财政年份:
    1995
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
Applications of Modern Statistical Thinking to the Social Sciences
现代统计思维在社会科学中的应用
  • 批准号:
    9207456
  • 财政年份:
    1992
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant
Mathematical Sciences: Topics in Nonparametric and Semiparametric Regression and Correlation Analysis
数学科学:非参数和半参数回归及相关分析主题
  • 批准号:
    9106488
  • 财政年份:
    1991
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
Mathematical Sciences Research Equipment 1990
数学科学研究设备1990
  • 批准号:
    9005696
  • 财政年份:
    1990
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
Extending and Implementing Multiple Imputation Technology
扩展和实施多重插补技术
  • 批准号:
    8805433
  • 财政年份:
    1988
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant
Collaborative Research on the Recalibration of Categorical Data to Achieve Comparability Over Time
重新校准分类数据以实现随时间推移的可比性的协作研究
  • 批准号:
    8311428
  • 财政年份:
    1983
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
A Consumer Health Policy Information and Resource Center For New York City
纽约市消费者健康政策信息和资源中心
  • 批准号:
    7923391
  • 财政年份:
    1980
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Continuing Grant

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队
Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目
基于Meta-analysis的新疆棉花灌水增产模型研究
  • 批准号:
    41601604
  • 批准年份:
    2016
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目
大规模微阵列数据组的meta-analysis方法研究
  • 批准号:
    31100958
  • 批准年份:
    2011
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
用“后合成核磁共振分析”(retrobiosynthetic NMR analysis)技术阐明青蒿素生物合成途径
  • 批准号:
    30470153
  • 批准年份:
    2004
  • 资助金额:
    22.0 万元
  • 项目类别:
    面上项目

相似海外基金

ERI: Data-Driven Analysis and Dynamic Modeling of Residential Power Demand Behavior: Using Long-Term Real-World Data from Rural Electric Systems
ERI:住宅电力需求行为的数据驱动分析和动态建模:使用农村电力系统的长期真实数据
  • 批准号:
    2301411
  • 财政年份:
    2024
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
I-Corps: Vision analysis system using inferred three-dimensional data to analyze and correct a user’s pose in relation to 3D space
I-Corps:视觉分析系统,使用推断的三维数据来分析和纠正用户相对于 3D 空间的姿势
  • 批准号:
    2403992
  • 财政年份:
    2024
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
Developing statistical methods for structural change analysis using panel data
使用面板数据开发结构变化分析的统计方法
  • 批准号:
    24K16343
  • 财政年份:
    2024
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2415562
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
REU Site: University of North Carolina at Greensboro - Complex Data Analysis using Statistical and Machine Learning Tools
REU 站点:北卡罗来纳大学格林斯伯勒分校 - 使用统计和机器学习工具进行复杂数据分析
  • 批准号:
    2244160
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Standard Grant
Optimising Air Transport Route & Demand Planning Using AI Powered Data Analysis
优化航空运输路线
  • 批准号:
    10079293
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Collaborative R&D
Development of multi omics data analysis method using short/long read integration and complete human reference sequences
使用短/长读长集成和完整的人类参考序列开发多组学数据分析方法
  • 批准号:
    23K11300
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
AI-Based Support System for Problem Creation and Sharing by Learners and Teachers using Educational Data Analysis
基于人工智能的支持系统,用于学习者和教师使用教育数据分析创建和共享问题
  • 批准号:
    23K17012
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Multimodal analysis using pen input data and gaze data in science and mathematics e-learning
在科学和数学电子学习中使用笔输入数据和注视数据进行多模态分析
  • 批准号:
    23K17589
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
Strengthening implementation science in Acute Respiratory Failure using multilevel analysis of existing data
利用现有数据的多级分析加强急性呼吸衰竭的实施科学
  • 批准号:
    10731311
  • 财政年份:
    2023
  • 资助金额:
    $ 11.6万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了