SEER RRSS Using Multiple Imputation to Enhance the Utility of SEER Summary Stage

SEER RRSS 使用多重插补增强 SEER 摘要阶段的实用性

基本信息

  • 批准号:
    8351006
  • 负责人:
  • 金额:
    $ 5.3万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2011
  • 资助国家:
    美国
  • 起止时间:
    2011-09-30 至 2012-09-29
  • 项目状态:
    已结题

项目摘要

Multiple Imputation (MI) methods have been widely used in many scientific fields to address missing data issues. Several statistical software packages have implemented MI procedures. However, their performance varies. Attempts were made to compare performance of MI procedures in varying statistical packages. Because of the complexity of MI, comparisons were made based on specific settings, such as assuming missing patterns were monotonic missing, the missing variable was semi-continuous, or the missing data were simple artificial data. None of these comparisons addressed which methods are best used in a large data set, such as the SEER registry data with non-monotonic missing pattern and many variables. This study will investigate issues more specific to the SEER registry data when using MI methods to handle missing data. Through this study, guidance for researchers in the cancer registry community will be provided with regard to properly handling of the missing data issue in cancer registry data. Missing data is a frequent problem in most scientific studies and a common feature of large data sets in general and medical data sets in particular. It can cause bias or lead to inefficient analyses if not handled properly. Because of its high standard requirements, the SEER data have only a small fraction of missing data for most of the variables collected. However, one very important variable, the SEER Summary Stage (1977, 2000 and CS), contains a higher percentage of missing or unknown data, especially for certain cancer sites. For example, 9.8% of the lung cancer cases and 22% of the liver cancer cases were coded as unknown for the variable SEER Summary Stage 2000 for the 2001-2003 SEER data. The complete case method (listwise deletion) is the most commonly used method to address this missing data issue for data analysis among researchers in the cancer registry community. If missing data are not missing completely at random, using the complete case method will introduce bias and generate incorrect results. For the same study data mentioned above, the distributions of age at cancer diagnosis for cases with known stage and cases with unknown stage were significantly different ¿ 34% of cases were 75 years old or older for known stage while 54% of cases were 75 years old or older for unknown stage. This strongly suggests cases with unknown stage were not missing completely at random. Hence, the complete case method is not an ideal method to analyze the data. Coding the cases with unknown stage as a separate sub-category will certainly include all cases in data analysis, but unfortunately, severe bias has been found for this type of analysis when data are not missing completely at random. Compared to the complete case method, MI, one of the more sophisticated methods to handle missing data, provides superior estimates when data are missing at random. First proposed in 1978, MI has become an important and influential approach in the statistical analysis of missing data in recent years because it is easy to use and readily available in many statistical packages. MI replaces each missing value with a set of plausible values that represents the uncertainty about the most appropriate value to impute, then combines results from separate data analyses for each complete dataset to generate the final estimates. It has been suggested that MI often provides valid and robust inferences even when assumptions were not met.
多重插补(MI)方法已广泛应用于许多科学领域,以解决缺失数据问题。若干统计软件包已执行了管理信息程序。然而,它们的表现各不相同。尝试在不同的统计数据包中比较MI手术的性能。由于MI的复杂性,基于特定设置进行比较,例如假设缺失模式为单调缺失、缺失变量为半连续或缺失数据为简单人工数据。这些比较都没有说明哪种方法最适合用于大型数据集,例如具有非单调缺失模式和许多变量的SEER登记数据。本研究将调查使用MI方法处理缺失数据时SEER登记数据的更具体问题。通过这项研究,将为癌症登记研究界的研究人员提供关于正确处理癌症登记数据中缺失数据问题的指导。 缺失数据是大多数科学研究中常见的问题,也是一般大型数据集,特别是医学数据集的共同特征。如果处理不当,它可能会导致偏见或导致低效的分析。由于其高标准的要求,SEER数据只有一小部分的缺失数据的大多数变量收集。然而,一个非常重要的变量,SEER汇总阶段(1977年,2000年和CS),包含更高百分比的缺失或未知数据,特别是对于某些癌症部位。例如,对于2001-2003年SEER数据,9.8%的肺癌病例和22%的肝癌病例被编码为变量SEER汇总阶段2000未知。 完整病例方法(列表删除)是癌症登记社区研究人员在数据分析中解决这种缺失数据问题的最常用方法。如果缺失数据不是完全随机缺失的,则使用完整病例方法将引入偏倚并生成不正确的结果。对于上述相同的研究数据,已知分期病例和未知分期病例的癌症诊断年龄分布存在显著差异,已知分期病例中34%的病例为75岁或以上,而未知分期病例中54%的病例为75岁或以上。这强烈表明,分期未知的病例并非完全随机缺失。因此,完全案例法不是分析数据的理想方法。将分期未知的病例编码为一个单独的子类别,肯定会包括数据分析中的所有病例,但不幸的是,当数据不是完全随机缺失时,这种类型的分析会发现严重的偏倚。 与完整病例方法相比,MI是处理缺失数据的更复杂的方法之一,当数据随机缺失时,MI提供了上级估计。MI于1978年首次提出,近年来已成为缺失数据统计分析中一种重要和有影响力的方法,因为它易于使用,并且可以在许多统计软件包中使用。MI将每个缺失值替换为一组合理值,这些值代表了最适合插补的值的不确定性,然后将每个完整数据集的单独数据分析结果合并,以生成最终估计值。有人建议,MI往往提供有效和强大的推论,即使假设不满足。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

THOMAS TUCKER其他文献

THOMAS TUCKER的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('THOMAS TUCKER', 18)}}的其他基金

SEER-LINKED VIRTUAL TISSUE REPOSITORY (VTR) PROGRAM
SEER 链接虚拟组织存储库 (VTR) 计划
  • 批准号:
    10976186
  • 财政年份:
    2023
  • 资助金额:
    $ 5.3万
  • 项目类别:
NCI SEER-LINKED PEDIATRIC WHOLE SLIDE IMAGING (POP: 8/17/2020 - 8/16/2021)
NCI SEER 关联儿科全切片成像(POP:2020 年 8 月 17 日 - 2021 年 8 月 16 日)
  • 批准号:
    10272821
  • 财政年份:
    2020
  • 资助金额:
    $ 5.3万
  • 项目类别:
Patterns of Care/Quality of Care Study: Diagnosis Year 2013 (SEER)Period of Performance: 08/15/2014-08/14/2015Line item #: 1
护理模式/护理质量研究:2013 年诊断 (SEER) 执行期间:08/15/2014-08/14/2015行项目
  • 批准号:
    8928277
  • 财政年份:
    2014
  • 资助金额:
    $ 5.3万
  • 项目类别:
IGF::OT::IGF 402 - NORTH AMERICAN ASSOCIATION OF CENTRAL CANCER REGISTRIES, INC. (NAACCR); TECHNICAL SUPPORT FOR CANCER SURVEILLANCE; POP 07/01/2014-06/30/2014
IGF::OT::IGF 402 - 北美中央癌症登记协会 (NAACCR);
  • 批准号:
    8885293
  • 财政年份:
    2014
  • 资助金额:
    $ 5.3万
  • 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
  • 批准号:
    8522048
  • 财政年份:
    2012
  • 资助金额:
    $ 5.3万
  • 项目类别:
Patterns of Care (POC) Quality of Care Dx Yr 2011
护理模式 (POC) 护理质量 Dx 2011 年
  • 批准号:
    8565231
  • 财政年份:
    2012
  • 资助金额:
    $ 5.3万
  • 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
  • 批准号:
    8317499
  • 财政年份:
    2011
  • 资助金额:
    $ 5.3万
  • 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
  • 批准号:
    8317500
  • 财政年份:
    2011
  • 资助金额:
    $ 5.3万
  • 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
  • 批准号:
    8131535
  • 财政年份:
    2010
  • 资助金额:
    $ 5.3万
  • 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
  • 批准号:
    8163679
  • 财政年份:
    2010
  • 资助金额:
    $ 5.3万
  • 项目类别:

相似海外基金

How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
  • 批准号:
    BB/Z514391/1
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
  • 批准号:
    2312555
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
  • 批准号:
    2327346
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
  • 批准号:
    ES/Z502595/1
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Fellowship
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
  • 批准号:
    ES/Z000149/1
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Research Grant
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
  • 批准号:
    23K24936
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
  • 批准号:
    2901648
  • 财政年份:
    2024
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
  • 批准号:
    488039
  • 财政年份:
    2023
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
  • 批准号:
    23K00129
  • 财政年份:
    2023
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
  • 批准号:
    2883985
  • 财政年份:
    2023
  • 资助金额:
    $ 5.3万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了