Design and Analysis for Cancer Epidemiology Studies

癌症流行病学研究的设计和分析

基本信息

  • 批准号:
    7059077
  • 负责人:
  • 金额:
    $ 7.43万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2005
  • 资助国家:
    美国
  • 起止时间:
    2005-09-30 至 2007-08-31
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): The overall goal of this research is to develop novel statistical methods for addressing the difficult issue of multiplicity in current cancer etiology. To identify determinants of cancer and quantify their role, cancer etiology studies are intrinsically multi-factorial because of the multi-step nature of carcinogenesis and multi-extrinsic factors that lead normal cells to malignant ones. Multiplicity inflates false positive rate. In the simplest example of searching for a cutpoint of one quantitative biomarker for disease status, the common practice of examining different cutpoints and pick the one with the smallest p-value results in highly inflated false positive rate. Even in largest studies, statistical power for testing interactions quickly diminishes, sample sizes rapidly become inadequate with stratification and risk estimates become unstable. Because there are so many risk factors, model overfitting is a common problem and the predictive performance of the statistical model is poor. It is thus not surprising that even main effects (e.g., candidate gene associations) have proven notoriously difficult to replicate and reported interactions even harder. The multiplicity issue is acute today as more biomarkers of risk exposures and even the entire pathways comprising easily dozens of genes and their environmental substrates become available. An effective means to reduce overfitting and prediction error is to constrain model parameters as in least absolute shrinkage and selection operator (lasso) to eliminate the large number of irrelevant variables (e.g., genes). Finding MLE in such regression models with large number of variables is challenging. Since some measures of exposure may not be indicative of cancer and these irrelevant variables reduce the accuracy of the regression model, selecting the most relevant variables into the model would be a significant step. However, classic methods for model/variable selection have not had much success in biomedical application because they too aggressively eliminate significant factors predictor and are numerically unstable due to collinearity. This pilot project application focuses on the commonly used logistic regression model in cancer etiology studies. Built upon the novel accelerated expectation-maximization (EM) algorithm we developed for variable selection in linear models, we propose to develop fast variable selection procedures for logistic regression model that reduces overfitting and has improved predictive property; and to develop computer programs, conduct simulation studies to assess the performance of the method/algorithm and to analyze the esophageal data from two currently NCI funded studies. Upon completion of the proposed research, the methods/algorithms developed can be used to analyze cancer epidemiology data more effectively and efficiently. It also provides a basis for further developments of the approach into potentially an RO1 application. The future study can includes extensions to multinomial (i.e., multi-class) logistic regression models for cancer outcomes, the Cox regression model for time-to-event data such as time to advanced cancer analyzing data in cancer etiology and the Bayesian hierarchical modeling and model selection that incorporate prior biological knowledge about pathways will enhance the ability to detect real causal effects.
描述(由申请人提供): 本研究的总体目标是开发新的统计方法,以解决当前癌症病因学的多重性这一难题。为了确定癌症的决定因素并量化其作用,癌症病因学研究本质上是多因素的,因为致癌的多步骤性质和导致正常细胞恶性的多外在因素。多重性增加了假阳性率。在搜索疾病状态的一个定量生物标志物的临界点的最简单示例中,检查不同临界点并选择具有最小p值的临界点的常见做法导致高度膨胀的假阳性率。即使在最大规模的研究中,检验相互作用的统计能力也会迅速降低,样本量会随着分层而迅速不足,风险估计也会变得不稳定。由于风险因素太多,模型过拟合是一个常见的问题,统计模型的预测性能很差。因此,即使是主效应(例如,候选基因关联)已被证明非常难以复制,并且更难报道相互作用。如今,随着风险暴露的生物标志物越来越多,甚至包括数十个基因及其环境底物的整个途径变得可用,多重性问题变得越来越严重。减少过拟合和预测误差的一种有效手段是将模型参数约束为最小绝对收缩和选择算子(lasso),以消除大量不相关的变量(例如,基因)。在这种具有大量变量的回归模型中找到MLE是具有挑战性的。由于某些暴露指标可能并不指示癌症,并且这些无关变量降低了回归模型的准确性,因此选择最相关的变量进入模型将是重要的一步。然而,经典的模型/变量选择方法在生物医学应用中并没有取得太大的成功,因为它们过于积极地消除了重要的预测因子,并且由于共线性而在数值上不稳定。这个试点项目的应用重点是癌症病因学研究中常用的逻辑回归模型。基于我们开发的用于线性模型中变量选择的新型加速期望最大化(EM)算法,我们建议开发用于逻辑回归模型的快速变量选择程序,以减少过拟合并提高预测性能;并开发计算机程序,进行模拟研究以评估方法/算法的性能,并分析来自两项目前NCI资助研究的食管数据。在完成所提出的研究后,开发的方法/算法可以用于更有效地分析癌症流行病学数据。它还提供了一个基础,为进一步发展的方法到潜在的RO 1应用程序。未来的研究可以包括对多项式的扩展(即,用于癌症结果的多类)逻辑回归模型、用于事件发生时间数据(例如晚期癌症发生时间)的考克斯回归模型、分析癌症病因学数据的贝叶斯分层建模和模型选择(其结合了关于途径的现有生物学知识)将增强检测真实的因果效应的能力。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

MING Tony TAN其他文献

MING Tony TAN的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('MING Tony TAN', 18)}}的其他基金

Robust Causal Comparisons of Nonrandomized Oncology Studies
非随机肿瘤学研究的稳健因果比较
  • 批准号:
    10614590
  • 财政年份:
    2022
  • 资助金额:
    $ 7.43万
  • 项目类别:
Robust Causal Comparisons of Nonrandomized Oncology Studies
非随机肿瘤学研究的稳健因果比较
  • 批准号:
    10434299
  • 财政年份:
    2022
  • 资助金额:
    $ 7.43万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8625912
  • 财政年份:
    2012
  • 资助金额:
    $ 7.43万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8392050
  • 财政年份:
    2012
  • 资助金额:
    $ 7.43万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8507643
  • 财政年份:
    2012
  • 资助金额:
    $ 7.43万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8657927
  • 财政年份:
    2012
  • 资助金额:
    $ 7.43万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8845174
  • 财政年份:
    2012
  • 资助金额:
    $ 7.43万
  • 项目类别:
Biostatistics
生物统计学
  • 批准号:
    7696609
  • 财政年份:
    2008
  • 资助金额:
    $ 7.43万
  • 项目类别:
Design and Analysis for Cancer Epidemiology Studies
癌症流行病学研究的设计和分析
  • 批准号:
    7127228
  • 财政年份:
    2005
  • 资助金额:
    $ 7.43万
  • 项目类别:
Design & Analysis of Preclinical Combination Studies
设计
  • 批准号:
    6881429
  • 财政年份:
    2004
  • 资助金额:
    $ 7.43万
  • 项目类别:

相似海外基金

Urinary Phthalate Biomarker Concentrations and Breast and Prostate Cancer Risk in a National Cohort of Adults in Canada
加拿大全国成人队列中尿液邻苯二甲酸盐生物标志物浓度与乳腺癌和前列腺癌风险
  • 批准号:
    494953
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
    Operating Grants
Prospective metabolomics investigation of gastric cancer risk in African Americans and European Whites with a low socioeconomic status
社会经济地位较低的非裔美国人和欧洲白人胃癌风险的前瞻性代谢组学调查
  • 批准号:
    10912190
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Deep Learning Image Analysis Algorithms to Improve Oral Cancer Risk Assessment for Oral Potentially Malignant Disorders
深度学习图像分析算法可改善口腔潜在恶性疾病的口腔癌风险评估
  • 批准号:
    10805177
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Clinical breast cancer risk prediction models for women with a high-risk benign breast diagnosis
高风险良性乳腺诊断女性的临床乳腺癌风险预测模型
  • 批准号:
    10719777
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Developing diagnostics of patient-specific cancer risk and early-stage tumorigenesis
开发患者特异性癌症风险和早期肿瘤发生的诊断方法
  • 批准号:
    478999
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
    Operating Grants
Environmental Metal Exposures and Breast Cancer Risk: A Prospective Study of Nationally Representative Canadian Data
环境金属暴露与乳腺癌风险:加拿大全国代表性数据的前瞻性研究
  • 批准号:
    495159
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Mechanisms of tamoxifen-associated endometrial cancer risk
他莫昔芬相关子宫内膜癌风险的机制
  • 批准号:
    10650054
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Gut microbiota-related mechanisms that impact colorectal cancer risk after bariatric surgery
影响减肥手术后结直肠癌风险的肠道微生物相关机制
  • 批准号:
    10733566
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Obesity, body fat distribution, and breast cancer risk: is visceral fat the culprit after menopause?
肥胖、身体脂肪分布和乳腺癌风险:内脏脂肪是绝经后的罪魁祸首吗?
  • 批准号:
    10586626
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
Optimization of a personalized skin cancer risk intervention for at-risk young adults
针对高危年轻人的个性化皮肤癌风险干预措施的优化
  • 批准号:
    10582944
  • 财政年份:
    2023
  • 资助金额:
    $ 7.43万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了