Design and Analysis for Cancer Epidemiology Studies

癌症流行病学研究的设计和分析

基本信息

  • 批准号:
    7127228
  • 负责人:
  • 金额:
    $ 7.25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2005
  • 资助国家:
    美国
  • 起止时间:
    2005-09-30 至 2007-08-31
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): The overall goal of this research is to develop novel statistical methods for addressing the difficult issue of multiplicity in current cancer etiology. To identify determinants of cancer and quantify their role, cancer etiology studies are intrinsically multi-factorial because of the multi-step nature of carcinogenesis and multi-extrinsic factors that lead normal cells to malignant ones. Multiplicity inflates false positive rate. In the simplest example of searching for a cutpoint of one quantitative biomarker for disease status, the common practice of examining different cutpoints and pick the one with the smallest p-value results in highly inflated false positive rate. Even in largest studies, statistical power for testing interactions quickly diminishes, sample sizes rapidly become inadequate with stratification and risk estimates become unstable. Because there are so many risk factors, model overfitting is a common problem and the predictive performance of the statistical model is poor. It is thus not surprising that even main effects (e.g., candidate gene associations) have proven notoriously difficult to replicate and reported interactions even harder. The multiplicity issue is acute today as more biomarkers of risk exposures and even the entire pathways comprising easily dozens of genes and their environmental substrates become available. An effective means to reduce overfitting and prediction error is to constrain model parameters as in least absolute shrinkage and selection operator (lasso) to eliminate the large number of irrelevant variables (e.g., genes). Finding MLE in such regression models with large number of variables is challenging. Since some measures of exposure may not be indicative of cancer and these irrelevant variables reduce the accuracy of the regression model, selecting the most relevant variables into the model would be a significant step. However, classic methods for model/variable selection have not had much success in biomedical application because they too aggressively eliminate significant factors predictor and are numerically unstable due to collinearity. This pilot project application focuses on the commonly used logistic regression model in cancer etiology studies. Built upon the novel accelerated expectation-maximization (EM) algorithm we developed for variable selection in linear models, we propose to develop fast variable selection procedures for logistic regression model that reduces overfitting and has improved predictive property; and to develop computer programs, conduct simulation studies to assess the performance of the method/algorithm and to analyze the esophageal data from two currently NCI funded studies. Upon completion of the proposed research, the methods/algorithms developed can be used to analyze cancer epidemiology data more effectively and efficiently. It also provides a basis for further developments of the approach into potentially an RO1 application. The future study can includes extensions to multinomial (i.e., multi-class) logistic regression models for cancer outcomes, the Cox regression model for time-to-event data such as time to advanced cancer analyzing data in cancer etiology and the Bayesian hierarchical modeling and model selection that incorporate prior biological knowledge about pathways will enhance the ability to detect real causal effects.
描述(由申请人提供): 这项研究的总体目标是开发新的统计方法,以解决当前癌症病因学中的多样性这一难题。为了确定癌症的决定因素并量化它们的作用,癌症病因学研究本质上是多因素的,因为癌症发生的多步骤性质以及导致正常细胞向恶性细胞转化的多个外部因素。多重性增加了假阳性率。在搜索疾病状态的一个定量生物标志物的切点这一最简单的例子中,检查不同的切点并选择p值最小的切点的常见做法会导致高度夸大的假阳性率。即使在最大的研究中,测试相互作用的统计能力也会迅速减弱,样本大小很快就会随着分层而变得不足,风险估计也会变得不稳定。由于风险因素很多,模型过拟合是一个普遍存在的问题,统计模型的预测性能较差。因此,即使是主效应(例如,候选基因关联)也被证明是出了名的难以复制,而且报告的相互作用更难,这也就不足为奇了。如今,随着更多的风险暴露生物标记物,甚至包括数十个基因及其环境底物的整个途径变得可用,多样性问题变得尖锐起来。减少过拟合和预测误差的一个有效方法是将模型参数约束为最小绝对收缩和选择算子(LASSO),以消除大量不相关的变量(如基因)。在这类具有大量变量的回归模型中寻找最大似然估计是一项具有挑战性的工作。由于暴露的某些指标可能不能指示癌症,而这些无关的变量降低了回归模型的准确性,因此选择最相关的变量进入模型将是重要的一步。然而,经典的模型/变量选择方法在生物医学应用中并没有取得太大的成功,因为它们过于激进地消除了显著的预测因子,并且由于共线性而导致数值不稳定。这个试点项目的应用重点是癌症病因学研究中常用的Logistic回归模型。在我们开发的用于线性模型变量选择的新型加速期望最大化(EM)算法的基础上,我们建议为Logistic回归模型开发快速变量选择程序,以减少过度拟合并具有更好的预测性能;开发计算机程序,进行模拟研究以评估方法/算法的性能,并分析目前由NCI资助的两项研究的食道数据。在拟议的研究完成后,所开发的方法/算法可以用于更有效和高效地分析癌症流行病学数据。它还为该方法进一步发展成为潜在的RO1应用提供了基础。未来的研究可以包括癌症结果的多项(即多类)Logistic回归模型的扩展,癌症病因学中癌症晚期分析数据等时间到事件数据的Cox回归模型,以及结合先前生物学知识的贝叶斯分层建模和模型选择,将增强检测真正因果效应的能力。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

MING Tony TAN其他文献

MING Tony TAN的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('MING Tony TAN', 18)}}的其他基金

Robust Causal Comparisons of Nonrandomized Oncology Studies
非随机肿瘤学研究的稳健因果比较
  • 批准号:
    10614590
  • 财政年份:
    2022
  • 资助金额:
    $ 7.25万
  • 项目类别:
Robust Causal Comparisons of Nonrandomized Oncology Studies
非随机肿瘤学研究的稳健因果比较
  • 批准号:
    10434299
  • 财政年份:
    2022
  • 资助金额:
    $ 7.25万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8625912
  • 财政年份:
    2012
  • 资助金额:
    $ 7.25万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8507643
  • 财政年份:
    2012
  • 资助金额:
    $ 7.25万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8392050
  • 财政年份:
    2012
  • 资助金额:
    $ 7.25万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8657927
  • 财政年份:
    2012
  • 资助金额:
    $ 7.25万
  • 项目类别:
Statistical Methods for Multi-Drug Combinations
多药组合的统计方法
  • 批准号:
    8845174
  • 财政年份:
    2012
  • 资助金额:
    $ 7.25万
  • 项目类别:
Biostatistics
生物统计学
  • 批准号:
    7696609
  • 财政年份:
    2008
  • 资助金额:
    $ 7.25万
  • 项目类别:
Design and Analysis for Cancer Epidemiology Studies
癌症流行病学研究的设计和分析
  • 批准号:
    7059077
  • 财政年份:
    2005
  • 资助金额:
    $ 7.25万
  • 项目类别:
Design & Analysis of Preclinical Combination Studies
设计
  • 批准号:
    6881429
  • 财政年份:
    2004
  • 资助金额:
    $ 7.25万
  • 项目类别:

相似海外基金

Urinary Phthalate Biomarker Concentrations and Breast and Prostate Cancer Risk in a National Cohort of Adults in Canada
加拿大全国成人队列中尿液邻苯二甲酸盐生物标志物浓度与乳腺癌和前列腺癌风险
  • 批准号:
    494953
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
    Operating Grants
Prospective metabolomics investigation of gastric cancer risk in African Americans and European Whites with a low socioeconomic status
社会经济地位较低的非裔美国人和欧洲白人胃癌风险的前瞻性代谢组学调查
  • 批准号:
    10912190
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Deep Learning Image Analysis Algorithms to Improve Oral Cancer Risk Assessment for Oral Potentially Malignant Disorders
深度学习图像分析算法可改善口腔潜在恶性疾病的口腔癌风险评估
  • 批准号:
    10805177
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Clinical breast cancer risk prediction models for women with a high-risk benign breast diagnosis
高风险良性乳腺诊断女性的临床乳腺癌风险预测模型
  • 批准号:
    10719777
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Developing diagnostics of patient-specific cancer risk and early-stage tumorigenesis
开发患者特异性癌症风险和早期肿瘤发生的诊断方法
  • 批准号:
    478999
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
    Operating Grants
Environmental Metal Exposures and Breast Cancer Risk: A Prospective Study of Nationally Representative Canadian Data
环境金属暴露与乳腺癌风险:加拿大全国代表性数据的前瞻性研究
  • 批准号:
    495159
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Mechanisms of tamoxifen-associated endometrial cancer risk
他莫昔芬相关子宫内膜癌风险的机制
  • 批准号:
    10650054
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Gut microbiota-related mechanisms that impact colorectal cancer risk after bariatric surgery
影响减肥手术后结直肠癌风险的肠道微生物相关机制
  • 批准号:
    10733566
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Obesity, body fat distribution, and breast cancer risk: is visceral fat the culprit after menopause?
肥胖、身体脂肪分布和乳腺癌风险:内脏脂肪是绝经后的罪魁祸首吗?
  • 批准号:
    10586626
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
Optimization of a personalized skin cancer risk intervention for at-risk young adults
针对高危年轻人的个性化皮肤癌风险干预措施的优化
  • 批准号:
    10582944
  • 财政年份:
    2023
  • 资助金额:
    $ 7.25万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了