Statistical Methods for Dependent Data
相关数据的统计方法
基本信息
- 批准号:0805050
- 负责人:
- 金额:$ 32万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-07-01 至 2014-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This proposal concentrates on various topics relating to the statistical analysis of dependent data. The first project extends the spectral envelope concept for analyzing DNA sequences. A common problem in analyzing long DNA sequence data is in identifying protein-coding sequences that are dispersed throughout the sequence and separated by regions of noncoding. DNA sequences are heterogeneous, so it is necessary to expand the methodology to capture the local behavior of such sequences. To address the problem of local behavior, a local spectral envelope with estimation via mixtures of smoothing splines will be explored. It is the hope that this methodology will help emphasize any periodic feature that exists in a categorical sequence of virtually any length in a quick and automated fashion. Projects such as the human genome project have produced large amounts of data and the methods established in this project will prove to be useful in the analysis of the vast quantities of data being produced by various genome projects. In another project, the focus is on the analysis of longitudinal data and the development of a practical nonparametric procedure for the estimation of the within-subject correlation structure. This technique is used to develop a data driven functional principal components analysis procedure (FPCA). Because longitudinal data often possess the property that observations made within a subject are correlated, an effective analysis of these data is required to account for this within-subject correlation. When a parametric form for the covariance structure is unknown, using a misspecified structure can result in biased and inefficient estimates. This project focuses on the analysis of longitudinal data that can be modeled as observations from smooth subject trajectories that are realizations of a stochastic process observed at discrete time points with noise. The high dimensionality and complexity of longitudinal data has made FPCA a popular tool for data reduction and visualization by capturing the primary modes of variation of the stochastic process generating the data. Scientists are often interested in using longitudinal data to determine the effect that a set of possibly time-varying covariates have on a given response over time. Functional linear models, and in particular the varying-coefficient model, provide a framework for analyzing such data. In many of these data sets, the functional coefficients have shapes that cannot be modeled parametrically. An effective analysis of these data is required to both account for the within-subject correlation and to allow for the flexible shapes of the coefficients. Because a parametric form for the within-subject covariance is not always known, a third project focuses on creating an iterative data-driven spline based procedure for fitting varying-coefficient models. This proposal concentrates on solving problems involved in the analysis of dependent data. The first project will develop a method for detecting genes in a long DNA sequences. Projects such as the human genome project have produced large amounts of data and the methods established in this project will prove to be useful in the analysis of the vast quantities of data being produced by various genome projects. A second proposed project focuses on the analysis of complex data collected over time. This project is also motivated by the analysis of DNA, and in particular, the analysis of gene expression data. In a third project, the investigators will focus on a technique called functional linear models. For example, techniques will be developed for studying the effect that a growth factor should have on the decision to supplement chemotherapy with antiangiogenic therapy when treating ovarian cancer.
本提案集中讨论与从属数据的统计分析有关的各种专题。第一个项目扩展了分析DNA序列的频谱包络概念。分析长DNA序列数据的一个常见问题是识别分散在整个序列中并被非编码区域分隔的蛋白质编码序列。DNA序列是异质的,因此有必要扩展方法学以捕获此类序列的局部行为。为了解决局部行为的问题,将探索通过平滑样条的混合估计的局部谱包络。我们希望这种方法将有助于以快速和自动化的方式强调存在于几乎任何长度的分类序列中的任何周期性特征。人类基因组计划等项目已经产生了大量数据,该项目中建立的方法将证明有助于分析各种基因组计划产生的大量数据。在另一个项目中,重点是纵向数据的分析和开发一个实用的非参数程序,用于估计受试者内相关结构。这种技术被用来开发一个数据驱动的功能主成分分析程序(FPCA)。由于纵向数据通常具有在受试者内进行的观察是相关的属性,因此需要对这些数据进行有效的分析以解释受试者内的相关性。当协方差结构的参数形式未知时,使用错误指定的结构可能会导致估计值有偏且效率低下。该项目侧重于纵向数据的分析,这些数据可以建模为来自平滑受试者轨迹的观察,这些轨迹是在具有噪声的离散时间点观察到的随机过程的实现。纵向数据的高维性和复杂性使得FPCA通过捕获生成数据的随机过程的主要变化模式而成为数据约简和可视化的流行工具。科学家们经常对使用纵向数据来确定一组可能随时间变化的协变量随时间对给定响应的影响感兴趣。函数线性模型,特别是变系数模型,提供了一个框架来分析这些数据。在许多这些数据集中,功能系数的形状,不能参数化建模。需要对这些数据进行有效的分析,以考虑受试者内的相关性,并允许系数的灵活形状。由于受试者内协方差的参数形式并不总是已知的,第三个项目的重点是创建一个迭代的数据驱动的样条拟合变系数模型的程序。该建议集中于解决相关数据分析中涉及的问题。第一个项目将开发一种检测长DNA序列中基因的方法。人类基因组计划等项目已经产生了大量数据,该项目中建立的方法将证明有助于分析各种基因组计划产生的大量数据。第二个拟议项目侧重于分析长期收集的复杂数据。该项目也受到DNA分析的启发,特别是基因表达数据的分析。在第三个项目中,研究人员将专注于一种称为函数线性模型的技术。例如,将开发技术来研究生长因子对治疗卵巢癌时用抗血管生成疗法补充化疗的决定的影响。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
David Stoffer其他文献
David Stoffer的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('David Stoffer', 18)}}的其他基金
Nonlinear and Nonstationary Time Series
非线性和非平稳时间序列
- 批准号:
1506882 - 财政年份:2015
- 资助金额:
$ 32万 - 项目类别:
Continuing Grant
Collaborative Research: The Analysis of Time Series Collected in Experimental Designs
协作研究:实验设计中收集的时间序列分析
- 批准号:
0706723 - 财政年份:2007
- 资助金额:
$ 32万 - 项目类别:
Standard Grant
Statistical Methods in the Frequency Domain
频域统计方法
- 批准号:
0102511 - 财政年份:2001
- 资助金额:
$ 32万 - 项目类别:
Continuing Grant
Mathematical Sciences: Walsh-Fourier Analysis and Categorical Time Series
数学科学:沃尔什-傅里叶分析和分类时间序列
- 批准号:
9000522 - 财政年份:1990
- 资助金额:
$ 32万 - 项目类别:
Standard Grant
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Statistical methods for informative and outcome-dependent data
信息性和结果相关数据的统计方法
- 批准号:
DGECR-2022-00433 - 财政年份:2022
- 资助金额:
$ 32万 - 项目类别:
Discovery Launch Supplement
Statistical methods for informative and outcome-dependent data
信息性和结果相关数据的统计方法
- 批准号:
RGPIN-2022-03068 - 财政年份:2022
- 资助金额:
$ 32万 - 项目类别:
Discovery Grants Program - Individual
Statistical Methods for Complex Dependent Dental Data
复杂相关牙科数据的统计方法
- 批准号:
6907997 - 财政年份:2005
- 资助金额:
$ 32万 - 项目类别:
Statistical Methods for Complex Dependent Dental Data
复杂相关牙科数据的统计方法
- 批准号:
7077718 - 财政年份:2005
- 资助金额:
$ 32万 - 项目类别:
Statistical-dynamical methods for scale dependent model evaluation and short term precipitation forecasting (STAMPF)
用于尺度相关模型评估和短期降水预报的统计动力学方法(STAMPF)
- 批准号:
5426405 - 财政年份:2004
- 资助金额:
$ 32万 - 项目类别:
Priority Programmes