Statistical And Computational Methods For Gene Expression and Proteomic Analysis
基因表达和蛋白质组分析的统计和计算方法
基本信息
- 批准号:8148480
- 负责人:
- 金额:$ 105.21万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Gene expression measurement using cDNA and oligo arrays continues to be a popular and useful technology for genomic analysis. High throughput methods for measuring protein concentrations are also increasing in popularity. One of the more challenging problems results from the large volume of data generated in these experiments. Quality control and experimental design remain important fundamental issues. Analysis techniques which account for complex array designs and minimize artifacts are required. Many problematic statistical and bioinformatics issues remain and are addressed in this project.
New next generation sequencing techniques are becoming popular for RNA expression measurement (RNAseq). As with microarrays, a host of technical and quality control issues remain as challenges, in addition to the new statistical problems implied by the discrete measurement (counts) which are returned.
We continue to develop new methods for analysis of alternative gene splicing, based on microarray platforms especially designed for the purpose, and more recently, using RNAseq. Two measurement platforms, the Affymetrix exon array and the ExonHit junction probe array are being studied. A major study of the effects of the cancer drug Topotecan, a topoisomerase inhibitor, has been completed and accepted as a publication. A special version of our analysis package, The MSCL Toolbox, was written for this study, namely the ExonSVD. This statistical technique was shown to be highly efficient at identifying genes undergoing alternative splicing, and was less susceptible to the false positives encountered with the earlier ExonANOVA method.
For almost a decade, our group has functioned as the "statistical analysis core" for a high-volume microarray laboratory in CCMD/CC. All microarray studies by this group now pass through our analysis pipeline. We now also perform as the analysis core for the microarray core facility for the NHLBI, more than tripling the throughput of microarray studies into our database and pipeline. This "core" facility has generated more than a dozen new collaborative projects per year, in which our staff are primarily responsible for statistical analysis and interpretation of microarray data.
The entire Framiningham Heart Survey SABRe project has begun to use this new technology, which increases the available transcriptional information by roughly a factor of 10, compared to standard expression arrays. This large project, which will eventually assay up to 5,000 samples, has now completed phase II, the case-control study, which our Lab is currently analyzing. The third phase (remainder of samples analyzed in high-throughput manner) has begun and should be completed in FY11. We are carefully monitoring statistical quality control for this study as it proceeds to analyze almost 200 samples per week. In combination with clinical and other laboratory data, this dataset will no doubt lead to major advances in the understanding of expression signatures and heart disease. The first, feasibility study analyzed samples from 50 individuals, with four blood derived sample types per individual; PBMC, lymphoblastoid cell lines, PaxGene tubes and buffy coat. The technical goal is to chose the best, or at least usable sample types for analysis in the larger study. The result shows that PBMC and PaxGene tubes are roughly equally good in the quality of results. PaxGene was chosen as the sample type for the next two phases.
Affordable, high-quality software availability has been one of the bottlenecks in analysis of microarray data. We have continued development of the "MSCL Analyst's Toolbox" to address this need. Built upon the commercial statistical package JMP, this toolbox allows investigators to download Affymetrix microarray data from a central database, normalize and transform the data, inspect it for a variety of outliers or defects, perform a variety of statistical tests to select relevant genes affected in the experiment, and then visualize and classify various patterns of gene expression. Because our Toolbox is written in open source scripts, its statistical tests can be modified as needed to conform to novel or unique experimental designs. In collaboration with over forty investigators in CC, NHLBI, NIDCR and other ICs, this tool has been applied to several dozen microarray studies. One-day and two-day Toolbox training workshops are regularly presented on the NIH campus.
In a major NIH-wide project, we maintain a database for storage, retrieval and analysis of Affymetrix microarrays, NIHAGCC. As part of this collaboration, we have created a data analysis pipeline and bioinformatics toolset, including both commercial and freely available software. The database currently stores information from over 8000 microarrays. Our downloadable tool set (MSCL Analyst's Toolbox) is now mature, widely tested and applied in numerous studies. Working with investigators in NCI, CC, NHLBI, NINDS, NIAID, NHGRI, NICHD, NIA, NIDDK, NIDA we have developed, customized and applied this software for the analysis of microarray based studies. We also maintain a quarterly-updated set of annotation files for use with Affymetrix data, in a format for convenient download and use by our collaborators.
In another study with investigators in NEI, we identified a list of retinal pigment epithelium (RPE) "signature" genes, based on comparison of RPE gene expression to catalogs of gene expression levels in other tissues. This new RPE signature has proven extremely valuable when used in combination with recently completed GWAS studies of adult macular degeneration, as the coincidence of signature genes with loci implicated in the GWAS study was very high, further implicating the RPE tissue as the source of many problems possibly causative of macular degeneration.
We are now investigating the properties of RNAseq, a method for more accurately assessing the transcriptome using next-generation sequencing technology. In one project, with investigators in NHGRI, we are assessing the reproducibility, both within subject, and within lane, of the methodology. This project has been extended to a comparison of expression in cells from individuals with or without cardiac calcification. In another, we have analyzed the transcriptome of rat pineal gland, both day and nightime, and rhesus superior chiasmatic nucleus. We have found a dramatic number of new unexpected differences as well as dozens of expression differences already known from microarray analysis. Indeed, about 50% of the "reads" generated in this study do not belong to well-document rat genes, and are presumably a result of novel transcription from portions of the genome not yet annotated. Further study has refined the list of unannotated, but controled regions to about 50 outstanding regions, likely producing non-coding RNAs (ncRNAs) some of which were found to be pseudo-genes of highly expressed genes. Interestingly, it is not the coding regions, but the control regions that are found, suggesting that the expression might have a role in control of the true gene itself.
基因表达测量使用cDNA和寡核苷酸阵列仍然是一个流行的和有用的技术,基因组分析。测定蛋白质浓度的高通量方法也越来越受欢迎。其中一个更具挑战性的问题来自于这些实验中产生的大量数据。质量控制和实验设计仍然是重要的基本问题。考虑到复杂的阵列设计和最小化工件的分析技术是必需的。许多有问题的统计和生物信息学问题仍然存在,并在这个项目中得到解决。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
peter j munson其他文献
peter j munson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('peter j munson', 18)}}的其他基金
Statistical And Computational Methods For Molecular Biology And Biomedicine
分子生物学和生物医学的统计和计算方法
- 批准号:
8565482 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
Statistical And Computational Methods For Gene Expression and Proteomic Analysis
基因表达和蛋白质组分析的统计和计算方法
- 批准号:
8746528 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
Statistical And Computational Methods For Molecular Biol
分子生物学的统计和计算方法
- 批准号:
7296867 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
Statistical And Computational Methods For Gene Expression and Proteomic Analysis
基因表达和蛋白质组分析的统计和计算方法
- 批准号:
8941406 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
Statistical And Computational Methods For Molecular Biology And Biomedicine
分子生物学和生物医学的统计和计算方法
- 批准号:
7966721 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
Statistical And Computational Methods For Gene Expression and Proteomic Analysis
基因表达和蛋白质组分析的统计和计算方法
- 批准号:
7966728 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别:
相似海外基金
Bridging Gene Expression Profiles, Proteomics, Multi-modal Neuroimaging and Cognitive Integrity on the Study of Alzheimer´s Disease
将基因表达谱、蛋白质组学、多模式神经影像学和认知完整性连接到阿尔茨海默病的研究中
- 批准号:
338710 - 财政年份:2015
- 资助金额:
$ 105.21万 - 项目类别:
Fellowship Programs
Analysis of protein regulating brain-specific aromatase gene expression through proteomics
通过蛋白质组学分析调节脑特异性芳香酶基因表达的蛋白质
- 批准号:
26461389 - 财政年份:2014
- 资助金额:
$ 105.21万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Molecular characterisation of early precursor lesions of a novel ñserrated pathwayî of colorectal cancer using gene expression and proteomics.
使用基因表达和蛋白质组学对结直肠癌新型“锯齿状通路”的早期前体病变进行分子表征。
- 批准号:
nhmrc : 1012157 - 财政年份:2011
- 资助金额:
$ 105.21万 - 项目类别:
Project Grants
Core C-Functional Proteomics and Gene Expression Analysis
核心 C 功能蛋白质组学和基因表达分析
- 批准号:
7264696 - 财政年份:2007
- 资助金额:
$ 105.21万 - 项目类别:
High-Throughput Gene Expression and Proteomics Instrumentation for Functional Genomics
用于功能基因组学的高通量基因表达和蛋白质组学仪器
- 批准号:
0301761 - 财政年份:2003
- 资助金额:
$ 105.21万 - 项目类别:
Standard Grant
Core C: Functional Proteomics and Gene Expression Analysis
核心 C:功能蛋白质组学和基因表达分析
- 批准号:
8251178 - 财政年份:
- 资助金额:
$ 105.21万 - 项目类别: