Development of High-Dimensional Data Analysis Methods for the Identification of Differentially Expressed Gene Sets
开发用于鉴定差异表达基因集的高维数据分析方法
基本信息
- 批准号:0714978
- 负责人:
- 金额:$ 55.29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-08-15 至 2011-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The main objectives of this proposal are to develop improved statistical methods for detecting sets of genes that are differentially expressed across two or more conditions and to apply these methods to discover new genetic and physiological mechanisms that control food intake, nutrient utilization, energy regulation, and metabolism. The developed statistical methods will provide powerful alternatives to tests of enrichment or overrepresentation that have become popular tools for interpreting microarray experiments. The proposed methods gain advantages over existing methods by (1) recognizing, accounting for, and utilizing dependence among genes; (2) maintaining continuous information about the degree of difference between gene set expression distributions; (3) identifying interesting gene sets by comparison of gene sets across treatments rather than comparing gene sets to one another; and (4) capturing information about differential expression contained in the joint expression distributions rather than using only marginal distributions. In addition to their use for identifying differentially expressed gene sets in traditional microarray experiments, the proposed methods offer a new and powerful approach for identifying genetic loci that control the expression of gene networks. The integrated research team of statisticians and biologists will identify the best of the proposed methods by theoretical study of their asymptotic properties, by comparisons of their performance on simulated data sets designed to mimic structures found in real data sets, and by weighing the value of biological insights provided by their application to actual data from a variety of microarray experiments. The asymptotic framework used in this research considers the statistical properties of the testing procedures as both the dimension of the data vectors (number of genes in a set)and the sample size (number of experimental units) grow large. Such a framework permits evaluation of methods for use on data of very high dimension and produces results that are intrinsically interesting from the statistical point of view. The developed methods will be used to investigate genetic control of food intake and energy regulation in pigs and to discover genetic regions that control the expression of gene networks in a population of mice that serve as a model for human obesity. The insights provided by these studies may be used to develop treatment strategies for human obesity. In addition, the proposed methods have much broader application to nearly any microarray-based investigation of differential gene expression. Applications range from the identification of sets of genes that play a role in distinguishing cancerous tissue from non-cancerous tissue to the identification of sets of genes important for developing high-quality plant material suitable for conversion to biofuel. The general goal of this work is to provide scientific researchers with powerful tools for identifying the most important genes behind a wide variety of biological phenomena.
这项建议的主要目标是开发改进的统计方法来检测在两种或更多条件下差异表达的基因集,并应用这些方法来发现控制食物摄入、营养利用、能量调节和新陈代谢的新的遗传和生理机制。开发的统计学方法将提供强大的替代测试,以丰富或过度代表,已成为解释微阵列实验的流行工具。与现有方法相比,所提出的方法具有以下优点:(1)识别、解释和利用基因之间的相关性;(2)保持关于基因集合表达分布之间差异程度的连续信息;(3)通过比较不同处理的基因集合而不是相互比较基因集合来识别感兴趣的基因集合;以及(4)捕获关于联合表达分布中包含的差异表达的信息,而不是仅使用边缘分布。除了在传统的微阵列实验中用于识别差异表达的基因组外,所提出的方法还为识别控制基因网络表达的遗传位点提供了一种新的和强大的方法。由统计学家和生物学家组成的综合研究小组将通过对这些方法的渐近性质进行理论研究,通过比较它们在旨在模拟真实数据集中发现的结构的模拟数据集上的性能,以及通过权衡它们应用于各种微阵列实验的实际数据所提供的生物学见解的价值,来确定建议方法中的最佳方法。本研究中使用的渐近框架考虑了测试过程的统计特性,因为数据向量的维度(集合中的基因数量)和样本大小(实验单元的数量)都变大了。这样的框架可以对用于极高维度数据的方法进行评价,并产生从统计学角度看本质上令人感兴趣的结果。开发的方法将用于研究猪的食物摄入量和能量调节的遗传控制,并在作为人类肥胖模型的小鼠群体中发现控制基因网络表达的遗传区域。这些研究提供的见解可能被用来开发人类肥胖的治疗策略。此外,所提出的方法在几乎任何基于微阵列的差异基因表达研究中都有更广泛的应用。应用范围从识别在区分癌症组织和非癌症组织中发挥作用的一组基因到识别对开发适合转化为生物燃料的高质量植物材料至关重要的一组基因。这项工作的总体目标是为科学研究人员提供强大的工具,以确定各种生物现象背后最重要的基因。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Daniel Nettleton其他文献
Daniel Nettleton的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Daniel Nettleton', 18)}}的其他基金
Conference on Predictive Inference and Its Applications
预测推理及其应用会议
- 批准号:
1810945 - 财政年份:2018
- 资助金额:
$ 55.29万 - 项目类别:
Standard Grant
Joint NSF/ERA-CAPS: Host Targets of Fungal Effectors as Keys to Durable Disease Resistance
NSF/ERA-CAPS 联合:真菌效应子的宿主靶点是持久抗病性的关键
- 批准号:
1339348 - 财政年份:2014
- 资助金额:
$ 55.29万 - 项目类别:
Continuing Grant
Distance-based variable selection for high-dimensional biological data
高维生物数据的基于距离的变量选择
- 批准号:
1313224 - 财政年份:2013
- 资助金额:
$ 55.29万 - 项目类别:
Standard Grant
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
相似海外基金
Development of Artificial Intelligence by Deep Learning Based on Genomic High-Dimensional Data Analysis System for Gastrointestinal Cancer
基于深度学习的胃肠道肿瘤基因组高维数据分析系统开发人工智能
- 批准号:
21H02998 - 财政年份:2021
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Development of pattern recognition algorithm for ultra low frequency multivariate time-series data considering dimensional correlation
考虑量纲相关性的超低频多元时间序列数据模式识别算法开发
- 批准号:
21K11938 - 财政年份:2021
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of quantum-inspired algorithms for decoding high-dimensional neural data
开发用于解码高维神经数据的量子启发算法
- 批准号:
20K16465 - 财政年份:2020
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Development of novel statistical modeling based on functional data analysis for high-dimensional data and its application
基于函数数据分析的高维数据统计模型开发及其应用
- 批准号:
20K11707 - 财政年份:2020
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of training and simulation programs for tracheobronchial reconstruction surgery using a three-dimensional operable airway model from clinical computed tomography data
使用来自临床计算机断层扫描数据的三维可操作气道模型开发气管支气管重建手术的训练和模拟程序
- 批准号:
20K17762 - 财政年份:2020
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Research and development of nonlinear Selective Inference for high-dimensional and small number of samples data
高维小样本数据非线性选择性推理研究进展
- 批准号:
20H04243 - 财政年份:2020
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Development of quality control methods of soil embankment using 3-dimensional data
利用三维数据开发土堤质量控制方法
- 批准号:
19K04605 - 财政年份:2019
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of a three dimensional structure analysis method of deciduous broadleaf forest using airborne LiDAR data
利用机载激光雷达数据开发落叶阔叶林三维结构分析方法
- 批准号:
19K06123 - 财政年份:2019
- 资助金额:
$ 55.29万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Novel methods for the integration of high dimensional single cell proteomic and RNA data to understand cell populations in development and disease.
整合高维单细胞蛋白质组和 RNA 数据以了解发育和疾病中的细胞群的新方法。
- 批准号:
MR/S005471/1 - 财政年份:2018
- 资助金额:
$ 55.29万 - 项目类别:
Fellowship
Development of high-dimensional data-adaptive causal inference methods to unravel the role of genetics in determining heart rhythm
开发高维数据自适应因果推理方法来揭示遗传学在确定心律中的作用
- 批准号:
2083410 - 财政年份:2018
- 资助金额:
$ 55.29万 - 项目类别:
Studentship