Statistical Inferences on Massive Data

海量数据统计推断

基本信息

  • 批准号:
    1206464
  • 负责人:
  • 金额:
    $ 60万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2012
  • 资助国家:
    美国
  • 起止时间:
    2012-06-01 至 2018-05-31
  • 项目状态:
    已结题

项目摘要

The proposal plans to develop novel statistical theory and methods for processing massive data. Four interrelated avenues are proposed for theoretical research and methodological developments: High-dimensional variable selection, large covariance estimation, large-scale hypothesis testing, and nonparametric statistical learning. In particular, novel statistical techniques are proposed to answer the following important questions: how to screen genes and risk factors with some acquired knowledge, what are the advantages of folded concave penalized methods, how to estimate the benchmark for classifications and regressions, how to deal with outliers, dependence data, and endogenous measurements, how to use homogeneity of geographical neighborhoods to enhance forecasting and inferences, how to assess uncertainty of risk measurements, how to conduct sparse principal component analysis, how to control the false discovery rates under arbitrary dependence, how to use nonparametric methods to enhance the flexibility of high-dimensional statistical learning. In addition, a novel statistical model, motivated from a financial economics theory, is proposed for estimating large covariance matrices for better understanding risk correlations and for better assessment of risks. The methods for testing the presence of endogenous variables and the celebrated multifactor pricing models are also presented and will be thoroughly investigated. Massive data collections have become routine in exploring the frontiers of science, in one case genomic studies and in another case measuring economic risks. The proposed research will advance our knowledge on understanding molecular mechanisms, biological processes, genetic associations, brain functions, social networks, economic and financial risks, supply and demands, and hence increase economic and global competitiveness. In addition, the proposed novel statistical techniques can be applied to other biological and engineering problems. The project will integrate research and education by working closely with senior undergraduate students, graduate students and postdoctoral fellows, and increase the collaborations between academia and industry by working closely with industrial partners and developing publicly available computer code for processing massive data with sound theoretical supports. The results will be disseminated broadly through presentations at seminars, conferences, professional association meetings, and the internet.
该提案计划开发新的统计理论和方法来处理海量数据。提出了理论研究和方法发展的四条相互关联的途径:高维变量选择、大协方差估计、大规模假设检验和非参数统计学习。特别是,提出了新的统计技术来回答以下重要问题:如何利用已有的知识筛选基因和风险因素;折叠凹罚方法的优点是什么;如何估计分类和回归的基准;如何处理孤立点、相关性数据和内生测量;如何利用地理邻域的同质性来增强预测和推断;如何评估风险测量的不确定性;如何进行稀疏主成分分析;如何控制任意依赖下的错误发现率;如何使用非参数方法来增强高维统计学习的灵活性。此外,从金融经济学理论出发,提出了一个新的统计模型,用于估计大协方差矩阵,以便更好地了解风险相关性和更好地评估风险。此外,还介绍了检验内生变量的方法和著名的多因素定价模型,并将对其进行深入研究。大量数据收集已经成为探索科学前沿的例行公事,在一个案例中进行基因组研究,在另一个案例中衡量经济风险。拟议的研究将增进我们对分子机制、生物过程、遗传关联、大脑功能、社会网络、经济和金融风险、供求的了解,从而提高经济和全球竞争力。此外,所提出的新的统计技术还可以应用于其他生物和工程问题。该项目将通过与高年级本科生、研究生和博士后密切合作,将研究和教育结合起来,并通过与行业合作伙伴密切合作,开发公开可用的计算机代码,在坚实的理论支持下处理海量数据,加强学术界和产业界的合作。结果将通过在研讨会、会议、专业协会会议和互联网上的陈述广泛传播。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jianqing Fan其他文献

Deep Neural Networks for Nonparametric Interaction Models with Diverging Dimension
具有发散维度的非参数交互模型的深度神经网络
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sohom Bhattacharya;Jianqing Fan;Debarghya Mukherjee
  • 通讯作者:
    Debarghya Mukherjee
Dynamic nonparametric filtering with application to volatility estimation
动态非参数滤波及其在波动率估计中的应用
  • DOI:
    10.1016/b978-044451378-6/50021-1
  • 发表时间:
    2003
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ming;Jianqing Fan;V. Spokoiny
  • 通讯作者:
    V. Spokoiny
Approaches to High-Dimensional Covariance and Precision Matrix Estimations
高维协方差和精度矩阵估计的方法
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jianqing Fan;Yuan Liao;Han Liu
  • 通讯作者:
    Han Liu
Improving Covariate Balancing Propensity Score : A Doubly Robust and Efficient Approach ∗
提高协变量平衡倾向评分:双重稳健和高效的方法*
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jianqing Fan;K. Imai;Han Liu;Y. Ning;Xiaolin Yang
  • 通讯作者:
    Xiaolin Yang
Features of Big Data and sparsest solution in high confidence set
  • DOI:
    10.1201/b16720-48
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jianqing Fan
  • 通讯作者:
    Jianqing Fan

Jianqing Fan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jianqing Fan', 18)}}的其他基金

Interface of Statistical Learning and Optimal Decisions
统计学习和最优决策的接口
  • 批准号:
    2210833
  • 财政年份:
    2022
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
DMS/NIGMS 2: Collaborative Research: Developing Statistical Learning Methods for Revealing the Molecular Signatures of Microvascular Changes in Neural Injury
DMS/NIGMS 2:合作研究:开发统计学习方法来揭示神经损伤中微血管变化的分子特征
  • 批准号:
    2053832
  • 财政年份:
    2021
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Flexible Network Inference
FRG:协作研究:灵活的网络推理
  • 批准号:
    2052926
  • 财政年份:
    2021
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Collaborative Research: Statistical Methods for RNA-seq Based Transcriptomic Analysis of Macrophage Function in Spinal Cord Injury
合作研究:基于RNA-seq的脊髓损伤中巨噬细胞功能转录组学分析的统计方法
  • 批准号:
    1662139
  • 财政年份:
    2017
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Robust and Distributed Statistical Learning from Big Data
从大数据中进行稳健的分布式统计学习
  • 批准号:
    1712591
  • 财政年份:
    2017
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Collaborative Research: Interface of Probability and Statistics for High-dimensional Inference
合作研究:高维推理的概率统计接口
  • 批准号:
    1406266
  • 财政年份:
    2014
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Workshop on: Discovery in Complex or Massive Datasets: Common Statistical Themes
研讨会:复杂或海量数据集中的发现:常见统计主题
  • 批准号:
    0751568
  • 财政年份:
    2007
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Collaborative Research: Development of bioinformatic methods for studying gene expression network inflammation and neuronal regeneration
合作研究:开发用于研究基因表达网络炎症和神经元再生的生物信息学方法
  • 批准号:
    0714554
  • 财政年份:
    2007
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
High-dimensional statistical learning and inference
高维统计学习和推理
  • 批准号:
    0704337
  • 财政年份:
    2007
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Workshop on Frontiers of Statistics: Nonparametric Modeling of Complex Data
统计前沿研讨会:复杂数据的非参数建模
  • 批准号:
    0531839
  • 财政年份:
    2006
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant

相似海外基金

Statistical Inferences under Monotonic Hazard Trend in Survival Analysis
生存分析中单调危险趋势下的统计推断
  • 批准号:
    2311292
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Continuing Grant
Bayesian Statistical Learning for Robust and Generalizable Causal Inferences in Alzheimer Disease and Related Disorders Research
贝叶斯统计学习在阿尔茨海默病和相关疾病研究中进行稳健且可推广的因果推论
  • 批准号:
    10590913
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
Population genomic inferences of history and selection across populations and time
跨群体和时间的历史和选择的群体基因组推断
  • 批准号:
    10623079
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
"She must not know much about that": Children's inferences based on others' listener design
“她一定对此了解不多”:孩子们根据别人的听众设计做出的推断
  • 批准号:
    2317559
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Improving inferences on health effects of chemical exposures
改进对化学品暴露对健康影响的推断
  • 批准号:
    10753010
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
Generative adversarial networks for demographic inferences of nonmodel species from genomic data
根据基因组数据对非模型物种进行人口统计推断的生成对抗网络
  • 批准号:
    NE/X009637/1
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
Beyond 1D Structure of Earth's Core - Reconciling Inferences from Seismic and Geomagnetic Observations
超越地核的一维结构 - 协调地震和地磁观测的推论
  • 批准号:
    NE/W005247/1
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
    Research Grant
Interactions between ES-miRNAs and environmental risk factors are responsible for TNBC progression and associated racial health disparities: a novel analysis with multilevel moderation inferences
ES-miRNA 和环境风险因素之间的相互作用导致 TNBC 进展和相关种族健康差异:一项采用多级调节推论的新颖分析
  • 批准号:
    10594746
  • 财政年份:
    2023
  • 资助金额:
    $ 60万
  • 项目类别:
Models and Inferences for Heterogeneous Interaction Patterns in Social Networks
社交网络中异构交互模式的模型和推论
  • 批准号:
    2210735
  • 财政年份:
    2022
  • 资助金额:
    $ 60万
  • 项目类别:
    Standard Grant
Distance-based robust inferences and model selection for semiparametric models
半参数模型的基于距离的鲁棒推理和模型选择
  • 批准号:
    RGPIN-2018-04328
  • 财政年份:
    2022
  • 资助金额:
    $ 60万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了