BIGDATA: Collaborative Research: F: Statistical Theory and Methods Beyond the Dimensionality Barrier

BIGDATA:协作研究:F:超越维度障碍的统计理论和方法

基本信息

  • 批准号:
    1633212
  • 负责人:
  • 金额:
    $ 20万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-09-01 至 2019-08-31
  • 项目状态:
    已结题

项目摘要

With recent advances in technology, it is now possible to measure and record significant numbers of features on a single individual. The volume, velocity, and variety, the "3Vs", of Big Data pose significant challenges for modeling and analysis of these massive datasets. For example, to understand cancer at the genetic level, researchers need to detect rare and weak signals from thousands, or even millions, of candidate genetic markers obtained from a limited number of subjects. Existing methods typically assume that the number of subjects is very large, an assumption often violated in practice. The main goal of this project is to develop efficient methods for extremely large-dimensional, small sample size data. The methodological advances will be extremely valuable in addressing Big Data challenges in different areas such as medical research, bioinformatics, financial analysis, and astronomic image analysis. Efficient software packages and algorithms to implement the proposed methods will be developed and made publicly available.The key innovative idea motivating this research is viewing a high-dimensional problem from a novel packing perspective, which allows the number of variables, p, to be arbitrarily large and the number of observations, n, to be finite. The proposed research will systematically investigate three fundamental problems under this "finite n, arbitrarily large p" paradigm: (1) asymptotic theory of spurious correlations, (2) fast detection of low-rank correlation structures, and (3) detection boundary and optimal testing procedures for detecting rare and weak signals. This research will transform the current asymptotic framework, transitioning from the regimes of "large n, small p" and "large n, larger p" to the regime of "finite n, arbitrarily large p".
随着最近技术的进步,现在可以测量和记录单个个体的大量特征。大数据的数量、速度和种类,即“3V”,给这些海量数据集的建模和分析带来了巨大的挑战。例如,为了在基因水平上理解癌症,研究人员需要从从有限数量的受试者那里获得的数千、甚至数百万个候选遗传标记中检测出罕见而微弱的信号。现有的方法通常假设受试者的数量非常大,这一假设在实践中经常被违反。这个项目的主要目标是为极大的维度、小样本的数据开发有效的方法。这些方法上的进步将在应对不同领域的大数据挑战方面具有极其宝贵的价值,例如医学研究、生物信息学、金融分析和天文图像分析。这项研究的关键创新想法是从一个新的布局角度来看待高维问题,它允许变量p的数量任意大,而观测的数量n有限。本研究将系统地研究这一“有限n,任意大p”范式下的三个基本问题:(1)伪相关的渐近理论,(2)低阶相关结构的快速检测,以及(3)检测稀有和微弱信号的检测边界和最优测试方法。本研究将改变目前的渐近框架,从“大n,小p”和“大n,大p”的体制过渡到“有限n,任意大p”的体制。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Spherical Cap Packing Asymptotics and Rank-Extreme Detection
球帽堆积渐近和秩极值检测
An Exploratory Statistical Cusp Catastrophe Model
探索性统计尖点灾难模型
JIVE integration of imaging and behavioral data
JIVE 整合影像和行为数据
  • DOI:
    10.1016/j.neuroimage.2017.02.072
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Yu, Qunqun;Risk, Benjamin B.;Zhang, Kai;Marron, J.S.
  • 通讯作者:
    Marron, J.S.
Calibrated percentile double bootstrap for robust linear regression inference
用于稳健线性回归推理的校准百分位双引导
  • DOI:
    10.5705/ss.202016.0546
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    1.4
  • 作者:
    Daniel McCarthy, Kai Zhang
  • 通讯作者:
    Daniel McCarthy, Kai Zhang
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kai Zhang其他文献

Exploring the Development of Research, Technology and Business of Machine Tool Domain in New-Generation Information Technology Environment Based on Machine Learning
基于机器学习的新一代信息技术环境下机床领域研究、技术和业务的发展探索
  • DOI:
    10.3390/su11123316
  • 发表时间:
    2019-05
  • 期刊:
  • 影响因子:
    3.9
  • 作者:
    Jihong Chen;Kai Zhang;Yuan Zhou;Yufei Liu;Lingfeng Li;Zheng Chen;Li Yin
  • 通讯作者:
    Li Yin
Boundary Hölder regularity for elliptic equations
椭圆方程的边界 Hölder 正则性
Adding ears to intelligent connected vehicles by combining microphone arrays and high definition map
麦克风阵列与高清地图相结合,为智能网联汽车加耳朵
  • DOI:
    10.1049/itr2.12091
  • 发表时间:
    2021-07
  • 期刊:
  • 影响因子:
    2.7
  • 作者:
    Kun Jiang;Diange Yang;Benny Wijaya;Bowei Zhang;Mengmeng Yang;Kai Zhang;Xuewei Tang
  • 通讯作者:
    Xuewei Tang
A universal method for hysteresis-free and stable perovskite solar cells using water pre-treatment
使用水预处理实现无滞后且稳定的钙钛矿太阳能电池的通用方法
  • DOI:
    10.1016/j.cej.2020.126435
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    15.1
  • 作者:
    Jingshu Wan;Li Tao;Qiao Wang;Kai Zhang;Jian Xie;Jun Zhang;Hao Wang
  • 通讯作者:
    Hao Wang
Robot‐assisted laparoscopic ureteroplasty for retrocaval ureter with three‐dimensional images navigation: technique and outcomes
三维图像导航机器人辅助腹腔镜下腔静脉后输尿管成形术:技术和结果
  • DOI:
    10.1111/bju.16278
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    4.5
  • 作者:
    Xiang Wang;Yiming Zhang;Zhihua Li;Xinfei Li;Silu Chen;G. Han;Mancheng Xia;Kunlin Yang;Liqun Zhou;Kai Zhang;Xuesong Li
  • 通讯作者:
    Xuesong Li

Kai Zhang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kai Zhang', 18)}}的其他基金

FRG: Collaborative Research: Mathematical and Statistical Analysis of Compressible Data on Compressive Networks
FRG:协作研究:压缩网络上可压缩数据的数学和统计分析
  • 批准号:
    2152289
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Continuing Grant
Binary Expansion Statistics: A Nonparametric Inference Framework for Big Data
二进制展开统计:大数据的非参数推理框架
  • 批准号:
    1916237
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Geometric Perspectives on the Correlation
相关性的几何视角
  • 批准号:
    1613112
  • 财政年份:
    2016
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Inference for Linear Model Parameters in Model-free Populations
合作研究:无模型群体中线性模型参数的推断
  • 批准号:
    1309619
  • 财政年份:
    2013
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant

相似海外基金

BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
  • 批准号:
    2348159
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
  • 批准号:
    2308649
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
  • 批准号:
    2027516
  • 财政年份:
    2020
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
  • 批准号:
    1934319
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Protecting Yourself from Wildfire Smoke: Big Data-Driven Adaptive Air Quality Prediction Methodologies
大数据:IA:协作研究:保护自己免受野火烟雾的侵害:大数据驱动的自适应空气质量预测方法
  • 批准号:
    1838022
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
  • 批准号:
    1926250
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
  • 批准号:
    1947584
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
  • 批准号:
    1837964
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
  • 批准号:
    1838222
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
  • 批准号:
    1838248
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了