Combinatorial Inference: Statistical Uncertainty Assessment for Discrete Structures

组合推理:离散结构的统计不确定性评估

基本信息

  • 批准号:
    1916211
  • 负责人:
  • 金额:
    $ 14.73万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-08-15 至 2023-01-31
  • 项目状态:
    已结题

项目摘要

Discrete structures like networks, clusters and ranking are observed in many real-life applications including brain networks, protein clusters, and portfolio selection. This project will develop a unified statistical framework for inferring these unknown discrete structures from large-scale datasets in biology, neuroscience, and finance. The replicability crisis on whether statistical inference for scientific discoveries can be trusted has drawn attention from both the scientific community and the public. A new field, "combinatorial inference", developed by the PI aims to fill the gap between various existing methods designed for continuous quantities and the lack of valid reproducibility assessment for discrete ones. This project will also train the next generation of data scientists to acquire statistical and computational skills for large-scale scientific data analysis. This project aims to develop a unified inference framework to assess uncertainty (e.g. constructing confidence intervals, testing hypotheses and controlling the false discovery rates) when inferring discrete structures include networks, hypergraphs, clustering and ranking. Three types of problems will be considered: (1) Testing the hypothesis that the discrete structure has certain combinatorial properties (e.g., the network is connected, or the graph is triangle-free); (2) Constructing confidence sets covering the discrete quantities with given significance level (e.g., the maximum degree of a graph, the elements belong to a same cluster, top k ranking items); (3) Screening discrete structures of interest (e.g., the cycles, hubs and cliques in a graph) with the false discovery rate control. Classical inference mainly focuses on testing hypotheses and constructing confidence intervals for continuous parameters where analytical methods and theory can be directly applied. On the other hand, exiting research on discrete structures in statistical models mainly focuses on estimation and lacks systematic inferential methods for uncertainty assessment. This project seeks to advance the frontiers of modern statistical inference in four directions: (a) Methodology: Developing efficient methods for hypothesis testing, confidence interval construction, and false discovery controlling procedures; (b) Theory: Developing new probabilistic tools including non-asymptotic concentration inequalities and limiting theory and universality phenomena for combinatorial random quantities; (c) Computation: Computationally efficient algorithms to implement the combinatorial inferential methods for large-scale statistical models; and (d) Fundamental limits: Developing new combinatorial information-theoretic lower bounds to justify the optimality of the proposed inferential methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
离散结构,如网络,集群和排名在许多现实生活中的应用,包括大脑网络,蛋白质集群,投资组合选择。该项目将开发一个统一的统计框架,用于从生物学、神经科学和金融领域的大规模数据集中推断这些未知的离散结构。 科学发现的统计推断是否可信的可复制性危机引起了科学界和公众的关注。 PI开发的一个新领域“组合推理”旨在填补为连续量设计的各种现有方法与缺乏有效的离散量再现性评估之间的差距。 该项目还将培训下一代数据科学家,以获得大规模科学数据分析的统计和计算技能。 该项目旨在开发一个统一的推理框架,以评估在推理离散结构(包括网络,超图,聚类和排名)时的不确定性(例如构建置信区间,测试假设和控制错误发现率)。 将考虑三种类型的问题:(1)检验离散结构具有某些组合性质的假设(例如,网络是连通的,或者图是无三角形的);(2)构造覆盖具有给定显著性水平的离散量的置信度集(例如,图的最大度,元素属于同一聚类,前k个排序项);(3)筛选感兴趣的离散结构(例如,图中的循环、枢纽和集团),具有错误发现率控制。 经典推理主要集中在检验假设和构造连续参数的置信区间,其中可以直接应用分析方法和理论。另一方面,现有的统计模型离散结构的研究主要集中在估计,缺乏系统的推理方法来评估不确定性。 该项目旨在从四个方向推进现代统计推断的前沿:(a)方法学:开发假设检验、置信区间构建和错误发现控制程序的有效方法;(B)理论:开发新的概率工具,包括非渐近集中不等式和组合随机量的极限理论和普适性现象;(c)计算:计算效率高的算法,以实现大规模统计模型的组合推理方法;以及(d)基本限制:开发新的组合信息-该奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的知识产权评估的支持。优点和更广泛的影响审查标准。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization
  • DOI:
    10.1016/j.jbi.2022.104147
  • 发表时间:
    2022-07
  • 期刊:
  • 影响因子:
    4.5
  • 作者:
    D. Zhou;Ziming Gan;Xu Shi;Alina Patwari;E. Rush;Clara-Lea Bonzel;V. A. Panickan;C. Hong;Y. Ho;T. Cai;L. Costa;Xiaoou Li;V. Castro;S. Murphy;G. Brat;G. Weber;P. Avillach;J. Gaziano;Kelly Cho;K. Liao;Junwei Lu;Tianxi Cai
  • 通讯作者:
    D. Zhou;Ziming Gan;Xu Shi;Alina Patwari;E. Rush;Clara-Lea Bonzel;V. A. Panickan;C. Hong;Y. Ho;T. Cai;L. Costa;Xiaoou Li;V. Castro;S. Murphy;G. Brat;G. Weber;P. Avillach;J. Gaziano;Kelly Cho;K. Liao;Junwei Lu;Tianxi Cai
Lagrangian Inference for Ranking Problems
  • DOI:
    10.1287/opre.2022.2313
  • 发表时间:
    2021-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yue Liu;Ethan X. Fang;Junwei Lu
  • 通讯作者:
    Yue Liu;Ethan X. Fang;Junwei Lu
Interstitial lung abnormalities in patients with stage I non-small cell lung cancer are associated with shorter overall survival: the Boston lung cancer study.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Junwei Lu其他文献

Fuzzy fault-detection filtering for uncertain stochastic time-delay systems with randomly missing data
具有随机缺失数据的不确定随机时滞系统的模糊故障检测滤波
Robust non-fragile guaranteed cost control for singular Markovian jump time-delay systems
奇异马尔可夫跳跃时滞系统的鲁棒非脆弱保证成本控制
Consensus for nonlinear multi-agent systems with sampled data
具有采样数据的非线性多智能体系统的共识
Harmonic balance method used for harmonics calculation and prediction in power systems
用于电力系统谐波计算和预测的谐波平衡法
A novel electrothermally actuated RF MEMS switch for wireless applications
一种用于无线应用的新型电热驱动 RF MEMS 开关

Junwei Lu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CAREER: Statistical foundations of particle tracking and trajectory inference
职业:粒子跟踪和轨迹推断的统计基础
  • 批准号:
    2339829
  • 财政年份:
    2024
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
CAREER: Statistical Inference in Observational Studies -- Theory, Methods, and Beyond
职业:观察研究中的统计推断——理论、方法及其他
  • 批准号:
    2338760
  • 财政年份:
    2024
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
STATISTICAL AND COMPUTATIONAL THRESHOLDS IN SPIN GLASSES AND GRAPH INFERENCE PROBLEMS
自旋玻璃和图推理问题的统计和计算阈值
  • 批准号:
    2347177
  • 财政年份:
    2024
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Standard Grant
Collaborative Research: Urban Vector-Borne Disease Transmission Demands Advances in Spatiotemporal Statistical Inference
合作研究:城市媒介传播疾病传播需要时空统计推断的进步
  • 批准号:
    2414688
  • 财政年份:
    2024
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
CAREER: Distribution-Free and Adaptive Statistical Inference
职业:无分布和自适应统计推断
  • 批准号:
    2338464
  • 财政年份:
    2024
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
CAREER: Statistical Inference in High Dimensions using Variational Approximations
职业:使用变分近似进行高维统计推断
  • 批准号:
    2239234
  • 财政年份:
    2023
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
CAREER: Towards Tight Guarantees of Markov Chain Sampling Algorithms in High Dimensional Statistical Inference
职业:高维统计推断中马尔可夫链采样算法的严格保证
  • 批准号:
    2237322
  • 财政年份:
    2023
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Continuing Grant
Unravel machine learning blackboxes -- A general, effective and performance-guaranteed statistical framework for complex and irregular inference problems in data science
揭开机器学习黑匣子——针对数据科学中复杂和不规则推理问题的通用、有效和性能有保证的统计框架
  • 批准号:
    2311064
  • 财政年份:
    2023
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Standard Grant
Development of statistical inference of extended Hawkes processes including missing data problem
扩展霍克斯过程的统计推断的发展,包括缺失数据问题
  • 批准号:
    23H03358
  • 财政年份:
    2023
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Developing Statistical Tools for Data integration and Data Fusion for Finite Population Inference
开发用于有限总体推理的数据集成和数据融合的统计工具
  • 批准号:
    2242820
  • 财政年份:
    2023
  • 资助金额:
    $ 14.73万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了