Theory and practice for exploiting the underlying structure of probability models in big data analysis

在大数据分析中利用概率模型的底层结构的理论与实践

基本信息

  • 批准号:
    1622490
  • 负责人:
  • 金额:
    $ 25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-08-01 至 2019-07-31
  • 项目状态:
    已结题

项目摘要

Ever-increasing use of data-intensive methods in scientific discoveries has led to a paradigm shift in science in recent years. High throughput scientific experiments, routine use of digital sensors, and intensive computer simulations have created a data deluge imposing new challenges on scientific communities to find effective and computationally feasible methods for processing and analyzing very large datasets. Despite many attempts, however, the necessary development of theoretical and computational foundations for big data analysis is lagging far behind. Many existing statistical methods are not capable of handling such data-intensive problems in terms of theoretical foundation as well as computational complexity and scalability. For analyzing high dimensional data with possibly complex structures, this research will offer a set of fundamental solutions using principled statistical methods. The resulting methods will provide a robust framework for big data analysis and allow scientists to use statistical models beyond their current limited applicability. The techniques developed in this project are likely to gain widespread acceptance across a broad spectrum of scientific disciplines, as well as in industry.The focus of this research is mainly on Bayesian statistics. Many recent methods aim to improve computational efficiency of Bayesian models by approximating the likelihood function using a small subset of data. In contrast, the objective of this research is to explore underlying structures of probability models and exploit these features to design efficient and scalable computational methods and algorithms for Bayesian inference in big data analysis. To this end, (1) the PIs will define and study the structure of probability distributions in order to develop novel geometrically motivated methods for statistical inference; (2) the PIs will develop efficient and scalable computational methods that accurately approximate probability distributions by exploiting their geometric properties; (3) finally, the PIs will apply these methods to real computationally-intensive problems from biological sciences. Due to its interdisciplinary nature, this research is expected to contribute to several fields, including statistics, machine learning, applied mathematics, and data-intensive computing.
近年来,在科学发现中越来越多地使用数据密集型方法,导致了科学范式的转变。高通量科学实验、数字传感器的常规使用以及密集的计算机模拟已经产生了数据洪流,这给科学界带来了新的挑战,即寻找有效且计算上可行的方法来处理和分析非常大的数据集。然而,尽管进行了许多尝试,但大数据分析的理论和计算基础的必要发展远远落后。现有的统计方法在理论基础、计算复杂性和可扩展性等方面都无法处理这样的数据密集型问题。对于可能具有复杂结构的高维数据的分析,本研究将提供一套基本的解决方案,使用原则性的统计方法。由此产生的方法将为大数据分析提供一个强大的框架,并允许科学家使用超出其目前有限适用性的统计模型。在这个项目中开发的技术很可能会获得广泛的科学学科,以及在industry.The研究的重点是贝叶斯统计的广泛接受。最近的许多方法旨在通过使用一个小的数据子集近似似然函数来提高贝叶斯模型的计算效率。相比之下,本研究的目标是探索概率模型的底层结构,并利用这些特征来设计大数据分析中贝叶斯推理的高效和可扩展的计算方法和算法。为此,(1)PI将定义和研究概率分布的结构,以开发新的几何动机的方法进行统计推断;(2)PI将开发高效和可扩展的计算方法,通过利用其几何特性精确地近似概率分布;(3)最后,PI将这些方法应用于生物科学的真实的计算密集型问题。由于其跨学科的性质,这项研究预计将有助于多个领域,包括统计,机器学习,应用数学和数据密集型计算。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Babak Shahbaba其他文献

A scalable reinforcement learning framework inspired by hippocampal memory mechanisms for efficient contextual and sequential decision making
一种受海马体记忆机制启发的可扩展强化学习框架,用于高效的情境和序列决策
  • DOI:
    10.1038/s41598-025-10586-x
  • 发表时间:
    2025-07-12
  • 期刊:
  • 影响因子:
    3.900
  • 作者:
    Hamed Poursiami;Ayana Moshruba;Keiland W. Cooper;Derek Gobin;Md Abdullah-Al Kaiser;Ankur Singh;Rouhan Noor;Babak Shahbaba;Akhilesh Jaiswal;Norbert J. Fortin;Maryam Parsa
  • 通讯作者:
    Maryam Parsa
MP33-06 COMBINED URINE AND PLASMA BIOMARKERS ARE HIGHLY ACCURATE FOR PREDICTING HIGH GRADE PROSTATE CANCER
  • DOI:
    10.1016/j.juro.2017.02.1002
  • 发表时间:
    2017-04-01
  • 期刊:
  • 影响因子:
  • 作者:
    Maher Albitar;Wanlong Ma;Lars Lund;Babak Shahbaba;Edward Uchio;Soren Feddersen;Donald Moylan;Kirk Wojno;Neal Shore
  • 通讯作者:
    Neal Shore

Babak Shahbaba的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Babak Shahbaba', 18)}}的其他基金

Collaborative Research: HDR DSC: Data Science Training and Practices: Preparing a Diverse Workforce via Academic and Industrial Partnership
合作研究:HDR DSC:数据科学培训和实践:通过学术和工业合作培养多元化的劳动力
  • 批准号:
    2123366
  • 财政年份:
    2021
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
MODULUS: Data-Driven Mechanistic Modeling of Hierarchical Tissues
MODULUS:分层组织的数据驱动机制建模
  • 批准号:
    1936833
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant

相似海外基金

Identifying and exploiting therapeutic vulnerabilities of tumor-host interactions that drive bone-to-meninges breast cancer metastasis
识别和利用导致骨到脑膜乳腺癌转移的肿瘤与宿主相互作用的治疗脆弱性
  • 批准号:
    10826488
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
Exploiting alpha-ketoglutarate-dependent metabolism for therapeutic benefit in acute myeloid leukemia
利用α-酮戊二酸依赖性代谢来治疗急性髓系白血病
  • 批准号:
    10684842
  • 财政年份:
    2022
  • 资助金额:
    $ 25万
  • 项目类别:
Exploiting alpha-ketoglutarate-dependent metabolism for therapeutic benefit in acute myeloid leukemia
利用α-酮戊二酸依赖性代谢来治疗急性髓系白血病
  • 批准号:
    10523632
  • 财政年份:
    2022
  • 资助金额:
    $ 25万
  • 项目类别:
Understanding and exploiting novel therapeutic vulnerabilities of RIT1-driven lung cancer
了解和利用 RIT1 驱动的肺癌的新治疗漏洞
  • 批准号:
    10211377
  • 财政年份:
    2021
  • 资助金额:
    $ 25万
  • 项目类别:
Understanding and exploiting novel therapeutic vulnerabilities of RIT1-driven lung cancer
了解和利用 RIT1 驱动的肺癌的新治疗漏洞
  • 批准号:
    10641671
  • 财政年份:
    2021
  • 资助金额:
    $ 25万
  • 项目类别:
Understanding and exploiting novel therapeutic vulnerabilities of RIT1-driven lung cancer
了解和利用 RIT1 驱动的肺癌的新治疗漏洞
  • 批准号:
    10378686
  • 财政年份:
    2021
  • 资助金额:
    $ 25万
  • 项目类别:
Exploiting synergistic and antagonistic interactions with antifungal drugs to improve disease treatment.
利用与抗真菌药物的协同和拮抗相互作用来改善疾病治疗。
  • 批准号:
    10204979
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
Exploiting synergistic and antagonistic interactions with antifungal drugs to improve disease treatment.
利用与抗真菌药物的协同和拮抗相互作用来改善疾病治疗。
  • 批准号:
    10456329
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
CAREER: Exploiting Antenna Capabilities in Wireless Mesh Networks: Theory, Protocols, and Practice
职业:在无线网状网络中利用天线功能:理论、协议和实践
  • 批准号:
    1441638
  • 财政年份:
    2014
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Project 1: Determining and Exploiting Mechanisms of AR-Mediated Suppression of Cell Proliferation and Survival
项目 1:确定和利用 AR 介导的细胞增殖和存活抑制机制
  • 批准号:
    10576936
  • 财政年份:
    2013
  • 资助金额:
    $ 25万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了