CAREER: Flexible Parsimonious Models for Complex Data

职业:复杂数据的灵活简约模型

基本信息

  • 批准号:
    1653017
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2017
  • 资助国家:
    美国
  • 起止时间:
    2017-07-01 至 2017-08-31
  • 项目状态:
    已结题

项目摘要

Researchers throughout academia, industry, and government are generating data at scales and levels of complexity far beyond what could previously have been imagined. Complex data demand statistical models that are sufficiently flexible to adapt to meaningful, underlying signals, allowing scientists to discover unexpected patterns. Yet as society relies more heavily on statistical algorithms to make decisions impacting everyday life, it becomes increasingly important for a method's output to be interpretable by non-experts. This demands parsimony: that simpler explanations be favored over more complicated ones. For example, the Internet has led to unprecedented quantities of data in the form of text (such as articles, blogs, webpages, consumer reviews, and many other social media products). Such text data represent a potential treasure trove of insights into the world -- what people are thinking, how this is changing over time, how this varies by location, etc. The investigator develops new statistical methods for overcoming major technical challenges to gleaning useful information from this data. This same methodology can be applied to the study of the microbiome, the vast community of microbes living in an environment such as the human gut. Better statistical methods are needed to identify types of microbes in the gut that play a crucial role in human health and disease. Another problem that is tackled in this project involves modeling data collected over time (such as wind-speed data and wildlife monitoring). The methods that are developed allow for more accurate forecasting, which is crucial in many areas including health and medicine and the development of lower cost energy systems. The last major area in this project is devoted to making the process of statistical research more efficient and its software of higher quality and easier to share across the community of statistical researchers. Finally, all three research objectives are closely integrated with educational outcomes, including the supervision and teaching of graduate students, outreach to non-statisticians and non-scientists, and the release of undergraduate-accessible mini-papers describing the investigator's new research findings.This project focuses on the design of new statistical methods that balance two important and often opposing needs: flexibility and parsimony. (1) Building predictive regression and classification models is difficult when the features are highly sparse. While many methods focus on the challenge of high dimensionality, relatively few have considered the obstacle posed by features that are rarely nonzero. The investigator develops a new framework for feature selection when the features are highly sparse that succeeds where preexisting methods fail. This is studied both from theoretical and computational standpoints. (2) High-dimensional covariance estimation and time series modeling are two rich, but largely distinct, areas in statistics, which the investigator combines to develop new methods for modeling locally stationary time series. The added flexibility in going from stationarity to local stationarity must be carefully balanced with parsimony. (3) A series of area-specific software modules will be distributed freely online building on the investigator's new platform for streamlining the process of performing simulation studies. Each module will implement some of the most common models, methods, and metrics used in a given area of statistics research. The goal is to facilitate the sharing of high-quality, reproducible simulation code in the statistics research community by creating an easily-adaptable standardized format.
学术界、工业界和政府的研究人员正在以远远超出以往想象的规模和复杂程度生成数据。 复杂的数据需要足够灵活的统计模型来适应有意义的潜在信号,使科学家能够发现意想不到的模式。 然而,随着社会越来越依赖统计算法来做出影响日常生活的决策,非专家可以解释方法的输出变得越来越重要。 这就要求简约:更简单的解释比更复杂的解释更受青睐。 例如,互联网以文本形式产生了前所未有的大量数据(如文章、博客、网页、消费者评论和许多其他社交媒体产品)。 这些文本数据代表了洞察世界的潜在宝库-人们在想什么,这是如何随着时间的推移而变化的,这是如何随位置而变化的,等等。 同样的方法可以应用于微生物组的研究,微生物组是生活在人类肠道等环境中的庞大微生物群落。 需要更好的统计方法来识别肠道中对人类健康和疾病起关键作用的微生物类型。 该项目中解决的另一个问题涉及建模随时间收集的数据(如风速数据和野生动物监测)。 所开发的方法可以进行更准确的预测,这在许多领域都至关重要,包括健康和医学以及低成本能源系统的开发。 该项目的最后一个主要领域是致力于提高统计研究进程的效率,提高其软件的质量,并使其更容易在统计研究人员群体中共享。 最后,所有三个研究目标都与教育成果紧密结合,包括对研究生的监督和教学,对非统计人员和非科学家的宣传,以及发布描述研究人员新研究成果的本科生可访问的迷你论文。 (1)当特征高度稀疏时,构建预测回归和分类模型是困难的。 虽然许多方法专注于高维的挑战,但相对较少考虑很少非零的特征所构成的障碍。研究人员开发了一个新的框架,功能选择时,功能是高度稀疏的成功,预先存在的方法失败。 这是研究从理论和计算的观点。 (2)高维协方差估计和时间序列建模是两个丰富的,但很大程度上是不同的,统计领域,研究人员结合起来开发新的方法来建模本地平稳的时间序列。 从平稳性到局部平稳性所增加的灵活性必须与简约性仔细平衡。 (3)一系列针对具体领域的软件模块将在调查员的新平台上免费在线分发,以简化进行模拟研究的过程。 每个模块将实现一些最常见的模型,方法,并在统计研究的给定领域使用的指标。 其目标是通过创建易于适应的标准化格式来促进统计研究界共享高质量、可复制的模拟代码。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jacob Bien其他文献

Inference on the proportion of variance explained in principal component analysis
主成分分析解释方差比例的推断
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ronan Perry;Snigdha Panigrahi;Jacob Bien;Daniela Witten
  • 通讯作者:
    Daniela Witten

Jacob Bien的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jacob Bien', 18)}}的其他基金

CAREER: Flexible Parsimonious Models for Complex Data
职业:复杂数据的灵活简约模型
  • 批准号:
    1748166
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
High-Dimensional Covariance Estimation via Convex Optimization
通过凸优化进行高维协方差估计
  • 批准号:
    1405746
  • 财政年份:
    2014
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant

相似国自然基金

相似海外基金

Flexible fMRI-Compatible Neural Probes with Organic Semiconductor based Multi-modal Sensors for Closed Loop Neuromodulation
灵活的 fMRI 兼容神经探针,带有基于有机半导体的多模态传感器,用于闭环神经调节
  • 批准号:
    2336525
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Flexible Thermoelectric Devices for Wearable Applications
适用于可穿戴应用的柔性热电器件
  • 批准号:
    2400221
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Flexible metal-organic frameworks (MOFs) for hydrogen isotope separation: insights into smart recognition of gas molecules towards materials design
用于氢同位素分离的柔性金属有机框架(MOF):深入了解气体分子对材料设计的智能识别
  • 批准号:
    24K17650
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
STTR Phase I: High-Sensitivity Flexible Quantum Dots/Graphene X-Ray Detectors and Imaging Systems
STTR 第一阶段:高灵敏度柔性量子点/石墨烯 X 射线探测器和成像系统
  • 批准号:
    2322053
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Differentially Private SQL with flexible privacy modeling, machine-checked system design, and accuracy optimization
协作研究:SaTC:核心:中:具有灵活隐私建模、机器检查系统设计和准确性优化的差异化私有 SQL
  • 批准号:
    2317232
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Improving Flexible Attention to Numerical and Spatial Magnitudes in Young Children
提高幼儿对数字和空间大小的灵活注意力
  • 批准号:
    2410889
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
FlexNIR-PD: A resource efficient UK-based production process for patented flexible Near Infrared Sensors for LIDAR, Facial recognition and high-speed data retrieval
FlexNIR-PD:基于英国的资源高效生产工艺,用于 LIDAR、面部识别和高速数据检索的专利柔性近红外传感器
  • 批准号:
    10098113
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Collaborative R&D
Flexible Air Source Heat pump for domestic heating decarbonisation (FASHION)
用于家庭供暖脱碳的灵活空气源热泵(时尚)
  • 批准号:
    EP/V042033/2
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Research Grant
Toward next-generation flexible and interpretable deep learning: A novel evolutionary wide dendritic learning
迈向下一代灵活且可解释的深度学习:一种新颖的进化广泛的树突学习
  • 批准号:
    23K24899
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Collaborative Research: SaTC: CORE: Medium: Differentially Private SQL with flexible privacy modeling, machine-checked system design, and accuracy optimization
协作研究:SaTC:核心:中:具有灵活隐私建模、机器检查系统设计和准确性优化的差异化私有 SQL
  • 批准号:
    2317233
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了