TensorLABE - Robust Characterization of Data Tensors and Synthetic Data Generation

TensorLABE - 数据张量的稳健表征和合成数据生成

基本信息

  • 批准号:
    2223932
  • 负责人:
  • 金额:
    $ 15.65万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Modern computational methods, such as Machine Learning (ML) based approaches, have produced impressive gains in efficiency and performance but are increasingly dependent on massive amounts of data. These data-driven approaches transition from the classical techniques of human-engineered source code to algorithms trained on a dataset to produce the desired solution, placing the data in the driver's seat. The proliferation of these data-driven technologies is being enabled and hastened by new hardware and software systems specifically designed to support the complex data-driven computation associated with these algorithms and the massive volumes of data accompanying them. But despite the impressive performance gains of these new hardware and software systems, understanding their design's data component has languished in favor of performance-driven advances in software and hardware-based solutions. The lack of data understanding has led to a number of undesirable outcomes such as unwanted bias in the data-driven solution, an inability to determine the actual suitability of a data set to solving a given problem ahead of time, an inability to determine if a data set has been manipulated or corrupted, and an inability to produce accurate synthetic data that can be used to train and test the performance of these software and hardware systems. This project aims to provide a robust framework for the characterization of large-scale tensor-based datasets to improve understanding of the data itself and enable the production of synthetic data that more accurately replicates real-world data for use in system design testing and validation.Specifically, this project proposes to advance knowledge in the fields of multilinear algebra, large-scale data analytics, machine learning, and artificial intelligence by incorporating a variety of tensor methods for statistical, structural, and performative data analyses to achieve more robust data characterization. A more holistic set of data characterizations will enable better assessment of data for bias and evaluation of datasets for suitability for a particular task. It will also allow the comparison of datasets to understand their differences and assess data for corruption or manipulation. A proof of concept will be established by incorporating the data characterization methods developed in the project into generating synthetic data with higher degrees of realism than conventional methods. The approach will be validated by testing the ability of the synthetic data to characterize software/hardware system performance more accurately.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代计算方法,如基于机器学习(ML)的方法,在效率和性能方面取得了令人印象深刻的进步,但越来越依赖于大量的数据。这些数据驱动的方法从人类工程源代码的经典技术过渡到在数据集上训练的算法,以产生所需的解决方案,将数据置于驾驶员的位置。新的硬件和软件系统专门设计用于支持与这些算法相关的复杂数据驱动计算以及伴随而来的大量数据,从而使这些数据驱动技术的扩散得以实现并加速。但是,尽管这些新的硬件和软件系统取得了令人印象深刻的性能提升,但在基于软件和硬件的解决方案中,对其设计数据组件的理解却受到了性能驱动的影响。缺乏对数据的理解导致了许多不希望的结果,例如数据驱动解决方案中存在不必要的偏差,无法提前确定数据集对解决给定问题的实际适用性,无法确定数据集是否已被操纵或损坏,以及无法生成可用于训练和测试这些软件和硬件系统性能的准确合成数据。该项目旨在为大规模基于张量的数据集的表征提供一个强大的框架,以提高对数据本身的理解,并使合成数据能够更准确地复制真实世界的数据,用于系统设计测试和验证。具体而言,该项目建议通过结合各种张量方法进行统计、结构和性能数据分析,以实现更稳健的数据表征,从而推进多元线性代数、大规模数据分析、机器学习和人工智能领域的知识。一套更全面的数据特征将能够更好地评估数据的偏差,并评估数据集对特定任务的适用性。它还将允许对数据集进行比较,以了解它们之间的差异,并评估数据是否存在损坏或操纵。将通过将项目中开发的数据特征方法纳入生成比传统方法具有更高真实感的合成数据来建立概念证明。该方法将通过测试合成数据的能力来验证,以更准确地表征软件/硬件系统的性能。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tim Andersen其他文献

Scale free projections arise from bipartite random networks
High-Throughput Virtual Screening Molecular Docking Software for Students and Educators
适合学生和教育工作者的高通量虚拟筛选分子对接软件
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tim Andersen;Owen M. McDougal
  • 通讯作者:
    Owen M. McDougal
Random Processes with High Variance Produce Scale Free Networks
具有高方差的随机过程产生无规模网络

Tim Andersen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tim Andersen', 18)}}的其他基金

SemiSynBio: Nucleic Acid Memory
SemiSynBio:核酸记忆
  • 批准号:
    1807809
  • 财政年份:
    2018
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Continuing Grant
EAGER: Tensor500: A Streaming Analytics High Performance Computing Benchmark
EAGER:Tensor500:流分析高性能计算基准
  • 批准号:
    1849463
  • 财政年份:
    2018
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant
EAGER: Stream500: A New Benchmark and Infrastructure for Streaming Analytics
EAGER:Stream500:流分析的新基准和基础设施
  • 批准号:
    1641774
  • 财政年份:
    2016
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant

相似国自然基金

供应链管理中的稳健型(Robust)策略分析和稳健型优化(Robust Optimization )方法研究
  • 批准号:
    70601028
  • 批准年份:
    2006
  • 资助金额:
    7.0 万元
  • 项目类别:
    青年科学基金项目
心理紧张和应力影响下Robust语音识别方法研究
  • 批准号:
    60085001
  • 批准年份:
    2000
  • 资助金额:
    14.0 万元
  • 项目类别:
    专项基金项目
ROBUST语音识别方法的研究
  • 批准号:
    69075008
  • 批准年份:
    1990
  • 资助金额:
    3.5 万元
  • 项目类别:
    面上项目
改进型ROBUST序贯检测技术
  • 批准号:
    68671030
  • 批准年份:
    1986
  • 资助金额:
    2.0 万元
  • 项目类别:
    面上项目

相似海外基金

Identification and characterization of microbiome-derived biomarkers via novel and robust systems-based approaches.
通过新颖且强大的基于系统的方法来鉴定和表征微生物组衍生的生物标志物。
  • 批准号:
    RGPIN-2022-05010
  • 财政年份:
    2022
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Discovery Grants Program - Individual
COLLABORATIVE RESEARCH: GCR: Characterization and Robust Multivariable Control of the Dynamics of Gas Exchange During Peritoneal Oxygenated Perfluorocarbon Perfusion
合作研究:GCR:腹膜全氟化碳灌注过程中气体交换动力学的表征和鲁棒多变量控制
  • 批准号:
    2121101
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Continuing Grant
COLLABORATIVE RESEARCH: GCR: Characterization and Robust Multivariable Control of the Dynamics of Gas Exchange During Peritoneal Oxygenated Perfluorocarbon Perfusion
合作研究:GCR:腹膜全氟化碳灌注过程中气体交换动力学的表征和鲁棒多变量控制
  • 批准号:
    2227939
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Continuing Grant
Robust Characterization of Brain-Heart Coupling Across Development and Modulations by Disordered Sleep
脑心耦合在发育和睡眠障碍调节中的稳健表征
  • 批准号:
    10293076
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
Policy-Robust Processing Networks: Characterization and Design
策略稳健的处理网络:表征和设计
  • 批准号:
    2139566
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant
Collaborative Research: GCR: Characterization and Robust Multivariable Control of the Dynamics of Gas Exchange During Peritoneal Oxygenated Perfluorocarbon Perfusion
合作研究:GCR:腹膜全氟化碳灌注过程中气体交换动力学的表征和鲁棒多变量控制
  • 批准号:
    2121110
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Continuing Grant
Robust Characterization of Brain-Heart Coupling Across Development and Modulations by Disordered Sleep
脑心耦合在发育和睡眠障碍调节中的稳健表征
  • 批准号:
    10443869
  • 财政年份:
    2021
  • 资助金额:
    $ 15.65万
  • 项目类别:
FET: Small: Collaborative Research: Efficient and Robust Characterization of Quantum Systems
FET:小型:协作研究:量子系统的高效且稳健的表征
  • 批准号:
    2100794
  • 财政年份:
    2020
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant
Policy-Robust Processing Networks: Characterization and Design
策略稳健的处理网络:表征和设计
  • 批准号:
    1856511
  • 财政年份:
    2019
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant
FET: Small: Collaborative Research: Efficient and Robust Characterization of Quantum Systems
FET:小型:协作研究:量子系统的高效且稳健的表征
  • 批准号:
    1909141
  • 财政年份:
    2019
  • 资助金额:
    $ 15.65万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了