SOFTWARE: Framework for Mining Large and Complex Scientific Datasets

软件:挖掘大型复杂科学数据集的框架

基本信息

项目摘要

Numerical simulations are replacing traditional experiments in gaining insights into complex physical phenomena. Given recent advances in computer hardware and numerical methods, it is now possible to simulate physical phenomena at very fine temporal and spatial resolutions. As a result, the the amount of data generated is overwhelming. Scientists are interested in analyzing and visualizing the data produced by such simulations to better understand the process that is being simulated. Analyzing such large scale data is hard. Not only the methods used are computationally expense, current programming tools make the analysis difficult to specify and modify. Thus, there is a dire need for a systematic approach, along with supporting algorithms and methodologies for flexible parallel implementations, to achieve scalable and interactive analysis on large scientific datasets. In this project, we propose the construction of such a scalable toolkit, namely the Computational Analysis Toolkit (CAT). This toolkit proposes to exploit ongoing work in feature analysis, scalable data mining and parallel programing environments. The crux of the approach is feature-mining; a process where by regions are delineated through various stages of detection, verification, de-noising, and tracking of points of interest. Additionally, we propose the use of some key data mining mining algorithms for achieving enhanced and robust implementations of feature-mining algorithms. It is our objective that the CAT toolkit should not only allow for the detection of features but also provide for a means to control the analysis in an interactive setting. For example, demographic and lifetime analysis of certain critical features as determined by the user/scientist may be an important way of understanding the underlying process being simulated. These critical features, once tagged via a suitable interface, can be profiled and a concise representation this profile can then be presented to the user as needed. We believe that for long-term use of a tool for feature and data mining, it is important that a) the algorithms are parallelized on a variety of platforms, b) the parallel implementations are easy to maintain and modify, and c) APIs are available for users to rapidly create scalable implementations of new mining algorithms. We are proposing to achieve these goals by using and extending a parallelization framework developed locally. This framework, referred to as FRamework for Rapid Implementations of Datamining Engines (FREERIDE), offers high-level APIs and runtime techniques to enable parallelization of algorithms for data mining and related tasks. It allows parallelization on both distributed memory and shared memory configurations, and further supports efficient processing of disk-resident datasets.The proposal, besides providing a useful toolkit, is likely engender the use of methodologies for large data exploration. Our efforts are likely to contribute to literature in scalable data and feature mining algorithms, and feature profile summarization.
数值模拟正在替代传统实验,以洞悉复杂的物理现象。鉴于计算机硬件和数值方法的最新进展,现在可以在非常精细的时间和空间分辨率下模拟物理现象。结果,生成的数据量是压倒性的。科学家有兴趣分析和可视化此类模拟产生的数据,以更好地了解正在模拟的过程。分析如此大规模的数据很难。不仅使用的方法包括计算费用,当前的编程工具使得分析难以指定和修改。因此,非常需要系统的方法,以及支持灵活并行实现的算法和方法,以实现大型科学数据集的可扩展性和互动分析。在这个项目中,我们提出了这种可扩展工具包的构建,即计算分析工具包(CAT)。该工具包建议在功能分析,可扩展数据挖掘和并行编程环境中利用正在进行的工作。该方法的症结在于特征。通过区域通过检测,验证,去噪声和感兴趣点跟踪的各个阶段来划定区域的过程。 此外,我们建议使用一些关键数据挖掘算法来实现增强和强大的挖掘算法实现。我们的目标是,猫工具包不仅应允许检测功能,而且还提供了一种在交互式设置中控制分析的方法。 例如,用户/科学家确定的某些关键特征的人口统计学和寿命分析可能是理解模拟基础过程的重要方法。这些关键功能一旦通过合适的接口进行标记,就可以介绍,并可以根据需要将该配置文件呈现给用户。我们认为,对于长期用于功能和数据挖掘的工具,重要的是,a)算法在各种平台上并行,b)平行实现易于维护和修改,并且c)用户可以快速创建新矿业算法的可扩展实现。我们建议通过使用和扩展当地开发的并行化框架来实现这些目标。该框架被称为快速实现数据安装引擎(Freeride)的框架,它提供了高级API和运行时技术,以实现用于数据挖掘和相关任务的算法并行化。 它允许在分布式内存和共享内存配置上并行化,并进一步支持磁盘驻留数据集的有效处理。提案除了提供有用的工具包外,还可能导致使用方法来用于大型数据探索。 我们的努力很可能有助于可扩展数据和功能采矿算法以及特征概况摘要的文献。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Raghu Machiraju其他文献

Raghu Machiraju的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Raghu Machiraju', 18)}}的其他基金

Collaborative Research: Autonomous Computing Materials
合作研究:自主计算材料
  • 批准号:
    1940168
  • 财政年份:
    2019
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Continuing Grant
Spokes: MEDIUM: MIDWEST: Collaborative: Community-Driven Data Engineering for Substance Abuse Prevention in the Rural Midwest
辐条:媒介:中西部:协作:社区驱动的数据工程,用于中西部农村地区的药物滥用预防
  • 批准号:
    1761969
  • 财政年份:
    2018
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Standard Grant
SCC-Planning: Using Innovations in Big Data and Technology to Address the High Rate of Infant Mortality in Greater Columbus Ohio
SCC-Planning:利用大数据和技术创新解决俄亥俄州大哥伦布市婴儿死亡率高的问题
  • 批准号:
    1737560
  • 财政年份:
    2017
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Standard Grant
BCSP: ABI Innovation: Collaborative Research: Predicting changes in protein activity from changes in sequence by identifying the underlying Biophysical Conditional Random Field
BCSP:ABI 创新:协作研究:通过识别潜在的生物物理条件随机场,根据序列变化预测蛋白质活性的变化
  • 批准号:
    1262469
  • 财政年份:
    2014
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Standard Grant
G&V: Medium: Collaborative Research: Large Data Visualization Using An Interactive Machine Learning Framework
G
  • 批准号:
    1065025
  • 财政年份:
    2011
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Standard Grant
ITR/NGS: A Framework for Discovery, Exploration and Analysis of Evolutionary Simulation Data (DEAS)
ITR/NGS:进化模拟数据发现、探索和分析的框架 (DEAS)
  • 批准号:
    0326386
  • 财政年份:
    2003
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Continuing Grant
CAREER: On the Assessment of Volume Rendering Algorithms in Visual Computing
职业:视觉计算中体积渲染算法的评估
  • 批准号:
    0196242
  • 财政年份:
    2000
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Continuing grant
CAREER: On the Assessment of Volume Rendering Algorithms in Visual Computing
职业:视觉计算中体积渲染算法的评估
  • 批准号:
    9734483
  • 财政年份:
    1998
  • 资助金额:
    $ 37.3万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于共价有机框架的噬菌体-光催化协同靶向抗菌策略用于顽固性细菌感染的研究
  • 批准号:
    22378279
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
认知诊断框架下稳健的Q矩阵修正方法研究
  • 批准号:
    32300942
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
光敏性共价有机框架复合材料用于非均相碳氢活化反应研究
  • 批准号:
    22301083
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
多源多尺度信息融合框架下地震波阻抗反演成像
  • 批准号:
    42374139
  • 批准年份:
    2023
  • 资助金额:
    51 万元
  • 项目类别:
    面上项目
基于共价有机框架薄膜的气体传感器及其敏感机理研究
  • 批准号:
    62371299
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目

相似海外基金

Mechanism-Driven Virtual Adverse Outcome Pathway Modeling for Hepatotoxicity
机制驱动的肝毒性虚拟不良结果途径建模
  • 批准号:
    10940417
  • 财政年份:
    2023
  • 资助金额:
    $ 37.3万
  • 项目类别:
Mechanism-Driven Virtual Adverse Outcome Pathway Modeling for Hepatotoxicity
机制驱动的肝毒性虚拟不良结果途径建模
  • 批准号:
    10675944
  • 财政年份:
    2023
  • 资助金额:
    $ 37.3万
  • 项目类别:
SCH: New Advanced Machine Learning Framework for Mining Heterogeneous Ocular Data to Accelerate
SCH:新的先进机器学习框架,用于挖掘异构眼部数据以加速
  • 批准号:
    10601180
  • 财政年份:
    2022
  • 资助金额:
    $ 37.3万
  • 项目类别:
SCH: New Advanced Machine Learning Framework for Mining Heterogeneous Ocular Data to Accelerate
SCH:新的先进机器学习框架,用于挖掘异构眼部数据以加速
  • 批准号:
    10665804
  • 财政年份:
    2022
  • 资助金额:
    $ 37.3万
  • 项目类别:
SCH: AI-Enhanced Multimodal Sensor-on-a-chip for Alzheimer's Disease Detection
SCH:用于阿尔茨海默病检测的人工智能增强型多模态芯片传感器
  • 批准号:
    10685378
  • 财政年份:
    2022
  • 资助金额:
    $ 37.3万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了