BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
基本信息
- 批准号:1740325
- 负责人:
- 金额:$ 47.09万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2020-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Recent technological and scientific advances have allowed the acquisition of vast amounts of various types of data. Such an abundance of information should lead to new scientific understanding and breakthroughs. However, the large-scale nature of this data introduces serious complications that choke classical data analysis techniques, leading to a stagnation of scientific progress in many areas. This issue requires novel mathematical techniques in order to effectively extract and analyze the information. This project will use Lyme disease data (through a collaboration with LymeDisease.org) as a motivating example in the design and testing of the methods, as it serves as a prime example of complex large-scale data with very significant impact to a fast growing community. The results of this project will thus have swift societal impact; for example, analysis on the LymeData will not only further the understanding of the disease itself, but will also lead to more accurate and precise diagnoses, and more personalized and effective treatments for patients. In addition, this proposal will support the education of postdoctoral, graduate and undergraduate students, and facilitate outreach efforts aimed especially at increasing the participation of under-represented populations. To accomplish this task, in addition to the activities funded by this proposal, the PIs will utilize existing programs such as the Women In Technology Sharing Online (WitsOn) program, Women in Data Science and Mathematics Research Collaboration Workshop (WiSDM), and MAPS 4 College of Los Angeles, all in which the PIs are already actively involved, to recruit under-represented populations and to promote the mathematical and technical sciences.The fundamental research in this project will center around three main objectives, each addressing a particularly important challenge that arises in large-scale data applications. The first goal is to design innovative data completion techniques that are practical for big data; this will involve the design and theoretical development of data completion methods using non-random (and non-uniform) observation patterns, adaptive sampling schemes, and utilizing additional structures hidden in the observations. Rather than using classical (computationally expensive) convex programming techniques, the project will focus on extremely efficient simple solvers that can be run in real-time during an inference task. Secondly, the team proposes two novel deep learning approaches for inferential tasks that (i) are extremely computationally efficient and can thus be applied to massive datasets, and (ii) achieve the accuracy benefits of modern deep learning approaches, which improve upon state of the art methods. Third, the project will develop critical data fusion techniques that allow data from a wide variety of sources to be analyzed in an aggregated manner. Lastly, the team proposes to combine these three data analysis tasks in a novel multi-stage feedback design where outputs from data completion, deep learning inferences and fusion will be cycled back as inputs to these mechanisms for an iterative and robust inference framework. Progress on these goals will yield new mathematical frameworks in data science, and provide techniques that will be directly applied to large-scale data to allow efficient and powerful data analysis.
最近的技术和科学进步使得能够获得大量各种类型的数据。如此丰富的信息应该会带来新的科学认识和突破。然而,这些数据的大规模性质带来了严重的复杂性,阻碍了传统的数据分析技术,导致许多领域的科学进步停滞不前。这个问题需要新的数学技术,以便有效地提取和分析信息。该项目将使用莱姆病数据(通过与LymeDisease.org合作)作为设计和测试方法的激励示例,因为它是复杂的大规模数据的主要示例,对快速增长的社区具有非常重要的影响。因此,该项目的结果将产生迅速的社会影响;例如,对LymeData的分析不仅将进一步了解疾病本身,而且还将导致更准确和精确的诊断,并为患者提供更个性化和有效的治疗。此外,该提案将支持博士后、研究生和本科生的教育,并促进特别旨在提高代表性不足人口参与程度的外联工作。为了完成这项任务,除了本提案资助的活动外,PI还将利用现有的计划,如在线技术共享妇女(WitsOn)计划,数据科学和数学研究合作研讨会妇女(WiSDM)和洛杉矶MAPS 4学院,所有这些项目PI都已经积极参与,该项目的基础研究将围绕三个主要目标,每个目标都解决了大规模数据应用中出现的一个特别重要的挑战。第一个目标是设计对大数据实用的创新数据完成技术;这将涉及使用非随机(和非均匀)观察模式,自适应采样方案和利用隐藏在观察中的额外结构的数据完成方法的设计和理论开发。而不是使用经典的(计算昂贵)凸规划技术,该项目将专注于非常有效的简单求解器,可以在推理任务期间实时运行。其次,该团队提出了两种用于推理任务的新型深度学习方法,(i)计算效率极高,因此可以应用于海量数据集,(ii)实现现代深度学习方法的准确性优势,从而改进了最先进的方法。第三,该项目将开发关键的数据融合技术,使来自各种来源的数据能够以聚合的方式进行分析。最后,该团队建议将这三个数据分析任务联合收割机结合到一个新的多阶段反馈设计中,其中数据完成,深度学习推理和融合的输出将作为这些机制的输入循环回来,以实现迭代和强大的推理框架。这些目标的进展将在数据科学中产生新的数学框架,并提供直接应用于大规模数据的技术,以实现高效和强大的数据分析。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Stochastic Gradient Descent for Linear Systems with Missing Data
具有缺失数据的线性系统的随机梯度下降
- DOI:10.4208/nmtma.oa-2018-0066
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Needell, Anna Ma
- 通讯作者:Needell, Anna Ma
On Motzkin’s method for inconsistent linear systems
- DOI:10.1007/s10543-018-0737-6
- 发表时间:2018-02
- 期刊:
- 影响因子:1.5
- 作者:Jamie Haddock;D. Needell
- 通讯作者:Jamie Haddock;D. Needell
Iterative Methods for Solving Factorized Linear Systems
- DOI:10.1137/17m1115678
- 发表时间:2017-01
- 期刊:
- 影响因子:0
- 作者:A. Ma;D. Needell;Aaditya Ramdas
- 通讯作者:A. Ma;D. Needell;Aaditya Ramdas
Randomized Projection Methods for Linear Systems with Arbitrarily Large Sparse Corruptions
具有任意大稀疏损坏的线性系统的随机投影方法
- DOI:10.1137/18m1179213
- 发表时间:2019
- 期刊:
- 影响因子:3.1
- 作者:Haddock, Jamie;Needell, Deanna
- 通讯作者:Needell, Deanna
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Deanna Needell其他文献
Stochastic iterative methods for online rank aggregation from pairwise comparisons
成对比较在线排名聚合的随机迭代方法
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:1.5
- 作者:
B. Jarman;Lara Kassab;Deanna Needell;Alexander Sietsema - 通讯作者:
Alexander Sietsema
Stochastic gradient descent for streaming linear and rectified linear systems with Massart noise
具有 Massart 噪声的流线性和整流线性系统的随机梯度下降
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Halyun Jeong;Deanna Needell;E. Rebrova - 通讯作者:
E. Rebrova
An Introduction to Fourier Analysis with Applications to Music
傅里叶分析简介及其在音乐中的应用
- DOI:
10.5642/jhummath.201401.05 - 发表时间:
2014 - 期刊:
- 影响因子:0.3
- 作者:
N. Lenssen;Deanna Needell - 通讯作者:
Deanna Needell
Deanna Needell的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Deanna Needell', 18)}}的其他基金
Collaborative Research: Fast, Low-Memory Embeddings for Tensor Data with Applications
协作研究:使用应用程序快速、低内存嵌入张量数据
- 批准号:
2108479 - 财政年份:2021
- 资助金额:
$ 47.09万 - 项目类别:
Continuing Grant
Tensors, Topics, Truth, and Time: Methods for Real Tensor Applications
张量、主题、真相和时间:实张量应用的方法
- 批准号:
2011140 - 财政年份:2020
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1934319 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
Structured Random Matrices and Graphs in Signal Processing
信号处理中的结构化随机矩阵和图
- 批准号:
1909457 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Continuing Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1740312 - 财政年份:2017
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
CAREER: Practical Compressive Signal Processing
职业:实用压缩信号处理
- 批准号:
1753879 - 财政年份:2017
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
CAREER: Practical Compressive Signal Processing
职业:实用压缩信号处理
- 批准号:
1348721 - 财政年份:2014
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
相似海外基金
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
2348159 - 财政年份:2023
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
2308649 - 财政年份:2022
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
- 批准号:
2027516 - 财政年份:2020
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1934319 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Protecting Yourself from Wildfire Smoke: Big Data-Driven Adaptive Air Quality Prediction Methodologies
大数据:IA:协作研究:保护自己免受野火烟雾的侵害:大数据驱动的自适应空气质量预测方法
- 批准号:
1838022 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
- 批准号:
1926250 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
1947584 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
1837964 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838222 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838248 - 财政年份:2019
- 资助金额:
$ 47.09万 - 项目类别:
Standard Grant