BIGDATA: Collaborative Research: F: Nomadic Algorithms for Machine Learning in the Cloud
BIGDATA:协作研究:F:云中机器学习的游牧算法
基本信息
- 批准号:1546452
- 负责人:
- 金额:$ 61.04万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-01-01 至 2021-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
With an ever increasing ability to collect and archive data, massive data sets are becoming increasingly common. These data sets are often too big to fit into the main memory of a single computer, and so there is a great need for developing scalable and sophisticated machine learning methods for their analysis. In particular, one has to devise strategies to distribute the computation across multiple machines. However, stochastic optimization and inference algorithms that are so effective for large-scale machine learning appear to be inherently sequential.The main research goal of this project is to develop a novel "nomadic" framework that overcomes this barrier. This will be done by showing that many modern machine learning problems have a certain "double separability" property. The aim is to exploit this property to develop convergent, asynchronous, distributed, and fault tolerant algorithms that are well-suited for achieving high performance on commodity hardware that is prevalent on today's cloud computing platforms. In particular, over a four year period, the following will be developed: (i) parallel stochastic optimization algorithms for the multi-machine cloud computing setting, (ii) theoretical guarantees of convergence, (iii) open source code under a permissive license, (iv) application of these techniques to a variety of problem domains such as topic models and mixture models. In addition, a cohort of students who can transfer their skills to both industry and academia will be trained, and a graduate level course on scalable machine learning will be developed. The proposed research will enable practitioners in different application areas to quickly solve their big data problems. The results of the project will be disseminated widely through papers and open source software. Course material will be developed for the education of students in the area of Scalable Machine Learning, and the course will be co-taught at UCSC and UT Austin. The project will recruit women and minority students.
随着收集和归档数据的能力不断增强,海量数据集变得越来越普遍。这些数据集通常太大,无法放入单个计算机的主存储器中,因此非常需要开发可扩展和复杂的机器学习方法来分析它们。特别是,必须设计策略将计算分布到多台机器上。然而,对于大规模机器学习如此有效的随机优化和推理算法似乎本质上是顺序的。该项目的主要研究目标是开发一种新的“游牧”框架来克服这一障碍。这将通过展示许多现代机器学习问题具有一定的“双重可分性”属性来实现。目的是利用这一特性来开发聚合、异步、分布式和容错算法,这些算法非常适合在当今云计算平台上流行的商用硬件上实现高性能。特别是,在四年期间,将开发以下内容:(i)用于多机器云计算设置的并行随机优化算法,(ii)收敛的理论保证,(iii)在许可许可下的开源代码,(iv)将这些技术应用于各种问题领域,如主题模型和混合模型。此外,将培训一批能够将其技能转移到工业界和学术界的学生,并将开发一门可扩展机器学习的研究生课程。拟议的研究将使不同应用领域的从业者能够快速解决他们的大数据问题。该项目的成果将通过论文和开源软件广泛传播。课程材料将用于可扩展机器学习领域的学生教育,课程将在加州大学圣迭戈分校和德克萨斯大学奥斯汀分校共同教授。该项目将招收女性和少数民族学生。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Inderjit Dhillon其他文献
Inderjit Dhillon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Inderjit Dhillon', 18)}}的其他基金
I-Corps: Faster than Light Big Data Analytics
I-Corps:超光速大数据分析
- 批准号:
1507631 - 财政年份:2015
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
AF:Small: Divide-and-Conquer Numerical Methods for Analysis of Massive Data Sets
AF:Small:用于分析海量数据集的分而治之数值方法
- 批准号:
1320746 - 财政年份:2013
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
AF: Small: Fast and Memory-Efficient Dimensionality Reduction for Massive Networks
AF:小:大规模网络的快速且节省内存的降维
- 批准号:
1117055 - 财政年份:2011
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
Non-Negative Matrix and Tensor Approximations: Algorithms, Software and Applications
非负矩阵和张量近似:算法、软件和应用
- 批准号:
0728879 - 财政年份:2007
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
Novel Matrix Problems in Modern Applications
现代应用中的新矩阵问题
- 批准号:
0431257 - 财政年份:2004
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
CAREER: Scalable Algorithms for Large-Scale Data Mining
职业:大规模数据挖掘的可扩展算法
- 批准号:
0093404 - 财政年份:2001
- 资助金额:
$ 61.04万 - 项目类别:
Continuing Grant
相似海外基金
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
2348159 - 财政年份:2023
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
2308649 - 财政年份:2022
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
- 批准号:
2027516 - 财政年份:2020
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1934319 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Protecting Yourself from Wildfire Smoke: Big Data-Driven Adaptive Air Quality Prediction Methodologies
大数据:IA:协作研究:保护自己免受野火烟雾的侵害:大数据驱动的自适应空气质量预测方法
- 批准号:
1838022 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
- 批准号:
1926250 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
1947584 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
1837964 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838222 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
- 批准号:
1838248 - 财政年份:2019
- 资助金额:
$ 61.04万 - 项目类别:
Standard Grant














{{item.name}}会员




