Beyond With-replacement Sampling for Large-Scale Data Analysis and Optimization
超越大规模数据分析和优化的替换采样
基本信息
- 批准号:1723085
- 负责人:
- 金额:$ 12.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-07-15 至 2020-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Advances in sensing and processing technologies, communication capabilities and smart devices have enabled deployment of systems where a massive amount of data is collected to make decisions. Many key problems of interest for analyzing and processing big data result in large-scale optimization problems. For a core, very widely used optimization method, which is efficient for such problems where the data points are sampled and processed in a sequential manner, there is a large gap between the theory and practice of this method. This project is about filling this gap by providing novel performance guarantees relevant to practical problems as well as developing novel and faster variants of the optimization method. The methods and techniques developed under the scope of this project will contribute to the efficiency and mathematical foundations of optimization algorithms targeted for big data challenges, contributing to more efficient decision making for a wide variety of large-scale data analysis problems. Incremental gradient (IG) is the core, very widely used optimization method mentioned above and subsumes popular optimization methods in data analysis and machine learning practice such as stochastic gradient descent, randomized coordinate descent and Kaczmarz methods. Various performance guarantees for IG are available if data points are sampled with replacement in an independent identically distributed (i.i.d.) manner. However, these are not helpful in practical scenarios: In practice, data is often sampled in a non-i.i.d fashion without-replacement instead, as the resulting convergence is typically much faster. A first goal in this project is to study and quantify this discrepancy over an interesting class of regression problems, which has been a key open problem. Several techniques and methods are proposed for obtaining asymptotic and non-asymptotic theoretical guarantees for without-replacement sampling schemes. A second goal is to develop fast algorithms with convergence guarantees that go beyond the limitations of the i.i.d. sampling. For this purpose, a new framework for studying several alternative sampling schemes and their performance is developed. Using this framework, novel sampling schemes based on weighted without-replacement sampling and cyclic sampling that can adapt to the dataset and improve upon the performance of the traditional i.i.d. sampling in terms of limiting accuracy are developed.
传感和处理技术、通信能力和智能设备的进步使得能够部署收集大量数据以做出决策的系统。分析和处理大数据的许多关键问题导致大规模优化问题。 对于一个核心的,非常广泛使用的优化方法,这是有效的,其中的数据点是采样和处理的顺序方式,这种方法的理论和实践之间有很大的差距。该项目旨在通过提供与实际问题相关的新性能保证以及开发新的更快的优化方法变体来填补这一空白。在本项目范围内开发的方法和技术将有助于提高针对大数据挑战的优化算法的效率和数学基础,为各种大规模数据分析问题做出更有效的决策。增量梯度(IG)是上述最优化方法的核心,它包含了数据分析和机器学习实践中流行的优化方法,如随机梯度下降,随机坐标下降和Kaczmarz方法。如果在独立同分布(i.i.d.)方式然而,这些在实际场景中没有帮助:在实践中,数据通常以非i.i. d方式采样,而不是替换,因为由此产生的收敛通常要快得多。在这个项目中的第一个目标是研究和量化这种差异在一个有趣的类回归问题,这一直是一个关键的开放问题。提出了几种技术和方法来获得渐近和非渐近理论保证的无替换抽样方案。第二个目标是开发出具有收敛保证的快速算法,超越i.i.d.的限制。取样.为此,一个新的框架,研究几种替代的抽样方案及其性能的开发。在此框架下,提出了基于加权无替换抽样和循环抽样的新抽样方案,该方案能够适应数据集的特点,并改善了传统i.i.d.抽样的限制精度方面的发展。
项目成果
期刊论文数量(18)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robust Accelerated Gradient Methods for Smooth Strongly Convex Functions
- DOI:10.1137/19m1244925
- 发表时间:2018-05
- 期刊:
- 影响因子:0
- 作者:N. Aybat;Alireza Fallah;M. Gürbüzbalaban;A. Ozdaglar
- 通讯作者:N. Aybat;Alireza Fallah;M. Gürbüzbalaban;A. Ozdaglar
Breaking Reversibility Accelerates Langevin Dynamics for Non-Convex Optimization
- DOI:
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Xuefeng Gao;M. Gürbüzbalaban;Lingjiong Zhu
- 通讯作者:Xuefeng Gao;M. Gürbüzbalaban;Lingjiong Zhu
ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning
- DOI:10.1109/ipdps47924.2020.00052
- 发表时间:2019-07
- 期刊:
- 影响因子:0
- 作者:Saeed Soori;Bugra Can;M. Gürbüzbalaban;M. Dehnavi
- 通讯作者:Saeed Soori;Bugra Can;M. Gürbüzbalaban;M. Dehnavi
When Cyclic Coordinate Descent Outperforms Randomized Coordinate Descent
- DOI:
- 发表时间:2017-12
- 期刊:
- 影响因子:0
- 作者:M. Gürbüzbalaban;A. Ozdaglar;P. Parrilo;N. D. Vanli
- 通讯作者:M. Gürbüzbalaban;A. Ozdaglar;P. Parrilo;N. D. Vanli
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate
- DOI:
- 发表时间:2019-06
- 期刊:
- 影响因子:0
- 作者:Saeed Soori;Konstantin Mischenko;Aryan Mokhtari;M. Dehnavi;Mert Gurbuzbalaban
- 通讯作者:Saeed Soori;Konstantin Mischenko;Aryan Mokhtari;M. Dehnavi;Mert Gurbuzbalaban
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mert Gurbuzbalaban其他文献
Entropic Risk-Averse Generalized Momentum Methods
熵风险规避广义动量方法
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Bugra Can;Mert Gurbuzbalaban - 通讯作者:
Mert Gurbuzbalaban
Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics
通过不可逆随机梯度 Langevin Dynamics 进行非凸优化
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Yuanhan Hu;Xiaoyu Wang;Xuefeng Gao;Mert Gurbuzbalaban;Lingjiong Zhu - 通讯作者:
Lingjiong Zhu
Mert Gurbuzbalaban的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mert Gurbuzbalaban', 18)}}的其他基金
Collaborative Research: Langevin Markov Chain Monte Carlo Methods for Machine Learning
合作研究:用于机器学习的朗之万马尔可夫链蒙特卡罗方法
- 批准号:
2053485 - 财政年份:2021
- 资助金额:
$ 12.5万 - 项目类别:
Standard Grant
SHF: Small: Communication-Efficient Distributed Algorithms for Machine Learning
SHF:小型:用于机器学习的通信高效分布式算法
- 批准号:
1814888 - 财政年份:2018
- 资助金额:
$ 12.5万 - 项目类别:
Standard Grant
相似国自然基金
甘蓝细胞质多样性及其对显性核基因雄性不育的影响
- 批准号:30700543
- 批准年份:2007
- 资助金额:19.0 万元
- 项目类别:青年科学基金项目
相似海外基金
A Translational Randomized Clinical Trial of Varenicline Sampling to Promote Smoking Cessation and Scalable Treatment Dissemination
伐尼克兰取样促进戒烟和可扩展治疗传播的转化随机临床试验
- 批准号:
10212989 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
A Translational Randomized Clinical Trial of Varenicline Sampling to Promote Smoking Cessation and Scalable Treatment Dissemination
伐尼克兰取样促进戒烟和可扩展治疗传播的转化随机临床试验
- 批准号:
10455439 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
Effectiveness of Nicotine Replacement Therapy Sampling in Dental Practices
尼古丁替代疗法采样在牙科实践中的有效性
- 批准号:
10678685 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
Effectiveness of Nicotine Replacement Therapy Sampling in Dental Practices
尼古丁替代疗法采样在牙科实践中的有效性
- 批准号:
10065030 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
A Translational Randomized Clinical Trial of Varenicline Sampling to Promote Smoking Cessation and Scalable Treatment Dissemination
伐尼克兰取样促进戒烟和可扩展治疗传播的转化随机临床试验
- 批准号:
10669624 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
Effectiveness of Nicotine Replacement Therapy Sampling in Dental Practices
尼古丁替代疗法采样在牙科实践中的有效性
- 批准号:
10221673 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
Effectiveness of Nicotine Replacement Therapy Sampling in Dental Practices
尼古丁替代疗法采样在牙科实践中的有效性
- 批准号:
10629020 - 财政年份:2020
- 资助金额:
$ 12.5万 - 项目类别:
Development and Testing of a Depression-Specific Behavioral Activation Mobile App Paired with Nicotine Replacement Therapy Sampling for Smoking Cessation Treatment Via Primary Care
开发和测试抑郁症特异性行为激活移动应用程序,并结合尼古丁替代疗法采样,通过初级保健进行戒烟治疗
- 批准号:
10364683 - 财政年份:2018
- 资助金额:
$ 12.5万 - 项目类别:
Development and Testing of a Depression-Specific Behavioral Activation Mobile App Paired with Nicotine Replacement Therapy Sampling for Smoking Cessation Treatment Via Primary Care
开发和测试抑郁症特异性行为激活移动应用程序,并结合尼古丁替代疗法采样,通过初级保健进行戒烟治疗
- 批准号:
10112873 - 财政年份:2018
- 资助金额:
$ 12.5万 - 项目类别:
Oregon State University/Marine Sediment Sampling Group Ocean Instrumentation: Acquisition of a replacement shipboard multisensor track for at-sea sediment physical properties scans
俄勒冈州立大学/海洋沉积物采样组海洋仪器:购买用于海上沉积物物理特性扫描的替换船载多传感器轨道
- 批准号:
1652959 - 财政年份:2017
- 资助金额:
$ 12.5万 - 项目类别:
Standard Grant