权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

MCS: AF: Small: Algorithms for Large Scale Prediction Problems

MCS：AF：小型：大规模预测问题的算法

基本信息

批准号：
1115788
负责人：
Peter Bartlett
金额：
$ 35万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2011
资助国家：
美国
起止时间：
2011-07-15 至 2015-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1115788&HistoricalAwards=false
关键词：
MCS AF Small Algorithms Large

项目摘要

In large scale prediction problems that arise in many application areas, data is plentiful, and it is computational resources that constrain the performance of prediction methods. The broad goal of this research project is the design and analysis of methods for large scale prediction problems that make effective use of limited computational resources. The main aims are: to improve our understanding of the tradeoff between the accuracy of a prediction method and its computational requirements; to develop model selection methods that adaptively choose the model complexity to give the best predictive accuracy for the available computational resources; to improve our understanding of the difficulty of solving large scale prediction problems using distributed computational resources; to develop analysis techniques and methods for asynchronous online prediction, which exploit the flexibility to respond to queries out of order; and hence to develop effective methods for large scale prediction problems.As data acquisition and storage has become cheaper, enormous data sets have become available in many areas, including web information retrieval, the biological, medical, and physical sciences, manufacturing, finance and retail. Consequently, for many statistical prediction problems, the amount of data available is so huge that we can treat it as unlimited. For instance, in using image and caption data to train a prediction rule that can automatically choose appropriate labels for images, the web provides an effectively unlimited supply of training data. Similar situations arise in using click stream data to predict the choices of visitors to a popular web site, or in using customers' ratings of movies to make useful recommendations. For these large scale prediction problems, the bottleneck to performance is not the amount of data, rather it is the computational resources that are available. Many modern prediction methods have been designed and analyzed from the perspective that data is precious: they aim for optimal predictive accuracy for a given sample size. But for large scale problems, this is the wrong perspective; computation is the precious resource that must be used wisely. This shift in perspective introduces some novel tradeoffs. One of the most important tradeoffs arises in choosing the complexity of a prediction rule. Should we use our computational resources trying to optimize over a very complex family of prediction rules, which would not allow us to gather much data? Or should we save computation by using simpler prediction rules, and instead spend this computation on gathering more data? This research project is aimed at improving our understanding of these tradeoffs, and hence developing strategies for large scale prediction problems that best exploit the available computational resources.

在许多应用领域中出现的大规模预测问题中，数据是丰富的，并且计算资源限制了预测方法的性能。该研究项目的主要目标是设计和分析有效利用有限计算资源的大规模预测问题的方法。主要目标是：提高我们对预测方法的准确性与其计算要求之间的权衡的理解;开发模型选择方法，该方法自适应地选择模型复杂度以针对可用计算资源给出最佳预测准确性;提高我们对使用分布式计算资源解决大规模预测问题的困难的理解;开发异步在线预测的分析技术和方法，利用灵活性来响应无序查询;并因此发展出大规模预测问题的有效方法。随着数据获取和存储变得越来越便宜，在许多领域中已经可以获得大量的数据集，包括网络信息检索、生物、医学和物理科学、制造业、金融和零售业。因此，对于许多统计预测问题，可用的数据量是如此巨大，以至于我们可以将其视为无限的。例如，在使用图像和字幕数据来训练可以自动为图像选择适当标签的预测规则时，网络提供了有效的无限训练数据供应。类似的情况也出现在使用点击流数据来预测访问者对热门网站的选择，或者使用客户对电影的评级来做出有用的推荐。对于这些大规模的预测问题，性能的瓶颈不是数据量，而是可用的计算资源。许多现代预测方法都是从数据宝贵的角度设计和分析的：它们旨在为给定的样本量提供最佳的预测精度。但对于大规模问题，这是错误的观点;计算是必须明智使用的宝贵资源。这种观点的转变带来了一些新的权衡。最重要的权衡之一是选择预测规则的复杂性。我们是否应该使用我们的计算资源来尝试优化一个非常复杂的预测规则家族，这将不允许我们收集太多数据？或者我们应该通过使用更简单的预测规则来节省计算，而不是将这些计算用于收集更多的数据？该研究项目旨在提高我们对这些权衡的理解，从而为大规模预测问题制定最佳利用可用计算资源的策略。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Peter Bartlett其他文献

Mathematical Foundations of Machine Learning

机器学习的数学基础

DOI：
10.4171/owr/2021/15
发表时间：
2022
期刊：
Oberwolfach Reports
影响因子：
0
作者：
Peter Bartlett;Cristina Butucea;Johannes Schmidt
通讯作者：
Johannes Schmidt

Minimax Fixed-Design Linear Regression

极小极大固定设计线性回归

DOI：
发表时间：
2015
期刊：
Proceedings of The 28th Conference on Learning Theory (COLT 2015), JMLR: Workshop and Conference Proceedings
影响因子：
0
作者：
Peter Bartlett;Wouter Koolen;Alan Malek;Eiji Takimoto;Manfred Warmuth
通讯作者：
Manfred Warmuth

Sex and Capacity: Introduction to Special Edition of the Liverpool Law Review

DOI：
10.1007/s10991-010-9074-9
发表时间：
2010-10-22
期刊：
Liverpool Law Review
影响因子：
0.300
作者：
Peter Bartlett
通讯作者：
Peter Bartlett

Articulating future directions of law reform for compulsory mental health admission and treatment in Hong Kong

DOI：
10.1016/j.ijlp.2019.101513
发表时间：
2020-01-01
期刊：
Research article
影响因子：
作者：
Daisy Cheung;Michael Dunn;Elizabeth Fistein;Peter Bartlett;John McMillan;Carole J. Petersen
通讯作者：
Carole J. Petersen