权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: CIF: Small: Learning from Multiple Biased Sources

合作研究：CIF：小型：从多个有偏见的来源学习

基本信息

批准号：
2008074
负责人：
Clayton Scott
金额：
$ 30万
依托单位：
Regents of the University of Michigan - Ann Arbor
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-07-01 至 2024-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2008074&HistoricalAwards=false
关键词：
Collaborative Research CIF Small Learning

项目摘要

The field of artificial intelligence, and especially machine learning, is concerned with automating the performance of a task by learning from past performances of that task. Examples include classifying images and successfully navigating a maze. Classical machine learning methods assume that past occurrences of a task, or “training data,” accurately represent future occurrences of the task. In many applications, however, training data are drawn from multiple sources that reflect future occurrences with varying degrees of quality. Examples include images labeled by crowd-sourced users or navigation of randomly simulated mazes. The objective of this project is to develop theoretical foundations of learning from multiple biased sources. The work will be motivated by applications in crowdsourcing and autonomous navigation as described above, as well as in video surveillance and nuclear threat detection. This research will support the cross-disciplinary development of a diverse cohort of PhD and undergraduate students at the University of Michigan and at Boston University.To achieve these goals, the investigators will establish theoretical foundations for four broad classes of machine learning problems for which virtually no theory presently exists: (1) Classification from multiple corrupted sources, (2) Clustering with overlapping, nonparametric clusters, (3) Sim2Real reinforcement learning, and (4) Zero-shot learning. This project's theoretical contributions will take the form of generalization error bounds, regret bounds, and sample complexity bounds, while also emphasizing distribution free or general nonparametric models wherever possible. To address the challenges of eliciting and aggregating biased information from multiple sources, the analyses will develop new technical tools, including weighted Rademacher complexity, regret analysis from biased bandit feedback, and oracle inequalities for density estimation, that are likely to find application in other learning settings. The research resulting from this effort will highlight distinctive features of learning from multiple sources, including various questions associated with multiple sample sizes. More generally, the research develops principled approaches for integrating heterogeneous data sources in both batch and sequential learning settings and under a variety of inter-source dependence models.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

人工智能领域，特别是机器学习，关注的是通过从任务的过去表现中学习来自动执行该任务。例子包括分类图像和成功地导航迷宫。经典的机器学习方法假设任务的过去发生，或“训练数据”，准确地表示任务的未来发生。然而，在许多应用中，训练数据是从多个来源提取的，这些来源反映了未来发生的具有不同质量程度的事件。例子包括由众包用户标记的图像或随机模拟迷宫的导航。这个项目的目标是发展从多个有偏见的来源学习的理论基础。这项工作将受到上述众包和自主导航以及视频监控和核威胁检测应用的推动。这项研究将支持密歇根大学和波士顿大学的博士生和本科生的跨学科发展。为了实现这些目标，研究人员将为四大类机器学习问题建立理论基础，目前几乎没有理论存在：（1）从多个损坏的源进行分类，（2）使用重叠的非参数聚类进行聚类，（3）Sim 2 Real强化学习，以及（4）Zero-shot学习。该项目的理论贡献将采取推广误差界，遗憾界和样本复杂性界的形式，同时也强调分布自由或一般非参数模型。为了解决从多个来源引出和聚合有偏见的信息的挑战，分析将开发新的技术工具，包括加权Rademacher复杂性，有偏见的强盗反馈的遗憾分析，以及密度估计的Oracle不等式，这些都可能在其他学习环境中找到应用。这项工作所产生的研究将突出从多个来源学习的独特特征，包括与多个样本量相关的各种问题。更一般地说，该研究开发了原则性的方法，用于在批处理和顺序学习环境中以及在各种源间依赖模型下集成异构数据源。该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

An exact solver for the Weston-Watkins SVM subproblem

DOI：
发表时间：
2021-02
期刊：
影响因子：
0
作者：
Yutong Wang;C. Scott
通讯作者：
Yutong Wang;C. Scott

VC dimension of partially quantized neural networks in the overparametrized regime

DOI：
发表时间：
2021-10
期刊：
ArXiv
影响因子：
0
作者：
Yutong Wang;C. Scott
通讯作者：
Yutong Wang;C. Scott

Consistent Estimation of Identifiable Nonparametric Mixture Models from Grouped Observations

DOI：
发表时间：
2020-06
期刊：
ArXiv
影响因子：
0
作者：
Alexander Ritchie;Robert A. Vandermeulen;C. Scott
通讯作者：
Alexander Ritchie;Robert A. Vandermeulen;C. Scott

Weston-Watkins Hinge Loss and Ordered Partitions

DOI：
发表时间：
2020-06
期刊：
ArXiv
影响因子：
0
作者：
Yutong Wang;C. Scott
通讯作者：
Yutong Wang;C. Scott

Learning from Label Proportions by Learning with Label Noise

DOI：
10.48550/arxiv.2203.02496
发表时间：
2022-03
期刊：
ArXiv
影响因子：
0
作者：
Jianxin Zhang;Yutong Wang;C. Scott
通讯作者：
Jianxin Zhang;Yutong Wang;C. Scott

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Clayton Scott其他文献

Multiclass Domain Generalization

多类域泛化

DOI：
发表时间：
2017
期刊：
影响因子：
0
作者：
A. Deshmukh;Srinagesh Sharma;James W. Cutler;Clayton Scott
通讯作者：
Clayton Scott

The Nested Structure of Cancer Symptoms

癌症症状的嵌套结构

DOI：
发表时间：
2010
期刊：
Methods of Information in Medicine
影响因子：
1.7
作者：
S. Bhavnani;G. Bellala;Arunkumaar Ganesan;Rajeev Krishna;Paul R. Saxman;Clayton Scott;Maria J. Silveira;Charles W. Given
通讯作者：
Charles W. Given