权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: RI: Small: Foundations of Few-Round Active Learning

协作研究：RI：小型：少轮主动学习的基础

基本信息

批准号：
2313130
负责人：
Ruoxi Jia
金额：
$ 30万
依托单位：
Virginia Polytechnic Institute and State University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-08-01 至 2026-07-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2313130&HistoricalAwards=false
关键词：
Collaborative Research RI Small Foundations

项目摘要

Supervised machine learning has found widespread application, often achieving state-of-the-art performance. However, these algorithms rely on labeled training instances, which can be challenging to acquire. Labeled instances are often done by humans and require time and money to obtain. Active Learning strives to minimize labeling costs by identifying the most informative instances for annotation. While Active Learning techniques have shown promise in producing high-performance models with fewer labels, their applications remain constrained due to the necessity for multiple interaction rounds with annotators, which can be time-consuming or infeasible. This project aims to advance Active Learning algorithms and understanding of their fundamental capabilities in scenarios with limited interaction rounds. A broad spectrum of machine learning applications is expected to benefit from the results of this research, reducing the time and cost associated with obtaining sufficient data for training accurate models. Additionally, this project engages underrepresented minority students through hands-on research and learning activities, develops course modules on resource-efficient machine learning, and disseminates our findings to industry and academia via an extensive online Active Learning tutorial.This project will launch a comprehensive investigation of few-round active learning, where the learner can actively request feedback on specific data points within a limited number of rounds. To achieve this, the project will interleave two algorithmic tasks: robust data utility quantification and planning with limited adaptivity. First, the investigators will explore methods to measure the utility of unlabeled data, taking into account data size, underlying data characteristics, and downstream learning tasks. Subsequently, the team will develop algorithms that optimize the data utility metric while simultaneously improving the metric's quality over time in a few-round active learning setting. The project findings will establish principled approaches for addressing a novel exploration-exploitation dilemma specific to few-round active learning and provide a fundamental understanding of adaptivity's role in budgeted learning. Finally, the project will evaluate the proposed approaches across various high-impact machine learning applications, including autonomous driving, smart buildings, dialog systems, and biochemical engineering.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

监督机器学习已经得到了广泛的应用，通常可以实现最先进的性能。然而，这些算法依赖于标记的训练实例，这可能具有挑战性。标记实例通常由人类完成，需要时间和金钱才能获得。主动学习通过识别最具信息性的注释实例来尽量减少标记成本。虽然主动学习技术在产生具有较少标签的高性能模型方面表现出了希望，但由于需要与注释器进行多轮交互，因此其应用仍然受到限制，这可能是耗时或不可行的。该项目旨在推进主动学习算法，并在有限交互回合的情况下了解其基本功能。预计广泛的机器学习应用将受益于这项研究的结果，减少与获得足够数据以训练准确模型相关的时间和成本。此外，该项目通过实践研究和学习活动吸引少数民族学生，开发资源高效机器学习的课程模块，并通过广泛的在线主动学习教程向行业和学术界传播我们的研究结果。该项目将对几轮主动学习进行全面调查，其中学习者可以在有限数量的轮内主动请求关于特定数据点的反馈。为了实现这一目标，该项目将交错两个算法任务：强大的数据效用量化和有限适应性规划。首先，研究人员将探索衡量未标记数据效用的方法，同时考虑数据大小、基础数据特征和下游学习任务。随后，该团队将开发优化数据效用度量的算法，同时在几轮主动学习环境中随着时间的推移提高度量的质量。该项目的研究结果将建立一个新的探索开发的困境，具体到几轮主动学习的原则性方法，并提供了一个基本的理解适应性的预算学习的作用。最后，该项目将评估各种高影响力机器学习应用的拟议方法，包括自动驾驶，智能建筑，对话系统和生化工程。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。