CAREER: Advancing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management
职业:推进开放式众包:众包数据管理的下一个前沿
基本信息
- 批准号:1940757
- 负责人:
- 金额:$ 41.34万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-05-15 至 2024-03-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning on big data is finally having an impact on our daily lives, from small triumphs like Siri and Google Translate to much tougher emerging applications like driverless cars and computer-assisted medical image diagnosis. From mundane online fraud detection to the most sophisticated uses of computer vision, these applications share an insatiable appetite for massive labeled training data. The primary source of high-quality labels is crowdsourcing, and research to date on crowdsourcing has focused on the key problem of how to maximize the production of high-quality crowdsourced labels per dollar spent, for problems where workers must choose between just a few predefined labels. However, more open-ended labeling problems have grown to constitute almost half of crowdsourced tasks today, and open-ended tasks raise an entirely new set of research challenges for crowdsourced data management.This activity addresses the key new research challenges in managing and optimizing open-ended crowdsourcing. Since open-ended crowdsourcing employs tasks with a large number of alternatives, humans struggle to select error-free ones. Additional challenges emerge in determining the open-ended task types appropriate for a specific problem, developing schemes to ascertain the right answer given open-ended worker responses, and inferring the hidden perspectives behind worker answers. The activity targets open-ended crowdsourcing problems that span nearly 90% of those used in practice today, with wide applicability in computer vision, natural language processing, and machine learning in general. The technical outcomes of the activity include the first foundational principles for open-ended crowdsourced data management, which in turn will expand the reach of machine learning into new and more challenging domains and more effective solutions in existing applications that impact our everyday lives. The pedagogical outcomes of the activity include a course on human-in-the-loop data analytics, crowdsourcing education modules for school teachers, as well as a quantification and dissemination of how crowdsourcing is performed in practice, along with a benchmark to accelerate crowdsourcing research in the future.
基于大数据的机器学习终于对我们的日常生活产生了影响,从Siri和谷歌翻译这样的小成功,到无人驾驶汽车和计算机辅助医学图像诊断等更艰难的新兴应用。 从普通的在线欺诈检测到最复杂的计算机视觉应用,这些应用都对大量标记的训练数据有着无法满足的需求。高质量标签的主要来源是众包,迄今为止关于众包的研究集中在如何最大限度地提高每美元生产高质量众包标签的关键问题上,对于工人必须在几个预定义的标签之间进行选择的问题。 然而,越来越多的开放式标签问题已经发展到构成了几乎一半的众包任务,开放式任务提出了一系列全新的研究挑战,众包数据管理。这项活动解决了管理和优化开放式众包的关键新的研究挑战。由于开放式众包采用的任务有大量的替代品,人类很难选择没有错误的任务。在确定适合特定问题的开放式任务类型,制定计划以确定开放式工人响应的正确答案,以及推断工人答案背后隐藏的观点方面出现了其他挑战。该活动针对的是开放式众包问题,涵盖了当今实践中近90%的问题,在计算机视觉、自然语言处理和机器学习中具有广泛的适用性。该活动的技术成果包括开放式众包数据管理的第一个基本原则,这反过来将扩大机器学习的范围,使其进入新的更具挑战性的领域,并在影响我们日常生活的现有应用中提供更有效的解决方案。该活动的教学成果包括:关于人在环数据分析的课程、学校教师众包教育模块,以及对实践中如何进行众包的量化和传播,沿着一个基准,以加快今后的众包研究。
项目成果
期刊论文数量(17)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows
AutoML 向何处去?
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Doris Xin, Eva Yiwei
- 通讯作者:Doris Xin, Eva Yiwei
An Exploratory User Study of Visual Causality Analysis
- DOI:10.1111/cgf.13680
- 发表时间:2019-06
- 期刊:
- 影响因子:2.5
- 作者:Chi-Hsien Yen;Aditya G. Parameswaran;W. Fu
- 通讯作者:Chi-Hsien Yen;Aditya G. Parameswaran;W. Fu
NOAH: Interactive Spreadsheet Exploration with Dynamic Hierarchical Overviews.
NOAH:具有动态分层概述的交互式电子表格探索。
- DOI:10.14778/3447689.3447701
- 发表时间:2021
- 期刊:
- 影响因子:2.5
- 作者:Sajjadur Rahman, Mangesh Bendre
- 通讯作者:Sajjadur Rahman, Mangesh Bendre
From Sketching to Natural Language: Expressive Visual Querying for Accelerating Insight
从草图到自然语言:用于加速洞察力的富有表现力的视觉查询
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:1.1
- 作者:Siddiqui, T;Wang Z;Karahalios K;Parameswaran A
- 通讯作者:Parameswaran A
CRUX: Adaptive Querying for Efficient Crowdsourced Data Extraction
CRUX:用于高效众包数据提取的自适应查询
- DOI:10.1145/3357384.3357976
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Rekatsinas, Theodoros;Deshpande, Amol;Parameswaran, Aditya
- 通讯作者:Parameswaran, Aditya
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Aditya Parameswaran其他文献
$$\varvec{\textsc {Orpheus}}$$ DB: bolt-on versioning for relational databases (extended version)
- DOI:
10.1007/s00778-019-00594-5 - 发表时间:
2019-12-20 - 期刊:
- 影响因子:3.800
- 作者:
Silu Huang;Liqi Xu;Jialin Liu;Aaron J. Elmore;Aditya Parameswaran - 通讯作者:
Aditya Parameswaran
Aditya Parameswaran的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Aditya Parameswaran', 18)}}的其他基金
FW-HTF-R: Human-Machine Teaming for Effective Data Work at Scale: Upskilling Defense Lawyers Working with Police and Court Process Data
FW-HTF-R:大规模有效数据工作的人机协作:提高辩护律师处理警察和法院流程数据的技能
- 批准号:
2129008 - 财政年份:2021
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
AitF: Collaborative Research: Fast, Accurate, and Practical: Adaptive Sublinear Algorithms for Scalable Visualization
AitF:协作研究:快速、准确和实用:用于可扩展可视化的自适应次线性算法
- 批准号:
1940759 - 财政年份:2019
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
AitF: Collaborative Research: Fast, Accurate, and Practical: Adaptive Sublinear Algorithms for Scalable Visualization
AitF:协作研究:快速、准确和实用:用于可扩展可视化的自适应次线性算法
- 批准号:
1733878 - 财政年份:2017
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
CAREER: Advancing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management
职业:推进开放式众包:众包数据管理的下一个前沿
- 批准号:
1652750 - 财政年份:2017
- 资助金额:
$ 41.34万 - 项目类别:
Continuing Grant
III: Medium: Collaborative Research: DataHub - A Collaborative Dataset Management Platform for Data Science
III:媒介:协作研究:DataHub - 数据科学协作数据集管理平台
- 批准号:
1513407 - 财政年份:2015
- 资助金额:
$ 41.34万 - 项目类别:
Continuing Grant
相似海外基金
Pelican: Advancing the Open Science Data Federation Platform
Pelican:推进开放科学数据联合平台
- 批准号:
2331480 - 财政年份:2023
- 资助金额:
$ 41.34万 - 项目类别:
Continuing Grant
Integrative approaches to advancing retention, engagement, and graduation in STEM students across a multi-campus open-access institution
在多校区开放获取机构中采用综合方法提高 STEM 学生的保留率、参与度和毕业率
- 批准号:
2221026 - 财政年份:2023
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
Collaborative Research: Advancing Thermodynamic Modeling of Open Magmatic Systems - Translithosphere Magma Chamber Simulator
合作研究:推进开放岩浆系统的热力学建模 - 跨岩石圈岩浆室模拟器
- 批准号:
2151039 - 财政年份:2022
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
POSE: Phase I: Tapis Advancing Collaborative Open Source (TACOS)
POSE:第一阶段:Tapis 推进协作开源 (TACOS)
- 批准号:
2229614 - 财政年份:2022
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
Collaborative Research: Advancing Thermodynamic Modeling of Open Magmatic Systems - Translithosphere Magma Chamber Simulator
合作研究:推进开放岩浆系统的热力学建模 - 跨岩石圈岩浆室模拟器
- 批准号:
2151038 - 财政年份:2022
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
Advancing the Open Geospatial Consortium (OGC) CDB Standard for 3D Synthetic Environment Simulation and Modelling
推进开放地理空间联盟 (OGC) CDB 标准的 3D 合成环境仿真和建模
- 批准号:
543688-2019 - 财政年份:2020
- 资助金额:
$ 41.34万 - 项目类别:
Collaborative Research and Development Grants
Advancing the Open Geospatial Consortium (OGC) CDB Standard for 3D Synthetic Environment Simulation and Modelling
推进开放地理空间联盟 (OGC) CDB 标准的 3D 合成环境仿真和建模
- 批准号:
543688-2019 - 财政年份:2019
- 资助金额:
$ 41.34万 - 项目类别:
Collaborative Research and Development Grants
RCN Build-a-Cell: An Open Community Considering & Advancing the Construction of Synthetic Cells
RCN Build-a-Cell:开放社区考虑
- 批准号:
1901145 - 财政年份:2019
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
Frameworks: Software NSCI-Open OnDemand 2.0: Advancing Accessibility and Scalability for Computational Science through Leveraged Software Cyberinfrastructure
框架:软件 NSCI-Open OnDemand 2.0:通过利用软件网络基础设施提高计算科学的可访问性和可扩展性
- 批准号:
1835725 - 财政年份:2018
- 资助金额:
$ 41.34万 - 项目类别:
Standard Grant
CAREER: Advancing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management
职业:推进开放式众包:众包数据管理的下一个前沿
- 批准号:
1652750 - 财政年份:2017
- 资助金额:
$ 41.34万 - 项目类别:
Continuing Grant