Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
基本信息
- 批准号:2312931
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-01 至 2027-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data-driven systems employ algorithms to aid human judgment in critical domains like hiring and employment, school and college admissions, credit and lending, and college ranking. Because of their impacts on individuals, population groups, institutions, and society at large, it is critical to incorporate fairness, accountability, and transparency considerations into the design, validation, and use of these systems. Current research in this area has mainly focused on classification and prediction tasks. However, scoring and ranking are also used widely, and raise many concerns that methods designed for classification cannot handle because classification labels are applied one item at a time, whereas ranking is explicitly designed to compare items. This project is focused on algorithmic score-based rankers that sort a set of candidates based on a “simple” scoring formula. Such rankers are widely used in critical domains because of the premise that they are easier to design, understand, and justify than complex learned models. Yet, even these seemingly simple and transparent rankers may produce counter-intuitive results, unfairly demote candidates that belong to disadvantaged groups, and be prone to manipulation due to sensitivity to slight changes in the input data or in the scoring formula. Addressing these issues is challenging due to the interplay between the data being ranked and the ranker, the complex structure within the data, and the need to balance multiple objectives.This project considers the core technical challenges inherent in the responsible design and validation of algorithmic rankers, and pursues three synergistic aims. Aim 1 is to develop methods to quantify the impact of item attributes, and of specific engineering choices regarding attribute representation and pre-processing, on the ranked outcome (validation). This information is then used to guide the data scientist in selecting a scoring function that corresponds to their understanding of quality or appropriateness (design). Aim 2 is to develop methods to quantify the impact of data uncertainty, of slight changes in the scoring formula, or both, on the ranked outcome (validation). This information is then used to guide the data scientist in intervening on data acquisition and pre-processing to reduce uncertainty, and in selecting a scoring function that is sufficiently stable (design). Aim 3 is to develop methods to quantify lack of fairness in ranked outcomes, with respect to candidates from under-represented or historically disadvantaged groups, in view of multiple fairness objectives and potential intersectional discrimination (validation). This information is then used to identify feasible trade-offs and assist the data scientist in navigating these trade-offs to enact fairness-enhancing interventions (design). Outcomes of this work will impact the practice of scoring and ranking in critical domains like educational program admissions, hiring, and college ranking. Insights from this work will enable technical interventions when appropriate, and also identify cases where they are insufficient, and where more data should be collected or an alternative screening process should be used. This project will also include teaching and mentoring, public education and outreach, and broadening participation of members of under-represented groups in computing.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的系统使用算法来帮助人类在招聘和就业、学校和大学招生、信贷和贷款以及大学排名等关键领域做出判断。由于它们对个人、人口群体、机构和整个社会的影响,在设计、验证和使用这些系统时,将公平、问责和透明度考虑因素纳入其中是至关重要的。目前这方面的研究主要集中在分类和预测任务上。然而,评分和排名也被广泛使用,并引起了许多为分类而设计的方法无法处理的问题,因为分类标签一次应用一个项目,而排名明确地设计为比较项目。这个项目的重点是基于算法得分的排名器,它根据一个“简单”的评分公式对一组候选人进行排序。这种排名器在关键领域中被广泛使用,因为它们比复杂的学习模型更容易设计、理解和调整。然而,即使这些看似简单和透明的排名也可能产生违反直觉的结果,不公平地将属于弱势群体的候选人降级,并由于对输入数据或评分公式的微小变化敏感而容易受到操纵。由于被排名者和排名者之间的相互作用,数据内部的复杂结构,以及平衡多个目标的需要,解决这些问题是具有挑战性的。本项目考虑了负责设计和验证算法排名者所固有的核心技术挑战,并追求三个协同目标。目标1是开发方法来量化项目属性以及关于属性表示和前处理的特定工程选择对排序结果(验证)的影响。这些信息随后被用来指导数据科学家选择与他们对质量或适当性(设计)的理解相对应的评分函数。目标2是开发方法来量化数据不确定性、评分公式中的微小变化或两者对排名结果(验证)的影响。这些信息随后被用来指导数据科学家干预数据采集和预处理以减少不确定性,并选择足够稳定的评分函数(设计)。目标3是考虑到多重公平目标和潜在的交叉歧视(验证),制定方法来量化排名结果中关于任职人数不足或历史上处于不利地位的候选人的不公平之处。这些信息然后被用来确定可行的权衡,并帮助数据科学家在这些权衡中导航,以制定增强公平的干预措施(设计)。这项工作的结果将影响在教育项目招生、招聘和大学排名等关键领域进行评分和排名的做法。这项工作的洞察力将使适当的技术干预成为可能,并确定它们不足的情况,以及在哪些情况下应该收集更多的数据或使用替代的筛查程序。该项目还将包括教学和指导、公众教育和推广,以及扩大代表不足群体的成员在计算方面的参与。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Detection of Groups with Biased Representation in Ranking
检测排名中有偏差的群体
- DOI:10.1109/icde55515.2023.00168
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Li, Jinyang;Moskovitch, Yuval;Jagadish, H. V.
- 通讯作者:Jagadish, H. V.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hosagrahar Jagadish其他文献
Hosagrahar Jagadish的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Hosagrahar Jagadish', 18)}}的其他基金
CIVIC-PG Track B: Understanding Native American Tribal Residents Needs through Better Data and Query Systems
CIVIC-PG Track B:通过更好的数据和查询系统了解美洲原住民部落居民的需求
- 批准号:
2228275 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Medium: Collaborative Research: Fairness in Web Database Applications
III:媒介:协作研究:Web 数据库应用程序的公平性
- 批准号:
2106176 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
BD Hubs: Collaborative Proposal: Midwest: Midwest Big Data Hub: Building Communities to Harness the Data Revolution
BD 中心:协作提案:中西部:中西部大数据中心:建设社区以利用数据革命
- 批准号:
1916425 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Cooperative Agreement
Collaborative Research: Framework for Integrative Data Equity Systems
协作研究:综合数据公平系统框架
- 批准号:
1934565 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
- 批准号:
1741022 - 财政年份:2017
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
BIGDATA: Small: DA: Choosing a Needle in a Big Data Haystack
大数据:小:DA:大海捞针
- 批准号:
1250880 - 财政年份:2013
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Small: Usable Databases Through Organic Technology
III:小型:通过有机技术可用的数据库
- 批准号:
1017296 - 财政年份:2010
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
TC: Small: Collaborative Research: User-Centric Privacy Control for Collaborative Social Media
TC:小型:协作研究:协作社交媒体的以用户为中心的隐私控制
- 批准号:
1017149 - 财政年份:2010
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
TC: Small: Analysis and Privacy Tools for Enterprise Database Audit Logs
TC:小型:企业数据库审计日志的分析和隐私工具
- 批准号:
0915782 - 财政年份:2009
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
Principles for Scalable Dynamic Visual Analytics
可扩展动态视觉分析的原则
- 批准号:
0808824 - 财政年份:2008
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342498 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322973 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322974 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
- 批准号:
2342497 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III : Medium: Collaborative Research: From Open Data to Open Data Curation
III:媒介:协作研究:从开放数据到开放数据管理
- 批准号:
2420691 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336769 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336768 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Designing AI Systems with Steerable Long-Term Dynamics
合作研究:III:中:设计具有可操纵长期动态的人工智能系统
- 批准号:
2312865 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: MEDIUM: Responsible Design and Validation of Algorithmic Rankers
合作研究:III:媒介:算法排序器的负责任设计和验证
- 批准号:
2312932 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
- 批准号:
2415562 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant