CAREER: Causal Modeling for Data Quality and Bias Mitigation
职业:数据质量和偏差缓解的因果建模
基本信息
- 批准号:2340124
- 负责人:
- 金额:$ 60万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-07-01 至 2029-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project presents a novel approach, inspired by database methodologies, to address the significant challenge of bias in algorithmic systems, particularly in sensitive domains such as credit scoring, medical diagnostics, predictive policing, and the criminal justice system. By recognizing that such biases often stem from the underlying data, the initiative redefines algorithmic bias as a data quality management issue. Emphasizing critical aspects of data quality management such as accuracy, completeness, and consistency, the project aims to develop methods that significantly enhance the trustworthiness and societal impact of these systems. Incorporating causal modeling with these essential data quality principles, it takes a strategic approach to identifying and addressing the root causes of algorithmic bias. This effort not only marks a significant advancement in the field of data science but also contributes substantially to national and public welfare by advocating for decision-making processes that are fair, accurate, and reliable, thereby promoting national health, prosperity, and well-being in a comprehensive manner. This plan envisions a wide-ranging dissemination of its motivation, approach, and artifacts through a diverse array of interdisciplinary colloquia, seminars, and co-curricular learning opportunities. This project addresses algorithmic bias through a fourfold approach: 1) Developing new, scalable algorithms for data repair, designed for repairing data concerning a special class of integrity constraints that can capture the statistical nuances of data used for training machine learning (ML) models. 2) Establishing a holistic data debiasing framework capable of addressing various data biases and quality issues. 3) Implementing methods to quantify uncertainty in algorithmic decision-making, particularly based on ML models, where the uncertainty stems from bias and data quality issues that cannot be fully recovered and removed due to incomplete information. 4) Lastly, the project focuses on developing methods for root-cause analysis to identify underlying issues and adaptive debiasing in dynamic data environments, incorporating proactive interventions in data processing pipelines for ongoing bias mitigation. This multifaceted strategy aims to advance the fields of data quality management, data cleaning for ML, and responsible data science, significantly enhancing the reliability, fairness, and accuracy of data-driven decision-making systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目提出了一种受数据库方法学启发的新方法,以解决算法系统中的偏见的重大挑战,特别是在信用评分,医疗诊断,预测性警务和刑事司法系统等敏感领域。 通过认识到这种偏见往往源于基础数据,该倡议将算法偏见重新定义为数据质量管理问题。该项目强调数据质量管理的关键方面,如准确性、完整性和一致性,旨在开发可显著增强这些系统的可信度和社会影响的方法。通过使用这些基本的数据质量原则进行因果建模,它采取了一种战略方法来识别和解决算法偏差的根本原因。这一努力不仅标志着数据科学领域的重大进步,而且通过倡导公平,准确和可靠的决策过程,为国家和公共福利做出了重大贡献,从而全面促进国家健康,繁荣和福祉。 该计划设想通过各种跨学科座谈会,研讨会和课外学习机会广泛传播其动机,方法和文物。该项目通过四种方法解决算法偏差:1)开发新的可扩展的数据修复算法,旨在修复有关特殊类别完整性约束的数据,这些约束可以捕获用于训练机器学习(ML)模型的数据的统计细微差别。2)建立一个全面的数据去偏见框架,能够解决各种数据偏见和质量问题。3)实施方法来量化算法决策中的不确定性,特别是基于ML模型,其中不确定性源于偏差和数据质量问题,由于信息不完整而无法完全恢复和删除。4)最后,该项目侧重于开发根本原因分析方法,以确定动态数据环境中的潜在问题和自适应去偏置,并在数据处理管道中纳入主动干预措施,以持续缓解偏置。这一多方面的战略旨在推进数据质量管理、ML数据清洗和负责任的数据科学领域,显著提高数据驱动决策系统的可靠性、公平性和准确性。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Babak Salimi其他文献
COMPARISON OF IMMORTALIZATION ASSAY AND POLYMERASE CHAIN REACTION DETECTION OF EPSTEIN-BARR VIRUS IN PEDIATRIC TRANSPLANT RECIPIENTS AND CONTROL SAMPLES
儿科移植受者和对照样品中 Epstein-Barr 病毒的永生化测定和聚合酶链反应检测的比较
- DOI:
10.1080/pdp.21.4.433.443 - 发表时间:
2002 - 期刊:
- 影响因子:0
- 作者:
Babak Salimi;E. Alonso;R. Cohn;S. Mendley;B. Katz - 通讯作者:
B. Katz
Causal What-If and How-To Analysis Using HYPER
使用 HYPER 进行因果假设和操作方法分析
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Fangzhu Shen;Kayvon Heravi;Oscar Gomez;Sainyam Galhotra;Amir Gilad;Sudeepa Roy;Babak Salimi - 通讯作者:
Babak Salimi
First Workshop on Governance, Understanding and Integration of Data for Effective and Responsible AI (GUIDE-AI)
第一届关于有效和负责任的人工智能的数据治理、理解和整合的研讨会(GUIDE-AI)
- DOI:
10.1145/3626246.3655019 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Abolfazl Asudeh;Sainyam Galhotra;Amir Gilad;Babak Salimi;Brit Youngmann - 通讯作者:
Brit Youngmann
Inflammatory Potential of Diet and Odds of Lung Cancer: A Case-Control Study
饮食的炎症潜力与肺癌的发生几率:病例对照研究
- DOI:
10.1080/01635581.2022.2036770 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
A. Sadeghi;K. Parastouei;S. Seifi;A. Khosravi;Babak Salimi;H. Zahedi;O. Sadeghi;Hamid Rasekhi;M. Amini - 通讯作者:
M. Amini
Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation
强制公平表示学习和因果图像生成的条件独立性
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Jensen Hwa;Qingyu Zhao;Aditya Lahiri;Adnan Masood;Babak Salimi;E. Adeli - 通讯作者:
E. Adeli
Babak Salimi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
AI-Powered Uncovering of Mechanisms in Cancer Through Causal Discovery Analysis and Generative Modeling of Heterogeneous Data
人工智能通过因果发现分析和异构数据生成模型揭示癌症机制
- 批准号:
10581180 - 财政年份:2023
- 资助金额:
$ 60万 - 项目类别:
Stress modeling of the human sperm sncRNA transcriptome and causal importance of dynamic miRNA in reproductive and developmental outcomes
人类精子 sncRNA 转录组的压力模型以及动态 miRNA 在生殖和发育结果中的因果重要性
- 批准号:
10707015 - 财政年份:2022
- 资助金额:
$ 60万 - 项目类别:
Stress modeling of the human sperm sncRNA transcriptome and causal importance of dynamic miRNA in reproductive and developmental outcomes
人类精子 sncRNA 转录组的压力模型以及动态 miRNA 在生殖和发育结果中的因果重要性
- 批准号:
10442142 - 财政年份:2022
- 资助金额:
$ 60万 - 项目类别:
Identifying low dose measurement error corrected effects of multiple pollutants using causal modeling
使用因果模型识别多种污染物的低剂量测量误差校正效应
- 批准号:
10634894 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
Identifying low dose measurement error corrected effects of multiple pollutants using causal modeling
使用因果模型识别多种污染物的低剂量测量误差校正效应
- 批准号:
10524732 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
Identifying low dose measurement error corrected effects of multiple pollutants using causal modeling
使用因果模型识别多种污染物的低剂量测量误差校正效应
- 批准号:
10332715 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
Identifying low dose measurement error corrected effects of multiple pollutants using causal modeling
使用因果模型识别多种污染物的低剂量测量误差校正效应
- 批准号:
10092293 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
III:Small: Counterfactually Fair Machine Learning through Causal Modeling
III:Small:通过因果建模实现反事实公平机器学习
- 批准号:
1910284 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
FAI: causal and semi-parametric inference for explanations of disparities and disparity-correcting modeling
FAI:用于解释视差和视差校正建模的因果和半参数推理
- 批准号:
2040804 - 财政年份:2021
- 资助金额:
$ 60万 - 项目类别:
Standard Grant
Effects of Chronic Kidney Disease on Cardiovascular Disease and Dementia Among People with Diabetes: Causal Modeling with Machine Learning Approach
慢性肾脏病对糖尿病患者心血管疾病和痴呆的影响:利用机器学习方法进行因果建模
- 批准号:
10059131 - 财政年份:2020
- 资助金额:
$ 60万 - 项目类别: