CAREER: Language Technologies Against the Language of Social Discrimination

职业:反对社会歧视语言的语言技术

基本信息

  • 批准号:
    2142739
  • 负责人:
  • 金额:
    $ 55.04万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-01 至 2027-08-31
  • 项目状态:
    未结题

项目摘要

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).The exponential growth of online social platforms provides an unprecedented source of equal opportunities for accessing expert- and crowd-wisdom, for finding education, employment, and friendships. One key root cause that can deeply impede these experiences is the exposure to implicit social bias. The risk is high, since biases are pernicious and pervasive, and it has been well established that language is a primary means through which stereotypes and prejudice are communicated and perpetuated. This project develops language technologies to detect and intervene in the language of social discrimination—sexist, racist, homophobic microaggressions, condescension, objectification, dehumanizing metaphors, and the like—which can be unconscious and unintentional, but cause prolonged personal and professional harms. The program opens up new research opportunities with implications to natural language processing, machine learning, data science, and computational social science. It develops new Web-scale algorithms to automatically detect implicit and disguised toxicity, as well as hate speech and abusive language online. Technologically, it develops new methods to surface and demote spurious patterns in deep-learning models, and new techniques to interpret deep-learning models, thereby opening new avenues to reliable and interpretable machine learning. Successful completion of the program will pave the ground for a paradigm shift in existing ways for monitoring civility in cyberspace, shielding vulnerable populations from discrimination and aggression, and reducing the mental load of platform moderators. Therefore, this project can benefit and empower a dramatic number of individuals—representatives of disadvantaged groups discriminated by gender, race, age, sexual orientation, ethnicity—who use social media or AI technologies built upon user generated content. Finally, the educational curriculum developed by this program will equip future technologists with theoretical and practical tools for building ethical AI, and will substantially promote diversity, equity and inclusion in STEM education, helping to foster a new, more diverse generation of researchers entering AI. The overarching goal of this CAREER project is to develop lightly supervised, interpretable machine learning approaches—grounded in social psychology and causal reasoning—to detect implicit social bias in written discourse and narrative text. More specifically, the first phase of the project develops algorithms and models for identifying and explaining gendered microaggressions in short comments on social media, first unsupervisedly, then with active learning, given limited supervision by trained annotators. It provides transformative solutions to making existing overparameterized black-box neural networks more robust and more interpretable. Since microaggressions are often implicit, it also develops approaches to generate explanations to the microaggression detector’s decisions. In the second phase, the project addresses the challenging task of detecting biased framing about members of the LGBTQ community in narrative domains of digital media and develops data analytic tools by operationalizing, across languages, well-established social psychology theories. The expected outcomes of this five-year program include new datasets, algorithms, and models that provide people-centered text analytics, and pinpoint and explain potentially biased framings, across languages, data domains, and social contexts.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该奖项全部或部分由2021年美国救援计划法案(公法117-2)资助。在线社交平台的指数增长为获取专家和群众智慧,寻找教育,就业和友谊提供了前所未有的平等机会。一个可能严重阻碍这些体验的关键根源是暴露于隐性社会偏见。这种风险很高,因为偏见是有害的,而且普遍存在,而且众所周知,语言是传播和延续陈规定型观念和偏见的主要手段。该项目开发语言技术,以检测和干预社会歧视的语言性别歧视,种族主义,同性恋恐惧症的microaggressions,屈尊俯就,客观化,非人性化的隐喻,等等,这可能是无意识和无意的,但会造成长期的个人和职业伤害。该计划开辟了新的研究机会,涉及自然语言处理,机器学习,数据科学和计算社会科学。它开发了新的网络规模算法,以自动检测隐含和伪装的毒性,以及在线仇恨言论和辱骂性语言。在技术上,它开发了新方法来揭示和降级深度学习模型中的虚假模式,以及解释深度学习模型的新技术,从而为可靠和可解释的机器学习开辟了新途径。该计划的成功完成将为现有方式的范式转变铺平道路,以监测网络空间的文明,保护弱势群体免受歧视和侵略,并减少平台版主的精神负担。因此,这个项目可以使大量的个人受益并赋予他们权力,这些人是受性别、种族、年龄、性取向、民族歧视的弱势群体的代表,他们使用基于用户生成内容的社交媒体或人工智能技术。最后,该计划开发的教育课程将为未来的技术人员提供构建道德人工智能的理论和实践工具,并将大大促进STEM教育的多样性,公平性和包容性,帮助培养新一代更多样化的研究人员进入人工智能领域。这个CAREER项目的总体目标是开发轻监督,可解释的机器学习方法-以社会心理学和因果推理为基础-以检测书面语篇和叙事文本中的隐性社会偏见。更具体地说,该项目的第一阶段开发算法和模型,用于识别和解释社交媒体上的简短评论中的性别微攻击,首先是无监督的,然后是主动学习,由训练有素的注释者进行有限的监督。它提供了变革性的解决方案,使现有的过度参数化的黑盒神经网络更强大,更可解释。由于微攻击往往是隐含的,它还开发了方法来产生解释微攻击检测器的决定。在第二阶段,该项目解决了在数字媒体的叙事领域中检测LGBTQ社区成员的偏见框架的挑战性任务,并通过跨语言操作化完善的社会心理学理论来开发数据分析工具。这个五年计划的预期成果包括新的数据集、算法和模型,这些数据集、算法和模型提供以人为本的文本分析,并查明和解释跨语言、数据域和社会背景的潜在偏见框架。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
  • DOI:
    10.48550/arxiv.2210.17432
  • 发表时间:
    2022-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiaochuang Han;Sachin Kumar;Yulia Tsvetkov
  • 通讯作者:
    Xiaochuang Han;Sachin Kumar;Yulia Tsvetkov
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
  • DOI:
    10.48550/arxiv.2210.07700
  • 发表时间:
    2022-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sachin Kumar;Vidhisha Balachandran;Lucille Njoo;Antonios Anastasopoulos;Yulia Tsvetkov
  • 通讯作者:
    Sachin Kumar;Vidhisha Balachandran;Lucille Njoo;Antonios Anastasopoulos;Yulia Tsvetkov
Gendered Mental Health Stigma in Masked Language Models
蒙面语言模型中的性别心理健康耻辱
Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces
  • DOI:
    10.48550/arxiv.2205.12558
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sachin Kumar;Biswajit Paria;Yulia Tsvetkov
  • 通讯作者:
    Sachin Kumar;Biswajit Paria;Yulia Tsvetkov
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
  • DOI:
    10.48550/arxiv.2212.10020
  • 发表时间:
    2022-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tianxing He;Jingyu Zhang;Tianle Wang;Sachin Kumar;Kyunghyun Cho;James R. Glass;Yulia Tsvetkov
  • 通讯作者:
    Tianxing He;Jingyu Zhang;Tianle Wang;Sachin Kumar;Kyunghyun Cho;James R. Glass;Yulia Tsvetkov
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yulia Tsvetkov其他文献

Style Transfer Through Multilingual and Feedback-Based Back-Translation
通过多语言和基于反馈的回译进行风格迁移
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shrimai Prabhumoye;Yulia Tsvetkov;A. Black;R. Salakhutdinov
  • 通讯作者:
    R. Salakhutdinov
LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for Multi-Granular Propaganda Span Identification
LTIatCMU 在 SemEval-2020 任务 11:结合多级特征进行多粒度宣传跨度识别
  • DOI:
    10.18653/v1/2020.semeval-1.230
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sopan Khosla;Rishabh Joshi;Ritam Dutt;A. Black;Yulia Tsvetkov
  • 通讯作者:
    Yulia Tsvetkov
RtGender: A Corpus for Studying Differential Responses to Gender
RtGender:研究性别差异反应的语料库
A Dynamic Strategy Coach for Effective Negotiation
有效谈判的动态策略教练
Extraction of Multi-word Expressions from Small Parallel Corpora By : Yulia Tsvetkov Supervised
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yulia Tsvetkov
  • 通讯作者:
    Yulia Tsvetkov

Yulia Tsvetkov的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yulia Tsvetkov', 18)}}的其他基金

NSF-BSF: Collaborative Research: RI: Small: Multilingual Language Generation via Understanding of Code Switching
NSF-BSF:协作研究:RI:小型:通过理解代码切换生成多语言
  • 批准号:
    2203097
  • 财政年份:
    2021
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Small: NL(V)P: Natural Language (Variety) Processing
合作研究:RI:小型:NL(V)P:自然语言(品种)处理
  • 批准号:
    2125201
  • 财政年份:
    2021
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
NSF-BSF: Collaborative Research: RI: Small: Multilingual Language Generation via Understanding of Code Switching
NSF-BSF:协作研究:RI:小型:通过理解代码切换生成多语言
  • 批准号:
    2007960
  • 财政年份:
    2020
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
NSF-BSF: RI: Small: Collaborative Research: Modeling Crosslinguistic Influences Between Language Varieties
NSF-BSF:RI:小型:协作研究:模拟语言品种之间的跨语言影响
  • 批准号:
    1812327
  • 财政年份:
    2018
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Continuing Grant

相似海外基金

Doctoral Dissertation Research: Intersections of Labor, Language, and Value in the Production of Emerging Technologies
博士论文研究:新兴技术生产中的劳动力、语言和价值的交叉点
  • 批准号:
    2343003
  • 财政年份:
    2024
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
SBIR Phase I: Sown To Grow - Measuring Growth in Trusting Relationships between Students and Educators with Natural Language Processing and Machine Learning Technologies
SBIR 第一阶段:播种成长 - 使用自然语言处理和机器学习技术衡量学生和教育工作者之间信任关系的增长
  • 批准号:
    2322340
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
Collaborative Research: EAGER: Developing and Optimizing Reflection-Informed STEM Learning and Instruction by Integrating Learning Technologies with Natural Language Processing
合作研究:EAGER:通过将学习技术与自然语言处理相结合来开发和优化基于反思的 STEM 学习和教学
  • 批准号:
    2329273
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
Towards Globally Equitable Language Technologies (EQUATE)
迈向全球公平的语言技术 (EQUATE)
  • 批准号:
    EP/Y031350/1
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Research Grant
Collaborative Research: EAGER: Developing and Optimizing Reflection-Informed STEM Learning and Instruction by Integrating Learning Technologies with Natural Language Processing
合作研究:EAGER:通过将学习技术与自然语言处理相结合来开发和优化基于反思的 STEM 学习和教学
  • 批准号:
    2329274
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
EAGER: Building Language Technologies by Machine Reading Grammars
EAGER:通过机器阅读语法构建语言技术
  • 批准号:
    2327143
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
CCRI: Planning-C: Facilitating Language Technologies for Crisis Response (LT4CR)
CCRI:Planning-C:促进语言技术应对危机(LT4CR)
  • 批准号:
    2234895
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Standard Grant
Construction of Large SL (Sign Language) Corpus of Video on Web Using Also AI Technologies: Advances in SL Corpus Linguistics
还使用人工智能技术构建大型网络视频手语语料库:手语语料库语言学的进展
  • 批准号:
    23K17273
  • 财政年份:
    2023
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Pioneering)
CAREER: Socially-Aware Language Technologies To Support People in Supporting Others for Better Online Communities
职业:具有社交意识的语言技术支持人们支持他人建设更好的在线社区
  • 批准号:
    2144562
  • 财政年份:
    2022
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Continuing Grant
CAREER: Socially-Aware Language Technologies To Support People in Supporting Others for Better Online Communities
职业:具有社交意识的语言技术支持人们支持他人建设更好的在线社区
  • 批准号:
    2247357
  • 财政年份:
    2022
  • 资助金额:
    $ 55.04万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了