权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Natural language processing for detecting toxic, abusive, and hateful language online

用于在线检测有毒、辱骂和仇恨语言的自然语言处理

基本信息

批准号：
RGPIN-2022-04481
负责人：
Taboada, Maite
金额：
$ 4.66万
依托单位：
Simon Fraser University
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=758103
关键词：
Natural language processing detecting toxic

项目摘要

Digital technologies offer incredible power, from artificial intelligence and virtual assistants to social media and recommendation systems. Deploying such technologies in a manner beneficial to both individuals and society is a pressing challenge. In mainstream and social media, content providers welcome feedback; such feedback, however, may be `toxic': malicious, abusive, or offensive. Toxic comments and posts online are those that intend to cause harm. They may take the form of personal attacks, abuse, harassment, threats, and may include profane, obscene, or derogatory language, with hate speech being the most extreme. In the last few years, I have closely studied online news comments and developed natural language processing (NLP) methods to analyze them. My long-term program of research develops robust methods for text classification in tasks such as sentiment analysis, misinformation detection, and content moderation. In the next few years, my SFU laboratory, the Discourse Processing Lab, will continue to study toxic language online, to develop methods and algorithms to detect toxicity automatically. Our work identifying constructive comments, those that contribute positively to an online discussion, has provided excellent insight for how to automatically classify non-constructive and toxic comments. Current approaches to detecting online toxicity are based either on general text characteristics (word length, text length, capitalization, and punctuation) or on lists of words likely to cause offense. Machine learning approaches (supervised, semi-supervised, or based on neural networks) rely on large annotated datasets, but many studies have shown that such approaches often fail because negativity in language may be wrapped in positive words, through metaphors and other figures of speech. Research, including our own, has found that accurately identifying and filtering toxic content requires a multidisciplinary perspective, drawing on a deep understanding of linguistics and on current methods in NLP and machine learning. To address existing gaps in the automatic detection of toxic comments, in the next five years I plan to: (Objective 1) study how metaphors and other figures of speech well known since antiquity (euphemism, litotes, hyperbole, sarcasm) convey toxic language. I will then develop (Objective 2) a system to detect figures of speech automatically, which I will integrate into (Objective 3) a new content moderation platform. The results of this work will mobilize research among scholars interested in evaluative language and the role of media in public discourse, including linguists, computational linguists, and communication and media researchers. At a time when media organizations, social media platforms, and the public are concerned about online abuse, misinformation, and the role of digital technology in politics and society, this project is timely and will make an important contribution to public discourse.

从人工智能和虚拟助手到社交媒体和推荐系统，数字技术提供了令人难以置信的力量。以对个人和社会都有利的方式部署此类技术是一项紧迫的挑战。在主流媒体和社交媒体中，内容提供商欢迎反馈；然而，此类反馈可能是"有毒的"：恶意的、辱骂性的或冒犯性的。网上有毒评论和帖子是指那些意图造成伤害的评论和帖子。它们可能采取人身攻击、辱骂、骚扰、威胁的形式，并可能包括亵渎、淫秽或贬损性语言，其中仇恨言论最为极端。在过去的几年里，我仔细研究了在线新闻评论，并开发了自然语言处理（NLP）方法来分析它们。我的长期研究计划开发了强大的文本分类方法，用于情感分析、错误信息检测和内容审核等任务。未来几年，我的 SFU 实验室——话语处理实验室，将继续在线研究有毒语言，开发自动检测毒性的方法和算法。我们的工作是识别那些对在线讨论做出积极贡献的建设性评论，为如何自动分类非建设性评论和有毒评论提供了极好的见解。目前检测在线毒性的方法要么基于一般文本特征（单词长度、文本长度、大写和标点符号），要么基于可能引起冒犯的单词列表。机器学习方法（监督式、半监督式或基于神经网络）依赖于大型注释数据集，但许多研究表明，此类方法常常会失败，因为语言中的消极性可能通过隐喻和其他修辞手法被积极的词语所包裹。研究（包括我们自己的研究）发现，准确识别和过滤有毒内容需要多学科视角，充分利用对语言学以及当前 NLP 和机器学习方法的深入理解。为了解决自动检测有毒评论方面的现有差距，我计划在未来五年内：（目标 1）研究自古以来众所周知的隐喻和其他修辞手法（委婉语、轻描淡写、夸张、讽刺）如何传达有毒语言。然后，我将开发（目标 2）一个自动检测修辞格的系统，并将其集成到（目标 3）一个新的内容审核平台中。这项工作的结果将动员对评价性语言和媒体在公共话语中的作用感兴趣的学者进行研究，包括语言学家、计算语言学家以及传播和媒体研究人员。在媒体组织、社交媒体平台和公众关注网络滥用、错误信息以及数字技术在政治和社会中的作用之际，这个项目是及时的，将为公众话语做出重要贡献。