Exaggeration, cohesion, and fragmentation in on-line forums

在线论坛中的夸大、衔接和碎片化

基本信息

  • 批准号:
    EP/T023333/1
  • 负责人:
  • 金额:
    $ 77.06万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2020
  • 资助国家:
    英国
  • 起止时间:
    2020 至 无数据
  • 项目状态:
    未结题

项目摘要

On-line forums can support the formation of social communities with shared interests and needs. They can also have a negative side if groups of users support each other in divisive attitudes or false beliefs. The social fragmentation resulting from these so-called echo-chamber effects has been identified as an engine behind the rise of violence and extremism, political gridlock, and decreases in social mobility. This project is motivated by the observation that echo-chamber effects involve a gradual shift from more moderate language to more extreme language. Further, damage repair is difficult when extreme social fragmentation has already occurred. The ability to use patterns in on-line language for early detection of on-line social fragmentation would thus be a major breakthrough in supporting earlier, and more effective, intervention against harmful trends in on-line forums.We have identified two major challenges in creating this capability. First, current NLP methods are poor at understanding expressions whose meaning is a degree on a scale, such as a scale defined on the dimensions of cost, quality, honesty, or performance. For example, "rather racist", "really racist", and "incredibly racist" express different degrees of disapproval, but such differences are not adequately captured by current algorithms. This limitation is central to our problem, because echo-chamber effects often involve incremental exaggerations of factual claims, emotions, or attitudes. The second challenge results from the fact that methods for using linguistic content in the analysis of social behaviour are limited. While much research has uncovered systematic associations between word choices and social groups, very little has addressed relationships between linguistic inferences and social trends. However, tracking the gradual shifts towards semantic extremes in echo-chamber effects requires making certain linguistic inferences. This is because inferring which underlying dimension of meaning is relevant in any specific case critically depends on information about who is talking and what they are talking about. For example, "Liverpool is far better" might to relate a scale of cultural excellence in a discussion amongst music fans, but to a scale of costs amongst people who are discussing housing. A fundamental advance in the methodology for combining linguistic and social information is thus needed to characterise echo-chamber effects on-line and make predictions about risks of future fragmentation. The project is a new collaboration between an experimental and computational linguist (the PI) and an expert in machine learning and social network analysis (the Co-I). Its components integrate the expertise of both collaborators. Advanced text-mining and data analytics will be used to generate the materials for a large-scale and experimentally normed data set of scalar expressions, using archives of the popular on-line forum Reddit. No normed data set of this type exists, and it will provide the training and test materials needed to develop and evaluate new algorithms. Using a modular work plan, the project team will first develop and validate separate algorithms to assess and predict the meanings of scalar expressions, and the level of fragmentation in the social network of Reddit users. These components will then be integrated using advanced graph-based machine learning methods. The primary outcome of the project will be a software package that will facilitate the work of on-line moderators by flagging subReddits or threads that display early stages of echo-chamber effects. The normed data set will also be extremely valuable for improving NLP applications that require nontrivial semantic inference, such as sentiment analysis, chatbots, and question-answering systems. More generally, the project is a demonstration project for advanced methodology in processing linguistic meaning in relation to social relationships and human behaviour.
在线论坛可以支持具有共同兴趣和需求的社会社区的形成。如果一群用户以分裂的态度或错误的信念相互支持,它们也可能有负面的一面。这些所谓的回音室效应造成的社会分裂被认为是暴力和极端主义抬头、政治僵局和社会流动性下降背后的引擎。这个项目的动机是观察到回音室效应涉及到从更温和的语言到更极端的语言的逐渐转变。此外,当极端的社会分裂已经发生时,修复损害是困难的。因此,使用在线语言模式及早发现在线社会分裂的能力将是支持更早、更有效地干预在线论坛有害趋势的重大突破。我们确定了创建这一能力的两个主要挑战。首先,目前的自然语言处理方法在理解意义是程度的表达方面很差,例如在成本、质量、诚实或绩效维度上定义的程度。例如,“相当种族主义”、“非常种族主义”和“令人难以置信的种族主义”表达了不同程度的反对,但当前的算法没有充分捕捉到这种差异。这一局限是我们问题的核心,因为回音室效应通常涉及对事实主张、情感或态度的渐进式夸大。第二个挑战源于这样一个事实,即在分析社会行为时使用语言内容的方法有限。虽然许多研究揭示了词汇选择和社会群体之间的系统联系,但很少有人研究语言推论和社会趋势之间的关系。然而,追踪回音室效应逐渐向语义极端的转变需要做出一定的语言学推断。这是因为推断意义的哪个基本维度在任何特定情况下是相关的,关键取决于关于谁在说话以及他们在说什么的信息。例如,在乐迷之间的讨论中,“利物浦好得多”可能会联系到文化卓越的程度,但在讨论住房的人中,可能会联系到成本的程度。因此,需要在结合语言和社会信息的方法上取得根本性的进步,以在线描述回音室效应,并对未来碎片化的风险进行预测。该项目是一位实验和计算语言学家(PI)和一位机器学习和社会网络分析专家(Co-I)之间的新合作。它的组件集成了两个合作者的专业知识。将使用流行的在线论坛Reddit的档案,利用先进的文本挖掘和数据分析,为标量表达的大规模和实验性标准化数据集生成材料。不存在这种类型的规范化数据集,它将提供开发和评估新算法所需的培训和测试材料。使用模块化的工作计划,项目团队将首先开发和验证单独的算法,以评估和预测标量表达式的含义,以及Reddit用户在社交网络中的碎片化程度。然后,这些组件将使用先进的基于图形的机器学习方法进行集成。该项目的主要成果将是一个软件包,它将通过标记显示回音室效果早期阶段的红点或线程,促进在线主持人的工作。规范化的数据集对于改进需要非平凡语义推理的NLP应用程序也将是非常有价值的,例如情感分析、聊天机器人和问答系统。更广泛地说,该项目是在处理与社会关系和人类行为有关的语言意义方面的先进方法的示范项目。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
DagoBERT: Generating Derivational Morphology with a Pretrained Language Model
  • DOI:
    10.18653/v1/2020.emnlp-main.316
  • 发表时间:
    2020-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Valentin Hofmann;J. Pierrehumbert;Hinrich Schütze
  • 通讯作者:
    Valentin Hofmann;J. Pierrehumbert;Hinrich Schütze
Superbizarre Is Not Superb: Derivational Morphology Improves BERT’s Interpretation of Complex Words
  • DOI:
    10.18653/v1/2021.acl-long.279
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Valentin Hofmann;J. Pierrehumbert;Hinrich Schütze
  • 通讯作者:
    Valentin Hofmann;J. Pierrehumbert;Hinrich Schütze
DRew: Dynamically Rewired Message Passing with Delay
  • DOI:
    10.48550/arxiv.2305.08018
  • 发表时间:
    2023-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Benjamin Gutteridge;Xiaowen Dong;Michael M. Bronstein;Francesco Di Giovanni
  • 通讯作者:
    Benjamin Gutteridge;Xiaowen Dong;Michael M. Bronstein;Francesco Di Giovanni
Predicting COVID-19 cases using Reddit posts and other online resources
使用 Reddit 帖子和其他在线资源预测 COVID-19 病例
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Drinkall F
  • 通讯作者:
    Drinkall F
Modeling Ideological Salience and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity
利用图神经网络和结构化稀疏性对两极分化的在线群体中的意识形态显着性和框架进行建模
  • DOI:
    10.18653/v1/2022.findings-naacl.41
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Valentin Hofmann;Xiaowen Dong;J. Pierrehumbert;Hinrich Schütze
  • 通讯作者:
    Hinrich Schütze
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Janet Pierrehumbert其他文献

Janet Pierrehumbert的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Janet Pierrehumbert', 18)}}的其他基金

FAW: Experimental and Computational Studies of Word Phonology
FAW:词音系的实验和计算研究
  • 批准号:
    9022484
  • 财政年份:
    1991
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Continuing Grant
US-Sweden Cooperative Science: Intonation and Voice Source Characteristics
美国-瑞典合作科学:语调和声源特征
  • 批准号:
    8712375
  • 财政年份:
    1988
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Standard Grant
The Use of Intonation in Automatic Speech Understanding
语调在自动语音理解中的应用
  • 批准号:
    8012248
  • 财政年份:
    1980
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Standard Grant

相似国自然基金

线虫减数分裂cohesion复合体和HORMA蛋白相互作用分子机制的研究
  • 批准号:
    31801137
  • 批准年份:
    2018
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Uncovering Mechanisms of Racial Inequalities in ADRD: Psychosocial Risk and Resilience Factors for White Matter Integrity
揭示 ADRD 中种族不平等的机制:心理社会风险和白质完整性的弹性因素
  • 批准号:
    10676358
  • 财政年份:
    2024
  • 资助金额:
    $ 77.06万
  • 项目类别:
Improving outcomes for stigmatised groups via social cohesion
通过社会凝聚力改善受污名群体的结果
  • 批准号:
    MR/Y019741/1
  • 财政年份:
    2024
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Fellowship
Demographic and life course drivers of social cohesion
社会凝聚力的人口和生命历程驱动因素
  • 批准号:
    DE240100232
  • 财政年份:
    2024
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Discovery Early Career Researcher Award
Civilisationist Mobilisation, Digital Technologies and Social Cohesion
文明主义动员、数字技术和社会凝聚力
  • 批准号:
    DP230100257
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Discovery Projects
Making social cohesion ecocentric through Indigenous language and song
通过土著语言和歌曲增强社会凝聚力以生态为中心
  • 批准号:
    FT230100651
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
    ARC Future Fellowships
Style and cohesion in the Septuagint's kaige tradition: Ecclesiastes and Lamentations
七十士译本凯格传统的风格和凝聚力:传道书和哀歌
  • 批准号:
    2881782
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
    Studentship
Rapid measurement of novel harm reduction housing on HIV risk, treatment uptake, drug use and supply
快速测量新型减害住房对艾滋病毒风险、治疗接受情况、毒品使用和供应的影响
  • 批准号:
    10701309
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
A Training Development Plan for HIV-associated Behavioural Medicine
HIV 相关行为医学培训发展计划
  • 批准号:
    10688316
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
Vanderbilt FIRST - Elevating Excellence and Transforming Institutional Culture
范德比尔特第一 - 提升卓越水平并转变机构文化
  • 批准号:
    10664626
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
Mechanisms of mitotic regulation
有丝分裂调节机制
  • 批准号:
    10798363
  • 财政年份:
    2023
  • 资助金额:
    $ 77.06万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了