Identifying & Classifying Bias in Cultural Heritage Catalogues: Applying Natural Language Processing to University of Edinburgh Archival Descriptions

识别

基本信息

  • 批准号:
    2356289
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2020
  • 资助国家:
    英国
  • 起止时间:
    2020 至 无数据
  • 项目状态:
    已结题

项目摘要

The objective of this project is to develop a context-informed approach to bias detection, executed as a series of case studies beginning with the University of Edinburgh's Archive. Motivated by separate yet related strands of research in the fields of Natural Language Processing (NLP) and Cultural Heritage, the project identifies opportunity to improve large-scale, automated bias detection. Taking a cross-disciplinary approach, the project applies NLP and data visualisation to archival descriptions. NLP approaches such as topic modelling and sentiment analysis will analyse and classify the language of the Archive's descriptions. Due to the context-dependency of bias, data visualisation provides a suitable approach to presenting results of the NLP analysis. Interactive data visualisations will present the results in their associated geographic areas and time periods, enabling people to see associations that Archive items have with different types of bias. The project will propose a visualisation framework for presenting bias in human language content, which, based on the author's knowledge, has yet to be proposed. Rather than eliminate bias, the project seeks to identify and classify bias, arguing that bias deserves a place in cultural heritage institutions.Bias, though problematic when one-sided, is informative when presented transparently. Bias communicates the perspective of specific groups of people during specific time periods in history; recording historical biases informs understandings of societal evolution and the various perspectives that have existed on a topic [1]. Identifying different types of bias helps researchers understand how representative their dataset is, where more types of bias being present suggests a more representative dataset. This project seeks to develop techniques for identifying and classifying bias that will bring value to cultural heritage institutions and the public they serve, making bias transparent in human language content anywhere from an archival description to a social media post.The project seeks to develop bias-detecting technology beginning with a case study with free-text, human-written, archival descriptions. Cataloguers first wrote archival descriptions on paper in the 1930s and then in databases beginning in the 1970s. Explicitly, the language of archival descriptions reflects their historical contexts, using terms considered racist, sexist or otherwise inappropriately biased today. Implicitly, missing information in archival descriptions regarding certain groups of people reflects historical biases. These types of explicit and implicit bias can be found in textual data beyond cultural heritage catalogues, such as in newspapers and social media posts. As a result, while improving the transparency of the Archive's descriptions, the outcomes of this project could also inform research on returning representative search results [5], implementing fair algorithms [2], and identifying bias in social media [3, 4].References1. Holterhoff, K. (2017) "From Disclaimer to Critique: Race and the Digital Image Archivist." In: Digital Humanities Quarterly 11.3 URL: http://digitalhumanities.org:8081/dhq/vol/11/3/ 000324/000324.html2. IEEE. (2016) Ethically Aligned Design: A Vision for Prioritizing Human Wellbeing with Artificial Intelligence and Autonomous Systems. Version 1. http://standards.ieee.org/develop/indconn/ ec/autonomous%20systems.html 12.05.20183. Recasens, M., Danescu-Nculescu-Mizil, C., Jurafsky, D. (2013). "Linguistic Models for Analyzing and Detecting Biased Language." Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics 1650-1659.
这个项目的目标是开发一个上下文知情的方法来检测偏差,执行一系列的案例研究开始与爱丁堡大学的档案。受自然语言处理(NLP)和文化遗产领域独立但相关的研究的启发,该项目确定了改进大规模自动偏见检测的机会。该项目采用跨学科的方法,将NLP和数据可视化应用于档案描述。NLP方法,如主题建模和情感分析,将分析和分类档案的描述语言。由于偏差的上下文依赖性,数据可视化提供了一种合适的方法来呈现NLP分析的结果。交互式数据可视化将在其相关的地理区域和时间段中呈现结果,使人们能够看到存档项目与不同类型的偏见之间的关联。该项目将提出一个可视化框架,用于呈现人类语言内容中的偏见,根据作者的知识,尚未提出。该项目不是消除偏见,而是寻求识别和分类偏见,认为偏见应该在文化遗产机构中占有一席之地。偏见虽然在片面时有问题,但在透明时是有信息的。偏见传达了历史上特定时期特定人群的观点;记录历史偏见有助于理解社会演变和关于某个主题的各种观点[1]。识别不同类型的偏差有助于研究人员了解他们的数据集的代表性,其中存在更多类型的偏差表明数据集更具代表性。该项目旨在开发识别和分类偏见的技术,为文化遗产机构及其服务的公众带来价值,使偏见在人类语言内容中变得透明,从档案描述到社交媒体帖子。该项目旨在开发偏见检测技术,从自由文本的案例研究开始,人类书写的档案描述。编目员在20世纪30年代首次将档案描述写在纸上,然后在20世纪70年代开始在数据库中。明确地说,档案描述的语言反映了它们的历史背景,使用了今天被认为是种族主义、性别歧视或其他不适当的偏见的术语。档案中关于某些群体的描述中隐含的、缺失的信息反映了历史偏见。这些类型的显性和隐性偏见可以在文化遗产目录之外的文本数据中找到,例如报纸和社交媒体帖子。因此,在提高档案描述的透明度的同时,该项目的成果还可以为返回代表性搜索结果[5],实施公平算法[2]和识别社交媒体中的偏见[3,4]的研究提供信息。Holterhoff,K.(2017)“从免责声明到批评:种族和数字图像档案。《数字人文季刊》11.3网址:http://digitalhumanities.org:8081/dhq/vol/11/3/ 000324.html2。IEEE。(2016年)伦理对齐的设计:优先考虑人类福祉与人工智能和自主系统的愿景。版本1. http://standards.ieee.org/develop/indconn/ ec/casualous%20systems.html 12.05.20183. Recasens,M.,Danescu-Nculescu-米齐尔,C.,Jurafsky,D.(2013年)。“分析和检测有偏见的语言的语言模型。第51届计算语言学协会年会论文集,1650-1659。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
生命分子工学・海洋生命工学研究室
生物分子工程/海洋生物技术实验室
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似海外基金

Constructing and Classifying Pre-Tannakian Categories
前坦纳克阶范畴的构建和分类
  • 批准号:
    2401515
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Classifying and Understanding Remedies in Comparative Labour Law
比较劳动法中补救措施的分类和理解
  • 批准号:
    EP/Y036875/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
Efficient and effective methods for classifying massive time series data
海量时间序列数据高效有效的分类方法
  • 批准号:
    DP240100048
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Discovery Projects
Classifying and localising future cancerous lesions
对未来的癌性病变进行分类和定位
  • 批准号:
    2895295
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
Multivariate machine learning analysis for identyfing neuro-anatomical biomarkers of anorexia and classifying anorexia subtypes using MR datasets.
多变量机器学习分析,用于识别厌食症的神经解剖生物标志物并使用 MR 数据集对厌食症亚型进行分类。
  • 批准号:
    23K14813
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Classifying spaces, proper actions and stable homotopy theory
空间分类、适当作用和稳定同伦理论
  • 批准号:
    EP/X038424/1
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Research Grant
A Study on Classifying the Morphology and Elucidating the Function of Quotations for Teaching Academic Writing
语词形态分类及阐释引文在学术写作教学中的作用研究
  • 批准号:
    23K00625
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Sensing intracranial bioimpedance through anatomic windows for classifying stroke type
通过解剖窗感测颅内生物阻抗以对中风类型进行分类
  • 批准号:
    10667998
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Development and Validation of an Equitable Computable Phenotype for Classifying Pediatric Sleep Deficiency in Electronic Health Records
开发和验证电子健康记录中儿童睡眠不足分类的公平可计算表型
  • 批准号:
    10724442
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Classifying 4-manifolds
4-流形分类
  • 批准号:
    EP/T028335/2
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了