Reading concordances in the 21st century (RC21)

21世纪阅读索引(RC21)

基本信息

  • 批准号:
    AH/X002047/1
  • 负责人:
  • 金额:
    $ 36.07万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2023
  • 资助国家:
    英国
  • 起止时间:
    2023 至 无数据
  • 项目状态:
    未结题

项目摘要

In today's digital world, the amount of text communicated in electronic form is ever-increasing and there is a growing need for approaches and methods to extract meanings from texts at scale. Corpus linguists have long been studying digitised texts and have established that much of language is characterised by recurring patterns. So the word 'eye' can appear together with words like 'cream' and 'test', or words like 'closed' and 'fixed'. In corpus linguistics, such patterns are identified with the help of concordances, i.e. displays that show many occurrences of a word, phrase or construction across a range of contexts in a compact format. However, lacking a well-established and clear-cut methodology, the art of reading concordances has not yet realised its full potential. At the same time, there has been very little innovation in algorithms in the concordance software packages available to corpus linguists. This project proposes an innovative approach to reading concordances in the 21st century. Through the collaboration between the University of Birmingham and Friedrich-Alexander-Universität Erlangen-Nürnberg we combine strengths in theoretical work in corpus linguistics with expertise in computational algorithms in order to develop a systematic methodology for reading concordances. We will develop tool-independent strategies for reading concordances and we will develop corresponding algorithms for the semi-automatic analysis of concordance lines. We will specifically implement the software FlexiConc to support the corpus linguist researcher in organising and interpreting concordances. To develop and test our approach, we will conduct two case studies. The first case study will focus on body language in fiction compared to non-fiction texts. The second case study will focus on political argumentation in social media, formalising its findings as corpus queries that can be used for automatic argumentation mining. Both case studies include a comparative dimension between English and German. Hence, they broaden out approaches to concordance reading which have been very focused on the English language so far. Through these case studies, we will establish an approach that not only provides innovation in corpus linguistics, but also has wider implications for the analysis of textual data at scale, while still retaining a humanities perspective. We will develop FlexiConc as open-source software, so that other researchers can use it as an off-the-shelf tool or integrate it into existing concordance tools or their own software environment. Both FlexiConc and our tool-independent approach to concordance analysis will have relevance beyond corpus linguistics, providing innovative approaches and algorithms for disciplines such as digital humanities and computational social science. We will raise awareness of the new possibilities in a variety of forms, for instance, through a project blog where users of our software can share their experience, and with the help of an advisory board of leading international experts. We will run training sessions at summer schools and conferences and make educational materials available online.
在当今的数字世界中,以电子形式传达的文本量不断增加,并且越来越需要从文本中大规模提取含义的方法和途径。语料库语言学家长期以来一直在研究数字化文本,并确定了许多语言的特征是重复出现的模式。因此,“眼睛”这个词可以与“奶油”和“测试”这样的词一起出现,或者像“关闭”和“固定”这样的词。在语料库语言学中,这种模式是在索引的帮助下识别的,即以紧凑的格式显示一个词、短语或结构在一系列上下文中的多次出现。然而,由于缺乏一个完善和明确的方法,阅读索引的艺术尚未实现其全部潜力。与此同时,语料库语言学家可用的索引软件包中的算法几乎没有创新。这个项目提出了一个创新的方法,阅读索引在21世纪世纪。通过伯明翰大学和弗里德里希-亚历山大-埃尔朗根-纽伦堡大学之间的合作,我们将联合收割机在语料库语言学理论工作中的优势与计算算法的专业知识相结合,以开发一种系统的方法来进行阅读索引。我们将开发独立于工具的阅读一致性策略,并开发相应的一致性线半自动分析算法。我们将专门实施软件EQUIPCONC,以支持语料库语言学家研究人员组织和解释索引。为了开发和测试我们的方法,我们将进行两个案例研究。第一个案例研究将集中在身体语言在小说相比,非小说文本。第二个案例研究将集中在社交媒体中的政治论证,将其结果形式化为可用于自动论证挖掘的语料库查询。这两个案例研究包括英语和德语之间的比较维度。因此,他们拓宽了一致性阅读的方法,这些方法到目前为止一直非常关注英语。通过这些案例研究,我们将建立一种方法,不仅提供语料库语言学的创新,而且对大规模文本数据分析具有更广泛的影响,同时仍然保留人文视角。我们将把Concept开发为开源软件,以便其他研究人员可以将其作为现成的工具使用,或者将其集成到现有的索引工具或自己的软件环境中。这两个conclusionconc和我们的工具独立的方法,以一致性分析将有相关性超越语料库语言学,提供创新的方法和算法的学科,如数字人文和计算社会科学。我们将以各种形式提高人们对新的可能性的认识,例如,通过一个项目博客,我们的软件用户可以分享他们的经验,并在一个由国际领先专家组成的咨询委员会的帮助下。我们将在暑期学校和会议上举办培训班,并在网上提供教育材料。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Michaela Mahlberg其他文献

Defining the Carceral Characteristics of the ‘Dickensian prison’: A Corpus Stylistics Analysis of Dickens’s Novels
定义“狄更斯监狱”的监狱特征:狄更斯小说的语料库文体分析
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0.3
  • 作者:
    E. March;D. Moran;Matt Houlbrook;Y. Jewkes;Michaela Mahlberg
  • 通讯作者:
    Michaela Mahlberg
Corpus Stylistics, Norms and Comparisons
语料库文体、规范和比较
Local textual functions ofmovein newspaper story patterns
报纸故事模式移动的局部文本功能
  • DOI:
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Michaela Mahlberg
  • 通讯作者:
    Michaela Mahlberg
Exploring text-initial words, clusters and concgrams in a newspaper corpus
探索报纸语料库中的文本首词、簇和组合词
Dickens, the suspended quotation and the corpus
狄更斯、暂停引用和语料库
  • DOI:
    10.1177/0963947011432058
  • 发表时间:
    2012
  • 期刊:
  • 影响因子:
    0.7
  • 作者:
    Michaela Mahlberg;Catherine Smith
  • 通讯作者:
    Catherine Smith

Michaela Mahlberg的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Michaela Mahlberg', 18)}}的其他基金

CLiC Dickens - characterisation in the representation of speech and body language from a corpus stylistic perspective.
CLiC Dickens - 从语料库风格角度描述言语和肢体语言的表征。
  • 批准号:
    AH/P504634/1
  • 财政年份:
    2017
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Research Grant
CLiC Dickens - characterisation in the representation of speech and body language from a corpus stylistic perspective.
CLiC Dickens - 从语料库风格角度描述言语和肢体语言的表征。
  • 批准号:
    AH/K005146/1
  • 财政年份:
    2013
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Research Grant

相似海外基金

Pedagogical Effects of Convergent Concordances : Semantic Restructuring of English Catenative Verb Constructions
趋同语词索引的教学效果:英语连动动词结构的语义重构
  • 批准号:
    21720214
  • 财政年份:
    2009
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Comprehensive Studies in Medieval English Language and Literature-IX
中世纪英语语言文学综合研究-九
  • 批准号:
    06301052
  • 财政年份:
    1994
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Comprehensive Studies in Medieval English Language and Literature-VIII
中世纪英语语言文学综合研究-VIII
  • 批准号:
    04301055
  • 财政年份:
    1992
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Concording Medieval English Metrical Romances by Computer
用计算机协调中世纪英语格律浪漫曲
  • 批准号:
    04610280
  • 财政年份:
    1992
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (C)
Concording Medieval English Metrical Romances by Computer
用计算机协调中世纪英语格律浪漫曲
  • 批准号:
    01510277
  • 财政年份:
    1989
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for General Scientific Research (C)
Comprehensive Studies in Medieval English Language and Literature--VII
中世纪英语语言文学综合研究--七
  • 批准号:
    01301058
  • 财政年份:
    1989
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Mathematical Sciences: Finite Group Actions on Surfaces, Representations of Knot Groups, and Knot Concordances
数学科学:曲面上的有限群作用、结群的表示以及结索引
  • 批准号:
    8521057
  • 财政年份:
    1986
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Standard Grant
Crosslinguistic Concordances of Child Language
儿童语言的跨语言索引
  • 批准号:
    8208142
  • 财政年份:
    1982
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Symmetries of Surfaces, Link Groups, And Knot Concordances
数学科学:曲面的对称性、链接群和结索引
  • 批准号:
    8121727
  • 财政年份:
    1982
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Standard Grant
Reading concordances in the 21st century (RC21)
21世纪阅读索引(RC21)
  • 批准号:
    508235423
  • 财政年份:
  • 资助金额:
    $ 36.07万
  • 项目类别:
    Research Grants
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了