Lexicography in Motion: A History of the Tibetan Verb

动态词典编纂:藏语动词史

基本信息

  • 批准号:
    AH/P004644/1
  • 负责人:
  • 金额:
    $ 100.99万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2017
  • 资助国家:
    英国
  • 起止时间:
    2017 至 无数据
  • 项目状态:
    已结题

项目摘要

At one point or another, most language users rely on dictionaries as authoritative source of lexicographical information. The first recorded dictionaries date back to Sumerian times (3rd millennium BCE) compiled in the course of the linguistic convergence ('Sprachbund') between Akkadian and Sumerian. Since then, dictionaries have played a key role in intercultural communication and advanced scientific research across languages and nation states. Modern-day lexicography still serves these goals, but its methods have changed beyond recognition. Card catalogues have given way to databases and digital resources that offer access to a much larger pool of linguistic data. Today, practically all lexicographers deploy text corpora and corpus querying tools, both to sharpen the empirical base of definitions and to provide contextual examples for the end user. We propose to take advantage of these developments to create a corpus-based diachronic lexicon of Tibetan verbs.Verbs play a central role in most sentences. Knowledge of the meaning of a verb leads to the arguments it requires and to the semantic roles the arguments, in turn, assume. Our lexicon draws on these links. It will allow the user to infer the complete structure of a sentence, based primarily on the terminal verb and the type of accompanying arguments. Furthermore, it charts the morphological and semantic changes of the verbs from the earliest records of Tibetan in the 8th century CE to contemporary times. Each verb is tracked to its earlier occurrence in the Old Tibetan material within the corpus and then compared with its applications in Classical and Modern Tibetan. Some of the existing dictionaries contain sporadic diachronic information, but this is never analysed or juxtaposed with other data. We propose to identify, examine and contextualise the diachronic evidence in a systematic fashion in order to obtain a better grasp of the evolution of the Tibetan language overall.Corpus resources and processing tools constitute indispensable components in modern lexicography. For Tibetan, some of these tools are now available. 'Tibetan in Digital Communication' produced a large corpus of Tibetan language material, with part-of-speech-tagging, spanning Old, Classical and Modern Tibetan. For the lexicon, we mine its content by running a series of automated queries drawing on Natural Language Processing (NLP) software. At first, we create an internal workflow tool. This allows us to categorise, both systematically and comprehensively, all the Tibetan verbs within the corpus. The different forms of the verbs are then grouped together in discrete entries; we analyse in depth verbal stems that display semantic ambiguity, repeated change or morphological irregularity. In parallel, we identify and label the arguments connected with each verb. We use this data, individually and cumulatively, to generate the citations and definitions for the lexicon.The verb lexicon will become an indispensable asset for students and scholars alike, working on any one of the many facets of Tibetan culture, past and present. Outside academia, through its modern component, the lexicon improves access to Tibet-related content in the political and economic sphere. Development aid, humanitarian assistance, medical provisions and educational support are best delivered in conversation with the recipients. These conversations must be conducted in Tibetan. Very few Tibetans are fluent in English and most do not wish to communicate in Chinese. The software we create also advances the creation of new digital tools for Tibetan speakers. The IT sector is reluctant to invest in the language of a people that holds little political or economical influence. Its speakers are excluded from the vast resources of the web. Key to such technologies is the availability of a Basic Language Resource Kit (BLARK). The lexicon and predicate software bring us one step closer to the completion of a BLARK for Tibetan.
在某种程度上,大多数语言使用者依赖词典作为词典编纂信息的权威来源。第一部有记录的词典可以追溯到苏美尔语时代(公元前3千年),编纂于阿卡迪亚语和苏美尔语之间的语言趋同(‘Sprachbund’)过程中。从那时起,词典在跨文化交流和跨语言、跨民族国家的先进科学研究中发挥了关键作用。现代词典编纂仍然服务于这些目标,但其方法已经发生了翻天覆地的变化。卡片目录已经让位于数据库和数字资源,它们提供了对更大的语言数据池的访问。今天,几乎所有的词典编纂者都使用文本语料库和语料库查询工具,既是为了加强定义的经验基础,也是为了为最终用户提供上下文范例。我们建议利用这些发展建立一个基于语料库的藏语动词历时词典。动词在大多数句子中发挥着核心作用。对动词意义的了解导致了它所需要的论元和论元所承担的语义角色。我们的词典利用了这些链接。它将允许用户主要根据末尾动词和伴随论元的类型来推断句子的完整结构。此外,它还绘制了从公元8世纪藏语最早的记录到当代动词的形态和语义变化。每个动词都被追溯到它在语料库中的旧藏语材料中较早的出现,然后与它在古藏语和现代藏语中的应用进行比较。现有的一些词典包含零星的历时信息,但这些信息从未与其他数据进行分析或并列。我们建议对历时证据进行系统的识别、考察和语境化,以更好地掌握藏语的整体演变。语料库资源和处理工具是现代词典编纂中不可或缺的组成部分。对于藏族人来说,其中一些工具现在已经可以使用。《数字交流中的藏语》制作了大量藏语语料库,带有词性标注,涵盖了古藏语、古藏语和现代藏语。对于词典,我们通过在自然语言处理(NLP)软件上运行一系列自动查询来挖掘其内容。首先,我们创建一个内部工作流工具。这使得我们能够对语料库中的所有藏语动词进行系统和全面的分类。然后将不同形式的动词组合在不同的条目中;我们深入分析表现出语义歧义、重复变化或形态不规则的动词词干。同时,我们识别并标记与每个动词相关的论元。我们使用这些单独和累积的数据来生成词汇的引用和定义。动词词汇将成为学生和学者不可或缺的财富,研究西藏文化的许多方面,无论是过去的还是现在的。在学术界之外,通过其现代组成部分,该词典改善了在政治和经济领域获取与西藏有关的内容的途径。发展援助、人道主义援助、医疗用品和教育支持最好通过与受援国的对话提供。这些对话必须用藏语进行。很少有藏族人英语流利,大多数人不希望用汉语交流。我们开发的软件也促进了为藏语使用者创造新的数字工具。IT行业不愿用一个几乎没有政治或经济影响力的民族的语言进行投资。它的演讲者被排除在网络的巨大资源之外。这些技术的关键是基本语言资源工具包(BLARK)的可用性。词典和谓词软件使我们离完成藏语的Blark又近了一步。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
NER for Tibetan and Mongolian Newspapers
藏文和蒙古文报纸的 NER
  • DOI:
    10.33774/coe-2021-xhw9l
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Barnett R
  • 通讯作者:
    Barnett R
A CG3 Constraint Grammar to detect verb dependencies for the Classical Tibetan Language
用于检测古典藏语动词依赖性的 CG3 约束语法
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Faggionato, C
  • 通讯作者:
    Faggionato, C
Dictionaries as collections of data stories: an alternative post-editing model for historical corpus lexicography
字典作为数据故事的集合:历史语料库词典编纂的另一种译后编辑模型
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ligeia Lugli
  • 通讯作者:
    Ligeia Lugli
Lexicography in Motion
运动中的词典编纂
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rode, Samyo
  • 通讯作者:
    Rode, Samyo
Smart lexicography for under-resourced languages
资源贫乏语言的智能词典编目
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ligeia Lugli
  • 通讯作者:
    Ligeia Lugli
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ulrich Pagel其他文献

Ulrich Pagel的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ulrich Pagel', 18)}}的其他基金

Tibetan in Digital Communication: Corpus Linguistics and Lexicography
数字传播中的藏语:语料库语言学和词典学
  • 批准号:
    AH/J00152X/1
  • 财政年份:
    2012
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Research Grant
Locating Culture, Religion and the Self: A Study of the Tantric Community in Rebkong (East Tibet)
文化、宗教与自我的定位:热贡(藏东)密宗团体研究
  • 批准号:
    AH/F009216/1
  • 财政年份:
    2008
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Research Grant

相似海外基金

Evaluating the contribution of plate boundary forces and mantle flow on the late Cenozoic North American plate motion history with coupled global models of mantle and lithosphere dynamics
利用地幔和岩石圈动力学耦合全球模型评估板块边界力和地幔流对晚新生代北美板块运动历史的贡献
  • 批准号:
    437134941
  • 财政年份:
    2019
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Research Grants
Elucidation of onset factors and attempts at prevention and treatment by motion analysis of sports injuries and sports history surveys
通过运动损伤的运动分析和运动史调查来阐明发病因素和预防和治疗的尝试
  • 批准号:
    17K01750
  • 财政年份:
    2017
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Collaborative Research: High Precision 40Ar/39Ar Geochronology and Paleomagnetism to Determine the History and Consequences of Louisville Mantle Plume Motion
合作研究:高精度 40Ar/39Ar 地质年代学和古地磁学以确定路易斯维尔地幔柱运动的历史和后果
  • 批准号:
    1154675
  • 财政年份:
    2012
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Standard Grant
Collaborative Research: High Precision 40Ar/39Ar Geochronology and Paleomagnetism to Determine the History and Consequences of Louisville Mantle Plume Motion
合作研究:高精度 40Ar/39Ar 地质年代学和古地磁学以确定路易斯维尔地幔柱运动的历史和后果
  • 批准号:
    1154094
  • 财政年份:
    2012
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Standard Grant
Motion Control of Hyper Redundant Robots with Learning Control Scheme Based on Linear Combination of Error History
基于误差历史线性组合的学习控制方案的超冗余机器人运动控制
  • 批准号:
    18560243
  • 财政年份:
    2006
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Motion Pictures as Reserch Materials for Educational History
作为教育史研究材料的电影
  • 批准号:
    02301040
  • 财政年份:
    1990
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Grant-in-Aid for Co-operative Research (A)
Fault Zone and Laterally Heterogeneous Structure: Effects on the Rupture History and Strong Ground Motion of the Loma Prieta Earthquake
断层带和横向非均质结构:对洛马普列塔地震破裂历史和强烈地震动的影响
  • 批准号:
    9011439
  • 财政年份:
    1990
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Standard Grant
RUI: Motion History and Tectonic Significance of Major Faults in Southeastern New England
RUI:新英格兰东南部主要断层的运动历史和构造意义
  • 批准号:
    8708244
  • 财政年份:
    1987
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Standard Grant
Relative Motion of the Pacific, Farallon and North American Plates and the Tectonic History of Western North America
太平洋、法拉伦和北美板块的相对运动与北美西部的构造历史
  • 批准号:
    7926346
  • 财政年份:
    1980
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Standard Grant
Relative Motion of the Pacific, Farallon and North America Plates and the Tectonic History of Western North America
太平洋、法拉伦和北美板块的相对运动与北美西部的构造历史
  • 批准号:
    7713673
  • 财政年份:
    1978
  • 资助金额:
    $ 100.99万
  • 项目类别:
    Continuing grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了