Tibetan in Digital Communication: Corpus Linguistics and Lexicography
数字传播中的藏语:语料库语言学和词典学
基本信息
- 批准号:AH/J00152X/1
- 负责人:
- 金额:$ 57.16万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2012
- 资助国家:英国
- 起止时间:2012 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In age, breadth and diversity of genre, Tibetan literature is in every way comparable to English. The Tibetan alphabet was invented in 650 CE. The earliest currently available securely dateable document dates to ca. 763 CE. Literary production has continued from that time unabated until today. Yet, the lexicographical resources of Tibetan are very inadequate and vastly inferior to what is available to English speakers. In total, students of Tibetan can draw on about a dozen dictionaries, most for Classical Tibetan. The scope of these lexicons tends to be poorly defined, and none of them meets the standards of scientific lexicography. Moreover, there is not a single work that covers the earliest period of Tibetan literature, Old Tibetan (650-1000 CE). The corpus and tools we propose to create will serve as the first step to advance the compilation of a comprehensive historical Tibetan dictionary akin to the Oxford English Dictionary.In order to achieve this, we propose to produce a large corpus of Tibetan texts spanning the language's entire history, drawn from Old, Classical and Modern Tibetan. In the past, scholars used laborious collections of slips organised and stored in vast filing cabinets in order to compile large dictionaries. Advances in computational linguistics mean that this work can now be achieved more thoroughly and effectively through the creation of annotated digital corpora. But our corpus, once carefully analysed and tagged, will not only pave the way for the compilation of Tibetan dictionaries of hitherto inconceivable calibre, but it will also prepare the ground for a wide range of other significant research initiatives. By mounting it on the Web, scholars from a wide range of disciplines (history, religion, literature, linguistics, etc.) working with Tibetan language materials will be able to search it and use its content for their own research. It is thus likely to become foundational to a vast array of research initiatives, benefiting many different constituencies in academia.Outside academia, in the modern world of electronic communication, our corpus will lay the foundation for the creation of new digital technologies for Tibetan (text messaging, automated translation, etc.). The high investment required to develop language software leaves languages without commercial or political power isolated and poorly resourced. Digital communication technologies are built on basic language processing tools (eg, word-segmentation programmes, part-of-speech taggers) of the very type we propose to create. Our work will reduce the cost to develop such technologies and thus attract commercial interest. Although Tibetan is spoken by more than two million people, it is barely represented in electronic media as a spoken language. We seek to remedy this by creating an electronic resource that will restore to Tibetans, irrespective of their residence or adopted nationality, the choice to use their language as they see fit in a world that is increasingly shaped by digital communication.
西藏文学在年代、广度和体裁多样性上都可以与英语文学相媲美。藏文字母是在公元650年发明的。目前最早的可安全标注日期的文档可以追溯到ca。公元763年。文学创作从那时起一直延续到今天。然而,藏语的词典资源非常不足,远远不如英语使用者所能获得的。总的来说,学习藏语的学生可以利用大约十几本字典,大多数是古典藏语。这些词典的范围往往定义不清,没有一个符合科学词典学的标准。此外,没有一部作品涵盖了西藏文学的最早时期,旧西藏(公元650-1000年)。我们建议创建的语料库和工具将作为推动编纂一部类似于《牛津英语词典》的综合性历史藏文词典的第一步。为了实现这一目标,我们建议制作一个涵盖藏文整个历史的大型语料库,包括古藏文、古典藏文和现代藏文。在过去,学者们为了编纂大型字典,需要费力地收集整理并存放在巨大的档案柜中的简牍。计算语言学的进步意味着这项工作现在可以通过创建带注释的数字语料库来更彻底、更有效地实现。但是,我们的语料库,一旦仔细分析和标记,将不仅为迄今难以想象的口径藏文词典的编纂铺平道路,但它也将为广泛的其他重大研究举措奠定基础。通过将其安装在网络上,来自广泛学科(历史,宗教,文学,语言学等)的学者使用藏文材料的工作人员将能够搜索它并将其内容用于自己的研究。因此,它很可能成为一系列研究计划的基础,使学术界的许多不同群体受益。在学术界之外,在电子通信的现代世界中,我们的语料库将为创造藏文新的数字技术(短信,自动翻译等)奠定基础。开发语言软件所需的高额投资使没有商业或政治力量的语言孤立起来,资源匮乏。数字通信技术是建立在基本的语言处理工具(例如,分词程序,词性标注器)的非常类型,我们建议创建。我们的工作将降低开发此类技术的成本,从而吸引商业兴趣。虽然有超过200万人说藏语,但它在电子媒体上几乎没有作为口头语言出现。我们试图通过创建一个电子资源来解决这个问题,这个电子资源将恢复藏人,无论他们的居住地或国籍如何,在一个日益被数字通信塑造的世界中,选择使用他们认为合适的语言。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Tibetan part-of-speech conundrums: ma? and yun ri?
藏语词性难题:ma?
- DOI:
- 发表时间:2015
- 期刊:
- 影响因子:0
- 作者:Hill, N
- 通讯作者:Hill, N
Disambiguating Tibetan verb stems with matrix verbs in the indirect infinitive construction
间接不定式结构中用矩阵动词消除藏语动词词干的歧义
- DOI:
- 发表时间:2015
- 期刊:
- 影响因子:0
- 作者:Garrett, E
- 通讯作者:Garrett, E
Constituent Order in the Tibetan Noun Phrase
藏语名词短语的构成顺序
- DOI:
- 发表时间:2015
- 期刊:
- 影响因子:0
- 作者:Garrett, E
- 通讯作者:Garrett, E
A Rule-based Part-of-speech Tagger for Classical Tibetan
基于规则的古典藏语词性标注器
- DOI:10.5070/h913224023
- 发表时间:2014
- 期刊:
- 影响因子:0
- 作者:Garrett E
- 通讯作者:Garrett E
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ulrich Pagel其他文献
Ulrich Pagel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ulrich Pagel', 18)}}的其他基金
Lexicography in Motion: A History of the Tibetan Verb
动态词典编纂:藏语动词史
- 批准号:
AH/P004644/1 - 财政年份:2017
- 资助金额:
$ 57.16万 - 项目类别:
Research Grant
Locating Culture, Religion and the Self: A Study of the Tantric Community in Rebkong (East Tibet)
文化、宗教与自我的定位:热贡(藏东)密宗团体研究
- 批准号:
AH/F009216/1 - 财政年份:2008
- 资助金额:
$ 57.16万 - 项目类别:
Research Grant
相似国自然基金
超灵敏高分辨的Digital-CRISPR技术用于免扩增的多重核酸检测
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于Digital Twin的数控机床智能运行维护方法研究
- 批准号:51875323
- 批准年份:2018
- 资助金额:60.0 万元
- 项目类别:面上项目
基于数字PCR(digital-PCR)技术的耳聋无创产前检测研究
- 批准号:LQ19H040016
- 批准年份:2018
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于Digital LAMP技术的循环肿瘤细胞检测和分型新方法研究
- 批准号:81702102
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
基于表面工程的外泌体digital PCR定量分析体系的构建及转化医学研究
- 批准号:81702959
- 批准年份:2017
- 资助金额:10.0 万元
- 项目类别:青年科学基金项目
相似海外基金
A systematic study of the role of digital communication in adolescent identity development
数字通信在青少年身份发展中的作用的系统研究
- 批准号:
23K12890 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Setting for Scientific Communication Utilizing Digital Technology to Support Preschool Teachers' Philosophy of Early Childhood Education in the New Normal Age
数字技术科学传播支撑新常态时代幼儿教师幼儿教育理念
- 批准号:
23K02797 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Scalable Digital Communication Intervention to Support Older Adults and Care-partners Transitioning Home After Major Surgery
可扩展的数字通信干预措施支持老年人和护理伙伴在大手术后过渡回家
- 批准号:
10710766 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
CLARUS : Building clarity and preventing bias in digital forensic examination, interorganisational communication and interaction
CLARUS:在数字取证检查、组织间沟通和互动中建立清晰度并防止偏见
- 批准号:
10084279 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
EU-Funded
Clarus: Building clarity and preventing bias in digital forensic examination, interorganisational communication and interaction
Clarus:在数字取证检查、组织间沟通和互动中建立清晰度并防止偏见
- 批准号:
10079452 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
EU-Funded
TASK ORDER TITLED "NHLBI USER-CENTERED RESEARCH, ANALYTICS AND DIGITAL COMMUNICATION SOFTWARE TOOLS MANAGEMENT"
任务订单标题为“NHLBI 以用户为中心的研究、分析和数字通信软件工具管理”
- 批准号:
10974172 - 财政年份:2023
- 资助金额:
$ 57.16万 - 项目类别:
Design and Implementation of Digital Signal Processing Algorithms for Communication and Biomedical Applications
通信和生物医学应用数字信号处理算法的设计和实现
- 批准号:
RGPIN-2017-06626 - 财政年份:2022
- 资助金额:
$ 57.16万 - 项目类别:
Discovery Grants Program - Individual
Collaborative Research: HCC: Small: Science communication in the ecosystem of digital media platforms
合作研究:HCC:小型:数字媒体平台生态系统中的科学传播
- 批准号:
2133963 - 财政年份:2022
- 资助金额:
$ 57.16万 - 项目类别:
Standard Grant
Development of risk communication on infectious diseases for Vietnamese migrants in Japan: tuberculosis response by digital health approach
为在日本的越南移民开展传染病风险沟通:通过数字健康方法应对结核病
- 批准号:
22K10482 - 财政年份:2022
- 资助金额:
$ 57.16万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Extending Digital Survivorship Needs Assessment Planning Tools to Enhance Communication in the Head and Neck Cancer Survivor-Caregiver-Provider Triad
扩展数字化生存者需求评估规划工具,以加强头颈癌生存者-护理者-提供者三人组中的沟通
- 批准号:
10831265 - 财政年份:2022
- 资助金额:
$ 57.16万 - 项目类别:














{{item.name}}会员




