A Parsed Historical Corpus of Modern English

现代英语历史语料库解析

基本信息

  • 批准号:
    0418061
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2004
  • 资助国家:
    美国
  • 起止时间:
    2004-09-01 至 2011-02-28
  • 项目状态:
    已结题

项目摘要

With National Science Foundation support, Dr. Anthony Kroch and Dr. Beatrice Santorini will create a one million word syntactically-annotated electronic corpus of Modern English texts and text samples, covering the years from 1700 to 1900 C. E. This corpus is the fourth in a series created over the past decade by researchers at Penn and at the University of York, England. The three existing corpora cover 900 years of the history of English (ca. 800 C. E. to 1700 C. E.) and comprise more than 4.5 million words of running text, tagged for part of speech and annotated for syntactic structure. The corpora support studies that combine grammatical analysis with the statistical tracking of changes across time, at a level of detail and precision never before possible. As a result, they have been used in the study of many changes that the English language has undergone over the centuries. However, the linguistic changes that distinguish Modern English from the language of earlier periods only go to completion in the 18th and 19th centuries, so that the new corpus is needed for temporally complete investigations of the rise of Modern English. The new corpus will reinforce in several ways the growing influence of electronic corpora on the language sciences. First, the study of language change is being revolutionized, because corpora provide, in manageable form, the data needed for large-scale and precise investigations. Because the corpora are publicly available, common data resources, different analyses can be compared with a confidence and precision not previously possible. Second, the corpora permit historical work to interact more effectively with other areas of linguistics. Thus, recent discussions of the long-assumed connection between language change and language acquisition have been stimulated by corpus-based studies of the time course of language change. In addition, applied mathematicians have begun to use the detailed data that the parsed historical corpora provide to test dynamical systems models of how language change diffuses through a population. Third, in computational linguistics, the corpora are beginning to prove useful as test beds for automatic processors, especially syntactic parsers. Crucially, they include a wide range of genres and levels of syntactic complexity, unlike the present the standard test bed, the Penn Treebank. Because its syntax differs very little from Present-Day English, the proposed new corpus of Modern English will be particularly useful in this regard. Finally, parsed corpora are beginning to have an impact on undergraduate education in linguistics. They are used in undergraduate research courses, where learning to use them reinforces students' understanding of language structure while, at the same time, the corpora provide datasets for research projects. Because its language poses no linguistic barriers to understanding, the present corpus will be particularly useful for this purpose.
在国家科学基金会的支持下,Anthony Kroch博士和Beatrice Santorini博士将创建一个100万字的现代英语文本和文本样本的句法注释电子语料库,涵盖公元1700年到1900年。这个语料库是宾夕法尼亚大学和英国约克大学的研究人员在过去十年中创建的系列语料库中的第四个语料库。现有的三个语料库涵盖了英语900年的历史(约公元800年至1700年)。包括450多万字的流水文本,标记为词性,注释为句法结构。语料库支持将语法分析与随时间变化的统计跟踪相结合的研究,其细节和精确度达到了前所未有的水平。因此,它们被用来研究英语语言几个世纪以来所经历的许多变化。然而,区分现代英语与早期语言的语言变化直到18世纪和19世纪才完成,因此需要建立新的语料库来暂时全面地研究现代英语的兴起。新的语料库将在几个方面加强电子语料库对语言科学日益增长的影响。首先,语言变化的研究正在发生革命性的变化,因为语料库以可管理的形式提供了大规模和精确调查所需的数据。由于语料库是公开可用的公共数据资源,不同的分析可以以以前不可能实现的置信度和精确度进行比较。其次,语料库允许历史工作与语言学的其他领域更有效地互动。因此,最近关于语言变化和语言习得之间长期假设的联系的讨论受到了基于语料库的语言变化时间进程研究的刺激。此外,应用数学家已经开始使用解析后的历史语料库提供的详细数据来测试动态系统模型,该模型描述了语言变化如何在人群中传播。第三,在计算语言学中,语料库开始被证明是自动处理器的试验台,特别是句法分析器。至关重要的是,它们包括了广泛的体裁和句法复杂程度,不同于目前的标准测试平台,宾夕法尼亚树库。由于它的句法与现在的英语差别很小,拟议的新的现代英语语料库在这方面将特别有用。最后,经过分析的语料库开始对语言学本科教育产生影响。语料库被用于本科研究课程,在这些课程中,学习使用它们可以加强学生对语言结构的理解,同时,语料库为研究项目提供数据集。由于其语言不会对理解造成任何语言障碍,因此目前的语料库对这一目的将特别有用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anthony Kroch其他文献

Anthony Kroch的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Anthony Kroch', 18)}}的其他基金

Testing and improving methods for efficient annotation through the construction of a large parsed corpus
通过构建大型解析语料库来测试和改进有效注释的方法
  • 批准号:
    1147499
  • 财政年份:
    2012
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Diachronic Generative Syntax Conference XIII: Support for meeting and associated workshop
历时生成语法会议第十三届:支持会议和相关研讨会
  • 批准号:
    1104768
  • 财政年份:
    2011
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
U.S.-Iceland Linguistics Workshop on Change and Variation in Icelandic Syntax
美国-冰岛语言学研讨会关于冰岛句法的变化和变异
  • 批准号:
    0639066
  • 财政年份:
    2006
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
SGER: Enriching Parser Output for Treebank Construction
SGER:丰富树库构建的解析器输出
  • 批准号:
    0527116
  • 财政年份:
    2005
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
The Emergence of Modern English Syntax
现代英语语法的出现
  • 批准号:
    9905488
  • 财政年份:
    1999
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
The Historical Syntax of Middle English from a Comparative Perspective
比较视角下的中古英语历史句法
  • 批准号:
    9511368
  • 财政年份:
    1996
  • 资助金额:
    --
  • 项目类别:
    Continuing grant
Head/Complement Order in the History of the West Germanic Clause
西日耳曼语子句历史中的主语/补语顺序
  • 批准号:
    8919701
  • 财政年份:
    1990
  • 资助金额:
    --
  • 项目类别:
    Standard Grant

相似海外基金

Building a parsed historical corpus to investigate word-order change and variation
构建经过解析的历史语料库来研究词序变化和变异
  • 批准号:
    2314522
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Study to generate a huge text corpus of Japanese in Edo-period and to recognize historical cursive
江户时代日语海量文本语料库生成及历史草书识别研究
  • 批准号:
    21K12008
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Syntactic and Semantic Information Annotation on the Corpus of Historical Japanese
历史日语语料库的句法和语义信息注释
  • 批准号:
    17H00917
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
A Historical study of word formation using a corpus with derivational and compound information
使用具有派生和复合信息的语料库进行构词的历史研究
  • 批准号:
    17K13471
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Historical Studies of Hiragana upon Integration of the Hiragana Grapheme Database and the 19th-Century Textbook Corpus of Hiragana Grapheme
平假名字素数据库与19世纪教科书平假名字素库整合的平假名历史研究
  • 批准号:
    17K13462
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
A methodology for qualitative analysis of historical corpus data: New insights about language change from a micro-analytical approach
历史语料库数据定性分析方法:微观分析方法对语言变化的新见解
  • 批准号:
    AH/N002911/1
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Fellowship
A Study of Compound Functional Expressions and Collocations in Medieval and Early Modern Japanese Based on the Corpus of Historical Japanese
基于历史日语语料库的中世纪和近代早期日语复合功能表达和搭配研究
  • 批准号:
    16K16850
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Historical Pragmatic study of politeness in speech acts: Corpus-based approach
言语行为礼貌的历史语用研究:基于语料库的方法
  • 批准号:
    16K02780
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Refinement and utilization of the Corpus of Historical Japanese through multilayered extension
历史日语语料库的多层次延伸提炼与利用
  • 批准号:
    15H01883
  • 财政年份:
    2015
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Construction and Analysis of Adjective's Syntactic and Semantic Annotation of the 'Corpus of Historical Japanese' Heian Period Series
《日本历史语料库》平安时代系列形容词句法语义注释的构建与分析
  • 批准号:
    15K16764
  • 财政年份:
    2015
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了