ITR: American National Corpus: A Primary Resource for Linguistics Research
ITR:美国国家语料库:语言学研究的主要资源
基本信息
- 批准号:0218609
- 负责人:
- 金额:$ 28.51万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2002
- 资助国家:美国
- 起止时间:2002-09-01 至 2005-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
With National Science Foundation support, Drs. Nancy Ide and Randi Reppen will conduct a three-year project to annotate extensively a 10-million word portion of the American National Corpus (ANC). The ANC consists of both spoken and written language from North America across a range of registers, such as planned speeches, conversations, fiction, and newspapers. This research project uses techniques from both computational linguistics and corpus linguistics to annotate the ANC for a range of grammatical and semantic characteristics. Specifically the project seeks to accomplish three major objectives: 1) develop automatic tools for annotating various elements and structures in the corpus; 2) create a 'gold standard' portion of the ANC, consisting of 10 million words in which the markup, annotation, and parts of speech have been hand-validated; and 3) describe the conceptual and meaning relations among words in the ANC within the framework of the 'semantic web', thus greatly enhancing analysis and retrieval capabilities. The investigators are to carry out this research through a variety of software programs (many created specifically for this project), and through extensive human/computer interaction to hand-validate the computer assigned labels. This research project is important for several reasons. First, the resulting corpus will be the first publicly available tagged corpus of spoken and written American English. Second, because the annotation of the corpus will be hand-validated, the resulting product will approach 100% accuracy. With this carefully annotated 10-million word corpus, language researchers will be able to address a number of structural and linguistic relationships across texts that previously could not be addressed. Since the corpus will be hand-validated, researchers can use this information to develop models for processing previously unseen texts. The ANC corpus will be readily available to researchers via the web. In addition to the annotated corpus, the project will make available to researchers a suite of tools designed to retrieve information from the corpus.
在美国国家科学基金会的支持下,南希·艾德和兰迪·雷彭博士将进行一个为期三年的项目,对美国国家语料库(ANC)的1000万字部分进行广泛的注释。 ANC由北美的口头和书面语言组成,涵盖一系列领域,如计划演讲,对话,小说和报纸。 本研究项目使用计算语言学和语料库语言学的技术来注释ANC的一系列语法和语义特征。 具体来说,该项目旨在实现三个主要目标:1)开发自动工具,用于注释语料库中的各种元素和结构; 2)创建ANC的“黄金标准”部分,由1000万个单词组成,其中标记,注释和词性都经过手工验证; 3)在语义网的框架内描述了ANC中词与词之间的概念和意义关系,从而大大提高了分析和检索能力。 研究者将通过各种软件程序(许多是专门为此项目创建的)进行本研究,并通过广泛的人机交互手动验证计算机分配的标签。这个研究项目很重要,有几个原因。首先,所产生的语料库将是第一个公开可用的标记语料库的口语和书面美国英语。 其次,由于语料库的注释将经过手工验证,因此最终产品的准确率将接近100%。有了这个经过仔细注释的1000万字语料库,语言研究人员将能够解决以前无法解决的文本之间的许多结构和语言关系。 由于语料库将被手动验证,研究人员可以使用这些信息来开发处理以前看不见的文本的模型。ANC语料库将通过网络随时提供给研究人员。 除了注释的语料库外,该项目还将向研究人员提供一套工具,用于从语料库中检索信息。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nancy Ide其他文献
Outline of a database model for electronic dictionaries
电子词典数据库模型概述
- DOI:
10.5555/3170967.3170995 - 发表时间:
1991 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;J. Véronis;J. Maitre - 通讯作者:
J. Maitre
A statistical measure of theme and structure
主题和结构的统计测量
- DOI:
10.1007/bf02176632 - 发表时间:
1989 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide - 通讯作者:
Nancy Ide
The Language Application Grid and Galaxy
语言应用网格和银河
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;Keith Suderman;J. Pustejovsky;M. Verhagen;C. Cieri - 通讯作者:
C. Cieri
Community Standards for Linguistically-Annotated Resources
语言注释资源的社区标准
- DOI:
10.1007/978-94-024-0881-2_4 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;N. Calzolari;Judith Eckle;D. Gibbon;Sebastian Hellmann;Ki Yong Lee;Joakim Nivre;Laurent Romary - 通讯作者:
Laurent Romary
Preface to the special issue: LREC 2012: state of the art in resource development and evaluation
- DOI:
10.1007/s10579-014-9289-9 - 发表时间:
2014-11-22 - 期刊:
- 影响因子:1.800
- 作者:
Nancy Ide;Nicoletta Calzolari - 通讯作者:
Nicoletta Calzolari
Nancy Ide的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nancy Ide', 18)}}的其他基金
EAGER: Collaborative Research: Mining Scientific Literature with the LAPPS Grid
EAGER:协作研究:使用 LAPPS 网格挖掘科学文献
- 批准号:
1811123 - 财政年份:2018
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
SI2-SSI: The Language Application Grid: A Framework for Rapid Adaptation and Reuse
SI2-SSI:语言应用网格:快速适应和重用的框架
- 批准号:
1147944 - 财政年份:2012
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
RUI: CRI: CI-ADDO-EN: Collaborative Research: MASC: A Community Resource For and By the People
RUI:CRI:CI-ADDO-EN:合作研究:MASC:人民的社区资源
- 批准号:
1059312 - 财政年份:2011
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
INTEROP: Sustainable Interoperability for Language Technology
INTEROP:语言技术的可持续互操作性
- 批准号:
0753069 - 财政年份:2008
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
CRI: CRD A Richly Annotated Resource for Language Processing and Linguistics Research
CRI:CRD 语言处理和语言学研究的注释丰富的资源
- 批准号:
0708952 - 财政年份:2007
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
Collaborative Research: CRI: An Open Linguistic Infrastructure for American English
合作研究:CRI:美式英语的开放语言基础设施
- 批准号:
0551601 - 财政年份:2006
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
CRI: An Open Linguistic Infrastructure for American English
CRI:美式英语的开放语言基础设施
- 批准号:
0454130 - 财政年份:2005
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
XMELLT: Cross-lingual Multi-word Expression Lexicons for Language Technology
XMELLT:语言技术跨语言多词表达词典
- 批准号:
9982069 - 财政年份:2000
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
American National Corpus: Planning and Exploration Workshop
美国国家语料库:规划与探索研讨会
- 批准号:
9978422 - 财政年份:1999
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Workshop: Exploring US-Romanian Collaboration in Language Technology
研讨会:探索美国-罗马尼亚在语言技术方面的合作
- 批准号:
9978601 - 财政年份:1999
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
相似海外基金
Conference: Grantsmanship Workshop at American Indian Science and Engineering Society 2023 National Conference
会议:美洲印第安人科学与工程学会 2023 年全国会议资助研讨会
- 批准号:
2334585 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: Travel: Geosciences United - A joint Technical Conference of the National Association of Black Geoscientists and the American Geophysical Union
合作研究:旅行:地球科学联合 - 全国黑人地球科学家协会和美国地球物理联盟的联合技术会议
- 批准号:
2334206 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: Travel: Geosciences United - A joint Technical Conference of the National Association of Black Geoscientists and the American Geophysical Union
合作研究:旅行:地球科学联合 - 全国黑人地球科学家协会和美国地球物理联盟的联合技术会议
- 批准号:
2334207 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Representations of Waste People in the New World: American National Identity in the Age of the Nation-State and Beyond
新世界中废人的表征:民族国家时代及以后的美国民族认同
- 批准号:
22K00491 - 财政年份:2022
- 资助金额:
$ 28.51万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The 2024 American National Election Studies (ANES)
2024 年美国全国选举研究 (ANES)
- 批准号:
2209438 - 财政年份:2022
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
Student Travel Support to 3D Printing of Polymeric Composites & Hybrid Systems Symposium at American Chemical Society National Meeting; San Diego, California; March 20-24, 2022
聚合物复合材料 3D 打印的学生旅行支持
- 批准号:
2129185 - 财政年份:2021
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
SBP: The Arizona Youth Project: (Re)defining American Identity and National Belonging
SBP:亚利桑那州青年项目:(重新)定义美国身份和民族归属感
- 批准号:
1948197 - 财政年份:2020
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
RAPID: Exploring COVID and the Effects on U.S. Education: Evidence from a National Survey of American Households
RAPID:探索新冠疫情及其对美国教育的影响:来自美国家庭全国调查的证据
- 批准号:
2037179 - 财政年份:2020
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Symposium on "Polymer Processing: Nanomanufacturing and Nanofabrication" at Spring American Chemical Society (ACS) National Meeting; Philadelphia, Pennsylvania; March 22-26, 2020
美国化学会(ACS)全国春季会议上“聚合物加工:纳米制造和纳米加工”研讨会;
- 批准号:
2002318 - 财政年份:2020
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Changes in Weight and Physical Function for Older African American Women in National, Peer-Led, Community-Based Weight Loss Program
国家、同伴主导、社区减肥计划中老年非洲裔美国女性体重和身体机能的变化
- 批准号:
10557163 - 财政年份:2019
- 资助金额:
$ 28.51万 - 项目类别: