Collaborative Research: CRI: An Open Linguistic Infrastructure for American English
合作研究:CRI:美式英语的开放语言基础设施
基本信息
- 批准号:0551601
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2006
- 资助国家:美国
- 起止时间:2006-03-01 至 2008-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project, supporting research into computational linguistics, plans the enhancing of the American National Corpus (ANC) with an open linguistic infrastructure that will add multiple manual and automatic annotations to a portion of ANC and will provide free access to these annotations in a common XML data format via a project website. The following activities are envisioned:-Incorporation of automatic annotations derived from freely existing tools, mapped into the ANC XML format language,-Syntactic and named entity annotations of a 10Mw gold standard corpus, with partial manual annotation,-Hand-corrected automatic WordNet and FrameNet annotation for a portion of the gold standard corpus,-Enhancement of automatic annotation performance via experimentation with machine learning techniques, and-Development of a web interface for users to download above annotations, and to upload new annotation of ANC.This work, describing methods for internal and external evaluation of the resources and tools developed, plans to create a richly, multiple annotated diverse corpus of natural language, and tools to access it. The full project would be the first large-scale execution of such effort, developing a 100 million word ANC and providing a 10-million word subset, annotated with syntax, named entities, and semantic categories in WordNet (WN) and FrameNet (FN). The annotated data will be balanced from different genres of text. One of the activities of the planning award consists in harmonizing all three resources, ANC, WN, and FN, and maximally exploiting their respective strengths. The other involves the continued development of the ANC, which, with the addition of a wide range of linguistic annotations, will serve as a resource for language processing research and applications for the NLP community. The planning project undertakes the following activities:-Creation and annotation of WN senses and FN frames,-Planning meetings, -Further research into experimentation with methods and software to enhance automatic annotation, and-Outreach to the US computational linguistics community.Broader Impact: Full completion of this work will further enhance the ANC by creating a comprehensive linguistic infrastructur for American English. The availability of a massive, richly annotated corpus of American English has impacts at many levels and across several areas, including computational linguistics and natural language processing, corpus linguistics, cross-linguistic studies, dialect studies, language acquisition, and materials development for both English language students and teacher training.
该项目支持计算语言学的研究,计划通过开放的语言基础设施来增强美国国家语料库(ANC),该基础设施将为ANC的一部分添加多个手动和自动注释,并将通过项目网站以通用的XML数据格式提供对这些注释的免费访问。设想开展以下活动:- 合并从自由存在的工具导出的自动注释,映射到ANC XML格式语言,-10 Mw黄金标准语料库的语法和命名实体注释,具有部分手动注释,-用于黄金标准语料库的一部分的手动校正的自动WordNet和FrameNet注释,-通过用机器学习技术进行实验来增强自动注释性能,以及-开发用于用户下载上述注释和上传ANC的新注释的web界面。完整的项目将是这种努力的第一次大规模执行,开发1亿字ANC并提供1000万字子集,在WordNet(WN)和FrameNet(FN)中注释语法,命名实体和语义类别。注释数据将从不同类型的文本中平衡。规划奖的活动之一是协调所有三种资源,ANC,WN和FN,并最大限度地利用各自的优势。另一个涉及ANC的持续发展,加上广泛的语言注释,ANC将成为NLP社区语言处理研究和应用的资源。该规划项目承担了以下活动:-创建和标注的WN意义和FN框架,-规划会议,-进一步研究实验的方法和软件,以加强自动注释,和-推广到美国计算语言学社区。更广泛的影响:这项工作的全面完成将进一步加强ANC创建一个全面的语言基础设施,为美国英语。大量的、注释丰富的美国英语语料库的可用性在多个层面和多个领域产生了影响,包括计算语言学和自然语言处理、语料库语言学、跨语言研究、方言研究、语言习得以及英语语言学生和教师培训的材料开发。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nancy Ide其他文献
The Language Application Grid and Galaxy
语言应用网格和银河
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;Keith Suderman;J. Pustejovsky;M. Verhagen;C. Cieri - 通讯作者:
C. Cieri
A statistical measure of theme and structure
主题和结构的统计测量
- DOI:
10.1007/bf02176632 - 发表时间:
1989 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide - 通讯作者:
Nancy Ide
Outline of a database model for electronic dictionaries
电子词典数据库模型概述
- DOI:
10.5555/3170967.3170995 - 发表时间:
1991 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;J. Véronis;J. Maitre - 通讯作者:
J. Maitre
Community Standards for Linguistically-Annotated Resources
语言注释资源的社区标准
- DOI:
10.1007/978-94-024-0881-2_4 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Nancy Ide;N. Calzolari;Judith Eckle;D. Gibbon;Sebastian Hellmann;Ki Yong Lee;Joakim Nivre;Laurent Romary - 通讯作者:
Laurent Romary
Preface to the special issue: LREC 2012: state of the art in resource development and evaluation
- DOI:
10.1007/s10579-014-9289-9 - 发表时间:
2014-11-22 - 期刊:
- 影响因子:1.800
- 作者:
Nancy Ide;Nicoletta Calzolari - 通讯作者:
Nicoletta Calzolari
Nancy Ide的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nancy Ide', 18)}}的其他基金
EAGER: Collaborative Research: Mining Scientific Literature with the LAPPS Grid
EAGER:协作研究:使用 LAPPS 网格挖掘科学文献
- 批准号:
1811123 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
SI2-SSI: The Language Application Grid: A Framework for Rapid Adaptation and Reuse
SI2-SSI:语言应用网格:快速适应和重用的框架
- 批准号:
1147944 - 财政年份:2012
- 资助金额:
-- - 项目类别:
Standard Grant
RUI: CRI: CI-ADDO-EN: Collaborative Research: MASC: A Community Resource For and By the People
RUI:CRI:CI-ADDO-EN:合作研究:MASC:人民的社区资源
- 批准号:
1059312 - 财政年份:2011
- 资助金额:
-- - 项目类别:
Standard Grant
INTEROP: Sustainable Interoperability for Language Technology
INTEROP:语言技术的可持续互操作性
- 批准号:
0753069 - 财政年份:2008
- 资助金额:
-- - 项目类别:
Continuing Grant
CRI: CRD A Richly Annotated Resource for Language Processing and Linguistics Research
CRI:CRD 语言处理和语言学研究的注释丰富的资源
- 批准号:
0708952 - 财政年份:2007
- 资助金额:
-- - 项目类别:
Continuing Grant
CRI: An Open Linguistic Infrastructure for American English
CRI:美式英语的开放语言基础设施
- 批准号:
0454130 - 财政年份:2005
- 资助金额:
-- - 项目类别:
Standard Grant
ITR: American National Corpus: A Primary Resource for Linguistics Research
ITR:美国国家语料库:语言学研究的主要资源
- 批准号:
0218609 - 财政年份:2002
- 资助金额:
-- - 项目类别:
Continuing Grant
XMELLT: Cross-lingual Multi-word Expression Lexicons for Language Technology
XMELLT:语言技术跨语言多词表达词典
- 批准号:
9982069 - 财政年份:2000
- 资助金额:
-- - 项目类别:
Standard Grant
American National Corpus: Planning and Exploration Workshop
美国国家语料库:规划与探索研讨会
- 批准号:
9978422 - 财政年份:1999
- 资助金额:
-- - 项目类别:
Standard Grant
Workshop: Exploring US-Romanian Collaboration in Language Technology
研讨会:探索美国-罗马尼亚在语言技术方面的合作
- 批准号:
9978601 - 财政年份:1999
- 资助金额:
-- - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
CRI: CI-EN: Collaborative Research: mResearch: A platform for Reproducible and Extensible Mobile Sensor Big Data Research
CRI:CI-EN:协作研究:mResearch:可复制和可扩展的移动传感器大数据研究平台
- 批准号:
1822935 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-New: Collaborative Research: Extensible, Software Enabled Unmanned Aerial Vehicles
CRI:CI-New:协作研究:可扩展、软件支持的无人机
- 批准号:
1823230 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Continuing Grant
CRI: CI-EN: Collaborative Research: OpenNetVM: A Software Platform Enabling Network Function Virtualization Research
CRI:CI-EN:协作研究:OpenNetVM:支持网络功能虚拟化研究的软件平台
- 批准号:
1823236 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-EN: Collaborative Research: An Experimental Infrastructure and a Database of Real Faults to Foster Reproducibility in Software Engineering Research
CRI:CI-EN:协作研究:实验基础设施和真实故障数据库,以促进软件工程研究的可重复性
- 批准号:
1929215 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: Sustaining Lemur Project Resources for the Long-Term
CRI:CI-SUSTAIN:合作研究:长期维持狐猴项目资源
- 批准号:
1822986 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-EN: Collaborative Research: An Experimental Infrastructure and a Database of Real Faults to Foster Reproducibility in Software Engineering Research
CRI:CI-EN:协作研究:实验基础设施和真实故障数据库,以促进软件工程研究的可重复性
- 批准号:
1823172 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-New: Collaborative Research: NJR: A Normalized Java Resource
CRI:CI-New:协作研究:NJR:标准化 Java 资源
- 批准号:
1823227 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-EN: Collaborative Research: mResearch: A platform for Reproducible and Extensible Mobile Sensor Big Data Research
CRI:CI-EN:协作研究:mResearch:可复制和可扩展的移动传感器大数据研究平台
- 批准号:
1823221 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: CiteSeerX: Toward Sustainable Support of Scholarly Big Data
CRI:CI-SUSTAIN:协作研究:CiteSeerX:迈向学术大数据的可持续支持
- 批准号:
1823288 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: CiteSeerX: Toward Sustainable Support of Scholarly Big Data
CRI:CI-SUSTAIN:协作研究:CiteSeerX:迈向学术大数据的可持续支持
- 批准号:
1853919 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant