Collaborative Proposal-Using the Web as a Corpus for Empirical Linguistic Research

协作提案-使用网络作为实证语言学研究的语料库

基本信息

  • 批准号:
    0113641
  • 负责人:
  • 金额:
    $ 15万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2001
  • 资助国家:
    美国
  • 起止时间:
    2001-09-01 至 2004-08-31
  • 项目状态:
    已结题

项目摘要

. This project will develop tools that make it possible to retrieve naturally occurring sentences from the World Wide Web on the basis of lexical content and syntactic structure, providing linguists with an immediate, easily accessible source of raw linguistic data. The PIs will investigate specific linguistic hypotheses at the lexical semantics/syntax interface as an illustrative application of these tools. At a high level, the planned work constitutes an important step toward a new paradigm for linguistic research. Rather than relying entirely on introspective data generated by the linguist who is trying to (dis)prove a particular hypothesis, Web-enabled linguistics research will draw on the methodology and the tools developed by the PIs to supply naturally occurring data on which theories can rest. With regard to specific linguistic questions, the goal is to provide an explanation of the rules and constraints that govern three transitivity alternations (Middle, Unaccusative, Unspecified Object Deletion), and the PIs expect data made available by their tools to shed light on the "grey" area between competence and performance, that is, the linguistic behavior that seems to fall outside of rule-governed behavior. Although naturally occurring data are not accorded great emphasis in generative syntax, the use of text corpora has a tradition in the greater linguistic enterprise. An explosive new phenomenon in the world of naturally occurring text, the World Wide Web is an essentially untapped resource that embodies the rich and dynamic nature of language, presenting a data resource of unparalleled size and diversity.
。该项目将开发工具,使之能够根据词汇内容和句法结构从万维网上检索自然出现的句子,为语言学家提供一个即时、容易获得的原始语言数据来源。PI将在词汇语义/句法界面调查特定的语言假设,作为这些工具的说明性应用。在更高的层面上,计划中的工作是朝着语言学研究的新范式迈出的重要一步。网络语言学研究不会完全依赖试图证明某个特定假设的语言学家产生的内省数据,而是利用个人投资机构开发的方法和工具来提供自然产生的数据,作为理论的基础。对于具体的语言问题,目标是解释支配三种及物性交替的规则和限制(中式、非宾语、未指定宾语删除),而PI期望通过他们的工具提供的数据来阐明能力和表现之间的灰色地带,即似乎超出规则控制的行为的语言行为。虽然自然产生的数据在生成句法中没有得到很大的重视,但在更大的语言事业中,使用文本语料库是有传统的。万维网是自然文本世界中一种爆炸性的新现象,它本质上是一种尚未开发的资源,体现了语言的丰富和动态本质,呈现了无与伦比的大小和多样性的数据资源。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Philip Resnik其他文献

A multi-modal approach for identifying schizophrenia using cross-modal attention
使用跨模式注意力识别精神分裂症的多模式方法
  • DOI:
    10.48550/arxiv.2309.15136
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gowtham Premananth;Yashish M. Siriwardena;Philip Resnik;Carol Y. Espy
  • 通讯作者:
    Carol Y. Espy
Computationally Scalable and Clinically Sound: Laying the Groundwork to Use Machine Learning Techniques for Social Media and Language Data in Predicting Psychiatric Symptoms
  • DOI:
    10.1016/j.biopsych.2022.02.146
  • 发表时间:
    2022-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Deanna Kelly;Glen Coppersmith;John Dickerson;Carol Espy-Wilson;Hanna Michel;Philip Resnik
  • 通讯作者:
    Philip Resnik
Using Intrinsic and Extrinsic Metrics to Evaluate Accuracy and Facilitation in Computer-assisted Coding
使用内在和外在指标来评估计算机辅助编码的准确性和便利性
  • DOI:
  • 发表时间:
    2006
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Philip Resnik;Michael Niv;Michael Nossal;Gregory Schnitzer;Jean Stoner;Andrew Kapit;Richard Toren
  • 通讯作者:
    Richard Toren
A Psycholinguistics-Inspired Method to Counter IP Theft using Fake Documents
一种受心理语言学启发的方法,利用虚假文档来打击知识产权盗窃
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Natalia Denisenko;Youzhi Zhang;Chiara Pulice;Shohini Bhattasali;Sushil Jajodia;Philip Resnik;V. S. Subrahmanian
  • 通讯作者:
    V. S. Subrahmanian
Selection and information: a class-based approach to lexical relationships
  • DOI:
  • 发表时间:
    1993-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Philip Resnik
  • 通讯作者:
    Philip Resnik

Philip Resnik的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Philip Resnik', 18)}}的其他基金

RI: Small: Modeling Co-Decisions: A Computational Framework Using Language and Metadata
RI:小型:共同决策建模:使用语言和元数据的计算框架
  • 批准号:
    2008761
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
RAPID: Advanced Topic Modeling Methods to Analyze Text Responses in COVID-19 Survey Data
RAPID:用于分析 COVID-19 调查数据中文本响应的高级主题建模方法
  • 批准号:
    2031736
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
SoCS: Collaborative Research: Data Driven, Computational Models for Discovery and Analysis of Framing
SoCS:协作研究:用于框架发现和分析的数据驱动计算模型
  • 批准号:
    1211153
  • 财政年份:
    2012
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
SGER: Exploiting Alternative Packagings of Source Meaning in Statistical Machine Translation
SGER:在统计机器翻译中利用源含义的替代包装
  • 批准号:
    0838801
  • 财政年份:
    2008
  • 资助金额:
    $ 15万
  • 项目类别:
    Continuing Grant
Workshop: Student Research in Computational Linguistics, at the ACL'2000 Conference
研讨会:计算语言学学生研究,ACL2000 会议
  • 批准号:
    0097529
  • 财政年份:
    2000
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant

相似海外基金

ECLIPSE/Collaborative Proposal: Studying Microwave-Plasma interactions at Solid Interfaces Using Microwave Microstrip Architectures
ECLIPSE/协作提案:使用微波微带架构研究固体界面处的微波-等离子体相互作用
  • 批准号:
    2206769
  • 财政年份:
    2022
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
ECLIPSE/Collaborative Proposal: Studying Microwave-Plasma Interactions at Solid Interfaces Using Microwave Microstrip Architectures
ECLIPSE/协作提案:使用微波微带架构研究固体界面处的微波-等离子体相互作用
  • 批准号:
    2206546
  • 财政年份:
    2022
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: Linking the topographic features of bio-inspired undulated cylinders to their force reduction properties using critical points
合作提案:使用临界点将仿生波状圆柱体的地形特征与其减力特性联系起来
  • 批准号:
    2037582
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: MRA: Using NEON data to elucidate the ecological effects of global environmental change on phenology across time and space
合作提案:MRA:利用 NEON 数据阐明全球环境变化对跨时间和空间物候的生态影响
  • 批准号:
    2017463
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: MRA: Using NEON data to elucidate the ecological effects of global environmental change on phenology across time and space
合作提案:MRA:利用 NEON 数据阐明全球环境变化对跨时间和空间物候的生态影响
  • 批准号:
    2017740
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: Plastic Spiraling In River Networks (Plastic-SIReN): Determining the controls of watershed plastic fluxes using a field and modeling approach
合作提案:河流网络中的塑料螺旋 (Plastic-SIReN):使用现场和建模方法确定流域塑料通量的控制
  • 批准号:
    2113333
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: MRA: Using NEON data to elucidate the ecological effects of global environmental change on phenology across time and space
合作提案:MRA:利用 NEON 数据阐明全球环境变化对跨时间和空间物候的生态影响
  • 批准号:
    2017785
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: Plastic Spiraling In River Networks (Plastic-SIReN): Determining the controls of watershed plastic fluxes using a field and modeling approach
合作提案:河流网络中的塑料螺旋 (Plastic-SIReN):使用现场和建模方法确定流域塑料通量的控制
  • 批准号:
    2113338
  • 财政年份:
    2021
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
CRCNS Research Proposal: Collaborative Research: Evaluating Machine Learning Architectures Using a Massive Benchmark Dataset of Brain Responses to Natural Scenes
CRCNS 研究提案:协作研究:使用大脑对自然场景反应的大量基准数据集评估机器学习架构
  • 批准号:
    2138972
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
Collaborative Proposal: WoU-MMA: Observations of Gravitational Wave Sources Using the Long Wavelength Array
合作提案:WoU-MMA:使用长波长阵列观测引力波源
  • 批准号:
    2011731
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了