权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SGER: Using Text Coherence and Verbal Valence in Long- Distance N-grams

SGER：在长距离 N 元语法中使用文本连贯性和语言效价

基本信息

批准号：
9704046
负责人：
Daniel Jurafsky
金额：
$ 5万
依托单位：
University of Colorado at Boulder
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
1997
资助国家：
美国
起止时间：
1997-01-15 至 1997-12-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=9704046&HistoricalAwards=false
关键词：
SGER Using Text Coherence Verbal

项目摘要

*** Building better speech recognizers requires augmenting n-gram grammars with sophisticated yet probabilistic linguistic knowledge. This project is building probabilistic models of two important pieces of syntactic/semantic knowledge: verb-argument constraints and semantic text coherence.(1) Verbs place strong constraints on the syntax and semantics of their arguments. This project is computing probabilities for the different argument structures that can co-occur with different verbs, and using these probabilities to augment standard trigram language models.(2) Texts and discourses tend to be semantically coherent;in particular the words that occur in a text tend to be semantically related to each other. This project is applying a model of word meaning called Latent Semantic Analysis (LSA) to ASR LMs. In LSA, a word-similarity metric is defined by computing a large matrix of word co-occurrence probabilities, which are then smoothed via Singular Value Decomposition, resulting in a generalized measure of semantic word-similarity. Trigram models can then increase the probability that similar words will occur near each other. Building these two stochastic models of linguistic knowledge, besides possible application in speech recognition LMs, word-sense disambiguation, or parsing, also helps bridge the gap between the structural models used in linguistics and the statistical models of speech engineering.***

*** 构建更好的语音识别器需要使用复杂且概率性的语言知识来增强 n 元语法。该项目正在构建两个重要的句法/语义知识的概率模型：动词参数约束和语义文本连贯性。（1）动词对其参数的语法和语义施加了严格的约束。该项目正在计算可以与不同动词同时出现的不同参数结构的概率，并使用这些概率来增强标准三元组语言模型。（2）文本和话语往往在语义上是连贯的；特别是文本中出现的单词往往在语义上彼此相关。该项目正在将一种称为潜在语义分析 (LSA) 的词义模型应用于 ASR LM。在LSA中，单词相似度度量是通过计算单词共现概率的大矩阵来定义的，然后通过奇异值分解对其进行平滑，从而得到语义单词相似度的广义度量。 Trigram 模型可以增加相似单词彼此靠近出现的概率。构建这两种语言知识的随机模型，除了可能应用于语音识别 LM、词义消歧或解析之外，还有助于弥合语言学中使用的结构模型和语音工程统计模型之间的差距。***

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Daniel Jurafsky其他文献

ReFT: Representation Finetuning for Language Models

ReFT：语言模型的表示微调

DOI：
发表时间：
2024
期刊：
arXiv.org
影响因子：
0
作者：
Zhengxuan Wu;Aryaman Arora;Zheng Wang;Atticus Geiger;Daniel Jurafsky;Christopher D. Manning;Christopher Potts
通讯作者：
Christopher Potts