Exploiting Speech Understanding in Intelligent Interfaces
在智能界面中利用语音理解
基本信息
- 批准号:06044055
- 负责人:
- 金额:$ 2.69万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for international Scientific Research
- 财政年份:1994
- 资助国家:日本
- 起止时间:1994 至 1995
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
We are interested in the use of spoken language in human-computer interaction. The inspiration is the fact that, for human-human interaction, meaningful exchanges can take place even without accurate recognition of the words the other is saying --- this being possible due to shared knowledge and complementary communication channels, especially gesture and prosody. We want to exploit this fact for man-machine interfaces.Therefore we are doing three things :1. Using simple speech recognition to augment graphical user interfaces, well integrated with other input modalities : keyboard, mouse, and touch screen.2. Building systems able to engage in simple conversations, using mostly prosodic clues. To sketch out our latest success :We conjectured that it would be possible for Japanese to decide when to produce many back-channel utterances based on prosodic clues alone, without reference to meaning.We found thatneither vowel lengthening, volume changes, nor energy level (to detect when the other finished speaking) were by themselves good predictors of when to produce an aizuchi. The best predictor was a low pitch level.Specifically, upon detection of the end of a region of pitch less than.9 times the local median pitch and continuing for 150ms, coming after at least 600ms of speech, the system predicted an aizuchi 200ms to 300ms later, providing it had not done so within the preceding 1 second.We also built a real-time system based on the above decision rule. A human stooge steered the conversation to a suitable topic and then switched on the system. After swich-on the stooge's utterances and the system's outputs, mixed together, produced one side of the conversation. We found that none of the 5 subjects had realized that his conversation partner had become partially automated.3. Building tools and collecting data to help do 1 and 2.
我们感兴趣的是在人机交互中使用口语。灵感来自于这样一个事实,即对于人与人的互动,即使没有准确识别对方所说的话,也可以进行有意义的交流-这是可能的,因为共享知识和互补的沟通渠道,特别是手势和韵律。我们想在人机界面中利用这一点,因此我们做了三件事:1.使用简单的语音识别来增强图形用户界面,与其他输入方式(键盘、鼠标和触摸屏)很好地集成。2.构建能够进行简单对话的系统,主要使用韵律线索。简单介绍一下我们最新的成功:我们证实,日本人可以仅仅根据韵律线索来决定何时发出许多非正式的话语,而不考虑其意义。我们发现,无论是元音的拉长、音量的变化,还是能量水平(用来检测对方何时结束讲话)本身都不能很好地预测何时发出合音。最好的预测器是一个低的音高水平。具体地说,在检测到音高小于0.9倍的局部中值音高的区域的结束并持续150 ms时,至少在600 ms的语音之后,系统预测200 ms到300 ms后的aizuchi,前提是它在前1秒内没有这样做。我们还建立了一个基于上述决策规则的实时系统。一个人类傀儡将谈话引导到一个合适的话题,然后打开系统。在接通傀儡的话语和系统的输出后,混合在一起,产生了对话的一方。我们发现,没有一个被试意识到他的谈话伙伴已经成为部分自动化。建立工具和收集数据来帮助完成1和2。
项目成果
期刊论文数量(19)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Tajchman,Gary and Dan,Jurafsky and Eric Folder: "Learning Phonological Rule Probabilities from Speech Corpora with Exploratory Computational Phonology" In Proceedings of ACL95. 9-15 (1995)
Tajchman、Gary 和 Dan、Jurafsky 和 Eric Folder:“通过探索性计算音系学从语音语料库学习音系规则概率”,ACL95 论文集。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Gildea, Daniel and Daniel.Jurafsky: "Learning Bias and Phonological Rules Induction" Computational Linguistics. (1995)
Gildea、Daniel 和 Daniel.Jurafsky:“学习偏差和语音规则归纳”计算语言学。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Nigel, WARD: "Using Prosodic Clucs to Decide When to Produce Back-Channel Utterances" CSLP.
Nigel,WARD:“使用韵律线索来决定何时产生 Back-Channel 话语”CSLP。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Jurafsky, Daniel: "A Probabilistic Model of Lexical and Syntactic Access and Disambiguation" Cognitive Science.
Jurafsky,丹尼尔:“词汇和句法访问和消歧的概率模型”认知科学。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Nigel Ward: "An Approach to Tightly-Coupled Syntactic/Semantic Processing for Speech Understanding" Proceedings of the AAAT Workshop on the Integration of Natural Language and Speech Processing. 50-57 (1994)
Nigel Ward:“用于语音理解的紧耦合句法/语义处理方法”自然语言与语音处理集成 AAAT 研讨会论文集。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
WARD Nigel其他文献
WARD Nigel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('WARD Nigel', 18)}}的其他基金
外国語会話能力養成のための対話的反射訓練システム
培养外语会话能力的互动反射训练系统
- 批准号:
12040209 - 财政年份:2000
- 资助金额:
$ 2.69万 - 项目类别:
Grant-in-Aid for Scientific Research on Priority Areas (A)
Non-lexical Sounds : a New Interface Modality for Voice-based Information Delivery Systems
非词汇声音:基于语音的信息传递系统的新接口模式
- 批准号:
11680412 - 财政年份:1999
- 资助金额:
$ 2.69万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
実時間音声理解応答を利用した機械操作における指動作訓練支援システムの研究
基于实时语音理解响应的机器操作手指运动训练支持系统研究
- 批准号:
08750301 - 财政年份:1996
- 资助金额:
$ 2.69万 - 项目类别:
Grant-in-Aid for Encouragement of Young Scientists (A)
相似海外基金
Peripheral and central contributions to auditory temporal processing deficits and speech understanding in older cochlear implantees
外周和中枢对老年人工耳蜗植入者听觉时间处理缺陷和言语理解的贡献
- 批准号:
10444172 - 财政年份:2022
- 资助金额:
$ 2.69万 - 项目类别:
Effects of Non-Blast mTBI on Binaural Processing and Speech Understanding in Noise
Non-Blast mTBI 对噪声中双耳处理和语音理解的影响
- 批准号:
10537947 - 财政年份:2022
- 资助金额:
$ 2.69万 - 项目类别:
Peripheral and central contributions to auditory temporal processing deficits and speech understanding in older cochlear implantees
外周和中枢对老年人工耳蜗植入者听觉时间处理缺陷和言语理解的贡献
- 批准号:
10630111 - 财政年份:2022
- 资助金额:
$ 2.69万 - 项目类别:
Individual differences in brain networks supporting speech understanding in patients with cochlear implants
支持人工耳蜗患者言语理解的大脑网络的个体差异
- 批准号:
10366520 - 财政年份:2021
- 资助金额:
$ 2.69万 - 项目类别:
Individual differences in brain networks supporting speech understanding in patientswith cochlear implants
支持人工耳蜗植入患者言语理解的大脑网络的个体差异
- 批准号:
10743568 - 财政年份:2021
- 资助金额:
$ 2.69万 - 项目类别:
End-to-End Model for Task-Independent Speech Understanding and Dialogue
与任务无关的语音理解和对话的端到端模型
- 批准号:
20H00602 - 财政年份:2020
- 资助金额:
$ 2.69万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Speech understanding ability and communication intervention for persons with age-related hearing loss and mild cognitive impairment or dementia
年龄相关性听力损失和轻度认知障碍或痴呆患者的言语理解能力和沟通干预
- 批准号:
10437659 - 财政年份:2018
- 资助金额:
$ 2.69万 - 项目类别:
Speech understanding ability and communication intervention for persons with age-related hearing loss and mild cognitive impairment or dementia
年龄相关性听力损失和轻度认知障碍或痴呆患者的言语理解能力和沟通干预
- 批准号:
10201560 - 财政年份:2018
- 资助金额:
$ 2.69万 - 项目类别:
Using Electrophysiology to Complement Speech Understanding-in-Noise Measures
使用电生理学补充噪声中的语音理解测量
- 批准号:
9906072 - 财政年份:2017
- 资助金额:
$ 2.69万 - 项目类别:
Temporal processing and speech understanding in older cochlear implantees
老年人工耳蜗植入者的时间处理和言语理解
- 批准号:
9355563 - 财政年份:2016
- 资助金额:
$ 2.69万 - 项目类别: