权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Probabilistic Auditory Scene Analysis

概率听觉场景分析

基本信息

批准号：
EP/G050821/1
负责人：
Richard Turner
金额：
$ 29.57万
依托单位：
University of Cambridge
依托单位国家：
英国
项目类别：
Fellowship
财政年份：
2010
资助国家：
英国
起止时间：
2010 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FG050821%2F1
关键词：
Probabilistic Auditory Scene Analysis

项目摘要

Auditory environments are typically very complicated. For example, thecocktail party comprises many sources; the chinking of glasses; thechattering of the many guests; the sound of backgroundmusic. Nevertheless, our auditory system can make sense of such ascene; it can work out how many acoustic sources there are anddetermine the individual contributions to the scene fromeach. Remarkably, it can do this using the information from a singlemicrophone. A major goal of auditory neuroscience is to understandhow the auditory system achieves this feat.Broadly speaking, it is thought that there are three stages toauditory scene analysis. The first stage is well understoodphysiologically and that is to convert the incoming sound into atime-frequency representation. This reveals the local energy in afrequency band at a particular time. In the second stage,psychophysical evidence suggests that primitive grouping principlesare used to group local regions of spectral-temporal energy arisingfrom a common source. By using simple stimuli - like tones and noise -a long list of primitive grouping principles have been elucidated. Forexample, the principle of good continuation identifies smoothlyvarying features with a single source and abrupt changes as asignature of separate sources. In the final stage of auditory sceneanalysis, called schema-based grouping, higher level knowledge, likethe structure of music or speech, is used to bind the groups ofspectral-temporal energy into streams so that there is one stream foreach source.There are many outstanding questions with this framework. Oneimportant open question is the role that auditory cortex plays inauditory scene analysis as it is not well established. Anotherconcerns the generality and completeness of the established list ofprimitive grouping rules. For although the principles successfullycharacterise perception of simple sounds it is unclear how successfuland relevant the description is for natural sounds. This project aims to resolve these questions though modelling work,psychophysics experiments and neural recording experiments. The newidea is to view the primitive grouping principles as arising frominference in a latent variable model of auditory scenes. A latentvariable model is a description of how an auditory scene, like thatencountered at a coctail party, is composed of latent auditorysources, like the chinking glasses and chattering guests. It alsoincludes a description of the statistics of these sources, like thefact that the chinking glasses tend to be isolated, high frequencyevents whist the chattering rather more constant and lower infrequency. The idea is that the brain is trying to infer these latentsources using prior knowledge of their statistics. New tools ofprobabilistic inference can make these intuitions concrete.This new perspective, called probabilistic scene analysis, has twomain advantages; one practical and one theoretical. The practicaladvantage is that a statistical characterisation of sounds can be usedto produce stimuli with complicated, but controlled structure, for usein experiments. The theoretical benefit is that the list of primitivegrouping rules, and the manner in which they trade off, are nowderived from the statistics of sounds; Heuristic implementation is nolonger required. This enables us to predict the results of theexperiments. In particular, the psychophysics experiments are aimedat resolving both how auditory grouping operates in synthetic auditorytextures (e.g. rain, wind, water etc.) and whether this is consistentwith the probabilistic account. Furthermore, the neural recordingexperiments will investigate the role of auditory cortex in auditoryscene analysis, and the hypothesis that it is representing high levelstatistics of sounds like slowly varying modulatory components.

听觉环境通常非常复杂。例如，鸡尾酒会包括许多来源；眼镜的叮当声；众多客人的闲聊；背景音乐的声音。然而，我们的听觉系统可以理解这种上升现象。它可以计算出有多少个声源，并确定每个声源对场景的贡献。值得注意的是，它可以使用来自单个麦克风的信息来做到这一点。听觉神经科学的一个主要目标是了解听觉系统如何实现这一壮举。从广义上讲，听觉场景分析被认为分为三个阶段。第一阶段在生理学上很好理解，即将传入的声音转换为时频表示。这揭示了特定时间频带内的局部能量。在第二阶段，心理物理学证据表明，原始分组原则用于对来自共同源的谱时能量的局部区域进行分组。通过使用简单的刺激——比如音调和噪音——一长串原始分组原则已经被阐明。例如，良好连续性原则将单个源的平滑变化特征识别为单独源的特征，并将突变特征识别为单独源的特征。在听觉场景分析的最后阶段，称为基于图式的分组，使用更高层次的知识，如音乐或语音的结构，将频谱-时间能量组绑定成流，以便每个源都有一个流。这个框架存在许多悬而未决的问题。一个重要的悬而未决的问题是听觉皮层在听觉场景分析中所扮演的角色，因为它尚未得到很好的证实。另一个问题涉及已建立的原始分组规则列表的通用性和完整性。因为虽然这些原理成功地描述了简单声音的感知，但尚不清楚这种描述对于自然声音有多成功和相关。本项目旨在通过建模工作、心理物理学实验和神经记录实验来解决这些问题。新想法是将原始分组原则视为源自听觉场景潜变量模型的推理。潜在变量模型描述了听觉场景（如鸡尾酒会上遇到的场景）如何由潜在听觉源（如叮当作响的玻璃杯和喋喋不休的客人）组成。它还包括对这些来源的统计数据的描述，例如叮当声往往是孤立的、高频事件，而喋喋不休的事件则更加持续且频率较低。这个想法是，大脑试图利用统计数据的先验知识来推断这些潜在来源。新的概率推理工具可以使这些直觉具体化。这种称为概率场景分析的新视角有两个主要优点：一种是实践的，一种是理论的。实际优点是声音的统计特征可用于产生具有复杂但受控结构的刺激，以供实验使用。理论上的好处是，原始分组规则的列表以及它们权衡的方式现在是从声音的统计中得出的；不再需要启发式实施。这使我们能够预测实验结果。特别是，心理物理学实验旨在解决听觉分组在合成听觉纹理（例如雨、风、水等）中如何运作以及这是否与概率解释一致。此外，神经记录实验将研究听觉皮层在听觉场景分析中的作用，以及它代表声音的高水平统计数据（如缓慢变化的调制成分）的假设。

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes

DOI：
10.17863/cam.15597
发表时间：
2015-04
期刊：
影响因子：
0
作者：
A. G. Matthews;J. Hensman;Richard E. Turner;Zoubin Ghahramani
通讯作者：
A. G. Matthews;J. Hensman;Richard E. Turner;Zoubin Ghahramani

Deep Gaussian Processes for Regression using Approximate Expectation Propagation

DOI：
10.17863/cam.21348
发表时间：
2016-02
期刊：
影响因子：
0
作者：
T. Bui;D. Hernández-Lobato;José Miguel Hernández-Lobato;Yingzhen Li;Richard E. Turner
通讯作者：
T. Bui;D. Hernández-Lobato;José Miguel Hernández-Lobato;Yingzhen Li;Richard E. Turner

Rebuilding the limit order book: sequential Bayesian inference on hidden states

DOI：
10.1080/14697688.2013.851402
发表时间：
2013-11
期刊：
Quantitative Finance
影响因子：
1.3
作者：
H. Christensen;Richard E. Turner;Simon I. Hill;S. Godsill
通讯作者：
H. Christensen;Richard E. Turner;Simon I. Hill;S. Godsill

Neural Adaptive Sequential Monte Carlo

神经自适应序列蒙特卡罗

DOI：
10.48550/arxiv.1506.03338
发表时间：
2015
期刊：
影响因子：
0
作者：
Gu S
通讯作者：
Gu S

Training Deep Gaussian Processes using Stochastic Expectation Propagation and Probabilistic Backpropagation

使用随机期望传播和概率反向传播训练深度高斯过程

DOI：
10.48550/arxiv.1511.03405
发表时间：
2015
期刊：
影响因子：
0
作者：
Bui T
通讯作者：
Bui T

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Richard Turner其他文献

Minority opinion: CT screening for lung cancer.

少数意见：肺癌CT筛查。

DOI：
10.1097/01.rti.0000189989.65271.79
发表时间：
2005
期刊：
Journal of thoracic imaging
影响因子：
3.3
作者：
C. Henschke;J. Austin;Nathaniel Berlin;T. Bauer;S. Giunta;Fred Gannis;M. Kalafer;S. Kopel;Albert Miller;H. Pass;H. Roberts;R. Shah;D. Shaham;Michael John Smith;S. Sone;Richard Turner;D. Yankelevitz;J. Zulueta
通讯作者：
J. Zulueta