Integrating sound and context recognition for acoustic scene analysis
集成声音和上下文识别以进行声学场景分析
基本信息
- 批准号:EP/R01891X/1
- 负责人:
- 金额:$ 12.47万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2018
- 资助国家:英国
- 起止时间:2018 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The amount of audio data being generated has dramatically increased over the past decade, spanning from user-generated content, recordings in audiovisual archives, to sensor data captured in urban, nature or domestic environments. The need to detect and identify sound events in environmental recordings (e.g. door knock, glass break) as well as to recognise the context of an audio recording (e.g. train station, meeting) has led to the emergence of a new field of research: acoustic scene analysis. Emerging applications of acoustic scene analysis include the development of sound recognition technologies for smart homes and smart cities, security/surveillance, audio retrieval and archiving, ambient assisted living, and automatic biodiversity assessment.However, current sound recognition technologies cannot adapt to different environments or situations (e.g. sound identification in an office environment, assuming specific room properties, working hours, outdoor noise and weather conditions). If information about context is available, it is typically characterised by a single label for an entire audio stream, not taking into account complex and ever-changing environments, for example when recording using hand-held devices, where context can consist of multiple time-varying factors and can be characterised by more than a single label. This project will address the aforementioned shortcomings by investigating and developing technologies for context-aware sound recognition. We assume that the context of an audio stream consists of several time-varying factors that can be viewed as a combination of different environments and situations; the ever-changing context in turn informs the types and properties of sounds to be recognised by the system. Methods for context and sound recognition will be investigated and developed, based on signal processing and machine learning theory. The main contribution of the project will be an algorithmic framework that jointly recognises audio-based context and sound events, applied to complex audio streams with several sound sources and time-varying environments. The proposed software framework will be evaluated using complex audio streams recorded in urban and domestic environments, as well as using simulated audio data in order to carefully control contextual and sound properties and have the benefit of accurate annotations. In order to further promote the study of context-aware sound recognition systems, a public evaluation task will be organised in conjunction with the public challenge on Detection and Classification of Acoustic Scenes and Events (DCASE). Research carried out in this project targets a wide range of potential beneficiaries in the commercial and public sector for sound and audio-based context recognition technologies, as well as users and practitioners of such technologies. Beyond acoustic scene analysis, we believe this new approach will advance the broader fields of audio and acoustics, leading to the creation of context-aware systems for related fields, including music and speech technology and hearing aids.
过去十年中,生成的音频数据量急剧增加,从用户生成的内容、视听档案中的录音,到在城市、自然或家庭环境中捕获的传感器数据。检测和识别环境录音中的声音事件(例如敲门声、玻璃破碎)以及识别音频录音的背景(例如火车站、会议)的需求导致了一个新的研究领域的出现:声学场景分析。声学场景分析的新兴应用包括开发用于智能家居和智能城市、安全/监控、音频检索和存档、环境辅助生活和自动生物多样性评估的声音识别技术。然而,当前的声音识别技术无法适应不同的环境或情况(例如,办公室环境中的声音识别,假设特定的房间属性、工作时间、室外噪音和天气条件)。如果有关上下文的信息可用,则其通常由整个音频流的单个标签来表征,而不考虑复杂且不断变化的环境,例如在使用手持设备进行录制时,其中上下文可以由多个时变因素组成,并且可以由多个标签来表征。该项目将通过研究和开发上下文感知声音识别技术来解决上述缺点。我们假设音频流的上下文由几个随时间变化的因素组成,这些因素可以被视为不同环境和情况的组合;不断变化的环境反过来又告诉系统要识别的声音的类型和属性。基于信号处理和机器学习理论,将研究和开发上下文和声音识别方法。该项目的主要贡献将是一个联合识别基于音频的上下文和声音事件的算法框架,应用于具有多个声源和时变环境的复杂音频流。所提出的软件框架将使用在城市和家庭环境中记录的复杂音频流以及使用模拟音频数据进行评估,以便仔细控制上下文和声音属性并获得准确注释的好处。为了进一步推动情境感知声音识别系统的研究,将结合声学场景和事件的检测和分类(DCASE)公共挑战赛组织一项公共评估任务。该项目进行的研究针对商业和公共部门中基于声音和音频的上下文识别技术的广泛潜在受益者,以及此类技术的用户和从业者。除了声学场景分析之外,我们相信这种新方法将推动更广泛的音频和声学领域的发展,从而为相关领域(包括音乐和语音技术以及助听器)创建情境感知系统。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Onsets, Activity, and Events: A Multi-task Approach for Polyphonic Sound Event Modelling
- DOI:10.33682/sm6r-8p49
- 发表时间:2019-10
- 期刊:
- 影响因子:0
- 作者:Arjun Pankajakshan;Helen L. Bear;Emmanouil Benetos;Events
- 通讯作者:Arjun Pankajakshan;Helen L. Bear;Emmanouil Benetos;Events
Towards Joint Sound Scene and Polyphonic Sound Event Recognition
走向联合声音场景和和弦声音事件识别
- DOI:10.21437/interspeech.2019-2169
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Bear H
- 通讯作者:Bear H
City Classification from Multiple Real-World Sound Scenes
根据多个真实世界声音场景进行城市分类
- DOI:10.1109/waspaa.2019.8937271
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Bear H
- 通讯作者:Bear H
An extensible cluster-graph taxonomy for open set sound scene analysis
- DOI:
- 发表时间:2018-09
- 期刊:
- 影响因子:0
- 作者:Helen L. Bear;Emmanouil Benetos
- 通讯作者:Helen L. Bear;Emmanouil Benetos
Polyphonic Sound Event and Sound Activity Detection: A Multi-Task Approach
和弦声音事件和声音活动检测:多任务方法
- DOI:10.1109/waspaa.2019.8937193
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Pankajakshan A
- 通讯作者:Pankajakshan A
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Emmanouil Benetos其他文献
Classification-based Note Tracking for Automatic Music Transcription
用于自动音乐转录的基于分类的音符跟踪
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
J. J. Valero;Emmanouil Benetos;J. Iñesta - 通讯作者:
J. Iñesta
Automatic Music Transcription and Ethnomusicology: a User Study
自动音乐转录和民族音乐学:用户研究
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
A. Holzapfel;Emmanouil Benetos - 通讯作者:
Emmanouil Benetos
Performance MIDI-to-score conversion by neural beat tracking
通过神经节拍跟踪将演奏 MIDI 转换为乐谱
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Lele Liu;Qiuqiang Kong;Veronica Morfi;Emmanouil Benetos - 通讯作者:
Emmanouil Benetos
Sound event detection in synthetic audio: Analysis of the dcase 2016 task results
合成音频中的声音事件检测:dcase 2016 任务结果分析
- DOI:
10.1109/waspaa.2017.8169985 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
G. Lafay;Emmanouil Benetos;M. Lagrange - 通讯作者:
M. Lagrange
Movie Analysis with Emphasis to Dialogue and Action Scene Detection
强调对话和动作场景检测的电影分析
- DOI:
10.1007/978-0-387-76316-3_7 - 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
Emmanouil Benetos;Spyridon Siatras;Constantine Kotropoulos;N. Nikolaidis;I. Pitas - 通讯作者:
I. Pitas
Emmanouil Benetos的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
通用声场空间信息捡拾与重放方法的研究
- 批准号:11174087
- 批准年份:2011
- 资助金额:70.0 万元
- 项目类别:面上项目
相似海外基金
Understanding organizational culture and implementation of evidence-based resuscitation practices
了解组织文化和实施循证复苏实践
- 批准号:
10591266 - 财政年份:2023
- 资助金额:
$ 12.47万 - 项目类别:
Place and Time Processing of Pitch in the Context of Cochlear Dysfunction
耳蜗功能障碍背景下音调的地点和时间处理
- 批准号:
10680120 - 财政年份:2023
- 资助金额:
$ 12.47万 - 项目类别:
Statistical Parametric Instrumental Sound Synthesis with Controllable Context of Performance
具有可控性能背景的统计参数乐器声音合成
- 批准号:
22KJ2855 - 财政年份:2023
- 资助金额:
$ 12.47万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Perception of speech in context by listeners with healthy and impaired hearing
听力健康和听力受损的听众对上下文中语音的感知
- 批准号:
10708981 - 财政年份:2022
- 资助金额:
$ 12.47万 - 项目类别:
Characterizing the behavioral expression of retrospective learning and memory of associative information by vmOFC->VTA neurons in the context of extinction-related behaviors
表征消退相关行为背景下 vmOFC->VTA 神经元的回顾性学习和联想信息记忆的行为表达
- 批准号:
10700484 - 财政年份:2022
- 资助金额:
$ 12.47万 - 项目类别:
The role of context in sleep-related memory reactivation in humans
环境在人类睡眠相关记忆重新激活中的作用
- 批准号:
10729554 - 财政年份:2022
- 资助金额:
$ 12.47万 - 项目类别:
Perception of speech in context by listeners with healthy and impaired hearing
听力健康和听力受损的听众对上下文中语音的感知
- 批准号:
10584131 - 财政年份:2022
- 资助金额:
$ 12.47万 - 项目类别:
Investigating orthography-phonology and orthography-semantics pathways with implications for compensatory mechanisms in reading disorder in the context of a randomized control trial
在随机对照试验的背景下研究正字法-音韵学和正字法-语义路径对阅读障碍补偿机制的影响
- 批准号:
10528583 - 财政年份:2021
- 资助金额:
$ 12.47万 - 项目类别:
Investigating orthography-phonology and orthography-semantics pathways with implications for compensatory mechanisms in reading disorder in the context of a randomized control trial
在随机对照试验的背景下研究正字法-音韵学和正字法-语义路径对阅读障碍补偿机制的影响
- 批准号:
10389790 - 财政年份:2021
- 资助金额:
$ 12.47万 - 项目类别:
Investigating orthography-phonology and orthography-semantics pathways with implications for compensatory mechanisms in reading disorder in the context of a randomized control trial
在随机对照试验的背景下研究正字法-音韵学和正字法-语义路径对阅读障碍补偿机制的影响
- 批准号:
10506327 - 财政年份:2021
- 资助金额:
$ 12.47万 - 项目类别:














{{item.name}}会员




