权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Computational auditory scene analysis as causal inference

作为因果推理的计算听觉场景分析

基本信息

批准号：
1921501
负责人：
Joshua McDermott
金额：
$ 50.03万
依托单位：
Massachusetts Institute of Technology
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-01 至 2023-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1921501&HistoricalAwards=false
关键词：
Computational auditory scene analysis causal

项目摘要

Just by listening, humans can infer many details about the world around them: what someone said, whether a window in their house is open or shut, or what their child dropped on the floor in the next room. These everyday (but essential) judgments usually require us to separate the distinct causes in the world that generate sound. We hear multiple people talking at once, but can attend to the one we are interested in. We can tell whether a sound was produced in a large or small room, or an empty or furnished apartment, but can also identify what the sound was. And if an object is dropped on a table, we can usually tell the object's approximate weight but also the material the table is made of, just by listening. These abilities are critical to our interactions with the world and will be critical to reproduce in machine hearing systems for robots, automobiles, and other technologies. Here the investigators propose to investigate human abilities to decompose sound into its constituent causes and to build machine systems that can replicate these abilities.The proposed work will jointly pursue two goals. First, the investigators will build models of how sound is generated in the world. This aspect of the work will combine insights from physics and acoustics with empirical measurements of sound, focusing on how forces imparted to objects resonate within the object to yield sound, and on how the resulting sound is altered by reflections off of environmental surfaces on its way to a listener's ears. Second, the investigators will develop a computational framework to infer the most likely explanation of a sound in terms of the events in the world that could have generated it. This aspect of the work will leverage recent advances in artificial intelligence research that render such inferences newly tractable. The resulting machine hearing systems will be compared with human listeners in a series of experiments, with the goal of improving the models of sound generation and the inference algorithms in order to match human auditory abilities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

仅仅通过倾听，人类就可以推断出周围世界的许多细节：某人说了什么，他们家里的窗户是开着还是关着，或者他们的孩子在隔壁房间的地板上掉了什么。这些日常的（但基本的）判断通常要求我们区分世界上产生声音的不同原因。我们同时听到多个人在说话，但可以专注于我们感兴趣的那个人。我们可以分辨出一个声音是在一个大房间还是小房间，或者是一个空的公寓还是一个有家具的公寓里产生的，但是我们也可以识别出这个声音是什么。如果一个物体掉在桌子上，我们通常可以通过听来判断物体的大致重量，以及桌子的材料。这些能力对于我们与世界的互动至关重要，对于机器人、汽车和其他技术的机器听觉系统的复制也至关重要。在这里，研究人员提出研究人类将声音分解成其组成原因的能力，并建立能够复制这些能力的机器系统。首先，研究人员将建立世界上声音是如何产生的模型。这方面的工作将结合联合收割机的见解，从物理学和声学与经验测量的声音，重点是如何赋予物体的力量在物体内共振产生声音，以及产生的声音是如何改变的反射关闭环境表面的方式，以听众的耳朵。其次，研究人员将开发一个计算框架，根据可能产生声音的世界事件来推断声音的最可能解释。这方面的工作将利用人工智能研究的最新进展，使此类推断变得更加容易处理。最终的机器听觉系统将在一系列实验中与人类听者进行比较，目的是改进声音生成模型和推理算法，以匹配人类的听觉能力。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Finding Fallen Objects Via Asynchronous Audio-Visual Integration

DOI：
10.1109/cvpr52688.2022.01027
发表时间：
2022-06
期刊：
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
影响因子：
0
作者：
Chuang Gan;Yi Gu;Siyuan Zhou;Jeremy Schwartz;S. Alter;James Traer;Dan Gutfreund;J. Tenenbaum;Josh H. McDermott;A. Torralba
通讯作者：
Chuang Gan;Yi Gu;Siyuan Zhou;Jeremy Schwartz;S. Alter;James Traer;Dan Gutfreund;J. Tenenbaum;Josh H. McDermott;A. Torralba

Causal inference in environmental sound recognition

环境声音识别中的因果推理

DOI：
10.1016/j.cognition.2021.104627
发表时间：
2021
期刊：
Cognition
影响因子：
3.4
作者：
Traer, James;Norman-Haignere, Sam V.;McDermott, Josh H.
通讯作者：
McDermott, Josh H.

ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

DOI：
发表时间：
2020-07
期刊：
ArXiv
影响因子：
0
作者：
Chuang Gan;Jeremy Schwartz;S. Alter;Martin Schrimpf;James Traer;Julian De Freitas;J. Kubilius;Abhishek Bhandwaldar;Nick Haber;Megumi Sano;Kuno Kim;E. Wang;Damian Mrowca;Michael Lingelbach;Aidan Curtis;Kevin T. Feigelis;Daniel Bear;Dan Gutfreund;David Cox;J. DiCarlo;Josh H. McDermott;J. Tenenbaum;Daniel L. K. Yamins
通讯作者：
Chuang Gan;Jeremy Schwartz;S. Alter;Martin Schrimpf;James Traer;Julian De Freitas;J. Kubilius;Abhishek Bhandwaldar;Nick Haber;Megumi Sano;Kuno Kim;E. Wang;Damian Mrowca;Michael Lingelbach;Aidan Curtis;Kevin T. Feigelis;Daniel Bear;Dan Gutfreund;David Cox;J. DiCarlo;Josh H. McDermott;J. Tenenbaum;Daniel L. K. Yamins