权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: NCS-FR: Beyond the ventral stream: Reverse engineering the neurocomputational basis of physical scene understanding in the primate brain

合作研究：NCS-FR：超越腹侧流：逆向工程灵长类大脑中物理场景理解的神经计算基础

基本信息

批准号：
2124136
负责人：
Nancy Kanwisher
金额：
$ 225万
依托单位：
Massachusetts Institute of Technology
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2124136&HistoricalAwards=false
关键词：
Collaborative Research NCS FR Beyond

项目摘要

The last ten years have witnessed an astonishing revolution in AI, with deep neural networks suddenly approaching human-level performance on problems like recognizing objects in an image and words in an audio recording. But impressive as these feats are, they fall far short of human-like intelligence. The critical gap between current AI and human intelligence is that, beyond just classifying patterns of input, humans build mental models of the world. This project begins with the problem of physical scene understanding: how one extracts not just the identities and locations of objects in the visual world, but also the physical properties of those objects, their positions and velocities, their relationships to each other, the forces acting upon them, and the effects of forces that could be exerted on them. It is hypothesized that humans represent this information in a structured mental model of the physical world, and use that model to predict what will happen next, much as the physics engine in a video game generates physically plausible future states of virtual worlds. To test this idea, computational models of physical scene understanding will be built and tested for their ability to predict future states of the physical world in a variety of scenarios. Performance of these models will then be compared to humans and to more traditional deep network models, both in terms of their accuracy on each task, and their patterns of errors. Computational models that incorporate structured representations of the physical world will then be tested against standard convolutional neural networks in their ability to explain neural responses of the human brain (using fMRI) and the monkey brain (using direct neural recording). These computational models will provide the first explicit theories of how physical scene understanding might work in the human brain, at the same time advancing the ability of AI systems to solve the same problems. Because the ability to understand and predict the physical world is essential for planning any action, this work is expected to help advance many technologies that require such planning, from robotics to self-driving cars to brain-machine interfaces. Each of the participating labs will also expand their established track records of recruiting, training, and mentoring women and under-represented minorities at the undergraduate, graduate, and postdoctoral levels. Finally, the collaborating laboratories will continue and increase their involvement in the dissemination of science to the general public, via public talks, web sites, and outreach activities.Deep neural networks have revolutionized object recognition in computers as well as understanding of object recognition in the primate brain, but object recognition is just one aspect of vision, and the ventral stream is just one of many brain systems. Studying physical scene understanding is a step toward scaling this reverse-engineering approach up to the rest of the mind and brain. Predicting what will happen next and planning effective action requires understanding the physical basis and physical relationships in the visual world. Yet it is unknown how humans do this or how machines could. Both challenges are addressed in this project by the building of image computable, neurally mappable computational models of physical scene understanding and prediction (Thread I), and using these models as explicit hypotheses for how the brain might accomplish these tasks, which will then be tested with behavioral and neural data from humans (Thread II) and non-human primates (Thread III). This project aims to make a transformative leap in understanding: from small-scale, special-case models and isolated experimental tests to an integrated large-scale, general-purpose model of a major swathe of the primate brain, that functionally explains much of the immediate content of our perceptual experience in every scene that confronts us. The work will advance theory by developing the first image-computable models capable of human-level physical scene understanding and prediction. Beyond understanding of the mind and brain, this research is directly relevant to AI and robotics (which require physical scene understanding), and brain-machine interfaces (which require understanding of the relevant neural codes). For the broader research community, the project will a) develop public datasets, benchmark tasks, and challenges, b) host adversarial collaborations to address these challenges, and c) host interdisciplinary workshops linking research communities from psychology to AI to neuroscience to address the fundamental questions that span these fields.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在过去的十年里，见证了人工智能领域的一场惊人的革命，深度神经网络在识别图像中的对象和录音中的单词等问题上突然接近人类水平。尽管这些壮举令人印象深刻，但它们远远达不到人类的智商。当前人工智能和人类智能之间的关键差距是，除了对输入模式进行分类外，人类还建立了世界的心理模型。这个项目从物理场景理解的问题开始：一个人如何不仅提取视觉世界中对象的身份和位置，而且提取这些对象的物理属性、它们的位置和速度、它们彼此之间的关系、作用在它们身上的力，以及可能施加在它们身上的力的影响。假设人类在物理世界的结构化心理模型中表示这些信息，并使用该模型预测接下来会发生什么，就像视频游戏中的物理引擎生成虚拟世界在物理上看似合理的未来状态一样。为了测试这一想法，将建立物理场景理解的计算模型，并测试它们在各种情况下预测物理世界未来状态的能力。然后，这些模型的性能将与人类和更传统的深度网络模型进行比较，无论是在每项任务的准确性方面，还是在错误模式方面。然后，结合物理世界的结构化表示的计算模型将与标准卷积神经网络进行比较，以测试它们解释人脑(使用功能磁共振成像)和猴子大脑(使用直接神经记录)的神经反应的能力。这些计算模型将提供第一个关于物理场景理解如何在人脑中工作的明确理论，同时提高人工智能系统解决同样问题的能力。由于理解和预测物理世界的能力对于计划任何行动都是必不可少的，这项工作预计将有助于推动许多需要这种规划的技术，从机器人到自动驾驶汽车再到脑机接口。每个参与的实验室还将扩大其在本科生、研究生和博士后水平上招聘、培训和指导女性和代表性不足的少数民族的既定记录。最后，合作实验室将继续并通过公共演讲、网站和外展活动更多地参与向公众传播科学。深度神经网络使计算机中的物体识别以及对灵长类大脑中物体识别的理解发生了革命性的变化，但物体识别只是视觉的一个方面，腹侧流只是许多大脑系统中的一个。研究物理场景理解是将这种反向工程方法推广到大脑其余部分的一步。预测接下来会发生什么，并计划有效的行动，需要了解视觉世界中的物理基础和物理关系。然而，人类如何做到这一点，或者机器如何做到这一点，都是未知的。在这个项目中，这两个挑战都是通过建立可计算的、神经上可映射的物理场景理解和预测的计算模型(线程I)来解决的，并将这些模型用作大脑如何完成这些任务的明确假设，然后将使用来自人类(线程II)和非人类灵长类动物(线程III)的行为和神经数据进行测试。这个项目的目标是在理解上实现一次革命性的飞跃：从小规模的特殊情况模型和孤立的实验测试，到一个集成的大规模、通用的灵长类大脑主要部分的模型，它在功能上解释了我们在面临的每一个场景中感知经验的大部分直接内容。这项工作将通过开发第一个能够理解和预测人类水平的物理场景的图像可计算模型来推进理论。除了对头脑和大脑的理解外，这项研究还与人工智能和机器人(需要理解物理场景)以及脑机接口(需要理解相关的神经代码)直接相关。对于更广泛的研究界，该项目将a)开发公共数据集、基准任务和挑战，b)主办对抗性合作以应对这些挑战，以及c)主办跨学科研讨会，将从心理学到人工智能再到神经科学的研究社区联系起来，以解决跨越这些领域的基本问题。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（1）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Nancy Kanwisher其他文献

Repetition blindness and illusory conjunctions: errors in binding visual types with visual tokens.

DOI：
10.1037//0096-1523.17.2.404
发表时间：
1991-05
期刊：
Journal of experimental psychology. Human perception and performance
影响因子：
0
作者：
Nancy Kanwisher
通讯作者：
Nancy Kanwisher

Privileged representational axes in biological and artificial neural networks

生物和人工神经网络中的特权表示轴

DOI：
发表时间：
2024
期刊：
bioRxiv
影响因子：
0
作者：
Meenakshi Khosla;Alex Williams;Josh McDermott;Nancy Kanwisher
通讯作者：
Nancy Kanwisher

MIT Open Access Articles An integrative computational architecture for object-driven cortex

麻省理工学院开放获取文章对象驱动皮层的综合计算架构

DOI：
发表时间：
期刊：
影响因子：
0
作者：
Ilker Yildirim;Jiajun Wu;Nancy Kanwisher;Joshua B. Tenenbaum
通讯作者：
Joshua B. Tenenbaum

Dissociating language and thought in large language models

在大型语言模型中分离语言与思维

DOI：
10.1016/j.tics.2024.01.011
发表时间：
2024-06-01
期刊：
TRENDS IN COGNITIVE SCIENCES
影响因子：
17.200
作者：
Kyle Mahowald;Anna A. Ivanova;Idan A. Blank;Nancy Kanwisher;Joshua B. Tenenbaum;Evelina Fedorenko
通讯作者：
Evelina Fedorenko