权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: RI: Medium: MoDL: Occams Razor in Deep and Physical Learning

合作研究：RI：媒介：MoDL：深度学习和物理学习中的奥卡姆斯剃刀

基本信息

批准号：
2212519
负责人：
Pratik Chaudhari
金额：
$ 79.99万
依托单位：
University of Pennsylvania
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2212519&HistoricalAwards=false
关键词：
Collaborative Research RI Medium MoDL

项目摘要

Deep neural networks (DNNs) are machine learning models inspired by how neurons perform computations in the animal brain. Over the past decade these models have led to revolutions in many fields of science and engineering, from making predictions of the next word on the keyboard of a mobile phone to selecting between cosmological models that best explain the structure of the universe. Although computer scientists have gained expertise in building these systems, they do not currently understand why they work and when they can fail. The research agenda focuses on developing theoretical tools that will build such a needed understanding for DNNs, with the hope that these same tools will also shed light on how learning occurs in biological systems, e.g., networks of neurons in the brain. The intellectual goal of the project is to identify common themes in the ways artificial and biological systems learn. The educational and outreach goals include (a) developing curricula at the intersection of computer science, neuroscience, and mathematics, (b) organizing tutorials on artificial intelligence for high-school students in Philadelphia, and (c) mentoring young researchers in the LatinX mathematical research community.Training a deep network reduces to a high-dimensional, large-scale, and non-convex optimization problem; curiously enough, simple algorithms like stochastic gradient descent are not just sufficient but also seemingly necessary for training DNNs. Accepted statistical wisdom suggests that the larger the model class, the more likely the learned model will overfit the training data. Yet, DNNs generalize extremely well to new data. This project seeks to unravel this apparent paradox: The central hypothesis is that DNNs succeed when the learning tasks exhibit a characteristic structure called “sloppiness.” For sloppy learning tasks, the Fisher Information Matrix of the learned network has eigenvalues that are distributed uniformly across a range that is exponentially large in the rank of the matrix. This project will investigate how this sloppy structure results in the training process exploring only a tiny subset of the function space, thereby yielding both rapid training and good generalization. It will characterize the shape of this tiny subset to understand why networks learn simple, low-dimensional functions for typical learning tasks. Connections will be made to biological and physical systems that learn through local learning rules and also exhibit such a sloppy structure (e.g., networks of neurons in the brain and elastic polymer networks such as proteins). The technical objective is to reveal universal principles of learning, namely a drive towards simplicity and low-dimensional internal representations exhibited by both DNNs and physical learning networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度神经网络（DNN）是一种机器学习模型，其灵感来自于神经元如何在动物大脑中执行计算。在过去的十年里，这些模型已经在许多科学和工程领域引发了革命，从预测移动的手机键盘上的下一个单词到选择最能解释宇宙结构的宇宙学模型。虽然计算机科学家已经获得了构建这些系统的专业知识，但他们目前还不知道它们为什么能工作，以及何时会失败。研究议程的重点是开发理论工具，以建立对DNN的必要理解，希望这些工具也能揭示生物系统中学习是如何发生的，例如，大脑中的神经元网络。该项目的智力目标是确定人工和生物系统学习方式的共同主题。教育和推广目标包括（a）开发计算机科学、神经科学和数学交叉领域的课程，（B）为费城的高中生组织人工智能教程，（c）指导LatinX数学研究社区的年轻研究人员训练深度网络，将其简化为高维、大规模和非凸优化问题;奇怪的是，像随机梯度下降这样的简单算法对于训练DNN来说不仅足够，而且似乎是必要的。公认的统计智慧表明，模型类越大，学习的模型越有可能过拟合训练数据。然而，DNN对新数据的泛化能力非常好。这个项目试图解开这个明显的悖论：中心假设是，当学习任务表现出一种称为“草率”的特征结构时，DNN会成功。对于草率的学习任务，学习网络的Fisher信息矩阵具有均匀分布在矩阵秩呈指数级大的范围内的特征值。这个项目将研究这种草率的结构如何导致训练过程只探索函数空间的一个很小的子集，从而产生快速的训练和良好的泛化。它将描述这个小子集的形状，以理解为什么网络在典型的学习任务中学习简单的低维函数。将与通过局部学习规则学习的生物和物理系统建立联系，并且也表现出这种草率的结构（例如，大脑中的神经元网络和弹性聚合物网络（如蛋白质）。该奖项的技术目标是揭示学习的普遍原则，即DNN和物理学习网络所表现出的简单性和低维内部表示的驱动力。该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Pratik Chaudhari其他文献

Design and Evaluation of Motion Planners for Quadrotors

四旋翼飞行器运动规划器的设计与评估

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Yifei Shao;Yuwei Wu;Laura Jarin;Pratik Chaudhari;Vijay Kumar
通讯作者：
Vijay Kumar

Real-time Vehicle Count, Speed Estimation and Number Plate Detection using CCTV Footage

使用闭路电视录像进行实时车辆计数、速度估计和车牌检测

DOI：
10.1109/icccee55951.2023.10424558
发表时间：
2023
期刊：
2023 1st International Conference on Cognitive Computing and Engineering Education (ICCCEE)
影响因子：
0
作者：
P. S. Gaikwad;Pratik Chaudhari;Pragati Bhole;Vinayak Girhe;Aniruddh Karekar
通讯作者：
Aniruddh Karekar

Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples

磁共振成像衍生的神经影像特征的生成模型以及包含 18000 个样本的相关数据集

DOI：
10.1038/s41597-024-04157-4
发表时间：
2024-12-05
期刊：
Scientific Data
影响因子：
6.900
作者：
Sai Spandana Chintapalli;Rongguang Wang;Zhijian Yang;Vasiliki Tassopoulou;Fanyang Yu;Vishnu Bashyam;Guray Erus;Pratik Chaudhari;Haochang Shou;Christos Davatzikos
通讯作者：
Christos Davatzikos