权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CRCNS: Reward and motivation in neural networks

CRCNS：神经网络中的奖励和动机

基本信息

批准号：
10227072
负责人：
ALEXEI KOULAKOV
金额：
$ 43.2万
依托单位：
COLD SPRING HARBOR LABORATORY
依托单位国家：
美国
项目类别：
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-30 至 2024-07-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/10227072
关键词：
Adaptive Behaviors Animals Behavior Behavioral Clinical Complex Computer Analysis Computer Models Computing Methodologies Data Drug Addiction Electrophysiology (science)Functional disorder Globus Pallidus Glutamates Goals Human Hunger Image Impairment Individual Instruction Knowledge Learning Lesion Link Maintenance Mathematics Mental Depression Methods Molecular Genetics Motivation Mus Neural Network Simulation Neurons Nutrient Outcome Pharmacology Physiological Play Population Principal Investigator Psychological reinforcement Punishment Recurrence Research Rewards Role Signal Transduction Structure Testing Thirst Training addiction base behavior test computational neuroscience expectation flexibility improved in vivo incentive salience learning network mental state motivated behavior motivational processes neural network neuromechanism novel optogenetics programs response theories tool

项目摘要

The overall goal of this project is to develop a reinforcement learning (RL) theory of motivation, understood here as motivational salience, and to test the conclusions of this theory using experimental observations obtained in the ventral pallidum (VP). Animals' actions depend on the shifting values of internal demands determined by physiological or behavioral conditions, such as thirst, hunger, addiction, specific nutrient deficiency, etc. These need-based modulations of the perceived values of reinforcements (reward or punishment} are described by a mathematical variable called motivational salience or, simply, motivation. Including motivation adds a new level of complexity to RL theory, and allows it to generate flexible ongoing behaviors. Here, we will investigate how motivation can be learned by neuronal networks to generate complex adaptive behaviors and compare the conclusions of our theory with the VP circuits. Previous studies indicate that the VP plays an important role in a variety of behaviors, potentially, by influencing motivational salience. In vivo recordings suggest that VP neuron firing correlates with motivational states. Lesions, pharmacological and optogenetic manipulations in VP cause profound changes in behaviors motivated by natural rewards or drugs of addiction. Dysfunction of this structure is linked to depression and drug addiction in humans. Our theoretical results suggest that distinct classes of neurons in the VP should play essential roles in representing either positive or negative motivational states. We further hypothesize that the functional interactions locally within the VP are critical for generating such signals that guide motivated behaviors. Consistent with predictions of RL theory, in our preliminary studies, we found that individual VP neurons could be classified as either positive or negative 'motivation neurons', as the activities of these neurons represented both expected values of outcomes and motivational states. When population activity is considered, representations of outcome expectation can be distinguished from representations of motivation fluctuating according to the animals' physiological states. Based on the preliminary data, we devised an integrated approach, combining studies in computational analysis and theory (Koulakov lab) with advanced molecular genetic tools, optogenetics, chemogenetics, electrophysiology, and imaging in behaving mice (Li lab), to test our hypotheses through the following Aims: Aim 1. To develop methods for identifying motivation in the population activity of VP neurons. Here we will use novel behavioral and computational methods to disambiguate representations of motivation and outcome expectation in neuronal responses. Aim 2. To develop reinforcement learning theory of motivation and to test its predictions using responses of VP neurons. Here we will develop the Q-learning theory of motivation and compare networks trained using this theory to responses of VP neurons. Aim 3. To identify the circuit basis of representations of motivation in VP neuronal populations. We will identify the network structure in Q-learning networks with motivation, and test predictions using opto- and chemogenetic manipulations in VP. RELEVANCE (See instructions): The neural mechanisms of motivated behaviors remain unclear. In the proposed research program, we will determine the precise circuit mechanisms and computations by which neurons in the ventral pallidum participate in modulating motivated behaviors. Findings from this project will have important clinical implications, as impairments in motivational processes are core features of depression and drug addiction.

这个项目的总体目标是开发一个激励的强化学习（RL）理论，在这里作为动机的显着性，并测试这一理论的结论，使用实验观察腹侧苍白球（VP）。动物的行为取决于内在需求的价值变化由生理或行为条件决定，如口渴，饥饿，成瘾，特定营养素这些基于需求的对强化感知价值的调节（奖励或惩罚}是由一个数学变量来描述的，这个变量被称为动机显著性，或者简单地说，动机。包括动机增加了一个新的复杂性水平的强化学习理论，并允许它产生灵活的持续行为。在这里，我们将研究如何动机可以通过神经网络学习，以产生复杂自适应行为，并将我们的理论与VP电路的结论进行比较。以前的研究表明VP在各种行为中起着重要作用，潜在地，通过影响动机显著性在体内记录表明，VP神经元放电与动机状态。损伤， VP中的药理学和光遗传学操作引起由以下因素激发的行为的深刻变化：自然奖励或成瘾药物。这种结构的功能障碍与抑郁症和药物成瘾有关在人类身上。我们的理论结果表明，VP中不同类别的神经元应该发挥重要作用，在代表积极或消极的动机状态的角色。我们进一步假设， VP内部的局部互动对于产生引导动机行为的信号至关重要。与RL理论的预测一致，在我们的初步研究中，我们发现单个VP神经元可以分为积极或消极的“动机神经元”，因为这些神经元的活动代表了结果的预期值和动机状态。当人口活动考虑到，结果期望的表征可以与动机的表征区分开来，根据动物的生理状态而波动。根据初步数据，我们设计了一个综合方法，结合计算分析和理论研究（Koulakov实验室）与先进的分子遗传学工具、光遗传学、化学遗传学、电生理学和行为小鼠的成像（Li 实验室），通过以下目标来测试我们的假设：目标1。开发识别动机的方法 VP神经元的群体活动。在这里，我们将使用新的行为和计算方法，消除神经元反应中动机和结果预期的歧义。目标2.到发展动机的强化学习理论，并使用VP神经元的反应来测试其预测。在这里，我们将开发动机的Q学习理论，并比较使用该理论训练的网络， VP神经元的反应。目标3.确定VP神经元动机表征的电路基础人口。我们将识别具有动机的Q学习网络中的网络结构，并测试预测在VP中使用光学和化学遗传学操作。相关性（参见说明）：动机行为的神经机制仍不清楚。在研究计划中，我们将确定腹侧苍白球神经元的精确电路机制和计算，参与调节动机行为。该项目的研究结果将具有重要的临床意义。影响，因为动机过程的障碍是抑郁症和药物成瘾的核心特征。