Learning poly-synaptic paths with traveling waves

Yoshiki Ito; Taro Toyoizumi

doi:10.1371/journal.pcbi.1008700

Abstract

Traveling waves are commonly observed across the brain. While previous studies have suggested the role of traveling waves in learning, the mechanism remains unclear. We adopted a computational approach to investigate the effect of traveling waves on synaptic plasticity. Our results indicate that traveling waves facilitate the learning of poly-synaptic network paths when combined with a reward-dependent local synaptic plasticity rule. We also demonstrate that traveling waves expedite finding the shortest paths and learning nonlinear input/output mapping, such as exclusive or (XOR) function.

Author summary

There are approximately 10¹¹ neurons with 10¹⁴ connections in the human brain. Information transmission among neurons in this large network is considered crucial for our behavior. To achieve this, multiple synaptic connections along a poly-synaptic network path must be adjusted coherently during learning. Because the previously proposed reward-dependent synaptic plasticity rule requires coactivation of presynaptic and postsynaptic neurons, learning can fail if a subset of neurons along a distant network path is inactive at the beginning of learning. We suggest that traveling waves that are initiated at an information source can mitigate this problem. We performed computer simulations of spiking neural networks with reward-dependent local synaptic plasticity rules and traveling waves. Our results show that this combination facilitates the learning and refinement of synaptic network paths. We argue that these features are a general biological strategy for maintaining and optimizing our brain function. Our research provides new insights into how complex neural networks in the brain form during learning and memory consolidation.

Citation: Ito Y, Toyoizumi T (2021) Learning poly-synaptic paths with traveling waves. PLoS Comput Biol 17(2): e1008700. https://doi.org/10.1371/journal.pcbi.1008700

Editor: Brent Doiron, University of Pittsburgh, UNITED STATES

Received: January 31, 2020; Accepted: January 11, 2021; Published: February 9, 2021

Copyright: © 2021 Ito, Toyoizumi. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files. The source code is available from https://github.com/toppo365/traveling_wave.git.

Funding: Brain/MINDS from Japan Agency for Medical Research and Development [AMED] under Grant Number JP21dm020700 (T.T.) https://www.amed.go.jp/en/index.html Japan Society for the Promotion of Science [JSPS] KAKENHI Grant Number JP18H05432 (T.T.). https://www.jsps.go.jp/english/index.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Waves of neural activity in the brain play an essential role in recognition and learning [1]. Among them, traveling waves are observed at different spatial scales in many brain regions by different recording methods, such as electroencephalogram (EEG) [2–4], voltage-sensitive dyes (VSDs) [5,6], and local field potentials (LFP) [7,8]. Traveling waves are typically observed under mild anesthesia [7,9], sleep [10], or idle [11].

Cortical traveling waves consist of the upstate and downstate of neurons and propagate these phases coherently [12–15]. The upstate is defined by relatively large membrane potential fluctuations with a high firing rate, while the downstate is referred to as a phase of small fluctuations with little spikes [16]. The propagation of this up/down state is estimated to be slower than the axonal signal transmission, and the activity spreads both as subthreshold and suprathreshold responses [17]. Lubenov et al. [18] suggested that these traveling waves spread along with anatomical structures rather than spatial distance.

The role of traveling waves has been unclear. One hypothesis is that traveling waves mediate lateral propagation of signals within the cortex [7,19]. Rubino et al. [20] suggested that the waves mediate information transfer to distant neurons during movement preparation and execution. Another hypothesis is that slow oscillations during sleep contribute to memory consolidation [21,22]. Notably, while these works suggest the significance of traveling waves for learning, specific mechanisms of how traveling waves improve learning are yet to be uncovered. We conducted computer simulations of neural network models to study this.

To explore this mechanism, we modeled synaptic plasticity. Synaptic weight between a pair of neurons changes according to presynaptic and postsynaptic neural activity and a reward signal [23–25]. Reward-modulated spike-timing-dependent plasticity (STDP) strengthens synapses that contribute to eliciting a spike in the presence of a reward signal [26,27]. While this learning rule tends to increase the probability of reproducing a spike sequence that leads to a reward, it cannot efficiently associate spiking activity among indirectly connected neurons. Signal transmission between indirectly connected neurons is crucial for task performance [28] because most neurons in the brain are connected indirectly [29].

We hypothesized that a critical role of traveling waves is to propagate neural activity between distant and indirectly connected neurons. Consistently, Lubenov et al. [18] reported that theta waves in the hippocampus assist signal transmission across areas, such as the amygdala, hypothalamus, and medial prefrontal cortex, and this is also suggested in humans [30,31]. Together with the standard reward-independent STDP [32], traveling waves could gradually create a repertoire of paths spreading from a wave-initiating site. Once such a repertoire is prepared, neurons are coherently activated along the paths so that reward-modulated STDP could select a subset of these paths to perform a task. We simulate computational models of reward-modulated STDP to study if traveling waves enhance learning.

Results

To test our hypothesis, we used relatively small excitatory spiking neural networks (N ~ 100) with a global inhibitory signal and a global dopaminergic signal. Fig 1A explains the scheme of our setting. For the spiking neuron model, we adopted the leaky integrate-and-fire neuron. The dynamics of the membrane potential v_i of neuron i are described by (1) where v₀ = -70 mV is the resting potential, h_i is the synaptic input from surrounding excitatory neurons to neuron i, is the external input to neuron i, h^inh is an inhibitory feedback signal that controls the overall firing rate of the network, computed as the running average of spikes from all neurons (see Material and Methods), and τ = 10 ms is the membrane time constant. h_i is updated according to dh_i/dt = −h_i/τ_h+h₀∑_j S_ij f_j (t−t_d), with synaptic time constant τ_h = 5 ms, scaling constant h₀ = 60 mV, excitatory synaptic weight S_ij from neuron j to neuron i, spike-train f_j of neuron j as a sum of delta functions peaking at neuron j’s spike timing, and synaptic transmission delay t_d = 2 ms. The neuron emits a spike when v_i reaches a spiking threshold of -54 mV and then is reset to resting potential at -60 mV. In addition, each neuron receives uncorrelated white Gaussian noise ξ_i. The noise level is controlled by a time-dependent standard deviation σ_i(t), modulated by traveling waves as described below. A subset of neurons (stimulated neurons) receives external input as and other neurons receive no external input, = 0 mV. The stimulated neurons receive input pulses at 200 Hz as that enforces them to spike during the first 250 ms of each learning trial (see below for each task setup).

Download:

Fig 1. Schematic explanation of the modified reward-modulated STDP rule.

(A) The whole network overview. (B) The STDP learning window. (C) The mechanism of synaptic plasticity. Synaptic weight changes as a product of eligible trace c(t) and dopaminergic signal D(t). (D) The upstate propagation from a presynaptic neuron to a postsynaptic neuron.

https://doi.org/10.1371/journal.pcbi.1008700.g001

As a synaptic plasticity rule (Fig 1B and 1C), we used a modified version of the reward-modulated STDP [26]. In this conventional model, synaptic plasticity does not occur in the absence of reward or punishment. However, recent research suggests that the dopaminergic signal has two different timescales: tonic and phasic [33]. Therefore, we prepared the corresponding tonic variable D_t, which represents the baseline dopamine level and the phasic variable D_p, which represents the dopaminergic signal driven by a reward or punishment. Hence, we assume that D_t signaling induces reward-independent STDP, and D_p signaling induces reward-dependent STDP. The amount of reward or punishment exponentially declines after the stimulation offset with a decay time-constant of 200 ms. Both dopaminergic components are assumed to be modulated by the novelty [34] of the task. Toward the end of the simulations, both D_t and D_p slowly declined to terminate learning and fix the network (see Material and Methods). Note that dopaminergic signals D_t and D_p are global variables common to all synapses. The synaptic weight S_ij (0 ≤ S_ij ≤ S_max) from neuron j to i is adjusted when c_ij > 0 or D_p > 0 according to (2) where S_max = 0.24 is the maximum synaptic weight, τ_s = 1 ms is a time unit, and c_ij (−S_max/2 ≤ c_ij ≤ S_max/2) is the so-called STDP eligibility trace [26] that accumulates the effects of plasticity events with time-constant τ_c = 1000 ms, namely, (3) where f_i is the spike-train of neuron i, and is the running average of f_i with a time constant τ_STDP. The increment of c_ij follows a typical asymmetric STDP window [35] with amplitude γ = 0.0009 and time-constant τ_STDP = 30 ms (Fig 1B). The c_ij instantaneously increases if there is a pre-before-post-event, instantaneously decreases if there is a post-before-pre-event, and otherwise exponentially decays with the time-constant τ_c. The upper and lower bounds of c_ij limit the speed of synaptic change. We assumed no changes in the synaptic weight when c_ij < 0 and D_p < 0.

For the wave, we used a simple custom-made propagation rule. The upstate is defined as a high noise level state (σ_i(t) ~6 mV), while the downstate is a low noise level state (σ_i(t) ~3 mV). These noise levels roughly reproduce the experimentally observed firing rate of 5 Hz in the upstate and 0 Hz in the downstate [36]. The initial upstate spread from externally stimulated neurons in each trial. Then, the upstate propagates from these neurons to the peripheral neurons. The noise level is determined by σ_i(t) = α_i∙ψ_i+3 mV with influx coefficient α_i (see Material and Methods) and local field ψ_i, representing the average activity of a non-modeled neuron mass around the modeled neuron i. To control the noise level, we constrained the range of σ_i(t) between 3 and 6 mV and the range of ψ_i between -1 mV and 100 mV. ψ_i is updated (Fig 1D) by (4) where τ_w = 200 ms is the time constant of waves, δt = 20 ms is a propagation delay, θ = 0,001 is a threshold for wave propagation, the expressions j_→i and j_i→, respectively, represent the sets of j indices that have connections incoming to and outgoing from neuron i. [x]₊ is the rectified linear function that takes x for positive x and 0 otherwise. g_i(t) describes the time-dependent drive for the local field ψ_i by the external input. For stimulated neuron i, integrates the external input from stimulation-onset time t_on, while time t is in the stimulation interval, where mod is the modulo function. Thus, g_i(t) discontinuously increases by η every 5 ms but is constant within this interval. We assume that g_i(t) = -5 mV after the stimulation interval. For non-stimulated neurons, g_i(t) = 0 mV always holds. The gain factor η takes a task-dependent value as described in Material and Methods. Altogether, the local field around stimulated neurons rapidly increases at the beginning of each learning trial and then diffuses as a wave to the local field of connected neurons. By the end of the learning trial of duration 3.0 s, ψ_i for all neurons decay close to zero. Neurons are placed on a two-dimensional square sheet. A rigid boundary condition is used so that the waves collapse at the edges of the sheet. To highlight the role of traveling waves, we also simulate models without waves. A constant noise level, σ_i, is used in these models. The value of σ_i is chosen so that the overall firing rate is the same as that of the corresponding model with waves. We define the conventional model as the model without the tonic dopamine signal and traveling waves.

Below, we conducted three tasks to illustrate our points. In Task 1, we demonstrate that the combination of reward-dependent STDP and traveling waves can selectively reinforce reward-related paths. In Task 2, we show that traveling waves can empower reward-dependent STDP to reinforce initially weak shortcut paths. In Task 3, we show that the reward-dependent STDP and waves can be exploited to learn the XOR function.

Task 1: Selectively reinforcing poly-synaptic paths

First, we demonstrate that the combination of traveling waves and the STDP rule can strengthen a specific path from a stimulated neuron to a target neuron. This task is especially important in large-scale networks such as the brain because most neurons are indirectly connected. A local STDP rule alone does not efficiently solve this task because coherent activation of distant neurons is rare before learning. Wave signals compensate for this deficiency and facilitate the learning of poly-synaptic paths. This effect turns out to be evident, especially in the presence of the tonic dopaminergic signal D_t, which is not included in the conventional reward-modulated STDP rule. The D_t signal induces a reward-independent STDP that works synergistically with traveling waves to prepare a repertoire of paths starting from the stimulated neuron (see below).

Fig 2 shows the setting and results of this task. Fig 2A Left shows the initial network setting of this task. The central neuron S with coordinates (600 μm, 600 μm) is stimulated by external input. This task aims to strengthen the path from S to target neuron T positions at the bottom (600 μm, 100 μm). We also prepared three false-target neurons F at the left (100 μm, 600 μm), right (1100 μm, 600 μm), and top (600 μm, 1100 μm), respectively. Synaptic connections from S are all outward, while the synaptic connections to T and F are all inward. Other neurons are randomly and unidirectionally connected to adjacent neurons within 200√2 μm with a probability of 0.5. If a neuron is isolated by chance, we repeat the procedure until it gets connected. We used this recurrently connected neural network to model a two-dimensional cortical sheet. The task we consider is information routing in a cortical sheet required for some animal tasks, such as learning an appropriate action in response to a stimulus by preparing a path from visual neurons to motor neurons [37]. The central neuron was stimulated during the first 250 ms of each trial. This causes a traveling wave to build up there and spread to the surrounding neurons gradually. A reward or punishment signal is provided (see Material and Methods for details) if the summed spike-count from the target or non-target neurons reaches a threshold level of 5 in each trial. If the target neuron spikes more than the other three false-target neurons during and after the stimulation, the reward signal D_p (> 0) is provided to the whole network. Meanwhile, if any of the false-target neurons spikes more than the target neuron, the punishment signal D_p (< 0) is provided. We repeated this trial of 3.0 s in duration for 80 times.

Download:

Fig 2. Upstate propagation improves the reinforcement task of poly-synaptic paths.

(A) An example of a successful trial. The initial synaptic weights are represented in color (Left). All neurons are aligned in a grid with 100 μm spacing, and the adjacent neurons within 200√2 μm are randomly connected with a probability of 0.5. Synaptic connections from stimulated neuron S are all outward, while the synaptic connections to target neuron T and false-target neurons F are all inward. The path from S to T is selectively strengthened at the end of the learning (Middle). The difference between the initial synaptic weight and the final synaptic weight (Right). (B) A successful example of this task. The firing rate of the target neuron selectively increases. (C) The averaged synaptic weight difference from the initial condition to the 10^th, 25^th, 40^th trials is plotted. The averaged synaptic weights are calculated, including the direction of synaptic weights (so that the opposite direction has a minus sign). (D) The averaged synaptic weight difference between the initial trial and the last trial (the 80^th trial) is plotted. (E) The success rate of each condition (50 simulations averaged). The shaded area indicates the standard error of the mean. A combination of wave and tonic dopamine signal D_t (red line) shows the best task performance, while the conventional model (black line) fails to complete this task.

https://doi.org/10.1371/journal.pcbi.1008700.g002

In a successful case, the paths from the stimulated neuron at the center to the target neuron at the bottom are selectively strengthened (Fig 2A). Fig 2B shows a successful example of the firing rate of the target (red) and the false-target neurons (black). The firing rate of the target neuron was selectively increased. Fig 2C indicates that the correct paths are gradually strengthened. In the last trial, the combination of waves and the D_t signal successfully establishes a path from the stimulated neuron to the target neuron (Fig 2D). The success rate of each condition is indicated in Fig 2E. Our full model shows the best task performance, while the conventional model (without waves and the D_t signal) fails in this task. For this task to be completed, the D_t signal is critical because the input signal from the stimulated neuron does not reach the target neuron in the initial setting (S1 Fig). Hence, a reward or punishment signal is too unreliable to train the network at the beginning. In contrast, reward-independent STDP, induced by the D_t signal, gradually establishes radially symmetric outbound paths spreading from the stimulated neuron (S2 Fig). Traveling waves speed up this process by enhancing radial spreading neural activity, but they are not effective in the absence of the D_t signal because they drive noisy neural activity (S1 Fig). Once radially symmetric candidate paths were formed (S2 Fig), reward-modulated STDP can select paths toward the target neuron based on reward and punishment signals (Fig 2).

Task 2: Finding a shortcut

The combination of the wave signal and STDP rule can also help find the shortest paths from the stimulated neuron to a target. Generally, finding short paths is vital for fast and reliable computation—transmission through detour paths is slow and fragile because successful transmission depends on multiple neurons’ states, which are unreliable in nature. Finding an initially weak shortcut path might be difficult without traveling waves because the neurons along the shortcut path would seldom be activated coherently. Wave propagation can significantly increase this probability and accelerate the learning process.

Fig 3 shows the setting and results of this task. Similar to Task 1, we placed a stimulated neuron and a target neuron. The stimulated neuron S is located upper-left at (100 μm, 500 μm), and the target neuron T is located bottom-left at (100 μm, 100 μm) (Fig 3A). Neurons within 100√5 μm are randomly and unidirectionally connected with a probability of 0.5. If a neuron is isolated by chance, we repeat the procedure until it gets connected. The synaptic weights of a detour path are initially set three times as strong as the other synapses. The stimulated neuron receives external input at the beginning of each trial for 250 ms. Initially, the signal is only transferred through the detour path, which takes more than 160 ms to reach the target neuron. Meanwhile, it takes less than 100 ms when the signal is transferred through the shortcut paths after learning. This setting could reflect inter-regional signal transmission, for example, where the shorter paths represent direct signal transmission, and the detour paths represent the signal transmission via several relay stations.

Download:

Fig 3. Wave propagation helps find a shortcut.

(A) An example of a successful trial. Each panel represents the initial synaptic weight (Left), the last synaptic weight (Middle) and the difference between them (Right). Initially, a strong detour path from the stimulated neuron S on the upper-left at (100 μm, 500 μm) to the target neuron T on the bottom-left at T (100 μm, 100 μm) is prepared. Neurons within 100√5 μm are randomly connected with a probability of 0.5. Synaptic connections from S are all outward, while synaptic connections to T are all inward. At the end of the trial, the shortcut paths are strengthened while the detour paths are preserved. (B) The averaged synaptic weight difference from the initial trial to the 10^th, 25^th, and 40^th trials is plotted. The averaged synaptic weights are calculated by each neuron, including the direction of synaptic connection. (C) The averaged synaptic weight difference from the initial trial and the last trial (the 60^th trial) is plotted. (D) The amount of reward signal is plotted. The wave conditions can successfully escape a local solution state and reach a better solution for this task. The error bar indicates the standard error of the mean. (E) The latency index takes the latency of the first spike in the target neuron after the stimulus onset if it is below 300 ms and takes 300 ms if the latency is above 300 ms. The condition with waves and tonic dopaminergic signal D_t (red) shows the best performance, while the conventional model (black) fails. The error bar indicates the standard error of the mean.

https://doi.org/10.1371/journal.pcbi.1008700.g003

In a successful case, shorter paths are strengthened while the detour paths are moderately strengthened (Fig 3A). This network change occurs with a continuous reinforcement of shortcut paths (Fig 3B). The wave condition successfully establishes shortcut paths, while the no-wave condition cannot strengthen them (Fig 3C). The D_t signal enhances the role of waves by further strengthening the shortcut paths by reward-independent STDP but is not effective on its own because synapses along the shortcut paths are initially too weak to induce spiking activity in the absence of waves. Fig 3D shows the overall performance of this task. Note that the amount of reward declines with the latency of activating the target neuron (see Material and Methods). The wave condition with the D_t signal outperforms the conventional model. Fig 3E represents the averaged latency index for obtaining a reward after trial onset. The latency index is equal to the latency of the first spike in the target neuron after the stimulus onset but saturates for latency above 300 ms to be insensitive to outliers. The latency index decreases faster in wave conditions than in no-wave conditions. While the effect of D_t on task performance is evident in these networks of recurrently connected neurons, the effect is less prominent in feedforward networks (S1 Text). This result shows that D_t-induced reward-independent STDP is especially important in selectively strengthening outbound paths from the stimulated neuron.

Task 3: Learning a nonlinear function

In this task, we demonstrate that our model is useful for a more practical setting. Here, we show that the XOR function can be learned in our model as well. Nonlinear functions such as the XOR function are essential for complex calculation, but how to realize them efficiently with the reward-modulated STDP rule remains to be seen. We propose that our model has an advantage in this task because some nonlinear functions can be created by finding appropriate poly-synaptic paths. Among the various kinds of nonlinear functions, we chose the XOR function because of its simplicity and universality of logic gates [38]. It is widely known that implementing an XOR function requires a hidden layer in a feedforward neural network. Therefore, this task is difficult for the STDP rule because indirect paths should be learned. Our model can alleviate this difficulty and facilitate the learning process.

Fig 4 shows the setting and results of this experiment. In this task, we used four stimulated neurons located at the bottom, namely 0_a (15 μm, 0 μm), 1_a (45 μm, 0 μm), 0_b (75 μm, 0 μm), and 1_b (105 μm, 0 μm) (Fig 4A). In the middle line at Y = 100 μm, 120 neurons were aligned. In the initial setting, these middle layer neurons receive a strong projection (S_ij = 0.2) from the nearest stimulated neuron and a weak projection (S_ij = 0.1) from another randomly selected stimulated neuron. Two target neurons are positioned at the top, namely, F (30 μm, 200 μm) and T (90 μm, 200 μm). Each middle layer neuron has a strong projection (S_ij = 0.2) to one of them. During this task, four different stimuli are provided, where one of the pairs of stimulated neurons 0_a0_b, 0_a1_b, 1_a0_b, or 1_a1_b receives external input. At the beginning of each trial, the corresponding neurons were stimulated for 250 ms. The target neuron for each of the four stimuli was F, T, T, and F, respectively. If the corresponding target neuron fires more than the other neuron, the reward signal D_p (> 0) is provided. Otherwise, the punishment signal D_p (< 0) is provided. The reason for initially having weak inputs from the stimulated neurons to the middle layer neurons is to expedite learning. If these connections are strong enough, the task can be solvable simply by learning the output-layer synapses. We set these synapses weak enough so that the task performance remains near the chance level by learning only the output-layer synapses. We use a feedforward network in this task, which may be implemented, for example, in three information-processing layers (e.g., layer 4 to layer 2–3 to layer 5) in a cortical column [11,39–41].

Download:

Fig 4. Wave propagation is useful for learning a nonlinear function.

(A) A successful example of synaptic weight change. The Initial condition (Left), the last condition (Middle), and the difference (Right). Each middle layer neuron receives a strong synaptic weight and a weak synaptic weight from stimulated neurons and sends an output to T or F. The success rate is initially at the chance level. (B) The averaged synaptic weight difference between each middle layer neuron projecting to T and F is plotted at the 5^th, 10^th, and 15^th trials. Each path strength is calculated as a product of the averaged synaptic weight from stimulated neurons to middle layer neurons and middle layer neurons to a target neuron. Percentage changes in averaged synaptic weight are shown in color. (C) The same plots as (B) at the last trial (the 25^th trial). Wave condition (top column) successfully learns the correct paths, while no-wave condition (lower column) fails. (D) The success rate of the XOR task. Wave conditions (red & yellow) shows better results than no-wave conditions (green & black).

https://doi.org/10.1371/journal.pcbi.1008700.g004

Fig 4A shows the synaptic weight change in a successful case. The relevant connections are selectively strengthened or weakened. Each synaptic path strength is calculated in Fig 4B. Correct paths are gradually strengthened through the trial. In the last trial (the 25^th trial), the wave condition successfully established the correct paths, while the no-wave condition failed (Fig 4C). Fig 4D shows the task performance for each condition. The wave conditions (red and yellow) perform better than the no-wave conditions (green and black) because, similar to Task 2, the weak connections can only be strengthened with the support of traveling waves. However, the contribution of D_t is small in this task because the signal transmission from the stimulated neurons to the target neurons is easily achieved from the beginning in the presence of waves due to the disynaptic feedforward structure.

Discussion

We have demonstrated that the combination of traveling waves and tonic dopaminergic signals enhances selective reinforcement of poly-synaptic paths. Further, we showed that this combination is also helpful for learning a shortcut and a nonlinear function. The advantage of traveling waves to send signals across distant neurons is effectively utilized in the tasks we explored. Thus, we argue that a possible role of traveling waves in the brain is to aid local learning rules, such as the reward-modulated STDP, to efficiently learn poly-synaptic paths by inducing coherent activity in neurons along with them.

The advantage of the proposed model over the conventional model is twofold. First, the combination of traveling waves and the tonic dopaminergic signal helps to prepare paths starting from stimulated neurons. In our model, a tonic dopaminergic signal permits reward-independent STDP. In its presence, traveling waves efficiently create a repertoire of poly-synaptic paths spreading from the wave-initiation sites. Second, once a repertoire of paths from the stimulated neurons is prepared, a reward-dependent phasic dopaminergic signal can reinforce its subset. These features are consistent with the biological evidence of recent studies. Beeler et al. [42] showed that tonic and phasic dopamine have different roles; tonic dopamine modulates the degree of learning and its expression, while phasic dopamine is the main source of reinforcement learning. In addition, Schultz [43] suggests that the continuous emission of tonic dopaminergic signals controls the motivation for exploration, while the discrete phasic dopaminergic signal induces event-related synaptic plasticity. Our model is also testable by examining the relationship between traveling waves and learning in a specific environment, such as by selective blockade or enhancement of either the tonic or phasic component of the dopaminergic signal.

Our model suggests a mechanism of memory consolidation during slow-wave sleep. Some experiments have observed traveling waves across the entire brain during slow-wave sleep [9,10] and showed their importance in memory consolidation [21,22]. Importantly, dopaminergic neurons emit tonic signals during slow-wave sleep [44]. These studies indicate that the combination of traveling waves and tonic dopaminergic signals may consolidate memory. Our results agree with this view, supporting that the coherent activation of neurons caused by traveling waves can prepare poly-synaptic paths for more rapid and reliable signal transmission (cf. Fig 3). Further studies on the role of traveling waves and dopaminergic signals on the efficacy of poly-synaptic paths during slow-wave sleep likely elucidate the mechanism of memory consolidation.

One limitation of our model is the separation of dynamics between neural activity and wave propagation. In our model, wave propagation is modeled by the local field without specific relation to the membrane potential of neurons. While this approach is reasonable in our study that involves only a small number of neurons, the local field must be defined by the average activity of many neurons in reality [45]. Thus, future large-scale simulations could model the relationship between traveling waves and the membrane potential of neurons in an explicit manner. Further, the current model only involves global inhibition, but different classes of inhibitory neurons contribute to up- and down-states in distinct ways [46]. More subtle features of traveling waves might arise from such detailed modeling. Despite these limitations, our simple model revealed a synergy of traveling waves and dopaminergic signals to efficiently learn the directionality of information flow and distant neural network paths in a reinforcement task. This mechanism would be progressively more important for animals with a larger brain because distant and indirect paths are more dominant. Our study underscores the importance of coherent neural activity in the form of waves for coherent learning beyond pairs of neurons.

Material and methods

Simulation environment

We conducted all simulations using the Brian2 simulator (https://brian2.readthedocs.io/en/stable/). This is an open Python library that focuses on simulating spiking neurons [47]. The post-analysis of the simulation is performed by custom-made Python code. The source code is provided in S1 File.

Networks

The network of excitatory neurons is defined task by task (see Figs 2A, 3A and 4A). As described in the Results section, all excitatory neurons receive an inhibitory feedback signal and a dopaminergic signal for simplicity (Fig 1A). The whole system of our model is indicated in Fig 5.

Download:

Fig 5. The whole system of our model.

Excitatory neurons are locally connected via synapses. Inhibitory feedback signal controls the firing rate of excitatory neurons, global dopaminergic signal modulates the synaptic weights, and wavefield created by the activities of other neurons controls the activity level of each excitatory neuron. External input and reward functions are externally provided. The + = operator means that the right-hand-side is added to the left-hand-side when an event happens (with delay for D_p). We use the same parameters (except three parameters summarized in Table 1) to learn three qualitatively different tasks in different network architectures, which underscores the robustness of the learning rule and the role of traveling waves.

https://doi.org/10.1371/journal.pcbi.1008700.g005

Inhibitory feedback

The inhibitory feedback signal h^inh controls the firing rate of excitatory neurons. The dynamics of h^inh are described by (5)

Download:

Table 1. The task-dependent variables are summarized.

https://doi.org/10.1371/journal.pcbi.1008700.t001

where τ_inh = 5 ms is the inhibitory time-constant, f_i is the spike-train (i.e., the sum of delta functions peaking at each spike timing) of neuron i, t_h = 1 ms is the transmission delay, and β is the inhibitory feedback strength. The values of β, summarized in Table 1, depending on each task because of the difference in the number of neurons and the network structure.

Inhibitory feedback strength β roughly correlates with the number of neuron N. Dopamine signal initial amplitude d_p is chosen for the best result for each task. Wave amplitude constant η is chosen depending on the network structure. Recurrent networks need relatively larger value than feedforward networks (see Supporting Information for solving Task 2 in feedforward networks). Note that, among several parameters in the model, β, d_p, and η are chosen as representative parameters that control the basic ingredients in the model: global inhibitory signal, dopamine signal, and traveling waves, respectively.

Dopaminergic signals

The tonic dopaminergic signal D_t and the phasic dopaminergic signal D_p are essential components of our simulations. The D_t signal is expressed as (6) with tonic dopamine constant d_t = 0.003, and the novelty function Novelty(t) (explained below). This setting is fixed in every task we conducted here. We set D_t = 0 for the conventional model.

D_p signal (-0.3 ≤ D_p ≤ 0.3) is adjusted depending on the performance of each task. D_p depends on three variables, reward R, amplitude function Γ_R for the reward, and novelty variable Novelty (Fig 5). D_p exponentially decays according to (7) except when a target/false-target neuron spikes. When a target/false-target neuron spikes at time t, D_p instantaneously jumps at time t+t_p according to (8) where decay constant τ_p = 200 ms and transmission delay t_p = 100 ms. Note that the + = operator indicates that the right-hand side is added to the left-hand-side variable upon a spiking event (with delay t_p). We measured the spike counts of target and false-target neurons by vectors n_true and n_false, respectively, in each trial (these vectors are reset to zero at the end of each trial). The raw reward R is a function of n_true and n_false. For Task 1, we set R = 0 when these neurons are not very active, namely, when the total spike-count of one target and three false-target neurons is less than 5. This adds robustness to the simulation results. Once the total spike-count reached 5, R = 1.0, when the target spike-count was the greatest and R = -0.5 the target spike-count was not the greatest among the four neurons. Therefore, (9) where I[∙] is the indicator function that takes 1 if the argument is true and 0 otherwise. We mean by max(n_false) and sum(n_false) the maximum and the sum of the spike counts of the three false-target neurons, respectively.

For Task 2, there is one target neuron and no false-target neuron. Therefore, we used (10) In this task, the punishment (R < 0) is not given.

For Task 3, we again considered one target neuron and one false-target neuron. R = 1 when the target neuron fires at least more than 5 spikes than the false-target neuron; R = -1 when the false-target neuron fires at least more than 5 spikes than the target neuron; and R = 0 otherwise. Namely, (11) We set this margin of 5 spikes to induce a clear difference in the number of spikes between the target and false-target neurons.

Next, we introduce the reward-amplitude function Γ_R. The amount of reward begins to take a non-zero value after the stimulus onset time t_on, stays fixed until the stimulus offset time t_off, and then decays exponentially. Namely, (12) with a dopamine decay constant τ_d = 200 ms and the initial amplitude d_p, which is set depending on the task (see Table 1).

Finally, we assume that dopamine release increases with novelty [48] and novelty becomes high when the prediction error is high. We simply assume that Novelty (0 ≤ Novelty ≤ 1) decreases by 0.2 at the end of a correct trial and increases by 0.2 at the end of a wrong trial. Here, we introduce task-dependent correct and incorrect criteria. In Tasks 1 and 3, we used R > 0 and R ≤ 0 at the end of each trial to define a correct and incorrect trial, respectively. In Task 2, we used the latency of signal transmission from the stimulated neuron to the target neuron for the criteria. Latency of less than 100 ms is defined as a success.

Local field and influx coefficients

For the wave, we used a simple custom-made propagation rule. The upstate is defined as a high noise level state (σ_i(t) ~6 mV), while the downstate is a low noise phase (σ_i(t) ~3 mV). The noise level is determined by σ_i(t) = α_i∙ψ_i+3 mV with an influx coefficient α_i and local field ψ_i. The local field is updated as explained in the Results by (13) As an initial condition, we choose ψ_i = 0 for all i, which corresponds to the downstate. We assume that upstate is induced by external stimuli (e.g., [49]). The influx coefficient α_i quantifies the sensitivity of neuron i’s noise level to ψ_i and is defined by (14) where t_on is again the trial onset. The coefficient α_i counts the number of neighboring local fields that influenced ψ_i in each trial up to time t. The tangent hyperbolic function is introduced to implement a saturation effect. For a conventional setting, σ_i(t) is set as a constant value adjusted to the same firing rate as the wave condition.

Supporting information

S1 Fig. The difference of the signal-driven spikes and traveling-wave-driven spikes of the target neuron in Task 1.

An example of the membrane potential before learning (Top) and the membrane potential of the same neuron after learning (Middle). The red line indicates spike timing. Before learning, the signal from the stimulated neuron does not reach the target neuron and the target neuron does not fire. In contrast, after learning, the external input reaches the target neuron, and the firing rate increases during the stimulus period. Meanwhile, the firing rate of spontaneous spikes driven by traveling wave does not change before and after learning. The noise level (black line) is changed by a traveling wave of upstate (Bottom). During the stimulus period at the onset of a trial, external input (green bar) is provided to the stimulated neuron.

https://doi.org/10.1371/journal.pcbi.1008700.s001

(TIFF)

S2 Fig. Task performance without the Dp signal in Task 1 and 2.

(A) The contribution of reward-independent STDP is shown for Task 1 by setting Dp = 0. The average synaptic weights are computed over 40 simulations, and their differences (from the initial trial to the 15th trial) are plotted with the Dt signaling and traveling waves (Left) and with the Dt signaling alone (Right). Outbound synaptic weights near the stimulated neuron are strengthened by the Dt signaling alone but more strongly with waves. In this task, initial synaptic weights are set rather strong. Hence, the stimulated neuron can propagate its activity to neighboring neurons from the beginning, and poly-synaptic paths toward the target and false-target neurons are gradually extended by reward-independent STDP. This happens even without waves but more efficiently with waves that contribute to the outbound spreading of neural activity. (B) The contribution of reward-independent STDP is shown for Task 2 by setting Dp = 0. The differences of averaged weights (from the initial trial to the 40th trial) are plotted with the Dt signaling and traveling waves (Left) and with the Dt signaling alone (Right). The detour path is efficiently strengthened in both cases because it is strong enough to propagate neural activity from the beginning. However, the shortcut path is strengthened only with waves because it is too weak to propagate neural activity at the beginning. Hence, the shortcut path requires waves to propagate neural activity only with waves, which is required to gradually strengthen the path by reward-independent STDP.

https://doi.org/10.1371/journal.pcbi.1008700.s002

(TIFF)

S1 Text. Task 2 with feedforward network.

https://doi.org/10.1371/journal.pcbi.1008700.s003

(PDF)

S1 File. The source code used for simulations.

https://doi.org/10.1371/journal.pcbi.1008700.s004

(GZ)

References

1. Klimesch W. Memory processes, brain oscillations and EEG synchronization. Int J Psychophysiol. 1996;24(1–2): 61–100. pmid:8978436.
- View Article
- PubMed/NCBI
- Google Scholar
2. Burkitt GR, Silberstein RB, Cadusch PJ, Wood AW. The steady-state visually evoked potential and travelling waves. Clin Neurophysiol. 2000;111(2): 246–258. pmid:10680559.
- View Article
- PubMed/NCBI
- Google Scholar
3. Nunez PL, Srinivasan R. Electric fields of the brain: The neurophysics of EEG. Oxford University Press; 2006.
4. Srinivasan R, Bibi FA, Nunez PL. Steady-state visual evoked potentials: distributed local sources and wave-like dynamics are sensitive to flicker frequency. Brain Topogr. 2006;18(3): 167–187. pmid:16544207.
- View Article
- PubMed/NCBI
- Google Scholar
5. Grinvald A, Lieke EE, Frostig RD, Hildesheim R. Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex. J Neurosci. 1994;14(5 Pt 1): 2545–2568. pmid:8182427.
- View Article
- PubMed/NCBI
- Google Scholar
6. Slovin H, Arieli A, Hildesheim R, Grinvald A. Long-term voltage-sensitive dye imaging reveals cortical dynamics in behaving monkeys. J Neurophysiol. 2002;88(6): 3421–3438. pmid:12466458.
- View Article
- PubMed/NCBI
- Google Scholar
7. Nauhaus I, Busse L, Carandini M, Ringach DL. Stimulus contrast modulates functional connectivity in visual cortex. Nat Neurosci. 2009;12: 70–76. pmid:19029885.
- View Article
- PubMed/NCBI
- Google Scholar
8. Nauhaus I, Busse L, Ringach DL, Carandini M. Robustness of traveling waves in ongoing activity of visual cortex. J Neurosci. 2012;32(9): 3088–3094. pmid:22378881.
- View Article
- PubMed/NCBI
- Google Scholar
9. Mohajerani MH, McVea DA, Fingas M, Murphy TH. Mirrored bilateral slow-wave cortical activity within local circuits revealed by fast bihemispheric voltage-sensitive dye imaging in anesthetized and awake mice. J Neurosci. 2010;30(10): 3745–3751. pmid:20220008.
- View Article
- PubMed/NCBI
- Google Scholar
10. Massimini M, Huber R, Ferrarelli F, Hill S, Tononi G. The sleep slow oscillation as a traveling wave. J Neurosci. 2004;24(31): 6862–6870. pmid:15295020.
- View Article
- PubMed/NCBI
- Google Scholar
11. Sakata S, Harris KD. Laminar structure of spontaneous and sensory-evoked population activity in auditory cortex. Neuron. 2009;64(3): 404–418. pmid:19914188.
- View Article
- PubMed/NCBI
- Google Scholar
12. Harris KD, Thiele A. Cortical state and attention. Nat Rev Neurosci. 2011;12: 509–523. pmid:21829219.
- View Article
- PubMed/NCBI
- Google Scholar
13. Petersen CC, Grinvald A, Sakmann B. Spatiotemporal dynamics of sensory responses in layer 2/3 of rat barrel cortex measured in vivo by voltage-sensitive dye imaging combined with whole-cell voltage recordings and neuron reconstructions. J Neurosci. 2003;23(4): 1298–1309. pmid:12598618.
- View Article
- PubMed/NCBI
- Google Scholar
14. Steriade M, McCormick DA, Sejnowski TJ. Thalamocortical oscillations in the sleeping and aroused brain. Science. 1993;262(5134): 679–685. pmid:8235588.
- View Article
- PubMed/NCBI
- Google Scholar
15. Krull EM, Sakata S, Toyoizumi T. Theta oscillations alternate with high amplitude neocortical population within synchronized states. Front Neurosci. 2019. 13(316): 1–16. pmid:31037053.
- View Article
- PubMed/NCBI
- Google Scholar
16. Lee BR, Mu P, Saal DB, Ulibarri C, Dong Y. Homeostatic recovery of downstate–upstate cycling in nucleus accumbens neurons. Neurosci Lett. 2008;434(3): 282–288. pmid:18329805.
- View Article
- PubMed/NCBI
- Google Scholar
17. Sato TK, Nauhaus I, Carandini M. Traveling waves in visual cortex. Neuron. 2012;75(2): 218–229. pmid:22841308.
- View Article
- PubMed/NCBI
- Google Scholar
18. Lubenov EV, Siapas AG. Hippocampal theta oscillations are travelling waves. Nature. 2009;459: 534–539. pmid:19489117.
- View Article
- PubMed/NCBI
- Google Scholar
19. Bringuier V, Chavane F, Glaeser L, Frégnac Y. Horizontal propagation of visual activity in the synaptic integration field of area 17 neurons. Science. 1999;283(5402): 695–699. pmid:9924031.
- View Article
- PubMed/NCBI
- Google Scholar
20. Rubino D, Robbins KA, Hatsopoulos NG. Propagating waves mediate information transfer in the motor cortex. Nat Neurosci. 2006;9: 1549–1557. pmid:17115042.
- View Article
- PubMed/NCBI
- Google Scholar
21. Rasch B, Büchel C, Gais S, Born J. Odor cues during slow-wave sleep prompt declarative memory consolidation. Science. 2007;315(5817): 1426–1429. pmid:17347444.
- View Article
- PubMed/NCBI
- Google Scholar
22. Miyamoto D, Hirai D, Murayama M. The roles of cortical slow waves in synaptic plasticity and memory consolidation. Front Neural Circuits. 2017;Volume 11: 92. pmid:29213231.
- View Article
- PubMed/NCBI
- Google Scholar
23. Calabresi P, Picconi B, Tozzi A, Filippo MD. Dopamine-mediated regulation of corticostriatal synaptic plasticity. Trends Neurosci. 2007;30(5): 211–219. pmid:17367873.
- View Article
- PubMed/NCBI
- Google Scholar
24. Frémaux N, Gerstner W. Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules. Front Neural Circuits. 2016. 9(85): 1–19. pmid:26834568.
- View Article
- PubMed/NCBI
- Google Scholar
25. Kuśmierz Ł, Isomura T, Toyoizumi T. Learning with three factors: modulating Hebbian plasticity with errors. Curr Opin Neurobiol. 2017;46: 170–177. pmid:28918313.
- View Article
- PubMed/NCBI
- Google Scholar
26. Izhikevich EM. Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb Cortex. 2007;17(10): 2443–2452. pmid:17220510.
- View Article
- PubMed/NCBI
- Google Scholar
27. Klampfl S, Maass W. Emergence of dynamic memory traces in cortical microcircuit models through STDP. J Neurosci. 2013;33 (28): 11515–11529. pmid:23843522.
- View Article
- PubMed/NCBI
- Google Scholar
28. Orsborn AL, Pesaran B. Parsing learning in networks using brain–machine interfaces. Curr Opin Neurobiol. 2017;46: 76–83. pmid:28843838.
- View Article
- PubMed/NCBI
- Google Scholar
29. Bassett DS, Bullmore ET. Small-world brain networks revisited. Neuroscientist. 2017;23(5): 499–516. pmid:27655008.
- View Article
- PubMed/NCBI
- Google Scholar
30. Zhang H, Watrous AJ, Patel A, Jacobs J. Theta and alpha oscillations are traveling waves in the human neocortex. Neuron. 2018;98(6): 1269–1281.e4. pmid:29887341.
- View Article
- PubMed/NCBI
- Google Scholar
31. Zhang H, Jacobs J. Traveling theta waves in the human hippocampus. J Neurosci. 2015;35(36): 12477–12487. pmid:26354915.
- View Article
- PubMed/NCBI
- Google Scholar
32. Dan Y, Poo MM. Spike timing-dependent plasticity: from synapse to perception. Physiol Rev. 2006;86: 1033–1048. pmid:16816145.
- View Article
- PubMed/NCBI
- Google Scholar
33. Floresco SB, West AR, Ash B, Moore H, Grace AA. Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat Neurosci. 2003;6: 968–973. pmid:12897785.
- View Article
- PubMed/NCBI
- Google Scholar
34. Li S, Cullen WK, Anwyl R, Rowan MJ. Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty. Nat Neurosci. 2003;6: 526–531. pmid:12704392.
- View Article
- PubMed/NCBI
- Google Scholar
35. Bi GQ, Poo MM. Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J Neurosci. 1998;18(24): 10464–10472. pmid:9852584.
- View Article
- PubMed/NCBI
- Google Scholar
36. Harris KD, Bartho P, Chadderton P, Curto C, Rocha J, Hollender L, et al. How do neurons work together? Lessons from auditory cortex. Hear Res. 2010;271(1–2): 37–53. pmid:20603208.
- View Article
- PubMed/NCBI
- Google Scholar
37. Matsumoto K, Suzuki W, Tanaka K. Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science. 2003;301(5630): 229–232. pmid:12855813.
- View Article
- PubMed/NCBI
- Google Scholar
38. Yang J, Yang W, Wu W. A novel spiking perceptron that can solve XOR problem. Neural Network World. 2011;1(11): 45–50.
- View Article
- Google Scholar
39. Alexandre F, Guyot F, Haton JP, Burnod Y. The cortical column: A new processing unit for multilayered networks. Neural Netw. 1991;4(1): 15–25.
- View Article
- Google Scholar
40. Lefort S, Tomm C, Floyd JC, Petersen CC. The excitatory neuronal network of the C2 barrel column in mouse primary somatosensory cortex. Neuron. 2009;61(2): 301–316. pmid:19186171.
- View Article
- PubMed/NCBI
- Google Scholar
41. Beeler JA, Daw N, Frazier CRM, Zhuang X. Tonic dopamine modulates exploitation of reward learning. Front Behav Neurosci. 2010;Volume 4: 170. pmid:21120145.
- View Article
- PubMed/NCBI
- Google Scholar
42. Westerberg JA, Cox MA, Dougherty K, Maier A. V1 microcircuit dynamics: altered signal propagation suggests intracortical origins for adaptation in response to visual repetition. J Neurophysiol. 2019;121: 1938–1952. pmid:30917065.
- View Article
- PubMed/NCBI
- Google Scholar
43. Schultz W. Behavioral dopamine signals. Trends Neurosci. 2007;30(5): 203–210. pmid:17400301.
- View Article
- PubMed/NCBI
- Google Scholar
44. Monti JM, Monti D. The involvement of dopamine in the modulation of sleep and waking. Sleep Med Rev. 2007;11(2): 113–133. pmid:17275369.
- View Article
- PubMed/NCBI
- Google Scholar
45. Muller L, Chavane F, Reynolds J, Sejnowski TJ. Cortical travelling waves: mechanisms and computational principles. Nat Rev Neurosci. 2018;19: 255–268. pmid:29563572.
- View Article
- PubMed/NCBI
- Google Scholar
46. Tahvildari B, Wolfel M, Duque A, McCormick DA. Selective functional interactions between excitatory and inhibitory cortical neurons and differential contribution to persistent activity of the slow oscillation. J Neurosci. 2012;32(35): 12165–12179. pmid:22933799.
- View Article
- PubMed/NCBI
- Google Scholar
47. Stimberg M, Brette R, Goodman DFM. Brian 2, an intuitive and efficient neural simulator. eLife. 2019;Volume 8. pmid:31429824.
- View Article
- PubMed/NCBI
- Google Scholar
48. Feenstra MGP, Botterblom MHA, Uum JFMV. Novelty-induced increase in dopamine release in the rat prefrontal cortex in vivo: inhibition by diazepam. Neurosci Lett. 1995;189(2): 81–84. pmid:7609924.
- View Article
- PubMed/NCBI
- Google Scholar
49. Tan AYY, Chen Y, Scholl B, Seidemann E, Priebe NJ. Sensory stimulation shifts visual cortex from synchronous to asynchronous states. Nature. 2014;509: 226–229. pmid:24695217.
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Klimesch W. Memory processes, brain oscillations and EEG synchronization. Int J Psychophysiol. 1996;24(1–2): 61–100. pmid:8978436.
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Burkitt GR, Silberstein RB, Cadusch PJ, Wood AW. The steady-state visually evoked potential and travelling waves. Clin Neurophysiol. 2000;111(2): 246–258. pmid:10680559.
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Nunez PL, Srinivasan R. Electric fields of the brain: The neurophysics of EEG. Oxford University Press; 2006.

[ref4] 4. Srinivasan R, Bibi FA, Nunez PL. Steady-state visual evoked potentials: distributed local sources and wave-like dynamics are sensitive to flicker frequency. Brain Topogr. 2006;18(3): 167–187. pmid:16544207.
View Article
PubMed/NCBI
Google Scholar

[11] View Article

[12] PubMed/NCBI

[13] Google Scholar

[ref5] 5. Grinvald A, Lieke EE, Frostig RD, Hildesheim R. Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex. J Neurosci. 1994;14(5 Pt 1): 2545–2568. pmid:8182427.
View Article
PubMed/NCBI
Google Scholar

[15] View Article

[16] PubMed/NCBI

[17] Google Scholar

[ref6] 6. Slovin H, Arieli A, Hildesheim R, Grinvald A. Long-term voltage-sensitive dye imaging reveals cortical dynamics in behaving monkeys. J Neurophysiol. 2002;88(6): 3421–3438. pmid:12466458.
View Article
PubMed/NCBI
Google Scholar

[19] View Article

[20] PubMed/NCBI

[21] Google Scholar

[ref7] 7. Nauhaus I, Busse L, Carandini M, Ringach DL. Stimulus contrast modulates functional connectivity in visual cortex. Nat Neurosci. 2009;12: 70–76. pmid:19029885.
View Article
PubMed/NCBI
Google Scholar

[23] View Article

[24] PubMed/NCBI

[25] Google Scholar

[ref8] 8. Nauhaus I, Busse L, Ringach DL, Carandini M. Robustness of traveling waves in ongoing activity of visual cortex. J Neurosci. 2012;32(9): 3088–3094. pmid:22378881.
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref9] 9. Mohajerani MH, McVea DA, Fingas M, Murphy TH. Mirrored bilateral slow-wave cortical activity within local circuits revealed by fast bihemispheric voltage-sensitive dye imaging in anesthetized and awake mice. J Neurosci. 2010;30(10): 3745–3751. pmid:20220008.
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref10] 10. Massimini M, Huber R, Ferrarelli F, Hill S, Tononi G. The sleep slow oscillation as a traveling wave. J Neurosci. 2004;24(31): 6862–6870. pmid:15295020.
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref11] 11. Sakata S, Harris KD. Laminar structure of spontaneous and sensory-evoked population activity in auditory cortex. Neuron. 2009;64(3): 404–418. pmid:19914188.
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref12] 12. Harris KD, Thiele A. Cortical state and attention. Nat Rev Neurosci. 2011;12: 509–523. pmid:21829219.
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref13] 13. Petersen CC, Grinvald A, Sakmann B. Spatiotemporal dynamics of sensory responses in layer 2/3 of rat barrel cortex measured in vivo by voltage-sensitive dye imaging combined with whole-cell voltage recordings and neuron reconstructions. J Neurosci. 2003;23(4): 1298–1309. pmid:12598618.
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref14] 14. Steriade M, McCormick DA, Sejnowski TJ. Thalamocortical oscillations in the sleeping and aroused brain. Science. 1993;262(5134): 679–685. pmid:8235588.
View Article
PubMed/NCBI
Google Scholar

[51] View Article

[52] PubMed/NCBI

[53] Google Scholar

[ref15] 15. Krull EM, Sakata S, Toyoizumi T. Theta oscillations alternate with high amplitude neocortical population within synchronized states. Front Neurosci. 2019. 13(316): 1–16. pmid:31037053.
View Article
PubMed/NCBI
Google Scholar

[55] View Article

[56] PubMed/NCBI

[57] Google Scholar

[ref16] 16. Lee BR, Mu P, Saal DB, Ulibarri C, Dong Y. Homeostatic recovery of downstate–upstate cycling in nucleus accumbens neurons. Neurosci Lett. 2008;434(3): 282–288. pmid:18329805.
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref17] 17. Sato TK, Nauhaus I, Carandini M. Traveling waves in visual cortex. Neuron. 2012;75(2): 218–229. pmid:22841308.
View Article
PubMed/NCBI
Google Scholar

[63] View Article

[64] PubMed/NCBI

[65] Google Scholar

[ref18] 18. Lubenov EV, Siapas AG. Hippocampal theta oscillations are travelling waves. Nature. 2009;459: 534–539. pmid:19489117.
View Article
PubMed/NCBI
Google Scholar

[67] View Article

[68] PubMed/NCBI

[69] Google Scholar

[ref19] 19. Bringuier V, Chavane F, Glaeser L, Frégnac Y. Horizontal propagation of visual activity in the synaptic integration field of area 17 neurons. Science. 1999;283(5402): 695–699. pmid:9924031.
View Article
PubMed/NCBI
Google Scholar

[71] View Article

[72] PubMed/NCBI

[73] Google Scholar

[ref20] 20. Rubino D, Robbins KA, Hatsopoulos NG. Propagating waves mediate information transfer in the motor cortex. Nat Neurosci. 2006;9: 1549–1557. pmid:17115042.
View Article
PubMed/NCBI
Google Scholar

[75] View Article

[76] PubMed/NCBI

[77] Google Scholar

[ref21] 21. Rasch B, Büchel C, Gais S, Born J. Odor cues during slow-wave sleep prompt declarative memory consolidation. Science. 2007;315(5817): 1426–1429. pmid:17347444.
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref22] 22. Miyamoto D, Hirai D, Murayama M. The roles of cortical slow waves in synaptic plasticity and memory consolidation. Front Neural Circuits. 2017;Volume 11: 92. pmid:29213231.
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref23] 23. Calabresi P, Picconi B, Tozzi A, Filippo MD. Dopamine-mediated regulation of corticostriatal synaptic plasticity. Trends Neurosci. 2007;30(5): 211–219. pmid:17367873.
View Article
PubMed/NCBI
Google Scholar

[87] View Article

[88] PubMed/NCBI

[89] Google Scholar

[ref24] 24. Frémaux N, Gerstner W. Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules. Front Neural Circuits. 2016. 9(85): 1–19. pmid:26834568.
View Article
PubMed/NCBI
Google Scholar

[91] View Article

[92] PubMed/NCBI

[93] Google Scholar

[ref25] 25. Kuśmierz Ł, Isomura T, Toyoizumi T. Learning with three factors: modulating Hebbian plasticity with errors. Curr Opin Neurobiol. 2017;46: 170–177. pmid:28918313.
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref26] 26. Izhikevich EM. Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb Cortex. 2007;17(10): 2443–2452. pmid:17220510.
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref27] 27. Klampfl S, Maass W. Emergence of dynamic memory traces in cortical microcircuit models through STDP. J Neurosci. 2013;33 (28): 11515–11529. pmid:23843522.
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref28] 28. Orsborn AL, Pesaran B. Parsing learning in networks using brain–machine interfaces. Curr Opin Neurobiol. 2017;46: 76–83. pmid:28843838.
View Article
PubMed/NCBI
Google Scholar

[107] View Article

[108] PubMed/NCBI

[109] Google Scholar

[ref29] 29. Bassett DS, Bullmore ET. Small-world brain networks revisited. Neuroscientist. 2017;23(5): 499–516. pmid:27655008.
View Article
PubMed/NCBI
Google Scholar

[111] View Article

[112] PubMed/NCBI

[113] Google Scholar

[ref30] 30. Zhang H, Watrous AJ, Patel A, Jacobs J. Theta and alpha oscillations are traveling waves in the human neocortex. Neuron. 2018;98(6): 1269–1281.e4. pmid:29887341.
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref31] 31. Zhang H, Jacobs J. Traveling theta waves in the human hippocampus. J Neurosci. 2015;35(36): 12477–12487. pmid:26354915.
View Article
PubMed/NCBI
Google Scholar

[119] View Article

[120] PubMed/NCBI

[121] Google Scholar

[ref32] 32. Dan Y, Poo MM. Spike timing-dependent plasticity: from synapse to perception. Physiol Rev. 2006;86: 1033–1048. pmid:16816145.
View Article
PubMed/NCBI
Google Scholar

[123] View Article

[124] PubMed/NCBI

[125] Google Scholar

[ref33] 33. Floresco SB, West AR, Ash B, Moore H, Grace AA. Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat Neurosci. 2003;6: 968–973. pmid:12897785.
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref34] 34. Li S, Cullen WK, Anwyl R, Rowan MJ. Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty. Nat Neurosci. 2003;6: 526–531. pmid:12704392.
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref35] 35. Bi GQ, Poo MM. Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J Neurosci. 1998;18(24): 10464–10472. pmid:9852584.
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

[ref36] 36. Harris KD, Bartho P, Chadderton P, Curto C, Rocha J, Hollender L, et al. How do neurons work together? Lessons from auditory cortex. Hear Res. 2010;271(1–2): 37–53. pmid:20603208.
View Article
PubMed/NCBI
Google Scholar

[139] View Article

[140] PubMed/NCBI

[141] Google Scholar

[ref37] 37. Matsumoto K, Suzuki W, Tanaka K. Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science. 2003;301(5630): 229–232. pmid:12855813.
View Article
PubMed/NCBI
Google Scholar

[143] View Article

[144] PubMed/NCBI

[145] Google Scholar

[ref38] 38. Yang J, Yang W, Wu W. A novel spiking perceptron that can solve XOR problem. Neural Network World. 2011;1(11): 45–50.
View Article
Google Scholar

[147] View Article

[148] Google Scholar

[ref39] 39. Alexandre F, Guyot F, Haton JP, Burnod Y. The cortical column: A new processing unit for multilayered networks. Neural Netw. 1991;4(1): 15–25.
View Article
Google Scholar

[150] View Article

[151] Google Scholar

[ref40] 40. Lefort S, Tomm C, Floyd JC, Petersen CC. The excitatory neuronal network of the C2 barrel column in mouse primary somatosensory cortex. Neuron. 2009;61(2): 301–316. pmid:19186171.
View Article
PubMed/NCBI
Google Scholar

[153] View Article

[154] PubMed/NCBI

[155] Google Scholar

[ref41] 41. Beeler JA, Daw N, Frazier CRM, Zhuang X. Tonic dopamine modulates exploitation of reward learning. Front Behav Neurosci. 2010;Volume 4: 170. pmid:21120145.
View Article
PubMed/NCBI
Google Scholar

[157] View Article

[158] PubMed/NCBI

[159] Google Scholar

[ref42] 42. Westerberg JA, Cox MA, Dougherty K, Maier A. V1 microcircuit dynamics: altered signal propagation suggests intracortical origins for adaptation in response to visual repetition. J Neurophysiol. 2019;121: 1938–1952. pmid:30917065.
View Article
PubMed/NCBI
Google Scholar

[161] View Article

[162] PubMed/NCBI

[163] Google Scholar

[ref43] 43. Schultz W. Behavioral dopamine signals. Trends Neurosci. 2007;30(5): 203–210. pmid:17400301.
View Article
PubMed/NCBI
Google Scholar

[165] View Article

[166] PubMed/NCBI

[167] Google Scholar

[ref44] 44. Monti JM, Monti D. The involvement of dopamine in the modulation of sleep and waking. Sleep Med Rev. 2007;11(2): 113–133. pmid:17275369.
View Article
PubMed/NCBI
Google Scholar

[169] View Article

[170] PubMed/NCBI

[171] Google Scholar

[ref45] 45. Muller L, Chavane F, Reynolds J, Sejnowski TJ. Cortical travelling waves: mechanisms and computational principles. Nat Rev Neurosci. 2018;19: 255–268. pmid:29563572.
View Article
PubMed/NCBI
Google Scholar

[173] View Article

[174] PubMed/NCBI

[175] Google Scholar

[ref46] 46. Tahvildari B, Wolfel M, Duque A, McCormick DA. Selective functional interactions between excitatory and inhibitory cortical neurons and differential contribution to persistent activity of the slow oscillation. J Neurosci. 2012;32(35): 12165–12179. pmid:22933799.
View Article
PubMed/NCBI
Google Scholar

[177] View Article

[178] PubMed/NCBI

[179] Google Scholar

[ref47] 47. Stimberg M, Brette R, Goodman DFM. Brian 2, an intuitive and efficient neural simulator. eLife. 2019;Volume 8. pmid:31429824.
View Article
PubMed/NCBI
Google Scholar

[181] View Article

[182] PubMed/NCBI

[183] Google Scholar

[ref48] 48. Feenstra MGP, Botterblom MHA, Uum JFMV. Novelty-induced increase in dopamine release in the rat prefrontal cortex in vivo: inhibition by diazepam. Neurosci Lett. 1995;189(2): 81–84. pmid:7609924.
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref49] 49. Tan AYY, Chen Y, Scholl B, Seidemann E, Priebe NJ. Sensory stimulation shifts visual cortex from synchronous to asynchronous states. Nature. 2014;509: 226–229. pmid:24695217.
View Article
PubMed/NCBI
Google Scholar

[189] View Article

[190] PubMed/NCBI

[191] Google Scholar

Figures

Abstract

Author summary

Introduction

Results

Task 1: Selectively reinforcing poly-synaptic paths

Task 2: Finding a shortcut

Task 3: Learning a nonlinear function

Discussion

Material and methods

Simulation environment

Networks

Inhibitory feedback

Dopaminergic signals

Local field and influx coefficients

Supporting information

S1 Fig. The difference of the signal-driven spikes and traveling-wave-driven spikes of the target neuron in Task 1.

S2 Fig. Task performance without the Dp signal in Task 1 and 2.

S1 Text. Task 2 with feedforward network.

S1 File. The source code used for simulations.

References