WE investigated the effects of 6 months' oral treatment with L-dihydroxy-phenylalanine (L-DOPA)/carbidopa on the remaining dopaminergic neurones of the substantia nigra pars compacta (SNC) and the ventral tegmental area (VTA) of rats with moderate or severe 6-hydroxydopamine (6-OHDA)-induced lesions and sham-operated animals. Using a radioimmunohistochemical method we counted tyrosine hydroxylase (TH)-radioimmunoreactive cells in the SNC and the VTA in emulsion-coated sections and measured the remaining surface area of both structures on autoradiograms. The sole difference observed was a significant increase of the remaining surface area of TH radioimmunolabelling in the SNC of moderately lesioned rats treated with L-DOPA/carbidopa compared with the untreated animals, while the rest of the parameters recorded, in both structures and groups of animals, were unchanged. This suggest that in vivo, this treatment is not toxic either to healthy dopaminergic neurones of the ventral mesencephalon or to those surviving after a 6-OHDA lesion.
Summary The automatic initiation of actions can be highly functional. But occasionally these actions cannot be withheld and are released at inappropriate times, impulsively. Striatal activity has been shown to participate in the timing of action sequence initiation and it has been linked to impulsivity. Using a self-initiated task, we trained adult male rats to withhold a rewarded action sequence until a waiting time interval has elapsed. By analyzing neuronal activity we show that the striatal response preceding the initiation of the learned sequence is strongly modulated by the time subjects wait before eliciting the sequence. Interestingly, the modulation is steeper in adolescent rats, which show a strong prevalence of impulsive responses compared to adults. We hypothesize this anticipatory striatal activity reflects the animals’ subjective reward expectation, based on the elapsed waiting time, while the steeper waiting modulation in adolescence reflects age-related differences in temporal discounting, internal urgency states or explore-exploit balance.
Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Abstract The automatic initiation of actions can be highly functional. But occasionally these actions cannot be withheld and are released at inappropriate times, impulsively. Striatal activity has been shown to participate in the timing of action sequence initiation and it has been linked to impulsivity. Using a self-initiated task, we trained adult male rats to withhold a rewarded action sequence until a waiting time interval has elapsed. By analyzing neuronal activity we show that the striatal response preceding the initiation of the learned sequence is strongly modulated by the time subjects wait before eliciting the sequence. Interestingly, the modulation is steeper in adolescent rats, which show a strong prevalence of impulsive responses compared to adults. We hypothesize this anticipatory striatal activity reflects the animals' subjective reward expectation, based on the elapsed waiting time, while the steeper waiting modulation in adolescence reflects age-related differences in temporal discounting, internal urgency states, or explore–exploit balance. Editor's evaluation This article investigates an important topic related to the initiation signals of actions sequences detected in the dorsal striatum. The data presented convincingly support the idea that these signals distinguish between the premature versus the timely release of actions. The experiments are well-organized and substantially advance the field. https://doi.org/10.7554/eLife.74929.sa0 Decision letter Reviews on Sciety eLife's review process Introduction The striatum is involved in the acquisition and execution of action sequences (Costa, 2011; Graybiel, 1998; Hikosaka et al., 1999). It has also been linked to temporal information processing (Bakhurin et al., 2017; Emmons et al., 2017; Gouvêa et al., 2015; Matell et al., 2003; Mello et al., 2015) and to the use of temporal information for the action initiation timing (Kunimatsu et al., 2018; Thura and Cisek, 2017; Yau et al., 2020). Sometimes, well-learned action sequences need to be initiated at precise times to obtain the desired outcome. However, these actions may be difficult to withhold when triggering cues are present, and also difficult to stop once they have been initiated (Dalley and Robbins, 2017; Gillan et al., 2016a; Graybiel, 2008; Knowlton and Patterson, 2016; Robbins and Costa, 2017). Moreover, in neuropsychiatric conditions involving malfunctioning of cortico-basal ganglia circuits, such as attention deficit hyperactivity disorder, Tourette syndrome, obsessive–compulsive disorder, and drug addiction (Dalley and Robbins, 2017; Gillan et al., 2016a; Singer, 2016) action sequences might be started at inappropriate contexts and timings. However, how the striatum contributes to action sequence initiation timing remains poorly understood. Interestingly, impulsivity has been identified as a vulnerability factor for compulsive drug seeking habits (Belin and Everitt, 2008). Recent studies link impulsivity to automaticity in behavior (Ersche et al., 2019; Gillan et al., 2016b; Hogarth et al., 2012) and a preponderance of habitual over goal-directed behavioral control (Everitt et al., 2008; Voon et al., 2015). An influential theory in the field postulates a dual control system for behavior, where dorsolateral striatal (DLS) circuits support habitual stimulus–response control whereas dorsomedial striatal (DMS) circuits mediate cognitive-based deliberative control (Balleine and Dickinson, 1998; Daw et al., 2005; Graybiel, 2008; Yin and Knowlton, 2006). During action sequence learning, neuronal activity in the DLS rapidly evolves to mark the initiation and termination of the acquired sequence (Jin and Costa, 2010; Jog et al., 1999), possibly contributing to its release as a behavioral unit or chunk (Graybiel, 2008). In contrast, the DMS encodes reward expectancy, reward prediction errors, and trial outcomes even after extensive training (Kim et al., 2009; Kubota et al., 2009; Rueda-Orozco and Robbe, 2015; Samejima et al., 2005; Thorn et al., 2010; Vandaele et al., 2019). All these features likely contribute to the regulation of explore–exploit (Barnes et al., 2005), cost–benefit (Floresco et al., 2008; Schultz, 2015), and speed–accuracy (Thura and Cisek, 2017) tradeoffs during decision making. A bias toward exploration and risk-taking (Addicott et al., 2017), low tolerance for delayed rewards (Dalley and Robbins, 2017; Monterosso and Ainslie, 1999; Wittmann and Paulus, 2008) and elevated internal urgency states (Carland et al., 2019), may all contribute to impulsivity traits. However, how neuronal activity in the dorsal striatum informs about enhanced premature responding in conditions of high impulsivity, remains unknown. Here, we studied striatal activity during a self-paced task where rats have to withhold a rewarded action sequence until a waiting interval has elapsed (Zold and Hussain Shuler, 2015). Prematurely initiated sequences were penalized by re-initiating the waiting interval. Despite the negative effect this timer re-initialization may have, the animals showed premature responding even after extensive training. Moreover, they often failed to interrupt the execution of the action sequence although there was sensory evidence of its untimely initiation. Thus, the task allowed comparing striatal activity during behaviorally indistinguishable prematurely and timely executed learned action sequences. We found a peak of striatal activity preceding trial initiation that was modulated by the time waited before responding. In addition, this modulation grew at a faster rate in adolescent rats. This likely reflects a steeper growing increase in reward expectancy during waiting that could underlie their more impulsive behavior compared to adult rats. Results Rats learn to make timely action sequences to obtain a water reward Water-deprived rats were trained to obtain water from a lick tube located within a nose-poke by emitting a sequence of eight licks following a visual cue (Figure 1A, B; task modified from Zold and Hussain Shuler, 2015). Animals self-initiated the trials by entering into the nose-poke. In those trials initiated 2.5 s after the end of the previous trial (timely trials), a 100-ms duration visual cue (two symmetrical green LEDs located at the nose-poke sides) reported that there was a 0.5 probability of receiving a water reward. In contrast, prematurely initiated trials (<2.5 s waiting time) were penalized by re-initiating the timer and had no visual cue associated with them. Spike discharges and local field potentials (LFPs) were recorded from the dorsal striatum using custom-made tetrodes (Vandecasteele et al., 2012) (representative localization Figure 1C, for detailed localization see Figure 1—figure supplement 1). Figure 1 with 1 supplement see all Download asset Open asset Rats become skilled in the task. (A) Behavioral chamber with the nose-poke. Animals' entries/exits from the nose-poke and licks are detected with infra-red (IR) beams. A 100-ms visual cue is presented through a pair of green LEDs placed on the side of the nose-poke to indicate a timely entry. (B) Schematic representation of the different types of trials. Timely trials require a minimum waiting time of 2.5 s and premature trials are those in which the minimum waiting time is not met. After that, trials are classified by whether animals performed an 8-lick sequence or not. (C) Top: Representative diagram of the electrodes' positioning, aimed at the dorsal striatum. Bottom: histological section (AP = 0.24 cm from bregma) with electrodes traces. (D) Percentage of the different trial types per session, timely trials with an 8-lick sequence (T×8L) or not (T<8L) and premature trials with an 8-lick sequence (P×8L) or not (P<8L). (E) Reward rates for early and late training stages. Trial duration (F), latency to the first lick during correct trials (G), and time to complete the 8-lick sequence (H) for the two types of correct trials (rewarded and unrewarded), at each training stage. (E–H) Data are expressed as mean ± standard error of the mean (SEM), n = 15 early and 15 late sessions, n = 5 rats performing the standard 2.5 s waiting time task,*** p<0.001; ** p<0.01. (I) Raster plot of the licks from 50 trials at a late training session of one of the adult rats. Colored dashes show the first 8 licks of the trial (timely rewarded trials: blue, timely unrewarded trials: lilac, premature trials: red). Adult rats learned to make timely nose-poke entries followed by an 8-lick sequence (Figure 1D). This was evidenced by a twofold increase in the reward rate in sessions late in training (after three consecutive sessions with >70% correct trials) compared to early in training (Figure 1E, P=6.10 × 10–5, Wilcoxon matched pairs test, n pairs = 15). Performance in 8-lick trials became faster with training (Figure 1F–H): trial duration (learning stage: F(1, 14) = 19.85, p = 5.43 × 10−4, trial: F(1, 14) = 19.91, p = 5.37 × 10−4), latency to the first lick (learning stage: F(1, 14) = 12.59, p = 3.21 × 10−4), and time to complete the 8-lick sequence (learning stage: F(1, 14) = 18.33, p = 7.60 × 10−4), diminished with training for both rewarded and unrewarded timely trials (significant effect of learning stage, non-significant interaction, two-way Repeated Measures ANOVAs). A representative raster plot showing licking bouts during a whole session is shown in Figure 1A, I, and a representative video recording of behavior displayed during premature and timely trials is shown in Video 1. Video 1 Download asset This video cannot be played in place because your browser does support HTML5 video. You may still download the video for offline viewing. Download as MPEG-4 Download as WebM Download as Ogg Behavior in a late training session of an adolescent rat. Premature nose-poke entries delayed the opportunity to get the reward as evidenced by a negative correlation between the relative number of premature trials and reward rate across sessions (Figure 2A, R2 = 0.090, F(1, 60) = 5.96, p = 0.018, n = 62 sessions from 5 rats). Premature trials diminished with training from >30% of all trials at the beginning of training to ~15% at the end of training. However, premature trials followed by an 8-lick sequence rose from 40% to 70% of all premature trials with training, paralleling the relative increase of 8-lick sequences observed in timely trials (Figure 2B, F(1, 14) = 32.87, p = 5.20 × 10−5 for learning stage and F(1, 14) = 121.74, p = 2.73 × 10−8 for trial timing, no interaction, two-way RM ANOVA). These data suggest that behavior during premature trials was modified by learning, just like in timely trials. Further supporting this presumption, time to complete the 8-lick sequence (F(1, 14) = 15.63, p = 0.001), latency to the first lick of the sequence (F(1, 14) = 14.30, p = 0.002) and variation coefficient of the inter-lick intervals (an index of the regularity of such intervals; F(1, 14) = 6.78, p = 0.021), diminished with training both for timely and premature trials (Figure 2C–E; significant effect of learning stage, no effect of trial type, no interaction, two-way RM ANOVA). Remarkably, even though premature trials were detrimental to reward rate (Figure 2A), there was a significant positive correlation between the percentage of premature trials followed by an 8-lick sequence and reward rate (Figure 2F, R2 = 0.478, F(1, 60) = 54.98, p = 1.03 × 10−9, n = 62 sessions from 5 rats). Figure 2 with 2 supplements see all Download asset Open asset Training does not suppress premature initiations of the learned behavioral response. Data correspond to five adult animals performing the standard 2.5 s waiting time task. (A) Correlation between all prematurely initiated trials and reward rate (Y = −0.023*X + 1.921, slope significantly different from zero p = 0.018, n=62 sessions from a total of 5 rats). The proportion of trials (B), time to complete the 8-lick sequence (C), latency to first lick (D), and coefficient of variation (E) of the 8-lick sequence inter-lick intervals for prematurely and timely initiated trials, at early and late learning stages. (B–E) Data are expressed as mean ± standard error of the mean (SEM), n = 15 early and 15 late sessions, *** p<0.001 ; ** p<0.01; * p<0.05. (F) Correlation between percentage of prematurely initiated trials followed by an 8-lick sequence and reward rate (Y = 0.024*X − 0.0408, slope significantly different from zero, p = 1.03 × 10−9, n=62 sessions from a total of 5 rats). (G–I) Normalized frequency distributions of the trial initiation times (waiting time), separated for trials with (gray solid line) and without (dotted line) 8-lick sequences and for early (G) and late (H) training stages (n = 15 sessions). Insets: percentage of the trials with (gray solid line) and without (dotted line) 8-lick sequences for each bin, zoomed around the criterion time. (I) Rats were trained with a 5-s criterion time period (blue) and afterwards were switched to a 2.5-s criterion time for the following two sessions (48 hr after the last 5-s waiting time session, lilac). Red dashed lines: criterion time; bin size: 100 ms; reference for normalization: bin with highest value = 1. To characterize the timing of trial initiations in adult rats, plots showing the frequency distribution of all trial initiation times were built (Figure 2G, H). The probability of a trial including an 8-lick sequence sharply increased at the end of the 2.5 s waiting interval, peaked immediately after its finalization, and then diminished gradually. Similar results were observed in a separate group of rats trained with a longer waiting interval (Figure 2—figure supplement 1). Finally, rats trained with the long waiting interval (5 s) quickly learned to adjust trial initiations to a shorter waiting interval (2.5 s) (Figure 2I), suggesting that premature trials with 8-lick sequences served to adapt behavior to changes in the waiting time requirements of the task that otherwise would have passed unnoticed to the rats. An additional group of adult rats was trained with a modified version of the task that required initiating trials not before 2.5 s and no later than 5 s after exiting the port in the previous trial. These animals showed similar behavior with a faster decrease of trial initiations after the 2.5 s waiting time (see below; Figure 2—figure supplement 2). Altogether, the data show that adult rats improved their reward rate by waiting the least possible time between trials. Noteworthy, premature execution of the learned behavioral response was (relatively) more common late in training than early in training, suggesting that with training behavior became less sensitive to the absence of the reward-predictive visual cue. Task-sensitive striatal activity concentrates at the boundaries of the learned behavioral response To determine if striatal activity marks the boundaries of the learned action sequence in our task (Jin and Costa, 2010; Jog et al., 1999), dorsal striatum activity recorded from adult rats was analyzed by aligning the activity to port entry and port exit (Figure 3A–C). Visual inspection of neuronal raster plots and peri-event time histograms (PETHs) showed strong modulations of striatal activity preceding port entry and/or at the time of port exit, during timely trials (Figure 3A, B). Overall, ~50% of the recorded units (473 out of 867) showed higher activity (>1 SD over baseline) preceding port entry ('anticipatory activity'; 19%, Figure 3D), at the time of port exit (18%, Figure 3E) or both before port entry and at port exit (12%, Figure 3F). On average, these neurons showed lower than baseline firing rates during the execution of the learned action sequence (Figure 3D–F). There were also neurons (n = 108, 12% of all recorded neurons) showing higher activity when the animal was inside the port than during the waiting period or at the initiation and finalization of the action sequence (Figure 3G). Finally, 82 neurons classified as non-task responsive were tonically active during the waiting period regardless of the waited time (Figure 3—figure supplement 1A). All main types of task-related activity emerged early during training (Figure 3—figure supplement 1B). Figure 3 with 1 supplement see all Download asset Open asset Striatal activity marks transitions between behavioral states of the task. Data come from five adult animals performing the standard 2.5-s waiting time task and two adult animals performing the 2.5–5 s cutoff version of the task. (A) Representative raster plots (above) and peri-event time histogram (PETH) (below) of striatal units showing firing rate modulations related to port entry and/or port exit. L1: first lick, L8: eighth lick, Last L: last lick. (B) Individual PETHs of striatal neurons during correct trials, aligned to port entry (left) or port exit (right). Below: average PETH (solid line) and standard error of the mean (SEM) (shaded area) for all recorded neurons. Red dashed line: port entry, orange dashed line: port exit, vertical green line: led on. (C) Proportion of neurons showing task-related firing rate modulations. PETH for all individual neurons and population average PETH aligned to port entry (left panels) and to port exit (right panels) for (D) striatal neurons showing only port entry-related activity, (E) striatal neurons showing only port exit-related activity, (F) striatal neurons showing activity modulations at both port entry and port exit, and (G) striatal neurons showing higher activity while animals are inside the port. From (D–G), data are the mean (solid lines) and SEM (shaded area). Colored bars over the x-axis show the interval used to detect firing rate modulations (red: entry, orange: exit, light-blue: inside port). Thus, although striatal activity was continuously modulated during the present task, modulations at the boundaries of the behavioral response accounted for about 50% of all task-related activity and more than 30% of the recorded neurons showed a peak of activity anticipating trial initiation. Waiting time modulates anticipatory activity Striatal activity that anticipates a learned behavioral response could specifically mark the initiation of a previously rewarded action sequence (Jin and Costa, 2010; Jog et al., 1999; Martiros et al., 2018). Likewise, it could relate to additional factors, such as reward anticipation or the vigor and value of an upcoming action (Lauwereyns et al., 2002; Samejima et al., 2005; Wang et al., 2013). Moreover, it has been proposed that changes in striatal activity preceding the initiation of a prepotent action may predict premature responding (Buckholtz et al., 2010; Donnelly et al., 2014; Wu et al., 2018). Because in the present task the learned action sequence is often prematurely released, we asked if the observed anticipatory activity could specifically predict its release, and if, additionally, encodes its timing. When all port entry responsive neurons were considered (i.e., port entry only plus port entry/port exit neurons), the average firing rate modulation anticipating trial initiation was higher for timely than for premature trials irrespective of the upcoming action including the 8-lick sequence or not (Figure 4A, B, F(1, 330) = 30.84, p = 7.01 × 10−8, significant main effect of trial initiation timing, no effect of action sequence-structure F(1, 330) = 0.089, p = 0.765, no interaction, F(1, 330) = 9.64 × 10−5, p = 0.992, two-way RM ANOVA). On average, this activity began 1 s before and peaked 0.5 s before the animal crossed the infrared beam located at port entry, both in premature and timely trials (Figure 4A). Since this anticipatory activity closely preceded approaching movements toward the nose-poke, we analyzed accelerometer recordings of head movements available from two adult rats. The accelerometer recordings did not show differences in movement initiation time between premature and timely 8-lick trials (0.357 ± 0.027 and 0.382 ± 0.023 s before port entry– mean and standard error of the mean (SEM) – for premature and timely trials, respectively, t(8) = 0.636, p = 0.543, Paired t-test; Figure 4C). Furthermore, the 8-lick sequences emitted during premature and timely trials lasted the same and had the same latency and inter-lick interval regularity (Figure 2C–E), suggesting similar action vigor during premature and timely 8-lick trials. Thus, although head acceleration data may not include differences in movements of other body parts, our data support that in this task, the firing rate modulation preceding trial initiation discriminates between premature and timely trials and does not predict the speed, regularity, structure, value, or vigor of the subsequently released action sequence. Figure 4 with 1 supplement see all Download asset Open asset Prematurely initiated trials are preceded by low striatal anticipatory activity. Data come from five adult animals performing the standard 2.5-s waiting time task except for (H–J), where they correspond to three rats trained in the 2.5- to 5-s cutoff task. (A) Average peri-event time histogram (PETH) of entry-related neurons, during premature and timely trials, for trials with or without an 8-lick sequence (diagram on top shows the types of trials analyzed). (B) Mean striatal activity during −640 to −80 ms before port entry for all PETH shown in (A), ***p<0.001. (C) Accelerometer recordings of head movements around port entry time for premature and timely trials. On the left, data are the mean (solid lines) and 95% CI (shaded area) of the accelerometer y-axis. Vertical lines represent the mean (solid) and 95% CI (dashed) of movement initiation for timely and premature trials. On the right, time from port entry to movement initiation for timely and premature trials. Data were obtained from a total of nine training sessions of two animals. (D) Average PETH of striatal neurons showing entry-related activity sorted by trial waiting time duration. The color code for the intervals is shown on the right. (E) Mean normalized firing rate for each of the waiting time segments. The 2.5-s criterion time is shown with a blue dashed line. (F) Average PETH of the same striatal neurons segmented according to next trial waiting time. (G) Average PETH of striatal neurons showing activation at port exit segmented as in (D). (H) Schematic representation of the different trial types in a modified version of the task incorporating a 5-s cutoff time. Timely trials: waiting time between 2.5 and 5 s, Premature trials: waiting time <2.5 s, Late trials: waiting time >5 s. (I) Proportion of trials followed by 8-lick sequences for each type of trial n = 21 late sessions, ***p<0.001. (J) Average PETH of striatal neurons showing activity preceding port entry according to the waiting time duration. The color code for the intervals is shown on the right. (K) Mean normalized firing rate for each of the waiting time groups for the 2.5-s (left) and 2.5- to 5-s (right) waiting times tasks, relative to the mean firing rate of timely trials ***p<0.001; *p<0.05. To further investigate this anticipatory activity, we plotted its amplitude at increasing waiting times observing that it increased with a steep slope as time waited reached the learned waiting interval and then plateaued (Figure 4D, E). Similar results were obtained in rats trained with a longer waiting interval. Because of the longer waiting time, behaviour becomes less organized during the first seconds after port exit in the 5s task, however, the modulation of activity is still observed in the bins that are close to port entry (Figure 4—figure supplement 1A). In contrast, there was no modulation of this neuronal activity by time waited in the following trial (Figure 4F), nor of port exit-related activity by the previous waiting time (Figure 4G). The steep slope of the curve at the criterion waiting time suggested that the neuronal activity does not linearly report elapsed time but is rather related to changes in reward anticipation as waiting progressed. To explore this possibility, we analyzed striatal activity of rats trained with a modified version of the task requiring initiating trials not before 2.5 s and no later than 5 s after exiting the port in the previous trial (Figure 2—figure supplement 2). Waiting less than 2.5 s (premature trials) or more than 5 s (late trials) was penalized by resetting the waiting timer (Figure 4H). We reasoned that if this activity provides a reward anticipation signal for the upcoming action, it should decrease after 5 s of waiting in the modified version of the task, considering that trials with >5-s waiting time are not rewarded. As in the standard version of the task, the animals learned to wait the less possible time between trials, and also noticed the effect of the cutoff time on reward probability, as evidenced by a reduced number of late trials with training (Figure 4I, q(20) = 10.64, p = 1.10 × 10−9 versus timely trials, Tukey post hoc test after significant one-way RM ANOVA [F(1.70, 34.05) = 33.07, p = 1.81 × 10−6]; Figure 2—figure supplement 2). The firing rate modulation preceding trial initiation increased with a steep slope at 2.5 s, but instead of plateauing, it decreased after surpassing the 5-s cutoff time (Figure 4J). A comparison of the effects of waiting time on anticipatory activity in the tasks with and without the 5-s cutoff yielded a significant interaction in a two-way RM ANOVA (Figure 4K, significant effect of task, F(1, 334) = 4.54, p = 0.034; trials, F(1.96, 656.34) = 67.73, p = 1.01 × 10−15; interaction F(2, 668) = 29.27, p = 6.53 × 10−13, two-way RM ANOVA). In summary, the marked modulation of striatal activity preceding trial initiation probably reflects subjective changes in reward anticipation as waiting progressed. Trial initiation timing modulates striatal activity at predicted outcome time Striatal activity can be modulated by reward-predictive sensory cues (Schultz, 2015). In the present task, a small population of neurons whose activity was modulated at the time of the visual cue (n = 27, 3% of all recorded neurons) showed lower activity during premature trials, when the visual cue was not presented q(26) = 6.64, p = 2.15 × 10−4 versus timely trials, Tukey post hoc test after significant one-way RM ANOVA (F(1.75, 45.49) = 17.80, p = 5.00 × 10−6) (Figure 5—figure supplement 1). This modulation by trial initiation timing was similar to that observed in port entry neurons and may represent the same kind of wait time-based reward anticipation activity that extends until when sensory feedback discloses if reward could be obtained or not. In contrast, neurons showing increased activity during licking did not show any modulation by trial initiation timing (Figure 5A, B). Furthermore, when we studied the neuronal activity around the individual licks, we found largely overlapping populations of positively modulated neurons with broad activity peaks encompassing many inter-lick intervals (Figure 5B). The lick-related modulation lasted longer during rewarded trials, where licking also persisted for longer, than in timely unrewarded (>2.5 s of waiting) and premature 8-lick trials (<2.5 s of waiting). However, it showed a similar amplitude irrespective of reward delivery or omission (Figure 5B, activity centered at the eighth lick), suggesting that these neurons were modulated by licking and not by the delivery of the reward itself. Figure 5 with 1 supplement see all Download asset Open asset Reward-responsive neurons discriminate prematurely from timely initiated trials. Data come from five adult animals performing the standard 2.5-s waiting time task and two adult animals performing the 2.5- to 5-s cutoff version of the task. (A) Schematic representation of the different types of trials analyzed. (B) Mean (solid lines) and standard error of the mean (SEM) of the average peri-event time histogram (PETH) of striatal units showing firing rate modulations related to licking activity (centered to the first, third, and eighth lick). PETH for all individual neurons and population average PETH of striatal units, aligned to the eighth lick (time 0 s), showing positive firing rate modulations during reward delivery (reward-responsive neurons) (C), reward omission during timely trials (no reward-responsive neurons) (D), or expected reward omission during 8-lick premature trials (expected no reward-responsive neurons) (E). (F) Mean normalized firing rate for the different trial conditions (timely rewarded, timely unrewarded, premature) for the three types of neurons depicted in (C–E). From top to bottom: reward-responsive neurons, no-reward-responsive neurons, and expected no reward-responsive neurons, ***p< 0.001; **p<0.001, *p<0.05. (G) Discrimination index (DI = ABS normalized firing rate in timely unrewarded trials − normalized firing rate in premature trials) for each of the groups of neurons shown in (C–E), ***p< 0.001; **p<0.001. (H) PETH of the licking activity for the three trial conditions, centered at the eighth lick, with its corresponding average at the bottom. Neuronal activity in the dorsal and ventral striatum is also modulated by the trial outcome and distinguishes between rewarded and non-rewarded trials in probabilistic tasks (Atallah et al., 2014; Histed et al., 2009; Nonomura et al., 2018; Shin et al., 2018; Yamada et al., 2011). In the present task, there are two types of non-rewarded, yet correctly performed trials: (1) the premature 8-lick trials, w
How do animals adopt a given behavioral strategy to solve a recurrent problem when several effective strategies are available to reach the goal? Here we provide evidence that striatal cholinergic interneurons (SCINs) modulate their activity when mice must select between different strategies with similar goal-reaching effectiveness. Using a cell type-specific transgenic murine system, we show that adult SCIN ablation impairs strategy selection in navigational tasks where a goal can be independently achieved by adopting an allocentric or egocentric strategy. SCIN-depleted mice learn to achieve the goal in these tasks, regardless of their appetitive or aversive nature, in a similar way as controls. However, they cannot shift away from their initially adopted strategies, as control mice do, as training progresses. Our results indicate that SCINs are required for shaping the probability function used for strategy selection as experience accumulates throughout training. Thus, SCINs may be critical for the resolution of cognitive conflicts emerging when several strategies compete for behavioral control while adapting to environmental demands. Our findings may increase our understanding about the emergence of perseverative/compulsive traits in neuropsychiatric disorders with a reported SCIN reduction, such as Tourette and Williams syndromes. SIGNIFICANCE STATEMENT Selecting the best suited strategy to solve a problem is vital. Accordingly, available strategies must be compared across multiple dimensions, such as goal attainment effectiveness, cost–benefit trade-off, and cognitive load. The striatum is involved in strategy selection when strategies clearly diverge in their goal attainment capacity; however, its role whenever several strategies can be used for goal reaching—therefore making selection dependent on additional strategy dimensions—remains poorly understood. Here, we show that striatal cholinergic interneurons can signal strategy competition. Furthermore, they are required to adopt a given strategy whenever strategies with similar goal attainment capacity compete for behavioral control. Our study suggests that striatal cholinergic dysfunction may result in anomalous resolution of problems whenever complex cognitive valuations are required.
Striatal cholinergic interneurons show tonic spiking activity in the intact and sliced brain, which stems from intrinsic mechanisms. Because of it, they are also known as "tonically active neurons" (TANs). Another hallmark of TAN electrophysiology is a pause response to appetitive and aversive events and to environmental cues that have predicted these events during learning. Notably, the pause response is lost after the degeneration of dopaminergic neurons in animal models of Parkinson's disease. Moreover, Parkinson's disease patients are in a hypercholinergic state and find some clinical benefit in anticholinergic drugs. Current theories propose that excitatory thalamic inputs conveying information about salient sensory stimuli trigger an intrinsic hyperpolarizing response in the striatal cholinergic interneurons. Moreover, it has been postulated that the loss of the pause response in Parkinson's disease is related to a diminution of I(sAHP), a slow outward current that mediates an afterhyperpolarization following a train of action potentials. Here we report that I(sAHP) induces a marked spike-frequency adaptation in adult rat striatal cholinergic interneurons, inducing an abrupt end of firing during sustained excitation. Chronic loss of dopaminergic neurons markedly reduces I(sAHP) and spike-frequency adaptation in cholinergic interneurons, allowing them to fire continuously and at higher rates during sustained excitation. These findings provide a plausible explanation for the hypercholinergic state in Parkinson's disease. Moreover, a reduction of I(sAHP) may alter synchronization of cholinergic interneurons with afferent inputs, thus contributing to the loss of the pause response in Parkinson's disease.
Striatal cholinergic interneurons (SCIN) exhibit pause responses conveying information about rewarding events, but the mechanisms underlying them remain elusive. Thalamic inputs induce a pause mediated by intrinsic mechanisms and regulated by dopamine D2 receptors, though the underlying membrane currents are unknown. Moreover, the role of D5 receptors (D5R) has not been addressed so far. We show that glutamate released by thalamic inputs in the dorsolateral striatum induces a burst in SCIN, followed by the activation of a Kv1-dependent delayed rectifier current responsible for the pause. Endogenous dopamine promotes the pause through D2R stimulation, while pharmacological stimulation of D5R suppresses it. Remarkably, the pause response is absent in parkinsonian mice rendered dyskinetic by chronic L-DOPA treatment but can be reinstated acutely by the inverse D5R agonist clozapine. Blocking the Kv1 current eliminates the pause reinstated by the D5R inverse agonist. In conclusion, the pause response is mediated by delayed rectifier Kv1 channels, which are tonically blocked in dyskinetic mice by a mechanism depending on D5R ligand-independent activity. Targeting these alterations may have therapeutic value in Parkinson’s disease.