Major depressive disorder is prevalent and impairing. Parsing neurocomputational substrates of reinforcement learning in individuals with depression may facilitate a mechanistic understanding of the disorder and suggest new cognitive therapeutic targets.To determine associations among computational model-derived reinforcement learning parameters, depression symptoms, and symptom changes after treatment.In this mixed cross-sectional-cohort study, individuals performed reward and loss variants of a probabilistic learning task during functional magnetic resonance imaging at baseline and follow-up. A volunteer sample with and without a depression diagnosis was recruited from the community. Participants were assessed from July 2011 to February 2017, and data were analyzed from May 2017 to May 2021.Computational model-based analyses of participants' choices assessed a priori hypotheses about associations between components of reward-based and loss-based learning with depression symptoms. Changes in both learning parameters and symptoms were then assessed in a subset of participants who received cognitive behavioral therapy (CBT).Of 101 included adults, 69 (68.3%) were female, and the mean (SD) age was 34.4 (11.2) years. A total of 69 participants with a depression diagnosis and 32 participants without a depression diagnosis were included at baseline; 48 participants (28 with depression who received CBT and 20 without depression) were included at follow-up (mean [SD] of 115.1 [15.6] days). Computational model-based analyses of behavioral choices and neural data identified associations of learning with symptoms during reward learning and loss learning, respectively. During reward learning only, anhedonia (and not negative affect or arousal) was associated with model-derived learning parameters (learning rate: posterior mean regression β = -0.14; 95% credible interval [CrI], -0.12 to -0.03; outcome sensitivity: posterior mean regression β = 0.18; 95% CrI, 0.02 to 0.37) and neural learning signals (moderation of association between striatal prediction error and expected value signals: t97 = -2.10; P = .04). During loss learning only, negative affect (and not anhedonia or arousal) was associated with learning parameters (outcome shift: posterior mean regression β = -0.11; 95% CrI, -0.20 to -0.01) and disrupted neural encoding of learning signals (association with subgenual anterior cingulate prediction error signals: r = -0.28; P = .005). Symptom improvement following CBT was associated with normalization of learning parameters that were disrupted at baseline (reward learning rate: posterior mean regression β = 0.15; 90% CrI, 0.001 to 0.41; loss outcome shift: posterior mean regression β = 0.42; 90% CrI, 0.09 to 0.77).In this study, the mapping of reinforcement learning components to symptoms of major depression revealed mechanistic features associated with these symptoms and points to possible learning-based therapeutic processes and targets.
Decision-making in the presence of other competitive intelligent agents is fundamental for social and economic behavior. Such decisions require agents to behave strategically, where in addition to learning about the rewards and punishments available in the environment, they also need to anticipate and respond to actions of others competing for the same rewards. However, whereas we know much about strategic learning at both theoretical and behavioral levels, we know relatively little about the underlying neural mechanisms. Here, we show using a multi-strategy competitive learning paradigm that strategic choices can be characterized by extending the reinforcement learning (RL) framework to incorporate agents’ beliefs about the actions of their opponents. Furthermore, using this characterization to generate putative internal values, we used model-based functional magnetic resonance imaging to investigate neural computations underlying strategic learning. We found that the distinct notions of prediction errors derived from our computational model are processed in a partially overlapping but distinct set of brain regions. Specifically, we found that the RL prediction error was correlated with activity in the ventral striatum. In contrast, activity in the ventral striatum, as well as the rostral anterior cingulate (rACC), was correlated with a previously uncharacterized belief-based prediction error. Furthermore, activity in rACC reflected individual differences in degree of engagement in belief learning. These results suggest a model of strategic behavior where learning arises from interaction of dissociable reinforcement and belief-based inputs.
Résumé Les dernières décennies ont connu une croissance inédite de notre compréhension des fondements cérébraux de la prise de décision économique. En particulier, la recherche a découvert non seulement la localisation de régions du cerveau où certains processus ont lieu, mais également la nature de variables latentes (économiquement significatives) ainsi que la manière dont elles sont liées au comportement. Cette transition d’une compréhension du lieu de la décision économique vers la manière dont se prend cette décision au niveau cérébral est intégrante à l’identification d’une relation entre processus nerveux et modèles de comportements économiques. Toutefois, le progrès accompli a été inégal. Les études neuro-économiques sur la prise de décision individuelle, telles que celles impliquant les préférences temporelles ou l’attitude face au risque, ont l’avantage de s’inscrire dans des décennies d’études neuroscientifiques sur les comportements animaliers. La plupart de ces résultats sont basés sur des approches quantitatives et informatiques, qui se prêtent aisément à l’expérimentation économique. En revanche, notre compréhension des systèmes nerveux sous-jacents au comportement social est bien moins spécifique. Une grande partie du défi actuel résulte des lacunes empiriques des prédictions de comportement issues de la théorie des jeux standard, qui sont largement basées sur l’équilibre. Utilisant notre propre étude comme exemple, nous montrons comment il est possible de chercher directement les variables latentes induites par les modèles actuels d’apprentissage stratégique, et de tenter de les localiser dans le cerveau. Plus précisément, nous montrons que les systèmes nerveux sous-jacents à l’apprentissage stratégique s’ajoutent à ceux impliqués dans l’apprentissage par essais-erreurs, mais incluent également des calculs additionnels qui captent l’apprentissage basé sur la croyance. Finalement, nous discutons la manière dont notre approche peut être élargie pour traiter les problèmes fondamentaux de l’économie. Classification JEL : C92, D83.
Article Figures and data Abstract eLife digest Introduction Results Discussion Materials and methods Appendix 1 Appendix 2 References Decision letter Author response Article and author information Metrics Abstract Disproportionate reactions to unexpected stimuli in the environment are a cardinal symptom of posttraumatic stress disorder (PTSD). Here, we test whether these heightened responses are associated with disruptions in distinct components of reinforcement learning. Specifically, using functional neuroimaging, a loss-learning task, and a computational model-based approach, we assessed the mechanistic hypothesis that overreactions to stimuli in PTSD arise from anomalous gating of attention during learning (i.e., associability). Behavioral choices of combat-deployed veterans with and without PTSD were fit to a reinforcement learning model, generating trial-by-trial prediction errors (signaling unexpected outcomes) and associability values (signaling attention allocation to the unexpected outcomes). Neural substrates of associability value and behavioral parameter estimates of associability updating, but not prediction error, increased with PTSD during loss learning. Moreover, the interaction of PTSD severity with neural markers of associability value predicted behavioral choices. These results indicate that increased attention-based learning may underlie aspects of PTSD and suggest potential neuromechanistic treatment targets. https://doi.org/10.7554/eLife.30150.001 eLife digest Posttraumatic stress disorder, or PTSD for short, is a serious psychiatric disorder that sometimes occurs after someone has experienced a dangerous or threatening event. People with PTSD are prone to overreact to unexpected reminders of these events, and are often hypervigilant for danger. Why these symptoms occur is not yet clear, but it is thought that people with PTSD may have learning problems that lead them to overestimate the likelihood of danger. Advanced tools from computer science and mathematics have helped scientists to study how the brain learns. These tools may now provide more insight into how diseases like PTSD disrupt learning. Scientists use computer models of learning to test how humans make choices and react to their outcomes. These models build on the idea that humans make choices based on what they predict an outcome will be, and then learn when they update their expectations based on the accuracy of their predictions. Now, Brown et al. show that people with PTSD have an increased learning response to surprising events – these are defined in this study as outcomes that are inconsistent with participants’ predictions. In the experiments, 74 combat veterans who had experienced trauma in Iraq or Afghanistan underwent a type of brain scanning procedure, while they played a gambling-like game. Some participants had PTSD, others did not. Both groups learned to make choices that minimized the loss of money. However, learning in veterans with PTSD was strongly influenced by how much attention they paid to surprising outcomes. Moreover, the brain areas that help to process attention to surprise were highly active in people with PTSD. Brown et al. added a third group of participants with depression to the study to verify that the learning changes were PTSD-specific. This depression-only group did not have differences in attention to surprise. Many treatments for PTSD focus on exposing individuals to feared situations and trauma memories, so that individuals can learn that these situations are no longer dangerous. Computational modeling and neuroimaging may help scientists pinpoint the sources of learning deficits, such as increased attention to surprising outcomes. Identifying the different possible causes of learning problems may lead to new or more precise learning-based treatments for PTSD and other learning-related conditions. Understanding how learning-related brain changes develop may also help find ways to prevent and better diagnose PTSD and other psychiatric disorders. https://doi.org/10.7554/eLife.30150.002 Introduction Posttraumatic stress disorder (PTSD) is debilitating and characterized by excessive behavioral, psychological, and physiological responses to unexpected stimuli (Pitman et al., 2012). In particular, clinical and empirical observations have documented the negative impact of salient cues on neural and behavioral functioning in PTSD, including heightened orienting to unexpected events, impaired extinction of learned fear, and unstable attention biases toward perceived threatening stimuli (Aupperle et al., 2012; Bar-Haim et al., 2007; Blair et al., 2013; Morey et al., 2009; Naim et al., 2015). Together, these behavioral alterations in response to unexpected stimuli, uncontrollable reminders of trauma, and other negative, threatening, and trauma-related events point to PTSD as a disorder of disrupted learning from reminders of negative events; however, the specific components of anomalous learning in PTSD remain unknown. As an initial step toward addressing this issue, we adopt a computational psychiatry approach (Montague et al., 2012; Wang and Krystal, 2014; Maia and Frank, 2011), using quantitative specification of neural and behavioral learning processes to investigate the neurocomputational substrates of PTSD. Computational model-based approaches to learning provide a mechanistic framework for understanding the detrimental impact of unexpected negative stimuli and reminders of negative events in PTSD. Error-guided models of reinforcement learning (RL) have robustly shown that unexpected outcomes (i.e., value ‘prediction errors’) drive learning by directly updating the value of the cues associated with those outcomes (Rescorla and Wagner, 1972; Sutton and Barto, 1998). A related family of hybrid reinforcement learning models combines prediction-error based learning with a dynamically changing attention modulation variable (i.e., a cue’s associability value) that scales with the magnitude of prediction errors previously associated with a particular cue. In these models, trial-by-trial associability values associated with particular cues gate the learning of subsequent outcomes associated with these cues (Li et al., 2011; Pearce and Hall, 1980; Le Pelley, 2004). Thus, these models contain separate parameters for error-based learning rate and associability updating that together govern how strongly current and past prediction errors, respectively, affect learning. While commonly used tasks that assess cue-salience (Todd et al., 2015) and attention to threat (Naim et al., 2015; Vythilingam et al., 2007) test important components of aversive processing in PTSD, they are more limited for explaining how changes in attention to negative stimuli may affect subsequent behavioral choices, as reinforcement learning algorithms allow in the context of value-based learning tasks. More generally, computational model-based approaches allow fitting of models of neural function to behavioral choices and imaging data and thus facilitate the separation of mechanistic processes (e.g., responses to associability value, associability updating, learning rate, prediction error responses) related to attention and learning from negative and positive events. In the hybrid RL framework, PTSD, and symptoms of hypervigilance in particular, may reflect disproportionate attentional processes that drive maladaptive, heightened responses to stimuli with a history of unexpected outcomes. The neural substrates of reinforcement learning suggest further compelling links between associability-modulated learning and PTSD, as the brain networks involved in both overlap. In particular, work in humans and rodents has identified roles for the ventral striatum, anterior cingulate, and amygdala in encoding prediction error (Rangel et al., 2008; Pagnoni et al., 2002; Garrison et al., 2013) and for the amygdala and insula in encoding associability values (Li et al., 2011; Roesch et al., 2012). In PTSD, affective stimuli consistently elicit altered neural activation in a network that prominently also includes the amygdala, insula, and prefrontal regions (Hayes et al., 2012; Etkin and Wager, 2007). Of translational relevance, attention, learning, and the amygdala have all been behavioral and neural targets of promising new therapies for PTSD (Badura-Brack et al., 2015; Craske et al., 2014; Langevin et al., 2016); elucidating the component neurobehavioral mechanisms associated with learning in PTSD may refine these targets. In brief, extant data indicate that neural and behavioral distinctions between solely prediction error-based and associability-modulated learning contribute uniquely to neural and behavioral correlates of learning and may clarify processes underlying the heightened sensitivity to unexpected stimuli in PTSD. To assess this possibility, we implemented a probabilistic learning task during functional magnetic resonance imaging (fMRI) in combat-deployed military veterans with and without posttraumatic stress disorder. We posited that if attention-modulated learning plays a role in PTSD, hybrid RL models incorporating associability should predict participants’ choices better than learning models without associability. Furthermore, PTSD severity, particularly symptoms related to hyperarousal (Lissek and van Meurs, 2015), should correlate preferentially both with enhanced associability updating and with increased activity in neural structures encoding associability values (i.e., amygdala and insula). Critically, these relationships would not be expected between PTSD and solely error-based learning rate or error-related activity in neural structures encoding value and prediction error (i.e., ventral striatum, ventromedial prefrontal cortex). Results Participant characteristics and model-agnostic behavioral performance Combat-deployed military veterans (N = 74) completed a probabilistic learning task in the loss and gain domains while undergoing fMRI scanning (Figure 1; full task description in Materials and methods). All veterans had served at least one tour in Iraq or Afghanistan since 2001, had experienced Criterion A deployment-related trauma, and were recruited from a larger study by our group examining biomarkers of mood and anxiety disorders. To be considered in the present analyses, participants were further required to demonstrate behavioral engagement on the relevant portions of the probabilistic learning task (N = 68 veterans; see Materials and methods for exclusion breakdowns, full inclusion/exclusion criteria, and exclusion details for sub-analyses). Veterans were assessed for PTSD using the Clinician Administered PTSD Scale (CAPS [Blake et al., 1995]) and for other psychiatric disorders with the Structured Clinical Interview for DSM-IV (SCID [First et al., 1996]). Participants exhibited a range of PTSD symptoms, with 39 veterans meeting DSM-IV (American Psychiatric Association, 2000) criteria for PTSD (see Supplementary file 1, Table 1A for clinical and demographic information); the other 29 veterans had previous deployment-related trauma exposure as assessed by the CAPS interview but did not meet criteria for PTSD, resulting in 39 participants with PTSD and 29 participants without PTSD for primary analyses. Figure 1 Download asset Open asset Reinforcement learning model. Schematic description of reinforcement learning model. Participants choose between two stimuli, view the monetary outcome of the choice, and learn over time which is the ‘better’ option. A reinforcement learning model incorporating associability-modulated learning rate is illustrated here. Expected value is updated based on the static learning rate, the current trial’s prediction error, and the associability value from the previous outcome associated with the stimulus. Meanwhile, associability value is updated based on the static associability weight and the absolute value of the current trial’s prediction error. Green lines indicate the effect of prediction error through learning rate on expected value, while purple lines indicate the effect of prediction error through associability weight on associability value on the current trial, and then on learning rate on the subsequent trial. Thus, through trial-by-trial modulation of learning rates, associability values act as an attentional gate on learning. The task involved blocks consisting of all loss or all gain trials; loss trials are shown here as the model with associability fit well in this condition only. https://doi.org/10.7554/eLife.30150.003 In both loss and gain domains, veterans showed robust learning, as evidenced by increasing likelihood of choosing the ‘better’ option from chance on the first trial to near 80% correct after ~15 trials (Figure 2a and Figure 2—figure supplement 1). An adaptive design titrated the task for participants to achieve sufficient learning (i.e., block length was adjusted once participants achieved learning; see Materials and methods for details); reflecting this, performance accuracy (% better choice) did not differ between participants with and without PTSD in the gain or loss domains (gain: t41 = −1.29, p>0.1; loss: t41 = 0.459, p>0.1). Figure 2 with 1 supplement see all Download asset Open asset Behavioral performance and relationship to parameter estimates. (a) Loss performance. Performance was quantified as proportion of choices that were the ‘better’ option. Over time, participants show learning (running average over five trials, averaged over all blocks; mean ± SE). Behavior is separated by diagnostic group, with veteran control (No PTSD) participants’ behavior marked by a solid line and the behavior of veterans with PTSD marked by a dotted line. (b) Plot of trial-by-trial associability value and prediction error for loss learning blocks, averaged across sets of five trials and across participants. Values are derived from the associability reinforcement learning model described in Figure 1 using each participant’s individually estimated parameter values and behavioral choices. As expected, a gradual reduction in average associability value and reduction of prediction error towards zero across trials is observed. As the initial expected value of each stimulus was set at 0, prediction errors are initially negative. During learning, prediction errors become distributed around, and converge toward, 0, indicating learning. Similarly, the initial associability value is set at one and decreases as outcomes become better predicted. Note that the probabilistic outcomes of this task ensure that outcomes are not completely predicted; this feature maintains variation in prediction error and retains the influence of associability value. As the task progresses, new blocks requiring new learning are more likely to occur, leading to an overall increase in associability value and prediction errors further from 0. Gain learning blocks did not fit the associability model well; figures related to gain learning are shown in Figure 2—figure supplement 1. https://doi.org/10.7554/eLife.30150.004 Behavioral model-fitting and relationship of model parameters with PTSD As an initial step toward evaluating the role of associability in learning in PTSD, participants’ choices were fit to a prediction error-based reinforcement learning (RL) model with and without a dynamic associability value (κ)-modulated learning rate on the prediction error (δ), as in Li et al. (2011). In this model, the associability value κ of a chosen stimulus changes on a trial-by-trial basis based on a combination of the magnitude of previous prediction errors and a static associability weight η, a parameter which varies by participant and indicates the extent to which the magnitude of recent prediction errors updates trial-by-trial associability values (see Materials and methods for full model specifications and Figure 2b for a trial-by-trial plot of the time course and relationship between prediction error and associability values). We verified via simulation that associability weight (η) did not directly affect performance (Figure 3—figure supplement 1a), allowing us to dissociate the effects of associability updating from general performance deficits. Consistent with our hypothesis, including associability in the RL model significantly improved model fit for the majority of participants and did so during loss learning only (Figure 3a; protected exceedance probability of model with versus without associability: 0% in gain, 100% in loss). The role of associability in loss, but not gain, learning is consistent with prior data showing heightened orienting and attentional biases toward negative information in PTSD (Li et al., 2011; Boll et al., 2013); thus, our subsequent analyses focused on learning variables in the loss domain (see Appendix 2, Supplementary Results for supplemental data related to gain learning and model-agnostic support for the presence of associability-modulated updating during loss learning only). Figure 3 with 1 supplement see all Download asset Open asset Model fit and relationship to behavior. (a) Protected exceedance probability and model frequency calculated by Bayesian Model Selection for the model with versus without associability for loss and gain trials, showing an improvement in model fit when adding associability during loss learning only. (b) Average regression beta values per subject between trial-by-trial associability value (κ) and reaction time (controlling for expected value and trial number) show a positive relationship between model-estimated associability value and choice latencies. Dots are individual subjects’ beta values. (c) Prediction of switching behavior is improved when previous trial’s prediction error (δ) is modulated by associability value (κ; negative change in AIC for logistic regression predicting switching; δ and δ * κ model, each compared against basic outcome-only model; see Supplementary Materials and methods). Error bars represent subject-level standard errors based on a leave-one-out standard error estimation. (d) Associability signaling independent of PTSD and covariates, displayed at p<0.05 FDR corrected. First level regression of trial-by-trial associability values on neural activity at time of outcome. Second level (shown) of constant term of regression. Additional model confirmation analyses (relationship between associability weight parameter and performance, relationship between actual and predicted choices, and model parameter recovery) are in Figure 3—figure supplement 1. https://doi.org/10.7554/eLife.30150.006 The role of associability in loss learning was further corroborated by robust effects of associability values on behavior during the task. First, choices predicted by the associability RL model showed high correspondence with participants’ actual choices (correlation of bins of predicted vs. observed choices: r = 0.997, p<0.001; Figure 3—figure supplement 1b) and did not differ between participants with and without PTSD (t-test of subjects’ log likelihoods and PTSD diagnosis: t66 = −0.33, p>0.1). Reaction times were significantly and positively correlated with trial-by-trial associability values (κ), reflecting greater decision latency as the associability value of the chosen option increased (Figure 3b; average per-subject regression beta value of reaction time predicting associability, controlling for expected value of chosen option and trial number: .232; t-test assessing difference from 0: t68 = 8.28, p<0.001). Next, we regressed trial-by-trial estimates of prediction error (δ) and associability-modulated prediction error (κ*δ), respectively, computed from participants’ individually estimated parameter values, against their trial-by-trial switching behavior (switch or no switch, a measure of responsivity to outcomes). Associability-modulated prediction error significantly predicted switching choices above prediction error alone (χ21 = 323.0, p<0.001; Figure 3c). In addition, associability weight and other model parameters were recoverable through simulation (Figure 3—figure supplement 1c), and associability value showed a robust neural effect independent of PTSD diagnosis (Figure 3d; Supplementary file 1, Table 1B). These data together indicate that the associability RL model fit participants’ behavior and neural activity well and support the incorporation of associability in the RL model of loss learning. If associability-modulated learning plays a role in PTSD, individuals with PTSD ought to have greater associability weights (η; see Materials and methods for model specifications). To test this possibility, participants’ individually estimated associability weights, reflecting the degree to which associability values are updated based on recent unsigned prediction errors during loss learning for each participant, and unmodulated learning rates (α) were compared between participants with and without a PTSD diagnosis (see Appendix 1: Supplementary Methods for individual parameter estimation details). Associability weights were increased in participants with PTSD (t62 = 4.01, p<0.001) while unmodulated learning rate did not differ between groups (t62 = 0.63, p>0.1; Figure 4a), supporting the hypothesis of a greater emphasis on modulation of loss learning by attention in PTSD. Figure 4 with 1 supplement see all Download asset Open asset Behavioral and neural substrates of associability are increased with PTSD. (a) Loss associability weight increases with PTSD while unmodulated learning rate does not (individual estimates for learning parameters: circles indicate control participants [mean: solid line] and X’s indicate participants with PTSD [mean: dotted line]; insets display regression beta values for PTSD diagnosis variable from linear regression predicting learning parameters; error bars represent SEM). (b) Neural effect of PTSD diagnosis on trial-by-trial associability value activation (cluster-level FDR p<0.05 whole brain corrected with a cluster forming threshold of p<0.001; t value of PTSD diagnosis on parametric modulator of associability value at outcome event). Dashed circles indicate amygdala and insula regions. Prediction error ROIs from an independent cohort used to test for prediction error differences in PTSD are shown in Figure 4—figure supplement 1. https://doi.org/10.7554/eLife.30150.008 Given the high co-occurrence of depression in PTSD (O'Donnell et al., 2004), we also tested the specificity of increased associability weight to PTSD versus depression. Specifically, we enrolled a separate cohort of gender-, estimated IQ-, and smoking status-matched participants with a current diagnosis of major depressive disorder (MDD; N = 20; see Materials and methods for MDD participant details), but not PTSD. These MDD-only participants performed the same learning task, and we also fit the behavior of these participants to the RL model with associability. Compared to MDD-only participants, the participants with PTSD had significantly higher associability weights (t57 = 7.25, p<0.001) but similar unmodulated learning rates (t57 = 1.44, p>0.1). Therefore, despite the high comorbidity between PTSD and depression (O'Donnell et al., 2004), increased associability-modulated learning appears specific to PTSD and not to mood-related psychopathology. Neural substrates of associative learning in PTSD and relationship to behavioral choices To investigate the instantiation of neural substrates of associability- and prediction error- based learning in PTSD, we first regressed trial-by-trial estimates of associability value (κ) and prediction error (δ), respectively, during loss learning against subjects’ neural responses to the outcome event (per [Li et al., 2011; Esber et al., 2012; Roesch et al., 2010a]; see Materials and methods for design matrix specifications). Corroborating the behavioral findings of greater associability value updating in participants with PTSD, neural encoding of associability values showed a significant relationship with PTSD at an FDR-corrected whole brain significance level in a network of regions including bilateral amygdala and insula, hypothesized areas of relevance for PTSD and associability-based learning (Pitman et al., 2012; Li et al., 2011) (Figure 4b; Supplementary file 1, Table 1C; see Figure 3d for associability-related signaling across participants after accounting for PTSD and covariates). To further test the localization of the increased associability-related activation in PTSD to our a priori hypothesized areas of amygdala and insula, we extracted beta values from these anatomical regions of interest (see Materials and methods for ROI definition). PTSD was significantly related to associability-related activation in these areas (amygdala: t37 = 2.45, p<0.05; insula: t37 = 3.87, p<0.001). Replacing the binary PTSD diagnosis with a dimensional measure of PTSD symptom severity (total CAPS score) across all veteran participants resulted in a similar pattern of effects (Figure 5a; Supplementary file 1, Table 1D). PTSD was not related to neural responses to prediction error at this whole brain level (Figure 5—figure supplement 1a). Follow up analyses covaried for presence of psychotropic medication, a positive screen for mild traumatic brain injury, or smoking status, and additionally tested effects within a subgroup of veterans with and without PTSD who were free from psychotropic medication and matched on estimated IQ; none of these covariates was significantly related to neural or behavioral results involving associability, and in the matched subgroup, PTSD remained related to behavioral and neural encoding of associability value (see Appendix 2, Supplemental Results for details). Figure 5 with 2 supplements see all Download asset Open asset PTSD symptom clusters of hyperarousal and avoidance/numbing have greatest neural and behavioral interaction with associability value. (a) Relationship of total PTSD symptoms (CAPS score) and PTSD symptom clusters to neural encoding of loss associability value (FDR p<0.05; displayed at p<0.005 uncorrected to allow equivalent thresholding across images). (b) Switching behavior is explained by interaction of PTSD symptom clusters and neural associability value activation for total CAPS, avoidance/numbing and hyperarousal. Bars indicate improvement in model fit when adding an interaction term of symptom cluster severity to a mixed effects logistic regression model predicting switching behavior by an interaction of previous outcome with neural activity; likelihood ratio test of model fit improvement; χ2 >10.60 indicates significant improvement at p<0.005 uncorrected for multiple comparisons [αBonferroni =.006]). Relationship of total PTSD symptoms and symptom clusters with prediction error; associability value and prediction error analyses with vmPFC and insula ROIs; and jackknife error distributions are shown in Figure 5—figure supplement 1. Interaction of amygdala and insula ROIs with previous outcomes and PTSD is illustrated in Figure 5—figure supplement 2. https://doi.org/10.7554/eLife.30150.010 The lack of whole-brain relationship between neural prediction error signals and PTSD could be due to our conservative multiple comparison correction. To assess this possibility, we further investigated the relationship between prediction error activation and PTSD within prediction error related regions of interest (ROIs) derived from a trauma-unexposed reference cohort, separate from the veteran cohort (described in Materials and methods). Neural responses in these ROIs (including ventral striatum and vmPFC) also did not show significant effects of PTSD for prediction error, even in neural regions strongly related to prediction error signaling (see Materials and methods for details; reference group prediction error activation is shown in Figure 4—figure supplement 1). Independent of PTSD, participants showed significant prediction error activation in striatum (left striatum: t42 = 3.56, p<0.001; right striatum: t42 = 3.02, p<.005), further supporting intact PE-related signal that is unaffected by PTSD. To assess which PTSD symptom clusters are more associated with increased neural signaling of associability value, we examined the symptom clusters of re-experiencing, avoidance/numbing, and hyperarousal (American Psychiatric Association, 2000). Specifically, we tested the degree to which symptom severity in each cluster was related to neural correlates of trial-by-trial associability and prediction error, respectively. The hyperarousal and avoidance/numbing symptom clusters showed the most extensive neural responses corresponding with associability (Figure 5a; Supplementary file 1,Tables 2E and 2F), with little relationship with re-experiencing symptoms (Figure 5a; Supplementary file 1, Table 2G). Finally, if neural encoding of associability is relevant for real-world behavioral disruptions in PTSD, the combination of PTSD and neural responsivity to associability value ought to be related to participants’ likelihood of adjusting behavioral choices based on past experiences. As described above, PTSD and the hyperarousal and avoidance/numbing symptom clusters showed a strong relationship with neural activation to associability, suggesting that the interaction of PTSD and neural substrates of associability should predict switching behavior over and above neural activation to associability value alone. We therefore carried out mixed effects logistic regressi
Abstract Humans possess a remarkable ability to understand what is and is not being said by conservational partners. An important class of models hypothesize that listeners decode the intended meaning of an utterance by assuming speakers speak cooperatively, simulating the speaker’s rational choice process and inverting this process for recovering the speaker’s most probable meaning. We investigated whether and how rational simulations of speakers are represented in the listener’s brain, when subjects participated in a referential communication game inside fMRI. In three experiments, we show that listener’s ventromedial prefrontal cortex encodes the probabilistic inference of what a cooperative speaker should say given a communicative goal and context. The listener’s striatum responds to the amount of update on the intended meaning, consistent with inverting a simulated mental model. These findings suggest a neural generative mechanism subserved by the frontal-striatal circuits that underlies our ability to understand communicative and, more generally, social actions.