Linear reinforcement learning in planning, grid fields, and cognitive control.

2021 
It is thought that the brain’s judicious reuse of previous computation underlies our ability to plan flexibly, but also that inappropriate reuse gives rise to inflexibilities like habits and compulsion. Yet we lack a complete, realistic account of either. Building on control engineering, here we introduce a model for decision making in the brain that reuses a temporally abstracted map of future events to enable biologically-realistic, flexible choice at the expense of specific, quantifiable biases. It replaces the classic nonlinear, model-based optimization with a linear approximation that softly maximizes around (and is weakly biased toward) a default policy. This solution demonstrates connections between seemingly disparate phenomena across behavioral neuroscience, notably flexible replanning with biases and cognitive control. It also provides insight into how the brain can represent maps of long-distance contingencies stably and componentially, as in entorhinal response fields, and exploit them to guide choice even under changing goals. Models of decision making have so far been unable to account for how humans’ choices can be flexible yet efficient. Here the authors present a linear reinforcement learning model which explains both flexibility, and rare limitations such as habits, as arising from efficient approximate computation
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    75
    References
    4
    Citations
    NaN
    KQI
    []