Neural-Linear Architectures for Sequential Decision Making

Tom Zahavy,Matan Haroush,Nadav Merlis,Daniel J. Mankowitz,Shie Mannor

Neural-Linear Architectures for Sequential Decision Making

2019

Making optimal decisions, while learning in dynamic environments like Markov Decision Processes and Multi-armed Bandits, often requires the computation of accurate uncertainty estimates. However, learning directly from raw, high dimensional inputs like vision and natural language, is typically done using deep neural networks, for which such accurate estimates are not available. Neural-linear algorithms address this challenge by utilizing linear algorithms (for which accurate uncertainty estimates exist) on top of a non-linear (deep) representation that is learned directly from the raw input. Such architectures have been recently explored, showing superior performance compared to both deep and linear state-of-the-art algorithms. A practical challenge in this approach is that the representation is assumed to be fixed over time when used by the linear algorithm, while the deep learning based representation changes as the optimization proceeds. In this talk, I will review recent neural-linear algorithms and discuss an algorithmic approach to deal with representations that change over time. In particular, I will present a linear fitted Q iteration algorithm that refines the weights of the last layer of a deep q network and improves its performance in the arcade learning environment, a neural-linear Thompson Sampling algorithm for contextual bandits and deep reinforcement learning, and an action elimination algorithm for text-based games that eliminates actions based on a linear upper confidence bound.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations