Uncertainty-driven exploration in the basal ganglia
Yuhao Wang, Armin Lak, Sanjay Manohar, Rafal Bogacz, University of Oxford, United Kingdom
Session:
Posters 3B Poster
Presentation Time:
Sat, 26 Aug, 13:00 - 15:00 United Kingdom Time
Abstract:
When facing an unfamiliar environment, animals need to explore different actions and update the knowledge about the resulting rewards, but also need to put the updated knowledge to use as quickly as possible. Optimal reinforcement learning (RL) strategies should therefore assess the uncertainties of these action–reward associations and utilise them to inform decision making. We propose a mechanism whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, so that the basal ganglia (BG) facilitates effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify the BG model, and fitted exploration strategies derived from the model to data from behavioural experiments. We also compared the directed exploration strategy from the BG model with variants of upper confidence bound strategy in simulation. The BG strategy performed better than the classic algorithms in simulation, and we found qualitatively similar results in fitting model to behavioural data as in a previous study which used a more idealised model with less implementation level details. Overall, our results suggest that transient dopamine levels in the BG encoding novelty can contribute to an uncertainty representation which drives exploration in RL.