P-1A.20

Assessing the optimality of reinforcement learning biases using evolutionary simulations

Stefano Palminteri, École Normale Supérieure - Institut National de la Santé et Recherche Médicale, France

Session:
Posters 1A Poster

Track:
Cognitive science

Location:
North Schools

Presentation Time:
Thu, 24 Aug, 17:00 - 19:00 United Kingdom Time

Abstract:
The tendency of repeating past choices more often than expected from the past history of outcomes has been repeatedly empirically observed in reinforcement learning experiments. It can be explained by at least two computational processes, asymmetric update and (gradual) choice perseveration. A recent meta-analysis suggests that both processes play a role in human reinforcement. However, while their descriptive power seems to be well investigated, they have not been compared regarding their normative potential (i.e., the relative advantages of these processes in terms of performance). In this study, we address this gap by running simulations using a new variant of an evolutionary algorithm to simulate reinforcement learning agents in a variety of scenarios. Our results show that asymmetric update (in the form of a positivity bias) is evolutionary stable and optimal in many situations, while the emergence of gradual perseveration is less systematic and robust. Overall, our results illustrate that the positivity bias presents a broader statistical advantage compared to gradual perseveration.

Manuscript:
License:
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
DOI:
10.32470/CCN.2023.1032-0
Publication:
2023 Conference on Cognitive Computational Neuroscience
Presentation
Discussion
Resources
No resources available.
Session P-1A
P-1A.1: Rapid Learning Without Catastrophic Forgetting in Multiple Morris Water Mazes
Raymond Wang, Jaedong Hwang, Akhilan Boopathy, Ila Fiete, Massachusetts Institute of Technology, United States
P-1A.2: Revealing the high-dimensional latent structure in visual cortical representations
Raj Magesh Gauthaman, Brice Ménard, Michael Bonner, Johns Hopkins University, United States
P-1A.3: TorchLens: A Python package for extracting and visualizing hidden activations of PyTorch models
JohnMark Taylor, Nikolaus Kriegeskorte, Columbia University, United States
P-1A.4: Predicting Object Similarity and Grasping Behavior from Deep-CNN Layers: Is One Visual Hierarchy Enough?
Aida Mirebrahimi Tafreshi, Western University, Canada; Aryan Zoroufi, Massachusetts Institute of Technology, United States; Leslie Ungerleider, Chris Baker, Maryam Vaziri-Pashkam, National Institute of Mental Health, United States
P-1A.5: vrGazeCore: A toolbox for virtual reality eye-tracking analysis
Deepasri Prasad, Amanda J. Haskins, Thomas L. Botch, Dartmouth College, United States; Jeff Mentch, Massachusetts Institute of Technology, United States; Caroline E. Robertson, Dartmouth College, United States
P-1A.6: Dorsomedial frontopolar cortex determines whether social information influences decision making in macaques
Ali Mahmoodi, Caroline Harbison, Alessandro Bongioanni, Andrew Emberton, Lea Roumazeilles, Jerome Sallet, Nima Khalighinejad, Matthew Rushworth, University of Oxford, United Kingdom
P-1A.7: Static and dynamic decision bound adjustments during continuous monitoring for sensory targets.
Harvey McCone, Richard Halpin, Trinity College Dublin, Ireland; Anna Geuzebroek, Simon Kelly, University College Dublin, Ireland; Redmond O'Connell, Trinity College Dublin, Ireland
P-1A.8: Novelty Drives Human Exploration Even When It Is Suboptimal
Alireza Modirshanechi, He A. Xu, Wei-Hsiang Lin, Michael H. Herzog, Wulfram Gerstner, EPFL, Switzerland
P-1A.9: Meta-cognitive Efficiency in Learned Value-based Choice
Sara Ershadmanesh, Ali Gholamzadeh, Maxplanck Institute for Biological Cybernetics, Germany; Kobe Desender, Ghent University, Belgium; Peter Dayan, Maxplanck Institute for Biological Cybernetics, Germany
P-1A.10: Excitatory-inhibitory cortical feedback enables hierarchical credit assignment
Will Greedy, Heng Wei Zhu, Joseph Pemberton, Kevin Nejad, Jack Mellor, Rui Ponte Costa, University of Bristol, United Kingdom
P-1A.11: Constructing and deconstructing bias: modeling privilege and mentorship in agent-based simulations
Andria Smith, Max Planck Institute for Intelligent Systems, Stuttgart, Germany; Simon Heuschkel, University of Tübingen, Germany; Ksenia Keplinger, Max Planck Institute for Intelligent Systems, Stuttgart, Germany; Charley Wu, University of Tübingen, Germany
P-1A.12: Modeling Infant Object Perception as Program Induction
Jan-Philipp Fränken, Stanford University, United States; Neil Bramley, Christopher Lucas, The University of Edinburgh, United Kingdom; Steven Piantadosi, University of California, Berkeley, United States
P-1A.13: Neural network modeling reveals diverse human exploration behaviors via state space analysis
Hua-Dong Xiong, University of Arizona, United States; Li Ji-An, University of California, San Diego, United States; Marcelo Mattar, New York University, United States; Robert Wilson, University of Arizona, United States
P-1A.14: Locus coeruleus-related insula activation supports implicit learning
Martin J. Dahl, Max Planck Institute for Human Development, Germany; Tiantian Li, Max Planck School of Cognition, Germany; Matthew R. Nassar, Brown University, United States; Mara Mather, University of Southern California, United States; Markus Werkle-Bergner, Max Planck Institute for Human Development, Germany
P-1A.15: Compositional Learning of a Numerical Reasoning Task in Artificial Neural Networks
Mia Whitefield, Sophie Arana, Christopher Summerfield, University of Oxford, United Kingdom
P-1A.16: Learning the cognitive control structure in neural networks through alternating learning and inference
Ali Hummos, Massachusetts Institute of Technology, United States; Matthew Nassar, Brown University, United States; Guangyu Robert Yang, Massachusetts Institute of Technology, United States
P-1A.17: Explainable Deep Learning for Arm Classification During Deep Brain Stimulation - Towards Digital Biomarkers for Closed-Loop Stimulation
Mathias Ramm Haugland, Anastasia Borovykh, Yen Tai, Shlomi Haar, Imperial College London, United Kingdom
P-1A.18: Simulating Broca’s and Wernicke’s aphasia with a plastic attractor
Lin Sun, University College London, United Kingdom; Sanjay G. Manohar, University of Oxford, United Kingdom
P-1A.19: ROI-to-ROI fMRI Brain Functional Connectivity Analysis of Flickering Light Stimulation for Entraining Gamma Waves
Kassymzhomart Kunanbayev, Jeongwon Lee, Dae-Shik Kim, KAIST, Korea (South)
P-1A.20: Assessing the optimality of reinforcement learning biases using evolutionary simulations
Stefano Palminteri, École Normale Supérieure - Institut National de la Santé et Recherche Médicale, France
P-1A.21: The Cognitive Mechanisms of Credit Assignment, and Learning about Control
Lisa Spiering, Hailey Trier, Jill O’Reilly, University of Oxford, United Kingdom; Nils Kolling, Inserm, France; Matthew Rushworth, University of Oxford, United Kingdom; Jacqueline Scholl, Inserm, France
P-1A.22: The Dynamic and Structured Nature of Learning and Memory
Hanqi Zhou, Álvaro Tejero-Cantero, Charley M. Wu, University of Tübingen, Germany
P-1A.23: Opening Computational Neuroscience to a Wider Audience: Virtual Escape Room for Kids
Isabelle Hoxha, CIAMS, Université Paris-Saclay, France; Noga Mudrik, The Johns Hopkins University, United States; Anne E. Urai, Leiden University, Netherlands; Dante Kienigiel, Instituto Tecnológico de Buenos Aires, Argentina; Jeremy Forest, Cornell University, United States; Mohamed Abdelhack, Krembil Centre for Neuroinformatics, Canada; Megan Peters, University of California Irvine, United States; Nick Halper, Neuromatch, United States; Ru-Yuan Zhang, Xinquan Lu, Shanghai Jiao Tong University, China; John S Butler, TU Dublin, Ireland
P-1A.24: Functional Connectivity: Continuous-Time Latent Factor Models for Neural Spike Trains
Meixi Chen, Martin Lysy, University of Waterloo, Canada; David Moorman, University of Massachusetts Amherst, United States; Reza Ramezan, University of Waterloo, Canada
P-1A.25: Learning from language and experience
Mark Ho, Todd Gureckis, New York University, United States
P-1A.26: Gamma-band sensory stimulation enhances episodic memory retrieval
Benjamin J. Griffiths, University of Birmingham, United Kingdom; Daniel Weinert, Ludwig-Maximilians-Universität, Germany; Ole Jensen, University of Birmingham, United Kingdom; Tobias Staudigl, Ludwig-Maximilians-Universität, Germany
P-1A.27: A General Method for Testing Bayesian Models with Neural Data
Gabor Lengyel, Sabyasachi Shivkumar, Ralf Haefner, University of Rochester, United States
P-1A.28: Differences in temporal adaptation across the human visual hierarchy are explained by delayed divisive normalization
Amber Brands, University of Amsterdam, Netherlands; Sasha Devore, Orrin Devinsky, Werner Doyle, Adeen Flinker, New York University School of Medicine, United States; Jonathan Winawer, New York University, United States; Iris Groen, University of Amsterdam, Netherlands
P-1A.29: Boosting noradrenaline improves economic rationality by enhancing attention in healthy humans
Hui-Kuan Chung, Philippe Tobler, University of Zurich, Switzerland
P-1A.30: Dynamic range adaptation in vast decision spaces: experimental evidence in humans
Nicolas Yax, Stefano Palminteri, Ecole Normale Superieure, France
P-1A.31: Spatiotemporal neural dynamics of working memory manipulation and reactivation
Jiaqi Li, Peking University, China; Ling Liu, Beijing Language and Culture University, China; Huan Luo, Peking University, China
P-1A.32: Maturation of Visual Pathways to the Amygdala and Their Role in Salient Information Processing Across Early Adolescence
Arshiya Sangchooli, Elise Rowe, University of Melbourne, Australia; Robert Smith, The Florey Institute of Neuroscience and Mental Health, Australia; Marta Garrido, University of Melbourne, Australia
P-1A.33: The neural dynamics of auditory word recognition and integration
Jon Gauthier, Roger Levy, Massachusetts Institute of Technology, United States
P-1A.34: Rapid Processing of Observed Touch Through a Social Perceptual Pathway: an EEG-fMRI fusion study
Haemy Lee Masson, Durham University, United Kingdom; Leyla Isik, Johns Hopkins University, United States
P-1A.35: Toward a More Neurally Plausible Neural Network Model of Latent Cause Inference
Qihong Lu, Princeton University, United States; Tan Nguyen, Washington University in St. Louis, United States; Uri Hasson, Thomas Griffiths, Princeton University, United States; Jeffrey Zacks, Washington University in St. Louis, United States; Samuel Gershman, Harvard University, United States; Kenneth Norman, Princeton University, United States
P-1A.36: Generalization of Covariance Structure in Human and Neural Network
Zilu Liang, Miriam Klein-Flugge, Christopher Summerfield, University of Oxford, United Kingdom
P-1A.37: Exploring a Basis Set of Intrinsic Functions Underlying Neural Computation by Symbolically Programming Recurrent Neural Networks.
Daniel Calbick, Ilker Yildirim, Yale University, United States; Jason Kim, Cornell University, United States
P-1A.38: Studying Continuous Structural Neuroplasticity During Motor Learning Using Diffusion MRI
Naama Friedman, Inbar Paretz, Ido Tavor, Tel Aviv University, Israel
P-1A.39: Inferring the existence of objects from their physical interactions
Pat Little, Todd Gureckis, New York University, United States
P-1A.40: Optimising Recurrent Neural Networks for System-Level Communication Results in Low-Entropy Structural Robustness
Cornelia Sheeran, Duncan Astle, Jascha Achterberg, Danyal Akarca, University of Cambridge, United Kingdom
P-1A.41: Regularised neural networks mimic human insight
Anika Löwe, Max Planck Institute for Human Development, Germany; Léo Touzo, Ecole Normale Supérieure, France; Paul Muhle-Karbe, University of Birmingham, United Kingdom; Andrew Saxe, University College London, United Kingdom; Christopher Summerfield, University of Oxford, United Kingdom; Nicolas Schuck, Max Planck Institute for Human Development, Germany
P-1A.42: Synergizing Anatomy and Function: A Goal-driven Model of Frontoparietal Dexterous Object Manipulation
Tonio Weidler, Rainer Goebel, Mario Senden, Maastricht University, Netherlands
P-1A.43: Variational inference for continuous time causal learning
Victor J. Btesh, University College London, United Kingdom; Neil R. Bramley, University Of Edinburgh, United Kingdom; J.-Philipp Fränken, Stanford University, United States; Maarten Speekenbrink, David A. Lagnado, University College London, United Kingdom
P-1A.44: A Bayesian model for online meta-planning
Ionatan Kuperwajs, Mark Ho, Wei Ji Ma, New York University, United States
P-1A.45: Perceptual Reality Monitoring as Higher-Order Inference on Sensory Precision
Nadine Dijkstra, Stephen Fleming, University College London, United Kingdom
P-1A.46: The effect of estimation time window length on overlap correction in EEG data
René S. Skukies, Benedikt V. Ehinger, University of Stuttgart, Germany
P-1A.47: Canonical dimensions of vision
Zirui Chen, Michael Bonner, Johns Hopkins University, United States
P-1A.48: Do better models of fMRI visual response better predict mental imagery responses?
Ghislain St-Yves, Jesse Breedlove, Kendrick Kay, Thomas Naselaris, University of Minnesota, United States
P-1A.49: Learning the value of control with Deep RL
Kai Sandbrink, Christopher Summerfield, University of Oxford, United Kingdom
P-1A.50: Functional relevance alters the neural geometry of novel instructed actions
Carlos González-García, University of Granada, Spain; Silvia Formica, Humboldt Universität zu Berlin, Germany; Ana F. Palenciano, University of Granada, Spain; Marcel Brass, Humboldt Universität zu Berlin, Germany
P-1A.51: Nonsalient shapes capture attention automatically and without awareness after perceptual learning
Yulong Ding, South China Normal University, China
P-1A.52: Pushing the Limits of Learning from Limited Data
Maya Malaviya, Ilia Sucholutsky, Thomas L. Griffiths, Princeton University, United States
P-1A.53: Control Limited Perceptual Decision Making
Juan R. Castiñeiras de Saa, Alfonso Renart, Champalimaud Foundation, Portugal
P-1A.54: Neural feature dimension maps track priority during visual search
Daniel Thayer, Thomas Sprague, University of California, Santa Barbara, United States
P-1A.55: Computational Modeling of Traveling Waves Using MEG-EEG in Human
Laetitia Grabot, Garance Merholz, Université Paris Cité, CNRS, France; Jonathan Winawer, David J. Heeger, Department of Psychology, New York University, United States; Laura Dugué, Université Paris Cité, CNRS, France
P-1A.56: Modeling Brain Responses to Video Stimuli Using Multimodal Video Transformers
Dota Tianai Dong, Max Planck Institute for Psycholinguistics, Netherlands; Mariya Toneva, Max Planck Institute for Software Systems, Germany
P-1A.57: Computational Tracking of Parkinsonian Motor Fluctuations in a Real-World Setting: a case study
Ainara Carpio Chicote, Julian Jeyasingh-Jacob, Subati Abulikemu, Shlomi Haar, Imperial College London, United Kingdom
P-1A.59: Differences in dynamic reconfiguration of whole-brain connectivity are related to individual differences in working memory task performance
Maren Wehrheim, Goethe University Frankfurt, Germany; Joshua Faskowitz, National Institute of Mental Health, United States; Christian Fiebach, Goethe University Frankfurt, Germany
P-1A.60: Fluctuations in Risk Attitudes Arise Systematically from Varying Noise in Bayesian Magnitude Perception
Gilles de Hollander, Marcus Grueschow, Christian Ruff, University of Zurich, Switzerland