What aspects of NLP models and brain datasets affect brain-NLP alignment?
SUBBA REDDY OOTA, Inria Bordeaux, France; Mariya Toneva, MPI for Software Systems, Germany
Session:
Posters 2B Poster
Presentation Time:
Fri, 25 Aug, 13:00 - 15:00 United Kingdom Time
Abstract:
Recent brain encoding studies highlight the potential for natural language processing models to improve our understanding of language processing in the brain. Simultaneously, naturalistic fMRI datasets are becoming increasingly available and present even further avenues for understanding the alignment between brains and models. However, with the multitude of available models and datasets, it can be difficult to know what aspects of the models and datasets are important to consider. In this work, we present a systematic study of the brain alignment across five naturalistic fMRI datasets, two stimulus modalities (reading vs. listening), and different Transformer text and speech models. We find that all text-based language models are significantly better at predicting brain responses than all speech models for both modalities. Further, bidirectional language models better predict fMRI responses and generalize across datasets and modalities.