CT-2.4

Humans and 3D neural field models make similar 3D shape judgements

Thomas O'Connell, MIT, United States; Tyler Bonnen, Stanford University, United States; Yoni Friedman, Ayush Tewari, Josh Tenenbaum, Vincent Sitzmann, Nancy Kanwisher, MIT, United States

Session:
Contributed Talks 2 Lecture

Track:
Cognitive science

Location:
South Schools / East Schools

Presentation Time:
Sat, 26 Aug, 17:15 - 17:30 United Kingdom Time

Abstract:
Human visual perception captures the 3D shape of objects. While convolutional neural networks (CNNs) resemble some aspects of human visual processing, they fail to explain human shape perception. A new deep learning approach, 3D neural fields (3D-NFs), has driven remarkable recent progress in 3D graphics and computer vision. 3D-NFs encode the geometry of objects in a continuous, coordinate-based representation. Here, we investigate whether humans and 3D-NFs make similar trial-level 3D shape judgments on match-to-sample tasks with rendered stimuli. In Experiment 1, 3D-NF behavior is more similar to human behavior than standard CNNs trained on ImageNet, regardless of whether lure objects were a.) from a different category than the target, b.) the same category as the target, or c.) matched to have the most similar 3D-NF to the target as possible. In Experiment 2, to accentuate differences between humans and 3D-NFs compared to CNNs, five difficulty conditions were defined based on the performance of 25 ImageNet CNNs. Again, we find 3D-NF and human behavior are well aligned, with both showing high accuracy even for trials where CNNs fail. Overall, 3D-NFs and humans show similar patterns of 3D shape judgements, suggesting 3D-NFs as a promising framework for investigating human 3D shape perception.

Manuscript:
License:
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
DOI:
10.32470/CCN.2023.1603-0
Publication:
2023 Conference on Cognitive Computational Neuroscience
Presentation
Discussion
Resources
No resources available.
Session CT-2
CT-2.1: Mental Imagery: Weak Vision or Compressed Vision?
Tiasha Saha Roy, Jesse Breedlove, Ghislain St-Yves, Kendrick Kay, Thomas Naselaris, University of Minnesota, United States
CT-2.2: Leveraging Artificial Neural Networks to Enhance Diagnostic Efficiency in Autism Spectrum Disorder: A Study on Facial Emotion Recognition
Kushin Mukherjee, University of Wisconsin-Madison, United States; Na Yeon Kim, California Institute of Technology, United States; Shirin Taghian Alamooti, York University, Canada; Ralph Adolphs, California Institite of Technology, United States; Kohitij Kar, York University, Canada
CT-2.3: Dropout as a tool for understanding information distribution in human and machine visual systems
Jacob S. Prince, Harvard University, United States; Gabriel Fajardo, Boston College, United States; George A. Alvarez, Talia Konkle, Harvard University, United States
CT-2.4: Humans and 3D neural field models make similar 3D shape judgements
Thomas O'Connell, MIT, United States; Tyler Bonnen, Stanford University, United States; Yoni Friedman, Ayush Tewari, Josh Tenenbaum, Vincent Sitzmann, Nancy Kanwisher, MIT, United States
CT-2.5: Humans and CNNs see differently: Action affordances are represented in scene-selective visual cortex but not CNNs
Clemens G. Bartnik, Iris I.A. Groen, University of Amsterdam, Netherlands
CT-2.6: Beyond Geometry: Comparing the Temporal Structure of Computation in Neural Circuits with Dynamic Mode Representational Similarity Analysis
Mitchell Ostrow, Adam Eisen, Leo Kozachkov, Ila Fiete, Massachusetts Institute of Technology, United States