Humans and 3D neural field models make similar 3D shape judgements
Thomas O'Connell, MIT, United States; Tyler Bonnen, Stanford University, United States; Yoni Friedman, Ayush Tewari, Josh Tenenbaum, Vincent Sitzmann, Nancy Kanwisher, MIT, United States
Session:
Contributed Talks 2 Lecture
Location:
South Schools / East Schools
Presentation Time:
Sat, 26 Aug, 17:15 - 17:30 United Kingdom Time
Abstract:
Human visual perception captures the 3D shape of objects. While convolutional neural networks (CNNs) resemble some aspects of human visual processing, they fail to explain human shape perception. A new deep learning approach, 3D neural fields (3D-NFs), has driven remarkable recent progress in 3D graphics and computer vision. 3D-NFs encode the geometry of objects in a continuous, coordinate-based representation. Here, we investigate whether humans and 3D-NFs make similar trial-level 3D shape judgments on match-to-sample tasks with rendered stimuli. In Experiment 1, 3D-NF behavior is more similar to human behavior than standard CNNs trained on ImageNet, regardless of whether lure objects were a.) from a different category than the target, b.) the same category as the target, or c.) matched to have the most similar 3D-NF to the target as possible. In Experiment 2, to accentuate differences between humans and 3D-NFs compared to CNNs, five difficulty conditions were defined based on the performance of 25 ImageNet CNNs. Again, we find 3D-NF and human behavior are well aligned, with both showing high accuracy even for trials where CNNs fail. Overall, 3D-NFs and humans show similar patterns of 3D shape judgements, suggesting 3D-NFs as a promising framework for investigating human 3D shape perception.