Processing global and local features in convolutional neural network (CNN) and primate visual systems

Jun Huang,Tianwen Chen,Wu Zhou,Yufeng Zheng,Yang Ou

Processing global and local features in convolutional neural network (CNN) and primate visual systems

2018

In the human visual system, visible objects are recognized by features, which can be classified into local features that are based on their simple components (i.e., line segment, angle, color, etc.) and global features that are based on the whole objects (i.e., connectivity, number of holes, etc.). Over the past half century, anatomical, physiological, behavioral and computational studies of the visual systems have led to a generally accepted model of vision, which starts at processing local features in the early stages of the visual pathways, followed by integrating them to global features in the later stages of the visual pathways. However, this popular local-to-global model has been challenged by a set of experiments showing that the visual systems in humans, non-human primates and honey bees are more sensitive to global features than local features. These “global-first” studies further motivated developing new paradigms and approaches to understand human vision and build new vision models. In this study, we started a new series of experiments that examine how two representative pre-trained Convolutional Neural Networks (CNN) (AlexNet and VGG-19) process local and global features. The CNNs were trained to classify geometric shapes into two categories based on local features (e.g., triangle, square and circle) or a global feature (e.g., having a hole). In contrast to the biological visual systems, the CNNs were more effective at classifying images based on local features than the global feature. We further showed that adding distractors greatly lowered the performance of the CNNs, again different from the biological visual systems. Ongoing studies will extend these analyses to other geometrical invariants and internal representations of the CNNs. The overarching goal is to use the powerful CNNs as a tool to gain insights into the biological visual systems, including that of humans and non-human primates.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations