Efficient visual coding: From retina to V2

H Shan, G Cottrell - arXiv preprint arXiv:1312.6077, 2013 - arxiv.org
arXiv preprint arXiv:1312.6077, 2013arxiv.org
The human visual system has a hierarchical structure consisting of layers of processing,
such as the retina, V1, V2, etc. Understanding the functional roles of these visual processing
layers would help to integrate the psychophysiological and neurophysiological models into
a consistent theory of human vision, and would also provide insights to computer vision
research. One classical theory of the early visual pathway hypothesizes that it serves to
capture the statistical structure of the visual inputs by efficiently coding the visual information …
The human visual system has a hierarchical structure consisting of layers of processing, such as the retina, V1, V2, etc. Understanding the functional roles of these visual processing layers would help to integrate the psychophysiological and neurophysiological models into a consistent theory of human vision, and would also provide insights to computer vision research. One classical theory of the early visual pathway hypothesizes that it serves to capture the statistical structure of the visual inputs by efficiently coding the visual information in its outputs. Until recently, most computational models following this theory have focused upon explaining the receptive field properties of one or two visual layers. Recent work in deep networks has eliminated this concern, however, there is till the retinal layer to consider. Here we improve on a previously-described hierarchical model Recursive ICA (RICA) [1] which starts with PCA, followed by a layer of sparse coding or ICA, followed by a component-wise nonlinearity derived from considerations of the variable distributions expected by ICA. This process is then repeated. In this work, we improve on this model by using a new version of sparse PCA (sPCA), which results in biologically-plausible receptive fields for both the sPCA and ICA/sparse coding. When applied to natural image patches, our model learns visual features exhibiting the receptive field properties of retinal ganglion cells/lateral geniculate nucleus (LGN) cells, V1 simple cells, V1 complex cells, and V2 cells. Our work provides predictions for experimental neuroscience studies. For example, our result suggests that a previous neurophysiological study improperly discarded some of their recorded neurons; we predict that their discarded neurons capture the shape contour of objects.
arxiv.org
Showing the best result for this search. See all results