When using synthetic imagery it is essential that it is fit for purpose. Imagery can be rendered at different levels of quality, depending on application. For example, when using a real-time system, rendering speed is a critical parameter but, when assessing the effectiveness of a camouflage system, physical accuracy is likely to be more important. A method to quantify the accuracy of the imagery, for particular applications, is necessary. A range of different metrics based on wavelets, higher order statistics, and a human vision model, has been developed to assess the fidelity of synthetic imagery. These metrics have been used to analyze synthetic imagery rendered at different levels of fidelity and to compare the synthetic imagery with real- world imagery. Some of the metrics can be used to compare two spatially correlated images, whereas others can be used to assess particular characteristics of the image such as clutter level. The metrics, and first order statistics, have been incorporated into a tool box called FIRE (Fidelity Investigations and Reporting Environment). This paper will describe the metrics used and the results of analyses undertaken.
Abstract In almost every study of the linearity of spatiotemporal summation in simple cells of the cat's visual cortex, there have been systematic mismatches between the experimental observations and the predictions of the linear theory. These mismatches have generally been explained by supposing that the initial spatiotemporal summation stage is strictly linear, but that the following output stage of the simple cell is subject to some contrast-dependent nonlinearity. Two main models of the output nonlinearity have been proposed: the threshold model (e.g. Tolhurst & Dean, 1987) and the contrast-normalization model (e.g. Heeger, 1 992a, b ). In this paper, the two models are fitted rigorously to a variety of previously published neurophysiological data, in order to determine whether one model is a better explanation of the data. We reexamine data on the interaction between two bar stimuli presented in different parts of the receptive field; on the relationship between the receptive-field map and the inverse Fourier transform of the spatial-frequency tuning curve; on the dependence of response amplitude and phase on the spatial phase of stationary gratings; on the relationships between the responses to moving and modulated gratings; and on the suppressive action of gratings moving in a neuron's nonpreferred direction. In many situations, the predictions of the two models are similar, but the contrast-normalization model usually fits the data slightly better than the threshold model, and it is easier to apply the equations of the normalization model. More importantly, the normalization model is naturally able to account very well for the details and subtlety of the results in experiments where the total contrast energy of the stimuli changes; some of these phenomena are completely beyond the scope of the threshold model. Rigorous application of the models' equations has revealed some situations where neither model fits quite well enough, and we must suppose, therefore, that there are some subtle nonlinearities still to be characterized.
Individual V1 neurons respond dynamically over only limited ranges of stimulus contrasts, yet we can discriminate contrasts over a wide range. Different V1 neurons cover different parts of the contrast range, and the information they provide must be pooled somehow. We describe a probabilistic pooling model that shows that populations of neurons with contrast responses like those in cat and monkey V1 would most accurately code contrasts in the range actually found in natural scenes. The pooling equation is similar to Bayes's equation; however, explicit inclusion of prior probabilities in the inference increases coding accuracy only slightly.
The albino visual cortex receives input from the ipsilateral visual field. To investigate how the visual cortex of humans albinos organizes this abnormal input we applied retinotopic mapping fMRI procedures. Two subjects with albinism and only small nystagmus and two control subjects underwent T2* MRI scanning of the occipital lobe during visual stimulation. In separate experiments we stimulated monocularly the nasal and temporal retina with phase encoded visual stimuli (Engel et al. 1997). BOLD responses were projected to the flattened representation of T1 weighted images, fourier analysed and correlated with the stimulus fundamental frequency. Retinotopic mapping yielded phase maps that allowed the identification of V1 and dorsal and ventral representations of V2 and V3 in both controls and albino subjects. In the controls V1 comprised a representation of the contralateral visual field, while it comprised a representation of both the contralateral and the ipsilateral visual field in the albino subjects. The normal contralateral and the abnormal ipsilateral representations are, at the resolution of fMRI, arranged as an overlay. We obtained evidence for a similar arrangement in other early visual areas. Our results indicate that, in the albinos tested, there has been no reordering of the geniculostriate projection in human as has been reported in other species. Furthermore, there appears to be an absence of the complete suppression of the abnormal input to the cortex that has also been documented in cat and ferret. In the human albino, we conclude that representations of mirror symmetric positions in the visual field occupy neighbouring regions of the cortex. This feature may have behavioural significance for tasks performed in regions of the visual field where fibres project aberrantly. Engel, S.A., Glover, G.H. & Wandell, B.A. (1997) Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cereb. Cortex, 7, 181–92.
We have developed a protocol for testing experimentally the hypothesis that the human visual system is optimised for making visual discriminations amongst natural scenes. Visual stimuli were made by gradual blending of the Fourier spectra of digitised photographs of natural scenes. The statistics of the stimuli were made unnatural to varying degrees by changing the overall slopes of the amplitude spectra of the stimuli. Thresholds were measured for discriminating small amounts of spectral blending at different spectral slopes. We found that thresholds were lowest when the spectral slope was natural; thresholds were increased when the slopes were either shallower or steeper than natural. A number of spurious cues were considered, such as differences in mean luminance or overall spectral power or contrast between test and reference stimuli. Control experiments were performed to remove such spurious cues, and the discrimination thresholds were still lowest for stimuli that were most natural. Thus, these experiments do provide experimental support for the idea that human vision and the human visual system are optimised for processing natural visual information.