Here, we systematically explore the size and spacing requirements for identifying a letter among other letters. We measure acuity for flanked and unflanked letters, centrally and peripherally, in normals and amblyopes. We find that acuity, overlap masking, and crowding each demand a minimum size or spacing for readable text. Just measuring flanked and unflanked acuity is enough for our proposed model to predict the observer's threshold size and spacing for letters at any eccentricity. We also find that amblyopia in adults retains the character of the childhood condition that caused it. Amblyopia is a developmental neural deficit that can occur as a result of either strabismus or anisometropia in childhood. Peripheral viewing during childhood due to strabismus results in amblyopia that is crowding limited, like peripheral vision. Optical blur of one eye during childhood due to anisometropia without strabismus results in amblyopia that is acuity limited, like blurred vision. Furthermore, we find that the spacing:acuity ratio of flanked and unflanked acuity can distinguish strabismic amblyopia from purely anisometropic amblyopia in nearly perfect agreement with lack of stereopsis. A scatter diagram of threshold spacing versus acuity, one point per patient, for several diagnostic groups, reveals the diagnostic power of flanked acuity testing. These results and two demonstrations indicate that the sensitivity of visual screening tests can be improved by using flankers that are more tightly spaced and letter like. Finally, in concert with Strappini, Pelli, Di Pace, and Martelli (submitted), we jointly report a double dissociation between acuity and crowding. Two clinical conditions-anisometropic amblyopia and apperceptive agnosia-each selectively impair either acuity A or the spacing:acuity ratio S/A, not both. Furthermore, when we specifically estimate crowding, we find a double dissociation between acuity and crowding. Models of human object recognition will need to accommodate this newly discovered independence of acuity and crowding.
The VideoToolbox is a free collection of two hundred C subroutines for Macintosh computers that calibrates and controls the computer-display interface to create accurately specified visual stimuli. High-level platform-independent languages like MATLAB are best for creating the numbers that describe the desired images. Low-level, computer-specific VideoToolbox routines control the hardware that transforms those numbers into a movie. Transcending the particular computer and language, we discuss the nature of the computer-display interface, and how to calibrate and control it.
Lunn and Banks (1986), studying the reasons for “visual fatigue” among users of video text displays, found that reading from a video display caused a tenfold elevation of the contrast threshold for a sinusoidal grating with the same spatial frequency as the lines of text. They suggested that this may contribute to “visual fatigue,” possibly by affecting accommodation, but they did not explain why printed text would not have the same effect.
Can you compare the beauty of the Mona Lisa to Starry Night? Would your beauty ratings of single images predict your rating of their relative beauty? Twenty-five participants were tested with 14 OASIS images and 6 self-selected images. There were 2 tasks. In the relative task, each participant saw all possible two-image pairs twice, chose which image was more beautiful and rated by how much on a 1–9 scale. In the absolute task, they saw all 20 images randomly presented one by one 4 times and rated how much beauty they felt from each, 1–9. We find that the participants made consistent absolute and relative beauty judgments (absolute: test-retest r = 0.98, σ 2 = 0.29; relative: test-retest r = 0.84, σ2 = 1.37). We used absolute beauty ratings to predict relative beauty ratings by subtracting one image’s absolute beauty rating from the other’s. This simple model precisely predicts mean beauty difference ratings (r = 0.79) and 80% of the choices. Thus, the mean beauty difference ratings are predicted by mean absolute beauty ratings. But the variance in our data is 2.4 times as large as predicted by our model, suggesting a noisy comparison process.
Many authors distinguish "first-order" object recognition from "second-order" tasks that are poorly suited to template matching and seem to demand other kinds of perceptual computation. "Second-order" tasks include detecting symmetry, Glass patterns, modulations of white noise, and coarse patterns composed of small balanced elements. Past treatments have suggested various special computations, particular to each task, that observers might make. We take a more general approach. We suppose a complete set of receptive fields (like those of V1 cells) and ask how many receptive fields are required to perform as well as human observers. This is like defining efficiency as the fraction of available information (e.g. dots or area) that would be required by an ideal observer, but applied to receptive fields rather than to components (e.g. dots) of the stimulus. With mild assumptions about the receptive fields, this reveals a dichotomy between "first-order" ordinary identification tasks that require on the order of ten receptive fields and "second-order" tasks that require thousands or millions. The necessary cortical wiring is greatly affected by the hundred-or-more-fold increase in the number of receptive fields used. Meeting abstract presented at VSS 2012
Crowding — the inability to recognize objects in clutter — severely limits object recognition and reading. In crowding, a simple target (e.g. a letter) that is recognizable alone cannot be recognized when surrounded by clutter that is less than the psychophysical crowding distance away (deg). Prior work shows that crowding distance scales linearly with target eccentricity and varies with the direction of crowding: crowding distance is approximately double for flankers placed radially rather than tangentially. Multiplying the psychophysical crowding distance by the cortical magnification factor yields the cortical crowding distance (mm of cortex). In V1, radial cortical crowding distance is a fixed number of mm and conserved across eccentricity, but not across orientation (Pelli, 2008). Since crowding distance in V1 is conserved radially across eccentricity, we imagined that there might be some downstream area, more involved in crowding, where the crowding distance is isotropic, conserved across both eccentricity and orientation. METHOD: We measured psychophysical crowding distances on 4 observers at eccentricities of ±2.5°, ±5°, and ±10°, radially and tangentially, for letter targets on the horizontal meridian. Results confirmed the well-known dependence on eccentricity and orientation. Using anatomical and functional MRI, we also measured each observer's retinotopic maps, and quantified tangential and radial cortical magnification in areas V1-hV4. RESULTS & CONCLUSION: We find that all four areas conserve cortical crowding distance across eccentricity, but only hV4 conserves crowding distance across both eccentricity and orientation. After averaging measurements across observers (n=4), we find that the V4 crowding distance is 3.0±0.2 mm (mean±rms error across orientation and eccentricity). Across both dimensions, conservation fails in V1-V3, with rms error exceeding 0.7 mm. The conservation of crowding distance in hV4 suggests that it mediates the receptive field of crowding, i.e. the integration of features to recognize a simple object. Meeting abstract presented at VSS 2018
Crowding, the unwanted perceptual merging of adjacent stimuli, is well studied and easily measured, but its physiological basis is contentious. We explore its link to physiology by combining fMRI retinotopy of cortical area hV4 and psychophysical measurements of crowding in the same observers. Crowding distance (i.e. critical spacing) was measured radially and tangentially at eight equally spaced sites at 5° eccentricity, and ±2.5° and ±10° on the horizontal midline. fMRI mapped the retinotopy of area hV4 in each hemisphere of the 5 observers. From the map we read out cortical magnification, radially and tangentially, at the 12 sites tested psychophysically. We also estimated the area of hV4 in mm2. Combining fMRI with psychophysics, last year we reported conservation of a roughly 1.8 mm crowding distance on the surface of hV4 (the product of cortical magnification in mm/deg and crowding distance in deg) across eccentricity and orientation, in data averaged across observers (Zhou et al. 2018 VSS). Crowding distances were less well preserved in the V1–V3 maps. Conservation of the hV4 crowding distance across individual observers would mean a fixed product of visual crowding distance and cortical magnification, which implies a negative correlation between log crowding distance and log magnification. Separate linear mixed-effects models of log crowding area and log cortical magnification each account for about 98% of the variance. Log areal hV4 cortical magnification shows a trend toward a negative correlation with log areal crowding across 10 hemispheres (r=−0.53, p=0.11); log hV4 surface area and log areal crowding show a similar negative correlation (r=−0.55, p=0.1). The trend toward larger crowding distances in observers with less surface area in hV4 is consistent with the possibility that crowding distances, though highly variable when measured in the visual field, are approximately conserved on the surface of the hV4 map.
We report a connection between effects of crowding and noise. In the periphery, it is impossible to identify a target in clutter, unless the clutter is at least a critical spacing away. The area enclosed by the critical spacing is the "combining field". Measuring thresholds in various levels of full-field white noise, we decompose threshold contrasts into efficiency and equivalent input noise (Pelli & Farell, 1999). Efficiency is the fraction of the contrast energy used by the human observer that would be needed by an optimal algorithm in the same amount of noise. Equivalent input noise is the amount of display noise needed to account for the human threshold, given the measured efficiency. We presented a 0.5, 2, or 8 deg letter at an eccentricity of 0-32 deg on a full-field background of white noise, at one of several noise contrasts. Measured threshold contrasts were decomposed into efficiency and neural noise. We find that efficiency is independent of eccentricity (0 to 32 deg) for all letter sizes within the acuity limit. For letters larger than the combining field, neural noise is proportional to letter area and independent of eccentricity. For letters smaller than the combining field, the neural noise corresponds to the combining field area, which is independent of letter size and grows with eccentricity. The foveal finding of equivalent noise proportional to area is consistent with scale-invariant letter recognition. The break of that proportionality in the periphery occurs when letter size equals combining field size, suggesting that there is a neural channel with that area, and perhaps there are similar channels with larger (but not smaller) areas. Meeting abstract presented at VSS 2016