Coordination of brain activity, in the form of modulation of feedforward activity by stored information and expectations, occurs across domains of perception and cognition. A reliable and compelling example of this is size contrast in vision. This paper builds on a prior study to show that in healthy humans, the spread of activation in striate and extrastriate visual cortex during a context-modulated size perception task is dependent on the perceived size of the target image, not on the physical size of the retinal image. These data provide further evidence that early regions in visual cortex are modulated by top-down influences, and provide a framework for investigating visual context processing in psychiatric disorders where reduced sensitivity to visual contextual effects has been demonstrated in behavioral tasks.
While neuroscience has made significant advances towards understanding the functions and firing properties of individual neurons and brain regions, the topic of their coordinated activity has received far less attention. Nevertheless, dynamic coordination of neural activity (i.e., modulation of signals based on current context) is essential for successful adaptation (Phillips, Clark, & Silverstein, 2015) . This concept is now well established in vision, where the influence of visual stimuli outside of the classical receptive field on primary visual cortex neurons has been convincingly demonstrated ( Freeman, Sagi, & Driver, 2001 ; Zhang & Von der Heydt, 2010 ; Zhu & Rozell, 2013 ). This evidence has transformed the traditional view that V1 provides only a topographical representation of retinotopic information (Tootell et al., 1998) . Less work has been done on the neural effects of higher cognitive influences on perception, although many behavioral studies have demonstrated the effects of emotional state ( Damaraju, Huang, Barrett, & Pessoa, 2009 ; Hu, Padmala, & Pessoa, 2013 ; Keil et al., 2010 ), focus of attention ( Fang, Boyaci, Kersten, & Murray, 2008 ; Gilbert, Ito, Kapadia, & Westheimer, 2000 ), stored information (learning and memory) ( Doherty, Campbell, Tsuji, & Phillips, 2010 ; Doherty, Tsuji, & Phillips, 2008 ; Li, Piech, & Gilbert, 2008 ; Papathomas & Bono, 2004 ), expectations ( Silverstein et al., 2006 ; Silverstein & Keane, 2009 ; Silverstein et al., 1996 ; Vetter & Newen, 2014 ), and social context (Kret & de Gelder, 2010 ) on visual processing.
In an fMRI study using a size illusion task, Murray and colleagues demonstrated that the spatial distribution of activity in V1 reflected the perceived size of an image, rather than the retinal size (Murray, Boyaci, & Kersten, 2006) . More recently, similar findings have been reported using variants of the Ebbinghaus and Ponzo illusions by Fang et al. (2008) , Schwarzkopf et al. (2011) , and Schwarzkopf & Rees (2013) . In the present study, we aimed to confirm this basic effect and further explore contextual effects in visual cortex. We replicated the timing and duration of the trial presentation used by Murray et al. (2006) . However, to more closely assay real- world experience, we presented realistic photographic images rather than the drawn geometric shapes used in that study. The aforementioned studies used target stimuli of basic geometric shapes to maximally excite neurons in V1; these were not shapes that individuals would typically observe outside of the laboratory [we note that while Schwarzkopf et al. (2011) used a photograph of a tunnel for background stimuli, the targets were simple black and white circle sketches]. While simple object processing-related activity in early visual cortex can inform our understanding of more abstract and complex object processing (for review, see Wilson & Wilkinson, 2015 ), natural image viewing activates additional brain regions. For example, categorical image processing activates inferior temporal cortex ( Carlson, Ritchie, Kriegeskorte, Durvasula, & Ma, 2014 ; Ritchie, Tovar, & Carlson, 2015 ) and object recognition activates multiple points along the ventral processing stream and into higher cortical areas (Reddy & Kanwisher, 2006) . This difference may be partly explained by additional feedback interactions involved in natural object processing, relative to simple shape processing.
In the present study, identically sized target photographic images were presented in spatial contexts such that they were judged to be closer to, or further from, the observer. The dependent variable was spread of activity in V1 - V5, and whether this was related to perceived image size.
Subjects were recruited from the Rutgers-Newark Psychology Department subject pool and received course credit for participation (subject demographics are shown in
Total Participants | 16 |
---|---|
Age (Years) | 21.3 ± 3.4 |
Sex (# Female) | 10 |
Race (#) | |
White | 0 |
Black | 1 |
Hispanic | 6 |
Asian | 6 |
Other | 3 |
Education (Years) | 14.1 ± 2.9 |
Shipley-2 Score | 100.2 ± 10.0 |
tunnel background was identical for all four images. The viewing distance, including distance of eyes to the mirror, was 990 mm.
Stimulus presentation and response collection were done using PsychoPy (Peirce, 2007, 2008) . The images were presented serially, each one for 10 seconds. Prior to a stimulus being presented, a white fixation cross was shown for 10 seconds in the position that the dog or cat would appear, so the duration of any given trial (fixation and stimulus) was 20 seconds, as in the study by Murray et al. (2006) . To assure that subjects attended to the stimuli, they were asked to press a response button when they saw a “cat” and another button when they saw a “dog”.
Just prior to the scanning session, and before entering the scanner, a behavioral measure of the size illusion effect was obtained for each subject. The task consisted of the same “target” cat and dog stimuli presented against the tunnel background. However, for each trial, a “probe” cat or dog (i.e., the same stimuli used in the fMRI experiment) was presented at the right of the stimulus image, in the absence of context. Subjects used the arrow keys to adjust the size of the probe cat or dog until they believed it was the same size as the target object, at which point they pressed the spacebar for the next stimulus. Subjects were not timed for this task. Each condition was presented 8 times in a sequence randomized for each subject, and for each trial, the probe (cat or dog) to be manipulated was randomly scaled, such that it was either larger or smaller than the target object, but the average ratio of probe to target was 1.0. For analysis, we used the variable of final ratio of probe object size relative to the target object size.
Subjects viewed the stimuli and made responses while being scanned in a Siemens 3T Trio magnet. Subjects were scanned in a prone position and a 12-channel Siemens head coil was used. Foam cushioning was used to stabilize head position and minimize head movement. An MRI-compatible button box was used to record responses. The stimuli were back projected onto a screen covering the rear bore of the magnet. A mirror attached to the head coil allowed subjects to view the stimuli in the normal orientation. The presentation of stimuli and
recording of responses was done using PsychoPy. Scanning was synchronized with stimulus presentation through a trigger pulse sent to the PsychoPy software.
Prior to the experimental trial sequence, a T1-weighted axial anatomical scan (TR = 2000 ms, TE = 4.38 ms, 204 × 256 matrix, FOV = 22 cm, slice thickness 2 mm, 0 mm gap, 80 slices) was obtained for each subject and used to register the functional imaging data for that subject during analysis. Functional imaging was done using an echo planar gradient echo imaging sequence and axial orientation, and data were obtained using the following parameters: TR = 2000 ms, TE = 30 ms, 64 × 64 matrix, FOV = 22 cm, slice thickness 4 mm, 0 mm gap, 32 slices.
Preparation of the BOLD data and statistical analyses were performed using FSL (FMRIB’s Software Library, http://www.fmrib.ox.ac.uk/fsl). BOLD data from each subject was skull-stripped using BET, FSL’s brain extraction tool (Smith, 2002) and then registered first to that subject’s anatomical data and then to the 2 mm MNI template space using FLIRT, FSL’s registration tool (Jenkinson, Bannister, Brady, & Smith, 2002 ). A General Linear Model (GLM) analysis using FSL’s modelling tool, FEAT (Woolrich, Ripley, Brady, & Smith, 2001) was performed first for each subject, and then across subjects in a second analysis.
For each analysis of an individual subject’s data, the skull-stripped BOLD data for that subject was motion corrected (FSL’s McFlirt tool) and prewhitened (FSL’s FILM). A high-pass filter (100 seconds) and 5 mm spatial smoothing were applied. For the group analysis, a mask comprised of visual areas V1 through V5 was constructed using the Juelich histological atlas, and the GLM analysis was restricted to voxels included in that mask. A mixed-effects model (FSL’s FLAME) was applied using a corrected cluster threshold of p = .001. Group-level analyses were based on four contrasts used during the subject-level analyses: far (cat + dog), near (cat + dog), far > near, and near > far.
To observe if our image manipulation was effective, we performed a t-test between the final adjusted ratio of manipulated probe image size to target image size when the target image was presented on the bottom left side of the screen (i.e., when it was perceived as “near”) compared to when it was presented in the middle part of the screen (i.e., when it was perceived as “far”). Using a one sample t-test, with 1 as the reference point, we observed that subjects significantly over-adjusted the size of the “far” image by close to 5% [t(15) = 4.611, p < .001, M = 1.046, SD = .04]. In contrast, subjects were largely accurate when adjusting the object to match the “near” target [t(15) = 1.356, p = .195, M = 1.014, SD = .04]. When the average estimated size of the “far” object was compared to that of the “near” object by means of a paired t-test, the difference was also significant [t(15) = 3.51, p < .01, M = .032, SD = .04]. This difference remained significant whether the difference was analyzed by image type as dog [t(15) = 2.88, p < .05] or cat [t(15) = 2.74, p < .05]. To obtain an overall measure of contextual influence, we subtracted the ‘near’ target ratio from the ‘far’ target ratio for each subject. This value was not correlated with age (r = -.354, p = .18), years of education (r = −.338, p = .22), or Shipley-Vocabulary score (r = −.413, p = .13). In addition, there were no differences on this score between males and females [t(14) = .52, p = .61] or racial groups [F(2,12) = .353, p = .79].
Subjects’ proportion correct responses (M = .988, SD = .029) indicated that all subjects were paying attention to the stimulus in the scanner. In order to assess the effect of the size illusion in visual cortex regions (V1 through V5), voxel activation generated from the “far” stimuli (cat and dog) was contrasted with that generated by the “near” (cat and dog).
We observed that equally sized images of real-world objects can differentially affect the extent and distribution of activation in early visual cortex―as a function of their spatial context―illustrated by the voxels activated in V1-V3 by the “far” target images but not the “near” target images. This is an extension of previous findings ( Fang et al., 2008 ; Murray et al., 2006 ), showing for the first time that same-sized target stimuli not specifically created to maximize neural firing in V1 can nevertheless induce differential activity in primary visual cortex in relation to spatial and textural context. In the context of increasing awareness of the non-replicability of research findings (Open Science, 2015) , we believe our study supports this brain basis for a basic visual function. This is especially relevant since it increases confidence in the potential utility of this measure for studies of clinical populations (see below).
Our study engaged participants in an active behavioral task while in the scanner, requiring participants to engage attentional resources to make a decision based on semantic categorization. This differs from previous studies involving contextual effects on visual perception that used passive participation as a means of eliminating higher-order functions (such as attention) from interacting with activation of basic visual cortex ( Murray et al., 2006 ; Pooresmaeili, Arrighi, Biagi, & Morrone, 2013 ). However, other studies using active participation have shown similar size constancy effects to that of Murray et al. (2006) ( Fang et al., 2008 ; Schwarzkopf & Rees, 2013 ; Schwarzkopf, Song, & Rees, 2011 ). Our study thus confirms that these effects in early visual cortex can be obtained during conditions of active attention.
Interestingly, in our behavioral task, we did not find that subjects underrepresented the size of the “near” object (in the bottom left); indeed, subjects were largely accurate in manipulating the size of the object when the target was closer. This could be due to the greater amount of context surrounding the “far” images (which appeared closer to the center of the screen, and were surrounded by context to an approximately equal degree on all sides) compared to the “near” images (which appeared close to the bottom, and so were less surrounded by context in the lower portion of the images). However, it is also possible that distortions of perceived visual angle are simply greater for objects that are perceived as being further from the observer (Arnold, Birt, & Wallis, 2008) . Consistent with this, Merriman et al. (2010) observed that adults use size estimation strategies at far, but not near, distances.
We believe that, for several reasons, our results could be used to guide investigations of the effects of regional brain volumes on perception. First, individual differences across a wide range of human behaviors correlate with individual differences in brain anatomy (for review, see Kanai & Rees, 2011 ). Next, it has been shown that individual variability in surface area of V1 is a reliable indicator of the strength of size context illusions (Schwarzkopf et al., 2011) , with larger surface area equating to decreased lateral inhibition between neurons (Kaas, 2000) and a subsequent decrease in contextual illusion strength. These effects may extend to other visual stimuli, including contrast, luminance, and orientation (Song, Schwarzkopf, & Rees, 2013) . Electrophysiological data complement these findings, showing that V1 cell receptive fields will change position depending on monocular depth cues (Ni, Murray, & Horwitz, 2014) . The extent of this change, however, is limited by physical anatomy, as described above. This literature collectively shows that visual context processing varies as a result of differences in extent of lateral inhibitory connections, and these variations can be reliably measured at anatomical and functional levels.
There are several notable limitations to this study. A difference between this study and others ( Murray et al., 2006 ; Schwarzkopf et al., 2011 ) that examined contextual processing in early visual cortex is that we did not perform retinotopic mapping (i.e., specifically defining V1 in every subject), but rather, used the Juelich histological atlas to define visual regions of interest. As a result, it is possible that some of the activity we detected in extrastriate regions was due to imprecision in measurement of V1 in our grouped data. It must be noted, however, that the use of V1 localizers has been criticized (as being potentially biased and even unnecessary), especially in cases where significant feedback from higher-level cognition to V1 is expected (Friston, Rotshtein, Geng, Sterzer, & Henson, 2006) . In addition, because extrastriate regions process progressively larger areas of space with increasing distance from V1 (i.e., V2, V3, V4, etc.) (Angelucci et al., 2002) , the effects we observed in extrastriate regions may reflect real differences in perceived stimulus size, information about which is fed back to V1, rather than being due to anatomical overlap of occipital regions across subjects. Murray et al. (2006) only examined ROIs within V1 and so did not present data on this issue, although they did recognize that some of their V1 effects could be due to feedback from higher-order visual regions. A third study limitation is that the number of trials per stimulus condition were limited, and although the design had been used before ( Murray et al., 2006 ; Song et al., 2013 ), additional trials for each of our conditions may have allowed for greater sensitivity to contextual effects.
In summary, we have shown that activation of early visual cortex is dependent on the relative perceived size of objects as opposed to their actual size. This has significant implications not only for perceptual science, but also for the treatment of mental health disorders. It has been shown that schizophrenia patients tend to demonstrate more veridical perception when viewing stimuli that normally creates visual illusions and that involves contextual integration (reviewed in Silverstein & Keane, 2011 ), including those where size contrast (Silverstein et al., 2013) and depth perception (Keane, Silverstein, Wang, & Papathomas, 2013 ) are involved. This reduced sensitivity to visual context is especially pronounced in acutely psychotic patients, but lessens with relapse and recovery as the illusion effects tend to normalize with symptom remission ( Silverstein et al., 2013 ; Uhlhaas, Phillips, & Silverstein, 2005 ). Because these behavioral impairments are linked to severity of disorganized (e.g., fragmented speech) and psychotic (e.g., hallucinations, delusions) symptoms respectively, it will be worthwhile to explore the extent to which reduced contextual effects in visual cortex serve as a biomarker of the connectivity alterations that are thought to subserve these symptoms ( Corlett, Frith, & Fletcher, 2009 ; Corlett, Honey, Krystal, & Fletcher, 2011 ; Phillips & Silverstein, 2003 ; Uhlhaas, Phillips, Mitchell, & Silverstein, 2006 ). Using the current paradigm could help reveal whether these size constancy abnormalities in schizophrenia had a functional basis in primary visual cortex. From there, it may be possible to examine if V1 volume or shape is related to activation level during size context processing, or if these differences result from other mechanisms known to alter receptive field size, such as attention (Fang et al., 2008) and perceptual load (de Haas, Schwarzkopf, Anderson, & Rees, 2014) .
This research was supported by the Biomedical Science Education Postdoctoral Training Program―NIH Grant No: 5K12GM093854, a Rutgers University grant supported RUBIC pilot work.
All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Keith A.Feigenson,CatherineHanson,ThomasPapathomas,Steven M.Silverstein,11, (2015) A Functional MRI Index of Spatial Context Effects in Vision. Psychology,06,2145-2154. doi: 10.4236/psych.2015.616211