Two Forms of Spatial Imagery: Neuroimaging Evidence

--Spatial imagery may be useful in such tasks as graph interpretation, geometry problem-solving

A major finding in cognitive neuroscience is that mental functions generally are not unitary and undifferentiated. Rather, they are carried out by a host of distinct representations and processes. This is not only true at coarse levels of analysis --for example when one considers faculties such as language, memory or perception --but also at fine levels of analysis. For example, at least two distinct types of spatial relations representations can be used to specify location, one that specifies categories (e.g., "left of") and one that specifies metric (precise distance) information (e.g., Kosslyn, 2006). In this article we consider the possibility that spatial imagery can be decomposed into at least two distinct types of processes. On the one hand, such imagery may serve to specify location; on the other, it may be involved in indicating orientation changes or mentally simulating such changes.
Many researchers have studied changes in orientation of objects in visual mental images, usually under the rubric of "mental rotation." Neuroimaging has proven a useful tool in such investigations, allowing researchers to compare and contrast mental rotation with other abilities. In a recent review and meta-analysis of mental rotation, Zacks (2008) found consistent activation in parietal cortex, with some extension into superior portions of occipital cortex. Zacks (2008) proposed that these regions are likely candidates for implementing the transformation-specific processes involved in mental rotation. Moreover, Zacks and Michelon (2005) proposed that spatial transformations rely initially (and essentially) on memory for spatial locations of objects that are encoded relative to one of three reference frames (object-centric, egocentric, or allocentric), and that spatial transformations per se are executed by a small subset of the common brain regions underlying the spatial imagery system in general.
Previous neuroimaging studies have found that remembering spatial locations of objects also activates portions of parietal cortex. Moscovitch, Kapur, Köhler, and Houle (1995) reported that Brodmann's Area (BA) 40 in inferior parietal cortex was activated when object location was retrieved (to a greater degree than when object identity was retrieved). Such findings are consistent with results from studying patients with brain damage. For example, van Asselen et al. (2006) found that stroke patients with damage to parietal cortex (and dorsolateral prefrontal cortex) were impaired relative to controls in remembering the locations of objects. However, no previous study has directly compared visualizing location with visualizing transformations of orientation.
If spatial transformations rely fundamentally on mapping different objects in space relative to a reference frame, then all of the brain areas evoked by spatial transformations should also be evoked by imagery for spatial location. However, if these processes are distinct, then at least some distinct brain areas should be activated in each case when the two conditions are compared directly. METHOD

Participants
Sixteen participants (8 male, 8 female), who were undergraduates, graduate students, or professionals, took part in the study (mean age 23, age range 18-30). Seven additional participants were tested but were excluded because of failure to complete at least one of the tasks to criterion, or because MRI scanner or computer equipment problems prohibited them from completing the session. All participants were tested according to applicable guidelines and regulations governing the use of human participants in research, and the experimental protocol was approved by the Harvard University Faculty of Arts and Sciences Committee on the Use of Human Subjects and the Partners Human Research Committee (governing research at Massachusetts General Hospital).

Materials
Stimuli included 35 alphanumeric characters created from a variety of standard fonts modified to fit the circles in which they would eventually appear. The characters were made to appear standard and prototypical in order to maximize clarity and facilitate learning and consisted of five numerals (1, 2, 3, 7, 9), 12 lowercase letters (a, b, d, f, h, i, j, m, n, q, r, t) and 18 uppercase letters (A, C, D, E, F, G, H, I, J, L, R, S, T, U, V, W, X, Y). For the familiarization phase, the characters were placed within a circle with a tick mark on top (see Figure1 for examples). For the experimental trials, two sets of five characters each were presented at different locations within a rectangle. A circle of the same size as those that surrounded the characters --as first studied by participants--was presented on each trial. Each circle was divided into three equally sized sections: one outlined with a bold line, one with a dashed line, and the third with a neutral line (see Figure 1). On each trial, a script character (in Apple Chancery font, with minor modifications for some letters to improve clarity) appeared under the circle, to cue participants as to which block character they should visualize in order to perform the task. Each task included 40 trials, presented in two blocks of 20 trials each over the course of the study. Participants were tested on four tasks, two of which are the focus of this report and are described in detail below.

Procedure
Familiarization phase (outside the scanner). During familiarization, participants studied the characters that would subsequently be used in the study.
Each character appeared, one by one, at the center of the screen, within a circle with a tick mark at the top. Each character appeared for 4 secs, and then disappeared. The character was followed by a blank circle (with a tick mark) and participants were instructed to visualize the character they had just studied as accurately as possible, and then to press a button, which made the character they studied reappear. They were instructed to compare their image to the shape of the actual character and to correct any inaccuracies. Once they had done so, they pressed a button, and the next character appeared. Once participants had completed the familiarization phase, they were asked if they had any questions and were instructed that they would have the opportunity to study some of the characters again, and would be asked to practice the tasks that they would be performing inside the scanner.
Once participants were ready, they completed a second familiarization phase where they studied the characters that would be used in the tasks that they were about to practice. This time, the participants studied the characters in a self-paced manner, for as little or as much time as needed.
Practice (outside the scanner). Once participants had studied the characters, they were given instructions for the first task; the order of task practice was counterbalanced over participants. For each task, participants were instructed that a rectangular box containing some of the characters from the familiarization phase would appear on the screen, and that they should remember the box to perform the task that followed. For the spatial location memory task, they were told that they should pay attention to and remember the location of the characters within the box. For the spatial transformation task, they were told that they should pay attention to and remember the shape of the characters (i.e., "what the characters look like").
For both tasks, a rectangular box then appeared at the center of the screen. The box contained five alphanumeric characters, each placed within a circle with a tick mark indicating the top of the circle. The characters were placed in different locations within the box (see Figure 1 for examples of the study boxes). Participants studied the box with the characters for 30 sec. After the box disappeared, participants were told that the practice trials were about to begin.
Spatial location task. On each trial, participants were shown a trisected circle stimulus (as previously described; see Figure 1 for a sample trial used when participants were inside the scanner). A smaller script version of the character was shown below the trisected circle. The script character cued the participants to mentally place the circle in the location where the character had appeared in the study box (for both tasks, a single study box was used for training; two boxes were used in the actual trials inside the scanner). The task was to decide whether the bold or dashed segment of the circle would be closer to the center of the display if the circle appeared in its original location. Thus, participants did not need to visualize the characters in order to perform the task; the only information required to perform the task was the location where the circle associated with the characters originally appeared.
Spatial transformation task. For the spatial transformation task (see Figure 1 for a sample trial used when participants were inside the scanner), on each trial, a trisected circle appeared. Below the circle was a character in script font. The character cued participants to visualize the corresponding block character they had studied during familiarization. A tick mark was positioned on the contour of the circle, (not at the top, but rather at another location on the circle's circumference; see Figure 1). Participants were instructed to mentally rotate the visualized character until its top was aligned directly under the tick mark. After rotating, participants judged whether more of the character would be in the bold or dashed section of the circle; the segments were arranged so that this judgment itself was easy (i.e., the rate-limiting step was mental rotation itself). Experimental trials (inside the scanner). The experimental trials had the same format as the practice trials, except: 1) none of the characters used in practice trials were used in experimental trials; 2) the computer did not beep if participants made an error; 3) the stimulus for each trial remained on the screen for a varying interval (as explained below); 4) each task was administered twice and each session of each task was comprised of 20 trials; 5) participants were given two new study boxes (presented in sequence for 30 sec each) at the beginning of each scan (i.e., for every 20 trials). The order of the tasks was counterbalanced. Each task was performed once before either task was repeated. The interstimulus interval (ISI) varied from 6 to 14 sec in one-sec increments. ISIs were programmed according to a pseudorandom schedule and were varied to allow deconvolution of the hemodynamic response, and also so that participants were required to remain vigilant.
Images were transformed to be made compatible with the Statistical Parametric Mapping program (SPM2, Wellcome Trust Centre for Neuroimaging, London, UK). Preprocessing included slice time correction, motion correction, and spatial normalization to Montreal Neurological Institute (MNI) coordinates. To maximize the spatial resolution of the results, data were not spatially smoothed. To model the hemodynamic response related to the processing of interest, events were modeled using the canonical hemodynamic basis function within SPM. Events were entered as vectors starting at the onset of each stimulus and ending at each participant's response. Only correct responses were considered (incorrect responses and trials where participants did not provide a response were not considered in the analyses). We used a random effect analysis to identify group activity associated with each contrast of interest (using a one-tailed t-test).
We contrasted each of the two tasks with the other and also compared each task to baseline, which was defined as the interval between the point when a participant responded on a given trial and the presentation of the next stimulus. Because each stimulus remained on the screen after the participant responded (until a new stimulus was presented), the baseline condition had the same visual stimulation as each of the task conditions, but without the task-specific processing associated with spatial location memory or spatial transformation.
Corrections for multiple comparisons. To correct for multiple comparisons, we conducted a Monte Carlo simulation using custom software written in MATLAB (The Mathworks, Natick, MA; Slotnick, 2008a). Because clusters of larger sizes are increasingly improbable, it is possible to determine the probability of a given spatial extent of activity (or larger) and then enforce an extent threshold to yield the desired type I error rate. Three dimensional spatial autocorrelation (full-width-half-maximum, FWHM) of the random effect contrast images was estimated to be 7.5 mm using custom software written in MATLAB to model smoothness in the data (Slotnick, 2008b). Based on 1,000 simulations it was determined that for an individual voxel threshold of p < .001, a cluster extent threshold of 15 contiguous voxels was necessary to correct for multiple comparisons to p < .05. Thus, only clusters of activation meeting or exceeding that size were considered as significantly activated. (For further details regarding cluster extent threshold correction for multiple comparisons, see Slotnick, 2008a;Slotnick & Schacter, 2006).

Behavioral results
We analyzed response time (RT) and error data using analysis of variance (ANOVA) with a 2 (task; spatial location versus transformation) x 2 (replicate; first versus second) x 2 (gender; female versus male) design. The only effect to emerge from the RT analysis was a main effect for task. As shown in Table 1, RTs were longer for the transformation task than for the location task, t(15), P REP = .96. No significant effects were found in the analysis of errors.

fMRI results
We contrasted each of the tasks against the other, to compare directly the activation associated specifically with spatial location memory and spatial transformation.
Spatial location versus transformation. In the comparison of spatial location versus transformation (Table 2 and Figure 2, activation in blue), a peak of activation was found in the vicinity of the occipito-parietal sulcus, near the precuneus and posterior cingulate cortex. This region was also activated more during the task than during the baseline, indicating that the differences observed in this area between the two tasks were clearly a result of increases in activation in the spatial location task rather than deactivations in the transformation task. Activations in this region were bilateral. We also found greater activation in the location task than the transformation task in the medial lingual gyrus (BA 18), although this difference was in fact a result of deactivation in the transformation task compared to baseline, rather than an increase in the location task. Other portions of the lingual gyrus and the cuneus were also activated in the location task compared to the transformation task, although the differences between the two main conditions and the baseline were subthreshold, making these activations more difficult to interpret. (All of these regions had non-significant positive Z-scores for the location task and negative Zscores for the transformation task, compared to baseline, which suggests a trend that may have proven significant with greater power).
Transformation versus spatial location. When we examined activation in the transformation task versus that in the spatial location task (see Table 3 and Figure 2, activation in yellow), we found peak activation bilaterally in the superior parietal lobule (BA 7), and in the postcentral gyrus (BA 2/5/7). In addition, we found activation in the right inferior parietal cortex (BA 40). Some portions of the left superior parietal cortex and right inferior parietal cortex were not significantly activated more than during the baseline, and one region of the right inferior parietal cortex was also activated more in the spatial location task than in the baseline, which provides evidence that spatial transformation may rely partly on areas responsible for mapping spatial location. Contrary to the reverse comparison, we did not find activation in the precuneus, posterior cingulate or at the parietal/occipital junction. Instead, we documented activation in parietal regions near the junction of the superior and inferior lobules and extending into the postcentral gyrus. These areas have been more classically associated with mental rotation (see for example, Cohen et al., 1996, Kosslyn, DiGirolamo, Thompson & Alpert, 1998, Kosslyn, Thompson, Wraga, & Alpert, 2001 and are generally more superior and lateral than those associated with the spatial location task. Although we found no significant activation within 5 mm of midline, activations tended to be closer to the medial surface in the location task than in the transformation task. See Figure 3 for a medial view through a peak region of activation in the location task (identified by contrasting spatial location > spatial transformation). This region, near the right occipito-parietal junction, was activated bilaterally and to a greater degree in the location task than in either the transformation task or the baseline. As such, our data suggest that it plays a specific role in the memory for object location.

DISCUSSION
Our results document a clear dissociation between spatial imagery that relies on transformational processes and spatial imagery that relies on memory for location. This result is important because it demonstrates that spatial imagery ability, like mental imagery more generally, is not a unitary function. This finding allows us to refine the conclusions of Kozhevnikov, Kosslyn and Shephard (2005), who demonstrated that visualizers should be divided into two types: those who prefer object imagery (i.e., imagery for shapes) and those who prefer spatial imagery. Kozhevnikov et al. (2005) showed that spatial imagers tend to be concentrated in certain professions and to interpret graphic representations differently from object imagers. However, Kozhevnikov et al. (2005) sort object and spatial imagers according to reported preference (rather than ability) and focus on mental manipulations and transformations. Perhaps more important, they treat spatial imagery as a single capacity. Clearly, spatial imagery should be divided into more finegrained capacities.
The theory of Kosslyn (1994) posits that information about the locations and orientations of objects is organized into a single map, which is implemented primarily in the right parietal lobe. Consistent with this claim, we found more activation for the location task, relative to the transformation one, in cortex near the occipito-parietal junction --with stronger activation in the right hemisphere. Moreover, the activation we found in the precuneus/posterior cingulate area for this comparison may reflect this region's role in directing information to processes that operate on this map. In contrast, the theory posits that spatial transformations occur when a process operating on this map in turn modifies the mapping function from inferotemporal areas (where visual memories are activated) to more posterior cortex; changing the mapping function results in changes in the location or orientation of the object in the image. Our findings suggest that portions of the parietal lobe near the junction of the superior and inferior lobules may play a crucial role here. Previous results (Cohen et al., 1996;Kosslyn et al., 1998) have also documented activation of motor and premotor regions during mental rotation (depending at least in part on the strategy being used). These findings are consistent with activation in postcentral gyrus observed here.
In interpreting our results, it is important to note that not all of the activation found to be greater in one task than the other indicates that the activated regions implement the functions used to perform the task. In particular, portions of the lingual gyrus in Brodmann's Areas 18 and 19 were clearly more activated in the location task than in the transformation task -but this was because of deactivation during the transformation task relative to the baseline (the activation in this region did not change in the location task relative to the baseline; see Table 2). Thus, we must caution that such differences in activation cannot be ascribed to the region's playing a role in the type of spatial imagery that underlies memory for location. Rather, it is possible that the transformation task (which our behavioral data indicate is more cognitively demanding) might require greater attention, and thus could require inhibiting regions where activation might interfere with task-specific processing. In this sense, a static picture of fMRI results is inadequate to represent the dynamic, shifting nature of brain activations and deactivations. In addition to active inhibition of regions that may interfere with accomplishing a task, resources such as blood flow and blood volume might be redistributed away from less useful regions toward essential ones. Given that spatial and shape-based imagery rely on different general processes (e.g., Kozhevnikov, Kosslyn & Shephard, 2005) and the difficulty of the transformation task, resources may shift from object-based ventral stream visual areas (e.g., BA 18) toward dorsal regions critical for spatial transformations.
The precuneus/posterior cingulate region, more activated in the location task than the transformation task, has also been associated with the "Default Mode Network" (see Mason, Norton, Van Horn, Wegner, Grafton, & Macrae, 2007). However, it is unlikely that the activation in this region reflects "default" brain activation during a less demanding task: this region was more strongly activated bilaterally in the location task than in the baseline, which suggests that the activation was a result of the region's playing an active role in task-specific processing for location memory. Other regions of the cuneus and lingual gyrus, which are statistically unchanged from baseline, may reflect non task-related processing (or stimulus-independent thought, Mason et al., 2007); however, the lack of statistical difference from baseline may also reflect a lack of power.
Although we cannot interpret all the differences between areas identified in the direct comparison of the two conditions, the data reveal a clear dissociation between two types of spatial imagery, with a small, distinct set of areas specific to each. These results support the claim that some processes map spatial locations and others transform spatial relations representations (cf. Zacks & Michelon, 2005).
If we affirm that spatial imagery ability (rather than simply "imagery ability") is useful in learning geometry or anatomy, in navigating an environment, in learning surgical techniques, then it is also important to know the particular combination of spatial imagery ability (or abilities) that come into play in each circumstance. This is essential in investigating the parameters that define the training and transfer of skill in a particular domain, and ultimately, in designing training programs to fit a specific set of skills (cf., Wright et al., 2008). Progress has been made in identifying subcomponents of mental imagery (see Kosslyn, Thompson & Ganis, 2006), including those functions that underlie processing shape and those that underlie processing spatial relations (see, for example, Farah, Hammond, Levine, & Calvanio, 1988;Kozhevnikov et al., 2002;Kosslyn, Ganis, & Thompson, 2001). The present study was designed to further our understanding of spatial mental imagery by decomposing this construct into two component parts. Our results support the view that at least two different, broad types of spatial imagery exist.

Authors' note
This material is based upon work supported by the National Science Foundation under grant REC-0411725. We thank Rebecca Wright for assistance in testing the participants. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. the main contrast between the two tasks, spatial location versus spatial transformation (L-T) and for each task when it is contrasted directly with the baseline, spatial location versus baseline (L-B) and spatial transformation versus baseline (T-B). Z-scores are provided for each peak focus of activation identified within a cluster. Significant Z-score effects are denoted with an asterisk (*). Table 3. Results of contrasting the spatial transformation task versus the spatial location task. Locations of the foci (peak voxels) of activation are given in Montreal Neurological Institute (MNI) coordinates. Numbers in parentheses below the activated region label represent the corresponding Brodmann Area (BA), and to which of the three clusters (Cl.), as identified by SPM, each locus of activation belongs. Cluster sizes were: Cluster 1 = 23 voxels; cluster 2 = 25 voxels; cluster 3 = 20 voxels. For each activation, Z-scores are provided for the main contrast between the two tasks, spatial transformation versus spatial location (T-L) and for each task when it is contrasted directly with the baseline, spatial transformation versus baseline (T-B) and spatial location versus baseline (L-B). Z-scores are provided for each peak focus of activation identified within a cluster. Significant Zscore effects are denoted with an asterisk (*).

Figure legends.
Figure 1. Example of a trial in the spatial location task and spatial transformation task. In both tasks, participants first studied the characters in two boxes that were each presented sequentially for 30 seconds (top). For the spatial location task, on each trial, participants saw a trisected circle with a vertical tick mark at the top and a script character beneath the circle (middle, left). The script character cued them to remember the location of the corresponding block character that they had studied. They thus visualized the trisected circle in the location within the box where the appropriate block character had appeared (bottom, left). After having mentally placed the trisected circle in the appropriate location within the box, participants then decided whether the bold or dashed section of the circle was closer to the center point of the box. Here, the correct response would have been "dashed". For the transformation task, participants saw a trisected circle with a tick mark displaced from the vertical and script character beneath the circle (middle, right). The script character served to cue the block character to be visualized. Participants visualized the character in the circle, and rotated it to align with the tick mark, in order to decide which section of the circle (bold or dashed) would contain more of the character when it was rotated (bottom, right). Here, the correct response would have been "bold". Figure 2. Differential activity associated with the spatial location task (blue) and the spatial transformation task (yellow) when the two tasks were contrasted with each other (at the top, anterior, to the left, and posterior, to the right, views; in the middle, right, to the left, and left, to the right, lateral views; at the bottom, inferior, to the left, and superior , to the right, views). Note that some apparent activations do not reflect increases versus the baseline and may be due to deactivations in the opposite task (see text and Tables 2 and 3 for details).  Table 2), associated with the spatial location task versus the transformation task (activation in this region was also elevated relative to the baseline). More inferiorly, activation in lingual gyrus can be seen (although the activity in this region was not significantly elevated relative to the baseline).