Visual Mental Imagery Activates Topographically Organized Visual Cortex: PET Investigations

Cerebral blood flow was measured using positron emission tomography (PET) in three experiments while subjects performed mental imagery or analogous perceptual tasks. In Experiment 1, the subjects either visualized letters in grids and decided whether an X mark would have fallen on each letter if it were actually in the grid, or they saw letters in grids and decided whether an X mark fell on each letter. A region identified as part of area 17 by the Talairach and Tournoux (1988) atlas, in addition to other areas involved in vision, was activated more in the mental imagery task than in the perception task. In Experiment 2, the identical stimuli were presented in imagery and baseline conditions, but subjects were asked to form images only in the imagery condition; the portion of area 17 that was more active in the imagery condition of Experiment 1 was also more activated in imagery than in the baseline condition, as was part of area 18. Subjects also were tested with degraded perceptual stimuli, which caused visual cortex to be activated to the same degree in imagery and perception. In both Experiments 1 and 2, however, imagery selectively activated the extreme anterior part of what was identified as area 17, which is inconsistent with the relatively small size of the imaged stimuli. These results, then, suggest that imagery may have activated another region just anterior to area 17. In Experiment 3, subjects were instructed to close their eyes and evaluate visual mental images of upper case letters that were formed at a small size or large size. The small mental images engendered more activation in the posterior portion of visual cortex, and the large mental images engendered more activation in anterior portions of visual cortex. This finding is strong evidence that imagery activates topographically mapped cortex. The activated regions were also consistent with their being localized in area 17. Finally, additional results were consistent with the existence of two types of imagery, one that rests on allocating attention to form a pattern and one that rests on activating stored visual memories.


INTRODUCTION
Visual perception occurs while a stimulus is being viewed, which leads to the creation of modality-specific internal representations; in contrast, visual mental imagery occurs when such representations are present but the object is no longer being viewed.Visual mental images correspond to short-term memory representations that lead to the experience of "seeing with the mind's eye."Unlike afterimages, the modality-specific representations that underlie imagery are relatively prolonged.
Although imagery has played a central role in theorizing about the mind since the time of Aristotle (e.g., see Tye, 1991), its nature and properties have been surrounded by controversy.Indeed, during the Behaviorist era, its very existence was questioned (e.g., see Watson, 1913), and more recently its status as a distinct kind of mental representation has been vigorously debated (Anderson & Bower, 1973; Kosslyn, 1980; Kosslyn & Pom-both Experiments 1 and 2, however, imagery selectively activated the extreme anterior part of what was identified as area 17, which is inconsistent with the relatively small size of the imaged stimuli.These results, then, suggest that imagery may have activated another region just anterior to area 17.In Experiment 3, subjects were instructed to close their eyes and evaluate visual mental images of upper case letters that were formed at a small size or large size.The small mental images engendered more activation in the posterior portion of visual cortex, and the large mental images engendered more activation in anterior portions of visual cortex.This finding is strong evidence that imagery activates topographically mapped cortex.The activated regions were also consistent with their being localized in area 17.Finally, additional results were consistent with the existence of two types of imagery, one that rests on allocating attention to form a pattern and one that rests on activating stored visual memories.erantz, 1977;Pylyshyn, 1973Pylyshyn, , 1981)).The recent debate has focused on the question of whether visual mental images depict information, or whether they represent information propositionally (and the depictive properties of imagery that are evident to introspection play no functional role in information processing).In this article we report three positron emission tomography (PET) experiments in which we investigate whether visual mental imagery activates topographically mapped areas of visual cortex.Such a finding would be good evidence that visual imagery is a special kind of visual short-term memory representation, one that involves depictive mental representations.
It has long been known that area 17 (also called primary visual cortex) is topographically organized in humans (e.g., see Fox, Mintun, Raichle, Miezin, Allman, L ?Van Essen, 1986;Holmes, 1918 Valois, 1982).Indeed, neuroanatomical studies of nonhuman primates have revealed that about half of the 32 distinct cortical areas known to be involved in vision are topographically organized (see Felleman & Van Essen, 1991; Van Essen, 1985).All of the relatively "lowlevel" (i.e., early in the processing sequence) areas of cortex are topographically organized.Furthermore, virtually every area involved in vision (not solely the lowlevel areas) that has an afferent connection to another area also receives an efferent connection from that area, and the forward and backward projections are of comparable size (e.g., Van Essen, 1985).These features of the anatomy imply that a great deal of information flows backward in the system, from "higher-level'' areas to the "lower-level,'' topographically organized areas.Indeed, Douglas and Rockland (1992) have found direct connections from area TE (in the anterior inferior temporal lobe) all the way back to area 17 (see also Rockland, Saleem, & Tanaka, 1992).Such direct cortico-cortico connections from the higher-level areas to the lower-level areas are consistent with the hypothesis that visual mental images are formed by using stored information to reconstruct spatial patterns in topographically organized cortical areas.Similar ideas have been popular at least since the late nineteenth century (e.g., see James, 1890).
Consistent with this hypothesis, numerous researchers have shown that parts of the brain used in vision are also involved in visual mental imagery (for reviews, see Farah, 1988;Kosslyn, in press).Two types of evidence support this inference.First, patients with brain damage sometimes show deficits in imagery that parallel their deficits in perception (see Farah, 1988; Kosslyn & Koenig, 1992).For example, patients with unilateral visual neglect ignore objects on the same side of space in both perception and mental imagery (e.g., Bisiach & Luzzatti, 1978; Bisiach, Luzzatti, & Perani, 1979).Second, researchers have used a number of techniques to measure brain activity while subjects are performing tasks using visual mental imagery and have found activation in brain areas used in visual perception.For example, Farah, Peronnet, Gonon, and Girard (1988) used evoked potentials to study visual imagery, and found activity on the posterior scalp; Goldenberg, Podreka, Uhl, Steiner, Willmes, Suess, and Deecke (1989) used single photon emission computed tomography (SPECT) and found activation in the occipital lobe; and Roland and Friberg (1985) used xenon-133 regional cerebral blood flow (rCBF) measurements to show that the posterior part of the brain is active during visual mental imagery.However, all previous research on brain activation during imagery relied on methods that have much poorer spatial resolution than PET.PET allows us to investigate the possibility that visual mental imagery involves specific cortical areas that are known to be topographically organized.In addition, PET allows us to test specific predictions about the mechanisms that form visual mental images, as will be described shortly.
This experiment was based on a task originally devised by Podgorny and Shepard (1978).They asked subjects either to view a letter in a grid (perception) or to imagine its presence in an empty grid (imagery).Following this, one or more dots were presented, and the subjects were to indicate whether the dot(s) fell on or off the letter (in the perception condition) or whether they would have fallen on or off the letter were it in the grid (in the imagery condition).Podgorny and Shepard showed that the time subjects took to respond varied with such factors as the number, and location of the dots.Furthermore, these factors affected response times in the same way in the imagery and perception conditions, suggesting that some of the same processes are used to carry out the two tasks (see Sternberg, 1969).Thus, we felt justified in assuming that a similar perception task could serve as a baseline for the analogous imagery task; by subtracting the blood flow evoked during the perception task, we could remove the contribution of the processes that also were used in the imagery task (such as those required to register the probe, to determine whether it was on or off a pattern, to generate a response, and to execute that response).The fact that the imagery task requires more time overall than the perception task suggests that imagery is more difficult than perception under these conditions; thus, the subtraction should reveal which-if any-areas are involved in imagery but not perception, as well as which shared areas had to work particularly hard during imagery.
We were able to formulate predictions about which brain areas should be particularly active during imagery by using a variant of the Podgorny and Shepard task.Kosslyn, Cave, Provost, and Von Gierke (1988) used this task to study how visual mental images are generated; "image generation" is the process whereby the shortterm memory image representation is formed, which requires activating information stored in long-term memory.Kosslyn et al. (1988) presented the probe before the subjects were able to finish visualizing the letter, and thereby could study the process of constructing the image by examining which portions of the shape appeared in the image before others (as indicated by faster responses to probes in those parts of the image).Kosslyn et al. (1988) varied the locations of the probe marks, and found that subjects visualize letters a segment at a time, in roughly the order in which they would draw them on paper.Based on patterns of dissociations following brain damage in other imagery tasks (e.g., see Farah, 1984; Kosslyn & Koenig, 1992) and on computational analyses (Kosslyn, 1980;Kosslyn, in press), we assume that the task has two major components: Subjects first activate stored information to create an image of part of an object, and then "inspect" these imagined patterns for specific properties (using the same processes as in perception).Thus, we can focus on specific imagery processes by comparing the blood flow evoked by this version of the task with that evoked by the perception baseline task, namely those that are involved in generating visual mental images; the perception baseline task allows us to remove the blood flow that reflects the inspection of the pattern, leaving the flow that underlies image generation.
On the basis of previous theories (see Kosslyn, 1980Kosslyn, , 1987Kosslyn, , 1991, in press), we expected the following areas to be selectively activated during image generation: topographically mapped areas of visual cortex (corresponding to the "visual buffer"); occipital-temporal junction or middle temporal gyrus structures (corresponding to the "literal files'' of Kosslyn, 1980 1990).These last attentional structures are involved in the "engage" operation; we expect them to be selectively activated during imagery because one does not need to engage attention until finding the target during a perceptual task, but must engage attention for each segment when building the image up from scratch.Finally, we also expect inferior parietal lobe activation when subjects encode the spatial relation of the X and the figure, as well as the individual spatial relations among segments of the figure (see Kosslyn, 1987); but such activity should occur in both tasks, and hence should not be in evidence after the subtraction.
Two groups of men were tested.Subjects in one group received imagery trials.They began by studying a set of uppercase and lowercase block letters; the opposite-case letter in a script font was centered beneath each letter.After studying these letters, the subjects participated in the task itself: They saw grids that contained only an X mark (formed by connecting the diagonal corners of a grid cell) and had a script cue beneath them, as illustrated in Figure 1.They were asked to decide whether the block letter that corresponded to the cue would cover the X, were the letter in the grid as it had previously appeared.This task required the subjects to visualize the letter in the grid.Half of the X marks would have been on the figure ("yes" trials) and half would have been adjacent to the figure ("no" trials).Moreover, half of each type of trial were placed on or near a segment that typically is drawn early in the sequence ("early" trials), and half were placed on or near a segment that typically is drawn late in the sequence ("late" trials).The subjects pressed one pedal with their right foot to indicate "yes" and another pedal with their left foot to indicate "no," and were urged to respond as quickly and accurately as possible.Following this, the imagery group received the perception baseline task.This task was like the imagery task, except that the block letters were actually present in the grids (drawn in light gray), as illustrated in Figure 1.The subjects were to decide as quickly and accurately as possible whether the X mark was on or off the letter.Thus, although the subjects did not need to visualize the letter, they still had to encode the X mark, make the judgment, and produce the response.The imagery and perception tasks used identical sequences of letters and probe positions; because the tasks involved at least 250 trials it was unlikely that subjects could have remembered the sequences of responses (or even realized that the order was the same).
The other half of the subjects were tested in a sensorymotor control task, which allowed us to examine the effects of simply observing an empty grid and responding; we wanted to ensure that any differences between the imagery and perception tasks were not merely a consequence of encoding higher spatial frequencies or scanning more high-contrast lines when empty grids were present.The sensory-motor control trials were identical to the imagery trials except that the cue beneath the grid was eliminated and the X mark was removed after a variable amount of time.The delay was imposed to ensure that the subjects in the sensory-motor control task viewed the grid as long as the subjects in the imagery task, and a range of delays was used to ensure that they actively monitored the display without falling into a rhythm of responding after a fixed interval.The subjects looked at the grid until the X mark was removed,.at which point they pressed a pedal and the stimulus was removed (the pedal to be pressed alternated from trial to trial).Following this task, these subjects performed the perception baseline task exactly as described above.
Subjects were given the instructions for both tasks (the imagery and perception baseline or the sensory-motor control and perception baseline) before entering the scanner, and the instructions for the perception baseline task were repeated prior to that task.The perception baseline task was administered approximately 10 min after the conclusion of the first task.

Results
We analyzed the response times from the imagery and perception tasks in order to confirm that the subjects were in fact generating visual mental images, and then considered the PET results themselves.

Behavioral results
Replicating previous results (Kosslyn et al., 1988), subjects who performed the imagery task required less time to evaluate the "early" probes than the "late" probes (with means of 3294 and 3635 msec, respectively), t(6) = 3.75, p < 0.01; in contrast, there was no difference in the error rates for the cwo kinds of probes (19.2 versus 22%),p > 0.25.Given thatkosslyn et al. (1988) found that the effect of probe location only occurred when imagery was used, these findings provide evidence that the subjects did follow the instructions and used imagery in this task; the effect of probe position appears to represent the dynamics of the image construction process itself.As has been found in previous studies (e.g., Kosslyn, 1988; Kosslyn et  al., 1988), there was no effect of probe location in the perceptual baseline task for either group, t < 1 for the earlyllate comparison in each case (with mean response times of 814 and 774 msec for the "early" probes for the two groups, and 844 and 732 msec for the "late" probes for the two groups; the corresponding error rates were 2.2,2.0,2.7, and 3.9%).These findings gave us confidence that the blood flow patterns evoked during the imagery task do in fact reflect imagery processing.

PET Results
Data from the imagery and sensory-motor control groups were prepared for statistical analysis by defining circular, 20-mm-diameter regions of interest (ROIs) on transformed, interpolated cerebral blood flow (CBF) maps (see the Method section, at the end of the article).The ROIs were located over 15 areas; as noted above, some of these regions were hypothesized to be used in imagery and perception (see Kosslyn, 1991), and we took most seriously the analyses of these regions.However, we also analyzed other regions that appeared to be the sites of large amounts of activation.These ROIs were localized using a computerized version of the Talairach brain atlas (Talairach, Szikla, Tournoux, Prossalentis, Bordas-Ferrer, Covello, Iacob, & Mempel, 1967), which su-perimposed atlas structures on the transformed mean CBF and difference images (for details, see the Method section).Statistical analyses were performed using the SAS system, Version 5.The dependent variable was the percent relative difference (PRD) in CBF, computed as 200 X [(CBKmeanl) -(CBKmeanz)] / (meanl + mean*>, where subscript 1 refers to the imagery measurement for the imagery group and the sensory-motor control measurement for the control group, and subscript 2 refers to the perception baseline measurement fqr both groups.Data were classified according to subject, group, task, ROI, and "Z" distance above or below the anterior commissure-posterior commissure (AC-PC) line.
Most ROIs had an axial extent encompassing several 4-mm-thick interpolated planes in stereotactic space.Accordingly, the number of PRD values was reduced to one per ROI by averaging each ROI over its Z-distance range, creating a data set with 14 (subjects) X 15 ROIs; however, one subject had a missing region (due to the way the camera was positioned).These data were analyzed with a linear model of the form PRD = TASK + SLJBJECTS(TASK) + ROI + TASKXROI, using the GLM procedure of SAS, where SUBJECT(TASK) indicates subjects nested within task and TASKXROI indicates an interaction term.The null hypothesis is PRD=O for all ROIs.The overallF(43,179) = 3.59,p = 0.0001, indicated that there were significant differences among the ROIs.The pooled mean PRD was 2.66%, with MSE = 6.24.All main effects were significant at the p = 0.01 level or better, but the interaction of TASK and ROI was not,p = 0.22.A Kolomogorov D test showed that the residuals were distributed normally.
We next analyzed the data from the two groups separately.The effect of region was highly significant, F(15, 89) = 2 .6 2 , ~ = 0.0026, for the imagery group, but was not significant, F < 1, for the sensory-motor control group.The hypothesis that specific brain regions are relatively more active during the imagery task than during the perception task was tested by computing the least square means and their associated p values; this statistic allowed us to determine whether a given value was greater than what would be expected by chance.
Figures 2 and 3 illustrate the critical PET findings for the imagery and perception tasks.The most interesting finding is that an area identified as area 17 was more active during imagery than during perception, p = 0.0005.The area of activation was not a point, but rather an elongated volume whose second moments indicate a parallelopiped with the following coordinates (in mm): +1.5< X < 8.4, -75.45 < Y < -53.2, and -0.2< Z < 16.2.The X coordinates are relative to the midline along the horizontal axis; the Y coordinates are relative to the anterior commissure (AC) along the anterior-posterior axis; and the 2 coordinates are relative to the AC-PC line along the vertical axis.Although the region of activation is rather large, the coronal plate illustrating regions  ) illustrates the location of the centroid of activation-at 8 mm above the AC-PC line, the activation is exactly on the infolding of area 17.However, we must note that the activated region is close to the boundary between areas 17, 18, and 30.This region was no more activated during the sensory-motor task than during the perception baseline task, p = 0.839.Thus, it is not merely observing grids with X marks that produces the additional activation we found in the imagery task relative to the perception task; one must also be forming a visual mental image in the grid.
To better understand the locus of activation in visual cortex, we manually outlined the activated region within this structure to create a volume of interest.The geometric centroid of the activated region was located to the left of midline, with coordinates X = -3.5 mm, Y = 64.3mm, and 2 = 8.0 mm, and its extent along the three dimensions was estimated (from its second moments) to be X = 9.7 mm, Y = 22.3 mm, and 2 = 16.5 mm.To discover whether this apparent bias toward the left side was significant, we created an homologous volume that was oriented to the right of the midline.We examined the results for each subject, comparing the left-right difference in the amount of imagery activation (after the perception activation was subtracted) in each volume, and found that there was indeed more activation on the left side (p < 0.01).This post hoc analysis provides evidence that in this task imagery selectively activates visual cortex in the left cerebral hemisphere.This finding also converges with behavioral results, which have shown that subjects can perform this imagery task faster when the stimuli are presented in the right visual field, and hence are seen initially in the left hemisphere (Kosslyn, 1988; Kosslyn, Maljkovic, Hamilton, Horwitz, & Thompson, 1993).
A number of other areas were also activated in the imagery task, as illustrated in Figure 3 and summarized in Table 1.Notably, we found that the anterior cingulate and the left pulvinar were selectively active during imagery, relative to perception.The thalamus was visualized on four slices in the stereotactic coordinate system ( Z = -4, 0, 4, and 8 mm from the AC-PC line); we identified the pulvinar as the posterior portions of the inferior aspect (2 = -4 and 0 mm), and considered the regions to the left and right of the midline separately.We also found that a region of dorsolateral prefrontal cortex was activated in both the left and right hemispheres.We did not, however, find activation of the occipital-temporal junction areas,p > 0.17 in each case.Finally, we found activation in the posterior cingulate and the cuneus, results that were not expected and cannot be explained easily at this time.
In contrast, only one area showed significantly more activation in the sensory-motor control task than in the perceptual control task, namely the cuneus, p = 0.0017, which is not easily explained.Of the areas we had reason to examine, no others approached significance in the analysis of the sensory-motor data,p > 0.17 in all cases.

Discussion
The most striking aspect of our results is that we found greuter activation of visual cortex during image generation than during perception.Furthermore, we found such activation only when subjects formed images; when they viewed the same grids stimuli but did not form images, visual cortex was not activated more than during the perception task.However, although the Talairach atlas identified the centroid as falling within the confines of area 17, the activation was at the extreme anterior portion of this region.Although the atlas shows area 17 extending as far anterior as -60 mm (relative to the AC), the centroid was close to the boundaries of areas 17, 18, and 30.The visual field is mapped along the calcarine fissure, with the fovea at the posterior end and regions subtending about 85" reaching the anterior end.Thus, the purported location of the centroid in area 17 is unexpected given the relatively small visual angle of the stimuli, and casts doubt on the inference that area 17 was activated.Nevertheless, in the monkey all portions of visual cortex that abut area 17 are also topographically mapped, and hence it is likely that whatever area was activated by imagery, it is topographically organized (we will consider this issue in more detail in Experiment 3).
We label this area "VC" (for "visual cortex") in Figure 2 to express our agnosticism about the identity of the area, and wish to emphasize that we present labels merely for expository convenience: the critical aspect of the results is the coordinates of the activated regions, not the names they are presently assigned.
In addition, the PET images indicated that imagery and perception activate similar areas, as we expected if the tasks did in fact involve common processes; however, some of these processes had to work harder during image generation than during perception or image inspection.The fact that not all of the areas were more active during image generation than during perception suggests that the results do not reflect an overall effect of difficulty per se; increased difficulty would presumably raise the overall level of activation, but would not selectively affect some structures relative to others.
But why did imagery result in more activation of visual cortex than did perception?The increased activation dur-ing imagery may have occurred because visual cortex receives high resolution information during "bottom up" perception, enabling its edge-detection and region-organizing processes to operate relatively effectively.In contrast, one must recreate the pattern from remembered information that is typically incomplete (cf.Neisser, 1967) during imagery-and so the processes used to reconstruct the local geometry of objects in visual cortex may be forced to work harder.We investigated this idea in Experiment 2.
We can also understand the fact that the anterior cingulate and the left pulvinar were selectively active during imagery.These areas appear to be involved in attention (see M e r g e & Buchsbaum, 1990; Posner, 1988; Posner & Petersen, 1990), and our imagery task requires "tagging" specific cells of the grid to form an image (Podgorny & Shepard, 1978, also conceived of this image generation process as involving selective attention).We expected these areas to be activated as part of the image generation process, given that people must "look at the location where each additional part belongs (see Kosslyn, 1980).It is worth noting, however, that the greater activation of the anterior cingulate probably does not indicate that it was more active in response selection or decision making in general, although it could be more active only when prefrontal areas are involved in processing (cf.Corbetta, Miezin, Dobmeyer, Shulman, & Pe- tersen, 1991); not only did the imagery and perception tasks involve the same responses, but the subjects actually made more of these responses in the perception task (because they required less time per trial, and the trials were self-paced).Even if the subjects had made the same number of responses in both tasks, we would not have expected greater activation during imagery if the anterior cingulate plays a critical role in response preparation or selection in general.
In addition, we were intrigued by our finding of im-agery activation in areas that are near what the Talairach atlas identifies as "Broca's area," as well as the fact that activation was also present in the homologous right hemisphere regions.Because both hemispheres were activated, the subjects were probably not merely subvocalizing to themselves during the task.Instead, these regions may be involved in programming sequences of operations, such as those required to build up an image a segment at a time (see Kosslyn, 1987Kosslyn, , 1988Kosslyn, , 1991)).Activation was also present in the nearby area 46; the homologous area in monkeys has been shown to serve as a spatial memory (see Goldman-Rakic, 1987), which would provide critical information for constructing the image a segment at a time.These findings are consistent with Shepard's (1987) idea that linguistic processing in Broca's area evolved out of earlier visual-spatial processes that were implemented in this region.Finally, we were initially surprised that imagery did not selectively activate the occipital-temporal junction area or the middle temporal gyrus, given the results of Haxby et al. (1991) and Sergent et al. (1992).We expected such activation because these regions may constitute the human analog of the inferior temporal lobe in monkeys, which is involved in storing visual memories (cf.Ungerleider & Mishkin, 1982).However, as noted above, Podgorny and Shepard (1978) conceived of their task as requiring attention, not activation of visual memories.And based on observations of selective deficits in patients with different types of brain damage, Levine, Warach, and Farah (1985) argued that there are two types of imagery-one that involves activating visual memories, and one that involves allocating attention.For example, if one stares at a tile floor and "sees" patterns by attending to different combinations of tiles, one is using attentionbased imagery.This sort of imagery involves "marking" regions of a spatially organized structure (the "visual buffer") that have been attended to, as opposed to activating stored memories of patterns.A stored description of the locations of segments, not a visual memory, can be used to direct attention to each of the appropriate locations (see Kosslyn, 1987Kosslyn, , 1991)).
The two kinds of imagery may involve the same mechanisms, except that during attentional imagery attention is allocated selectively and temporal-lobe-based visual memories are not activated (for additional discussion of this distinction, see Kosslyn & Shin, in press).Our gridand-X task involves this sort of attentional imagery, and hence in retrospect it is not surprising that visual memo'y areas in the temporal lobe were not activated in this task.This hypothesis might also explain why we did not find greater activation in parietal regions during imagery.
Posner and his colleagues (e.g., Posner & Petersen, 1990) posit that the parietal lobes are involved in (among other functions) disengaging attention from a region prior to shifting one's attention to a new region.This process would not occur in attention-based imagery: One would leave one's attention engaged in each successive area, building up the segments of the imaged pattern.In contrast, the superior parietal lobes should be active when one uses visual-memory-based imagery; in this case, one visualizes an object or part and then must disengage to shift to the location at which another object or part should be imaged.We investigate these possibilities in the following experiment.

EXPERIMENT 2
Perhaps the most striking finding of Experiment 1 is that imagery induced more activation in visual cortex than did perception.This may have occurred because images are not stored as photographs, but rather must be actively reconstructed.Such a reconstruction process may also take place in perception itself under some circumstances.For example, Lowe (1987a,b) found that when confronted with noisy visual input, it was useful for his computer vision program to generate template-like images and match them top-down against fragmentary input.Lowe's program essentially used mental imagery to help encode noisy inputs during perception.Kosslyn (in press) argues that human imagery is used in a similar way during perception.
We investigated this hypothesis by modifying the perception stimuli used in Experiment 1: We now first presented a script cue and then presented a perceptually degraded upper case version of the cued letter and a degraded X mark.Although the subjects were told to compare the visible letter and X mark, we hypothesized that in this task they would use imagery to complete the noisy input.If so, then topographically organized visual cortex should be activated to a similar degree in imagery and perception.
We did not want to be in the position of predicting a null finding, however, and thus included another condition to demonstrate that visual cortex was activated in the imagery task.In this new baseline task the subjects viewed the exact same grids stimuli used in the imagery task, but did not form images in them.
Finally, we designed this experiment so that we could explore the hypothesis that regions of the temporal lobe and the inferior parietal lobe were not activated in Ex- periment 1 because those subjects used "attention-based imagery."In this experiment, all test stimuli now were presented for only 200 msec.We hypothesized that removing the stimulus quickly would make it difficult for the subject to allocate attention systematically over the grid.Thus, we conjectured that the subjects would recall what the pattern actually looked like, using visual-memory-based imagery, instead of recalling a description of which rows and columns had been filled and then directing attention to those regions.In addition, we were concerned that the imagery results from Experiment 1 could have arisen because the subjects moved their eyes differently in the imagery and sensory motor control tasks; perhaps the activation in visual cortex was caused because the subjects swept their eyes more regularly across the grid lines in the imagery task.The present experimental design eliminates that possibility.
In short, our second experiment is like the first but with three changes: First, the perceptual stimuli were degraded by flipping bits randomly within the grid (as described in the Method section at the end of the article).Thus, the figure was more difficult to distinguish from the background.We also removed pixels from the X probe mark, making it more difficult to detect.An example of the stimuli is presented in Figure 4.In contrast, as is evident in Figure 4, the imagery stimuli did not have a noisy background.Second, the fixation point (an asterisk) was now replaced by the script cue (which was visible for 300 msec).Thus, subjects did not have to locate the cue under the grid and shift their attention up to form the image.The cue was presented in both the imagery and perception tasks.And third, all grid stimuli now were presented for only 200 msec, thereby preventing the subjects from moving their eyes over the stimuli.Finally, in addition to an imagery and perception task, we included a new baseline task.In this set of trials, the subjects received the same trial sequence and stimuli that were used in the imagery task; they fixated, saw a script cue, and then saw an empty grid with an X mark for 200 msec.The task was simply to press a pedal when the X mark appeared.This task was administered first, and all of the subjects who received it reported later that they did not form images. Thus, by subtracting the activation engendered in this baseline task from that engendered in the imagery task, we could assess the contribution of imagery per se (but not above and beyond that engendered in perception).Our predictions were as follows: If subjects use imagery in the new version of the perception task, we should now eliminate the difference between the imagery and perception tasks in the amount of activation in visual cortex-but we should still find greater activation of this structure during imagery compared to the baseline task.In addition, because the stimuli were presented briefly, we hypothesized that subjects would not be able to attend to the locations of segments of the letter very easily, and instead would form visual-memory-based images.If so, then we expected temporal lobe structures (used to store and activate visual memories) and parietal lobe structures (used to disengage attention after each segment is placed) to be activated selectively in the imagery task.As in Experiment 1, we expected dorsolateral prefrontal areas to be activated, if they in fact play a role in arranging parts in an image.

Behavioral Results
The response times and error rates were analyzed in separate analyses of variance; only the time to make correct responses was analyzed, and response times greater than 2.5 times the mean of the appropriate cell were treated as outliers and discarded prior to analysis.We found that the subjects did not require less time to evaluate the "early" probes than the ''late'' probes in the imagery task (with means of 1161 and 1139 msec for "early" and "late" probes, respectively), F C 1. Nor did we find such an effect in the perception task (with means of 668 and 682 msec), F < 1, but the subjects required generally more time in the imagery task than in the perception task, F(1, 12) = 9.23,p = 0.01.However, they did make fewer errors for "early" probes than for the "late" probes in the imagery task (11.8 versus 17.2%), F(1, 12) = 9 .8 9 , ~ < 0.01, and in the perception task (10.6 versus 14.7%), F(1, 12) = 5.93, p C 0.05.Our manipulation of the letters not only made the perception task more difficult (indeed, there was no significant difference in error rates between the two tasks, F C l), but also produced an effect of probe position; this is the first time we have found such an effect in a perceptual version of this task (see Kosslyn et al., 1988; Kosslyn, 1988)which is consistent with the hypothesis that by degrading the stimuli we succeeded in leading the subjects to use imagery in this version of the perception task.

PET Results
We performed three sets of analyses on the PET data: we compared imagery and perception by subtracting blood flow in the perception task from that in the imagery task; we examined imagery per se by subtracting blood flow in the baseline task from that in the imagery task; and, finally, we examined the perception task per se by subtracting blood flow in the baseline task from that in the perception task.Selected aspects of the blood flow results are presented in Figure 2.
Imugey-Perception Anulysis.These results are presented in Table 2, which includes significance levels, and Figures 2 and 5.As expected, in contrast to the results from Experiment 1, we now failed to find more activation in imagery than in perception in the region identified as area 17 by the Talairach atlas,p > 0.13.However, we did find more activation in imagery than in perception in a number of areas.First, three areas that were active may be involved in the storage of, or activation of, visual memories.We found more activation in the left middle temporal gyrus.Sergent et al., (1992) found the middle temporal gyrus to be activated during object recognition and face recognition; similarly, in recently completed work at the Massachusetts General Hospital we have found this area to be active when people decide whether words spoken name accompanying pictures (Kosslyn, Alpert, Thompson, Chabris, Rauch & Anderson, 1993).In addition, we found more activation in imagery than in perception in the left inferior temporal gyrus; this area may have a direct correspondence to area IT of the monkey brain, which is known to be involved in encoding object properties (e.g., see Ungerleider & Mishkin, 1982).We also found more activation in the left area 19, a visual association area.Consistent with the hypothesis that we are now observing the effects of visual-memory based imagery, as opposed to attention-based imagery, we did not find more activation in the thalamus,p > 0.1, in contrast to Experiment 1.
As in Experiment 1, we found greater activation in imagery than in perception in several areas that may be involved in looking up stored visual information to construct an image.Specifically, dorsolateral prefrontal cortex in both hemispheres (including area 46) was activated, as was the anterior cingulate gyrus.
As predicted, we now found activation in another set of areas that may be involved in the process of arranging segments of an image into a composite.Specifically, we found more activation in the frontal eye fields, the left inferior parietal area, left superior parietal lobe, the right superior parietal lobe, and the right angular gyrus.
We also found activation in two areas that are known to be involved in motor processing, the left precentral gyrus and the right supplementary motor area.These results were not predicted, but were significant even with the stringent post hoc statistical parametric mapping (SPM) technique used here (see the Method section at the end).
Finally, for a number of areas, the difference in activation between imagery and perception would have been considered to be significant according to our SPM analysis if our theory had predicted these differences.We list these results for the use of other investigators, who may have different theories than ours.These areas (followed by X, Y, Z coordinates, ordered from posterior-to-anterior) included the left-hemisphere fusiform gyrus (-46.54,-46.29, -16.00), insula (-29.96, 20.01,

MEDIAL VIEW -120
Imugery-Baseline Anulysis.We next examined which areas were activated by imagery per se, not whether it activated these structures more than did the perception task.To do so, we compared the amount of activation in this area when the blood flow evoked in the baseline task was subtracted from that evoked in the imagery task.
These results are presented in Figures 2 and 6 and Table 3.As is evident, we did find that the region identified as area 17 in the Talairach atlas was more activated during imagery than during the baseline task.The coordinates of this area are remarkably similar to those of the part of visual cortex that was activated by imagery in Experiment 1.However, this part of area 17 is very near the border of areas 17, 18, and 30, and we cannot localize the activation with confidence to one of these areas.We also found that another region of cortex, area 18, was highly activated during imagery, relative to the baseline task.Indeed, this area was so activated that it was signif-  icant even with the stringent SPM post hoc test.This area is almost certainly topographically mapped in humans.
As in the previous analysis, we found activation in areas that may be involved in the storage of, or activation of, visual memories.We found more activation in the left and right middle temporal gyri, in the left inferior temporal gyrus, and in visual association area 19 in the right hemisphere.Unlike the previous analysis, we did find activation in the right thalamus, but not the pulvinar per se.This activation may reflect image inspection processes, which were not eliminated by this subtraction (but were eliminated by subtracting the perception data from the imagery data, as explained when we introduced the rationale for that subtraction).These findings contrast with those from Experiment 1, which is consistent with our hypothesis that attention-based imagery was used there but not here.
We also found greater activation in imagery than in the baseline task in dorsolateral prefrontal cortex in both hemispheres (including area 46), which may be involved in looking up stored visual information to construct an image.However, we did not find more activation in the anterior cingulate gyrus in imagery compared to the baseline task.
We also found another set of activated areas that may be involved in the process of arranging segments of an image into a composite.Specifically, we found more activation in the frontal eye fields, the left inferior parietal area, the left and right superior parietal lobe, and the left and right angular gyri (but the left was only marginally significant,p C 0.06).The left precentral gyrus (with coordinates -49.09, -2.94, 36.00) was also marginally significant by the post hoc test (p C 0.08).We also found activation in the precuneus, left fusiform, and right hippocampal areas, none of which was predicted by our theory.Finally, if one had hypothesized such activation, two other regions would have been considered to show greater activation in the imagery task; we report these findings in case other investigators have made such predictions: The areas were Broca's area (-42.71, 4.71, 24.00) and the insula (-31.24,13.64, 8.00).

Perception-Baseline Amiysk
We also compared the pattern of blood flow in the perception task with that in the baseline task.The results of this analysis are presented in Figures 2 and 6 and Table 4.As is evident, many of the same areas that were active during imagery were also active during perception.We found greater activation in the region identified as on the border of areas 17 and 18 in the Talairach atlas, which presumably reflects the process of encoding the patterns.This area was about 1 cm anterior to the part of visual cortex that was activated by imagery; moreover it was in the right hemisphere, whereas the area that was activated by im- agery was in the left hemisphere.We also found bilateral activation of visual association area 19, but did not find activation of the middle temporal or inferior temporal gyri.This is of interest because the subjects did not need to identify the pattern in order to perform this task; rather, they needed only to segregate figure from ground, encode the X, and detect whether the X was on or off the figure.In addition, although the thalamus was active (with the centroid now being slightly to the left of midline), the centroid of activation was not in the pulvinar per se.
We also found greater activation in perception than in the baseline task in dorsolateral prefrontal cortex in both hemispheres (including area 46), which may be involved in looking up stored visual information to construct an image.And we again found another set of activated areas that may be involved in the process of shifting attention to arrange the segments of the figure.Specifically, we found the left and right inferior parietal regions, right superior parietal lobe, and there was a trend towards more activation in the frontal eye fields (p < 0.07).
We also found activation in the left and right fusiform gyri as well as the right caudate nucleus, right precentral gyrus and right postcentral gyrus.We had no grounds for predicting activation in any of these areas.Finally, if one had hypothesized such activation, several other regions would have been considered to show greater activation in the perception task: the left-hemisphere supplementary motor cortex (-32.

Discussion
In part because we included an additional within-subjects condition, this experiment produced a wealth of findings; we will focus here on three sets of results that have particular theoretical significance.Perhaps of most interest, we found more activation in a region identified as area 17 by the Talairach atlas during imagery than during the baseline task, even when exactly the same stimuli were present in both tasks.Indeed, the coordinates of this area are very similar to those found in Experiment 1: in that experiment, the centroid was approximately 65 mm posterior to the AC, whereas in this experiment it was approximately 63 mm posterior to the AC.Again, this centroid is probably too far anterior to reflect activation in area 17.Although the Talairach and Tournoux (1988) atlas shows area 17 extending as far as 60 mm posterior to the AC, the area of activation is close to the border of areas 17, 18, and 30.The similarity of this result from the two experiments is noteworthy because different subjects participated and the procedures differed in many respects.Because the stimuli were presented for only 200 msec in Experiment 2, these imagery results cannot be ascribed to the way subjects scanned over the stimuli.The only substantive difference from Experiment 1 in the activation of this part of visual cortex is that the centroid was to the right of center in Experiment 2. We also found that imagery activated area 18, which is almost certainly a topographically mapped region of cortex in humans.The centroid here was slightly to the left of the midline.
Second, we now found activation in areas that may be involved in the storage or recall of visual memories.When we compared activation in the imagery task to that in the baseline task, we found more activation in the left and right middle temporal gyri, the left inferior temporal gyrus, and the right area 19.Similarly, when we compared blood flow in the imagery task to that in the perception task, we found more activation in the left middle temporal gyrus, left inferior temporal gyrus, and left area 19.These findings are consistent with Douglas and Rockland's (1992) report that there are direct connections from area TE (the anterior part of IT) to retinotopically mapped areas (area 17, in particular).These areas were not more active in imagery than in perception in Experiment 1, and are consistent with the claim that the present version of the task induced visual-memorybased imagery.In addition, we now found more activation in the imagery task than in the perception or baseline tasks in the superior parietal regions, which may reflect the role of this area in the disengage process (see Corbetta et al., in press).We also found activation in the left inferior parietal lobe during imagery, which may reflect its role in encoding the spatial relations among segments during the construction process (see Kosslyn, 1987).Recall that we did not find parietal activation in Experiment 1, which we suggested might reflect continued engagement of attention.Also consistent with the putative difference in the two types of imagery, we did not find selective activation in the pulvinar in the present experiment.
Third, we now did not find more activation in visual cortex during the imagery task than during the perception task.In addition, in the perception task-like the imagery task-we found that the subjects made more errors when evaluating probes on segments that would be imaged late in the sequence of segments.This result is not found in the perceptual task when stimuli are intact, but is found in the imagery version of the task (e.g., see Kosslyn et al., 1988).These findings suggest that we did induce the subjects to use imagery at least some of the time in the perception task.Indeed, some of the same areas were active in both the perception and imagery tasks relative to the baseline task, specifically those that we hypothesize are used to look up stored information and shift attention to the location where a segment should be visualized.
But the imagery and perception tasks did not produce identical patterns of activation, relative to the baseline task.Even the areas that were activated in common were typically more activated during imagery.This is not sur-prising, however, if imagery was used only partially in the perception task.But a number of areas that were active in imagery (compared to the baseline) were not active in perception (compared to the baseline): these areas were, specifically, area 18, the left middle temporal gyrus, left inferior temporal gyrus, right middle temporal gyrus, right superior parietal lobe, right hippocampus and the precuneus area.On the other hand, left area 19, right fusiform cortex, the right caudate nucleus, right precentral gyrus, and right postcentral gyrus all were activated in perception (compared to the baseline) but not activated in imagery (compared to the baseline).
One of the most interesting aspects of these findings is that the middle temporal lobe and inferior temporal lobe were not selectively activated during our perception task.These findings may suggest either that subjects used visual-memory-based imagery during the imagery task but completed the patterns using attentional imagery during the perception task, or that visual memories are activated by the joint operation of several areas-not all of which were necessary when the object was partially visible.The fact that left area 19 was activated during perception but not imagery (relative to the baseline task) may suggest that this region of the brain stores a type of visual information that is specifically useful for completing a pattern but which is not used as much when the pattern is generated from scratch.It is possible that this information is also used to guide motor processing, but we will not speculate about that here.These alternatives can be distinguished empirically in future research.
Finally, although we again found greater activation of the anterior cingulate gyrus in imagery compared to perception, we did not find that this structure was more active in imagery than in the baseline condition.This result may indicate that the anterior cingulate is involved in a kind of "anticipatory priming."In the baseline task, the subjects needed to respond as soon as the stimulus appeared, and they may have prepared to encode the stimulus as soon as the cue appeared.It is possible that the process that primes one to "expect to see something" is the same as that used to generate mental images (for an extended discussion of this idea, see Kosslyn, in press).Presumably, less such anticipatory priming was used in perception because one had to encode the stimulus more fully prior to determining the correct response.

EXPERIMENT 3
The results of the previous experiments suggest that imagery activates topographically organized portions of visual cortex, but they do not definitively demonstrate that this is the case.In this experiment we adapted the logic of Fox et al. (1986) to an imagery experiment.Rather than varying the visual angle subtended by a perceived object, we varied the "visual angle" subtended by imagined patterns.If the posterior, medial region of cor-tex that was activated by imagery in Experiments 1 and 2 is topographically organized, then we expect different portions of it to be more strongly activated when objects are visualized at different sizes.
In this experiment the subjects closed their eyes and listened to names of letters of the alphabet along with cue words.Each cue specified one of four judgments (e.g., whether the uppercase letter has only straight lines or any curved lines).The subjects visualized the letters at the smallest possible "visible" size during one block of trials and at the largest possible "nonoverflowing" size during another block of trials.In both cases, the letters were to be centered in front of them and maintained at the appropriate size until the cue word was presented.
We again were concerned about the appropriate baseline condition.Petersen (personal communication, 1992) found that listening to the names of letters when one's eyes were closed reduced blood flow in the occipital lobe.If so, then subtracting blood flow in this task from that in an imagery task could mislead us into inferring increases in blood flow in the imagery task.We decided that for the question we wanted to answer here, an ideal control task was the task itself: If the subjects form images of tiny letters directly in front of them, much spatial variation would occur in a small, foveal region of a topographically mapped area.In contrast, if the subjects formed images of large letters, this foveal region would contain little spatial variation (only part of one segment of a larger letter, at most).Hence we expected greater metabolic activity within the foveal region for small let- ters than large ones.In contrast, large letters should extend into more peripheral regions.Hence we expected greater metabolic activity farther from the foveal representation when large images were formed.
This reasoning led us to subtract the blood flow in the large image condition from blood flow in the small image condition, which would reveal regions that were more activated by small images.Similarly, we subtracted the blood flow in the small image condition from that in the large image condition, which revealed regions that were more activated by large images.These subtractions would not only reveal variations in activity along topographically mapped areas, but also would indicate which other areas were more activated during the two conditions.Kosslyn (1975) found that subjects require more time to evaluate objects imaged at small sizes (presumably because of some kind of "grain effect," perhaps due to spatial summation).Thus, if subjects follow the instructions, they should require more time to evaluate properties of letters imaged at a smaller size than at a larger (but not overflowing) size.

Behauioral Results
We began by analyzing the response times and error rates.Although the overall means were consistent with Kosslyn's (1975) previous findings, with the subjects requiring more time for smaller images (2040 and 2157 msec for large and small images, respectively), when all 16 subjects were considered this difference was not significant, F(1, 15) = 1.48, p > 0.2.Similarly, although there were slightly more errors for small images, this too was not a significant difference (13.0 and 15.1% errors, for large and small images, respectively), F < 1.
We did not feel justified in assuming that differences in blood flow reflect the effects of image size unless we had the corresponding behavioral index that the subjects did in fact use images at the different sizes.On examining the data, it was clear that although the majority of subjects produced the expected difference in times, several did not.We thus ordered the subjects in terms of the difference in response time for large-small trials, from largest positive difference to largest negative difference.When the three subjects who had a response time difference of at least 100 msec in the unexpected direction were eliminated, we now found that the remaining 13 subjects required more time to evaluate the letters imaged at the smaller sizes (with means of 1878 and 2100 msec for large and small images), F(1, 12) = 5.35,p < 0.04.(This test must be regarded only as a heuristic, however, given that the subjects were no longer randomly sampled.)The error rates for these subjects were 14.2% for large images and 15.7% for small images, F<1, which shows that there was no speed-accuracy tradeoff.Thus, we felt justified in examining the patterns of blood flow for these subjects only.

PET Results
The results are illustrated in Figures 2 and 7, and are summarized in Table 5.First, when we examined selective activation when large images were formed (i.e., the pattern of blood flow from small images was subtracted from the pattern of blood flow from large images), we found greater activity in a region of area 17 (as identified by the Talairach atlas) that was about 69 mm posterior to the AC.In contrast, when we examined selective activation when small images were formed (i.e., the pattern of blood flow from large images was subtracted from the pattern of blood flow from small images), we found greater activity in a region of area 17 (as identified by the Talairach atlas) that was about 88 mm posterior to the AC.Both areas of activation were to the right of midline, which is consistent with Sergent's (1989) finding that subjects perform this sort of task faster when the cues are presented to the left visual field.
When we examined the regions that were specifically activated by large images, we found two other areas were over the SPM threshold, both in the left hemisphere: the superior temporal gyrus and the middle temporal gyrus.
If images formed at a large size are more detailed, as the evidence suggests (see Kosslyn, 1980), then these results may speak to the additional processing that'was involved right cerebral hemispheres, seen from lateral and medial views.The triangles represent the loci of significant increases in activity when images were formed at a large size instead of a small size, and the circles represent the loci of significant increases in activity when images were formed at a small size instead of a large size.The tick marks on the axes specify 20 mm increments relative to the anterior commissure.aActivation specific to large mental images was examined by subtracting activation from small mental images and activation specific to small mental images was examined by subtracting activation from large mental images.Coordinates (in mm, relative to the anterior commissure) and p values.
Regions are presented from posterior to anterior.Midline regions lie in the saggital aspect of the brain; some lateralized regions are on the lateral surface, and some are medial, as illustrated in Figure 7. Seen from the rear of the head, the X coordinate is horizontal (with positive values to the right), the Y coordinate is in depth (with positive values anterior to the anterior commissure), and the Z coordinate is vertical (with positive values superior to the anterior commissure).
in forming more sharply defined letters.Note that it is unlikely that more parts (segments) were added to larger images; not only do letters have relatively few parts, but we found no additional activation in the other areas that we inferred to be involved in generating multipart images.In addition, if one had specifically made these predictions, other areas would have been significantly more active with the large images; as in the previous experiments, we list these results for the use of other investigators, who may have different theories than ours.These areas (followed by X , Y, Z coordinates) were area 18 (1.91, -79.44, 24.00) along the midline; left-hemisphere posterior cingulate (-7.01, -53.94, 16.00), the caudate (-10.84, 2.16, 8.00), and dorsolateral prefrontal cortex (-28.69,50.61, -4.00); and right hemisphere area 19 (7.01, -83.26, 24.00), postcentral gyrus (31.24, -34.81, 64.00), amygdala (19.76, -4.21, -16.00), and inferior frontal cortex (35.06, 14.91, -12.00).
In addition, two other areas were specifically activated by small images, relative to large ones: the left-hemi-sphere precentral gyrus and the right-hemisphere dorsolateral prefrontal area.We have no theory about why these areas were selectively active here.Finally, the following areas would have been significant if one had specifically predicted them to be: left-hemisphere superior parietal (- Finally, when we examined the data from all 16 subjects, we found substantially the same results as those reported above.Critically, we again found more activation for the smaller images in the posterior part of area 17 and more activation for the larger images in the anterior part of area 17.However, the effect was not as large as that reported above (particularly in the anterior region), which is as we would expect if the additional three subjects we included did not in fact perform the task.

Discussion
We now have strong evidence that visual mental images rely on topographically organized regions of visual cortex; these results are particularly noteworthy because the subjects' eyes were closed throughout the duration of the task.Specifically, we found that letters visualized at a relatively small visual angle activated a very posterior portion of visual cortex more than letters visualized at a relatively large visual angle.We expected greater metabolic activity within a small region for the small images because more complex spatial variations had to be represented in that region.In contrast, we found that letters visualized at a relatively large visual angle activated an anterior region of visual cortex more than did letters visualized at a small visual angle.This result follows if the large image extended into portions of the visual area that represent more peripheral parts of the field.
The activation we found in this experiment appears to be about where one would expect if area 17 were in fact activated.Fox et al. (1986) examined activation caused by alternating checkerboard stimuli that subtended means of 0.07, 3.5, and 10.5" of visual angle; these centroids were (translated into Talairach & Tournoux, 1988 coordinates courtesy of S. Petersen) -83.7, -77.7, and -74.18 mm posterior to the AC, respectively.In the present experiment, letters visualized at the smallest legible size activated a region 88 mm posterior to the AC; this suggests that subjects could form an image at a fraction of a degree of visual angle, which is consistent with the size at which one can read a letter when a page is held at arm's length.In addition, when subjects were asked to form images of the letters as large as possible while still remaining "visible," we found selective activation 69 mm posterior to the AC.Kosslyn (1978) describes psychophysical tasks that allow one to measure the visual angle subtended by imaged objects when they just begin to "overflow" the image.Although the angle varied depending on the precise procedure, for line drawings of animals they were as small as about 12.5", and for featureless solid rectangular shapes, 16.5"; however, the angle was approximately 20.5" for featureless solid rectangular shapes with slightly different instructions and procedure.If we assume that the visual field is arranged along calcarine cortex according (approximately) to a logarithmic scale (e.g., see Schwartz, 1980; Van Essen, Newsome, & Maunsell, 1984), then 69 mm is roughly within the region that Fox et al.'s finding would lead us to expect the large images to activate.
However, we must note that the activation at the posterior part of area 17 may have arisen from foveal activation of other areas that meet area 17 at this point.In the monkey, at least four areas (Vl, V2, V3, and V4) all meet at the occipital pole and all represent the fovea at that location (see Van Essen, 1985).It is clear that the responses of neurons in at least one of these areas, V4, can be modulated by nonvisual input (e.g., see Haenny, Maunsell, & Schiller, 1988; Moran & Desimone, 1985).Thus, foveal activation engendered by tiny images could reflect activity in the human homologues to any one (or combination) of these areas.The small images selectively activated the portion of visual cortex that represents the fovea, presumably because more visual complexity was packed into that region than occurred with the larger images.In contrast, when we subtracted the activation engendered by small images from that engendered by large images, we found a focus of activation at the anterior part of area 17; the activated region appears to be about where it should be if area 17 was in fact activated.But, given the precision of measurement, we cannot rule out the possibility that another topographically organized area that is near area 17 was in fact activated.We must be cautious about making strong claims about localization, given the fact that the images were averaged over subjects and that there are great individual differences in the size and position of area 17 (e.g., see Redemacher, Caviness, Steinmetz, & Galaburda, 1993).
Finally, we found that right-hemisphere visual cortex was selectively activated during imagery.We suggested earlier that right-hemisphere activation of visual cortex may reflect visual-memory-based imagery.It is possible that subjects used this sort of imagery because the alternative, attention-based imagery, is more difficult to use when one's eyes are closed.Attention-based imagery depends on allocating attention to specific regions of space, which may be easier when a visual structure is provided by the environment rather than having to be supplied in imagery.If the visual framework is also imaged, subjects may have difficulty maintaining it long enough to allocate attention over it.

GENERAL DISCUSSION
Our results suggest that visual mental imagery activates at least two, and possibly three, areas of visual cortex that are known to b e -o r almost certainly are-topographically organized in the human brain.One area is identified as area 17 in the Talairach atlas; this area is most likely to have actually been activated in Experiment 3. In addition, at least some types of imagery activate area 18 (see Experiment 2).Finally, there is an extrastriate area that lies just anterior to area 17 that is also activated during imagery (Experiments 1 and 2); this area is probably topographically organized.The finding that even one topographically organized part of cortex is activated during visual mental imagery suggests that image representations are spatially organized; amodal linguistic representations would not be supported by topographically mapped cortex.
The imagery used in Experiments 1 and 2 apparently did not activate the same part of visual cortex that was activated during Experiment 3. One possible explanation for this disparity is that the perceptual input when one's eyes are open is so strong that it overwhelms any activation from imagery in area 17 proper, and hence we did not detect such activity in the first two experiments.Another possibility is that there may be another visual area anterior to area 17 that is only activated when one melds images with perceived stimuli.Alternatively, this more anterior area may be activated during imagery only when one must override perceptual input from the eyes.
We must consider a potential problem with the present findings.One could argue that the greater activation of visual cortex in imagery versus perception we found in Experiment 1, and the greater activation of visual cortex in imagery versus the baseline we found in Experiment 2, occurred because this area is activated whenever subjects must perform a difficult visual task.We could respond in two ways.First, even if this possibility were correct, the results are important because there was no actual vkualpattern in the grid during the imagery task: the "visual" difficulty arises in part due to the contribution of imagery.Thus, even if visual cortex is activated during visual mental imagery simply because the task is visually difficult (perhaps because visual imagery interferes with visual encoding), the fact that images can substitute for percepts in defining "visual difficulty" is of interest.Second, it is not clear what "difficulty" corresponds to in a processing sense."Difficulty" can be defined only relative to the specific processing capabilities of a system; what is difficult for human beings might not be for a different species.If "difficulty" means that the pattern cannot be easily encoded bottom-up, then in difficult perceptual tasks imagery may be used during encoding-as apparently occurred in Experiment 2. Imagery can arise not only because information is activated in long-term memory to form a spatial image, but also because input can be retained for a prolonged period (in which case imagery serves as a kind of on-line shortterm memory).If this view is correct, then "difficult" visual tasks result in greater activation in visual cortex because imagery is used; if so, then such tasks should also cause greater activation of the middle temporal gyrus or the pulvinar (if visual-memory based or attentionbased imagery is used, respectively).In any event, it is clear that imagery is not confined to retention of on-line input.The subjects in Experiment 3 had their eyes closed during the entire task; there was no actual visual input at all.
Our findings have several broader implications.It is well known that the majority of visual areas in the macaque monkey have reciprocal connections to other visual areas, receiving information from the areas to which they send information (Felleman & Van Essen, 1991; Van Essen, 1985).The fact that stored information can evoke visual patterns in relatively "low level" visual areas during imagery suggests that the reciprocal connections principle applies to the human brain as well (for convergent data and some similar ideas, see Damasio, Damasio, Tranel, & Brandt, 1990; Hebb, 1968).In addition, the finding that imagery afTects relatively low-level visual areas raises the possibility that "top-down'' processing may play a greater role in perception than is currently assumed.The results from Experiment 2 suggest that visual mental imagery may be used in perception itself when one is faced with fragmentary input.
In addition, we were intrigued by our finding that the imagery used in Experiment 1 selectively activated lefthemisphere visual cortex whereas the imagery used in Experiments 2 and 3 selectively activated right-hemisphere visual cortex.Kosslyn (1988) and Kosslyn et al. (1993) found that subjects could perform the grids image-generation task faster if the stimuli were lateralized to the right visual field, so that the left hemisphere received the information initially.This finding is consistent with Farah's (1984) original analysis of image generation deficits following brain damage, and with Kosslyn, Holtzman, Gazzaniga, and Farah's (1985) finding that the left hemispheres of two split-brain patients were better than their right hemispheres at forming images of multipart objects (including letters).However, Kosslyn (1987Kosslyn ( , 1988)), Kosslyn et al. (1993), and Sergent (1989Sergent ( , 1990) ) all report that under some circumstances the right hemisphere can generate images better than the left.For example, Sergent (1989) found that normal subjects could evaluate the relative heights of lower case letters better if the cue was shown in the ,left visual field.It is possible that this disparity reflects differences in the type of imagery used.When attention-based imagery is used, the left hemisphere may be superior; but when visualmemory based imagery is used, the right hemisphere may have an edge.Our results suggest that PET will be a valuable tool in understanding the apparent contradictions that now lace the literature on this topic.
We also found that other areas used in visual percep-tion are used in image generation.Indeed, many of these areas were as predicted if image generation involves an iterative process of activating and positioning individual parts of imaged objects.The present results implicate structures that may correspond to specific component processes used in visual imagery, and are consistent with the idea that there are at least two distinct ways to form visual mental images.But these findings just skim the surface.There may be many types of mental images, and many ways of using them (see Kosslyn

Materials
Sixteen uppercase (A,B,C,E,F,G,HJ,L,O,P,R,S,T,U,Y) and 16 lowercase (a,b,d,e,f,g,h,i,j,l,n,p,q,t,v,y) letters were used as stimuli.For the perception trials, the letters were drawn in light gray in 4x5 grids; these grids subtended 2.6 (horizontal) by 3.4 (vertical) degrees of visual angle from the subject's viewpoint.For the imagery trials, the light gray block letter was eliminated, and centered beneath each grid was the opposite case letter in a script font.The X mark probe remained in the same place for the corresponding imagery and perception trials.The sensory-motor control trials were identical except that the lowercase cue was removed.
A separate group of 20 subjects drew the letters in grids while the order in which they added segments was surreptitiously recorded.Half of the "yes" and half of the "no" trials fell on or adjacent to a segment that typically was drawn during the initial phases of printing the letter, and were called "early" trials; the other half were on or adjacent to a segment that typically was drawn during the final phases of printing the letter, and were called "late" trials.
The experiment was conducted with a Macintosh Plus computer with a Polaroid CP-50 screen filter (to reduce glare), using the MacLab 1.6 program (Costin, 1988), which presented the stimuli and recorded subjects' responses.The computer rested on a gantry approximately 45 cm from the subject's eyes; the screen was tilted down to face the subject.

Task Procedure
The subjects in the imagery group first studied the 16 uppercase and 16 lowercase letters.When they were familiar with the letters' appearance, the imagery trials began.A grid appeared with an X mark and a lowercase cue beneath.The subject decided whether the corresponding uppercase block letter would have covered the X mark if it were present in the grid.If the letter would have covered the X, the subject pressed a pedal with his right foot; if the letter would not have covered the X, he pressed another pedal with his left foot (we used a foot response because the activation due to responding would be high in the motor strip, out of the way of areas of interest).As soon as the subject responded, the stimulus was removed.After a delay of 500 msec the next trial began.The subjects began by performing 32 practice trials with digits instead of letters to familiarize them with the procedure and ensure that they understood the task.Any errors were discussed with the experimenter, and the instructions.were reviewed if necessary.The actual test trials began 2 min before scanning and continued for 18-22 mins.The stimuli were ordered randomly, except that the same type of trial (earlyilate) or response (yesho) could not occur more than three times in succession and the same letter could not occur twice within a sequence of three trials.Subjects first completed 128 trials with uppercase letters, and then received lowercase letter trials until the end of the session.
In the sensory-motor trials, the computer was programmed to have a mean delay of 3172 msec before the X mark was removed, with durations varying according to a Gaussian distribution with sigma = 2376 msec.When the time to respond was added on, the subjects in the imagery group viewed the grids for a mean of 3478 msec per trial, compared to 3480 msec per trial in the sensorymotor control group, a nonsignificant difference, t C 1. Thus, subjects in the imagery and sensory-motor control groups viewed the grids for the same amounts of time.
The perception baseline task was performed second, after the imagery or sensory-motor task (depending on the group), because we did not want subjects to overlearn the appearance of the letters in the grids; we wanted the imagery task to be as challenging as possible in order to evoke the largest changes in blood flow.

PET Procedure
Following informed consent, a small catheter was placed in the radial artery of the right arm for blood sampling.A custom-molded plastic foam head holder was con-structed and the subject positioned in the gantry such that tomographic slices could be obtained parallel to the canthomeatal (CM) line.Subjects also received a CT scan (GE 8800) using the same head holder, which maintained a reference geometry and permitted registration with the PET data.PET scanning was performed with a Scanditronix PC-384 tomograph (Litton, Bergstrom, Eriksson, Bohm, & Blomquist, 1984) during continuous inhalation of 1 5 0 , which efficiently labels HzO.Subjects breathed %-CO2 for 20 min, during which time one of the tasks was performed.PET scanning began 8 min after the start of inhalation and was carried out in three separate measures of 4 min each.During each task, the scanning bed was indexed 9.5 mm between each measurement, yielding nine contiguous PET slices.Arterial blood samples (approximately 0.8 ml) were drawn every 60 sec from the drip of a continuous withdrawal pump (2 mumin) beginning 7 min after the start of inhalation.

PET Data Analysis
PET data were corrected for the effects of random coincidences, dead time, scattered radiation, detector nonuniformity, and photon attenuation, and were calibrated by comparison with measurements using a well scintillation counter.Tissue concentration images were reconstructed using a conventional filtered pack-projection algorithm to an in-plane spatial resolution of 8 mm full width at half maximum (FWHM).The blood concentration history and tissue concentration images were used to compute CBF maps with a modification of the equilibrium method, as described by Senda, Buxton, Alpert, Correia, Mackay, Weise, and Ackerman (1988).
To prepare the data for activation analysis, the CBF data for each subject were normalized over the three scan positions in each task, to account for run-to-run variations in global mean CBF.The global mean flow for a task was determined as an area-weighted average over whole brain ROIs drawn on each of the nine slices.Percent relative difference images were computed, pixelby-pixel, from CBF images, paired by task, and normalized by the grand mean flow over both tasks.A stereotactic coordinate transformation was determined so that each subject's scan could be represented in the reference system of Talairach et al. (1967).The method used to perform the stereotactic transformation was an extension of the St. Louis method (Fox, Perlmutter, & Raichle, 1985), with the X-ray plane film replaced by a CT scout view obtained in registration with the PET scan.An operator identified the locations of critical landmarks by positioning the cursor on the screen, and used the coordinates of the glabella, inion, limbus sphenoidale (LS), and sulcus transversus (ST) to estimate the AC-PC line (Fox et  Then, the directions of the stereotactic axes were computed.The coordinates of the midsagittal plane and the maximum dimensions of the brain were determined from the PET scan data. The CBF images and relative differences images then were rescaled, interpolated, and transformed to a standard stereotactic coordinate system (Talairach et al., 1967).Slices were computed in the system every 4 mm beginning 8 mm below the AC-PC line.The slices were horizontally oriented in the stereotactic coordinate system (i-e., were parallel to the AC-FC line and perpendicular to the midsagittal plane).For visual analysis, the transformed CBF and relative difference images were summed, slice by slice, over the seven subjects in each group to provide mean CBF and mean relative difference images for each slice in each group.

Sub]ects
Thirteen men volunteered to participate as paid subjects (mean age 20 years 8 months, with a range of 18 years 7 months to 23 years 2 months).All but two were attending college in the Boston area at the time of testing.Twelve subjects were right-handed and one was lefthanded.All subjects reported having good vision and none reported that they were taking medication.None of the subjects was aware of the experimental hypotheses at the time of testing.

Materials
The 16 uppercase letter stimuli used in Experiment 1 were used as stimuli here.For the perception trials, the letters were drawn in light gray, with a density of approximately 82.5 pixels per cm2.The grid background (the cells not filled in by the uppercase block letter) was filled in a lighter gray, with a density of about 45% of that of the block letters (or about 37 pixels per cm2).Thus, the contrast between the letter and background was reduced from the original set of stimuli.In addition, the X probe now consisted of dashed lines one pixel wide; each dash was 1 pixel long, and dashes were separated by a space equivalent to 2 pixels.Each script cue was removed from beneath the test stimulus, and instead was individually centered on another screen (which was now presented prior to the test stimulus).
For the imagery and baseline trials, the gray block letter was eliminated from the grid, as was the light gray background, leaving only the dashed X probe within the grid.Like the perception stimuli, each script cue was removed from beneath the test stimulus, and instead was individually centered on another screen (which was now presented prior to the test stimulus).

Task Procedure
Improvements in our technique allowed us to administer more than two tasks per subject, and so 5 subjects (mean age 20 years 8 months; age range 19 years 9 months to 23 years 2 months) performed the baseline, imagery, and perception tasks.In the baseline task, they viewed the same stimuli as in the imagery task, but prior to studying the appearance of the block letters within the grids.The subjects were instructed simply to view the stimuli "without trying to make sense of them or make any connections between them;" they were told to respond by pressing a pedal as quickly as they could when the grid appeared; they were to alternate which foot they used to make the response from trial to trial.These subjects were interviewed after the experiment and asked whether they made any connections between the script letters and the grids and whether they visualized anything within the grids during the baseline task; all subjects reported having followed the instructions not to think about or make associations between the stimuli, and none reported visualizing a pattern within the grid.These subjects completed an average of 61 trials for the baseline task (with a range of 47-71), an average of 62 trials in the perception task (range 57-64), and an average of 58 trials in the imagery task (range 46-65).The subjects required a mean of 350 msec to respond to the appearance of the grid in the baseline task.
Due to time constraints, an additional eight subjects performed only the imagery and perception tasks.As before, the imagery trials were always administered before the perception trials.Prior to the imagery trials, the subjects first studied four block numbers (1,3,4,7), each within a grid, as well as the corresponding script cues.As soon as subjects reported knowing which cells of the grid were filled by each of the block numbers and knew which script cues went with which numbers, a series of practice trials was administered.Subjects completed 16 practice trials, seeing each of the four numbers twice with a "yes" probe and twice with a "no" probe.Once the practice trials were complete, the subjects studied the 16 block letters as well as 16 corresponding script letters that would serve as cues.The imagery trials were administered as soon as the subjects reported being familiar with these block letters and their corresponding cues.The trials began 30 sec before scanning and continued for an additional 2 min.On each trial the following events occurred: a centered asterisk appeared for 500 msec; the screen was blank for 100 msec; the script cue appeared for 300 msec, followed by another blank screen for 100 msec; a grid, empty except for the dashed X probe, appeared for 200 msec.At this point, subjects decided whether the X would cover the block letter if it were actually present in the grid.An asterisk appeared in the center of the screen 100 msec after the subject responded, and a new trial began.The stimuli were ordered randomly, except that the same type of trial (early/late probe position) or response (yesho) could not occur more than 3 times in succession and the same letter could not occur twice within a sequence of 3 trials.Both the imagery and perception tasks of the experiment contained 128 trials, of which subjects completed on average 57 in the imagery task (with a range of 32-101) and 62 in the perception task (with a range of 44-89).
The perception trials were identical to the imagery ones except that the perceptual stimuli were used.All other details of the method and procedure were the same as those from the corresponding task of Experiment 1.As in the original experiment, the perception task was always performed second in order to ensure that the subjects would not overlearn the letters' appearance and the task would remain challenging.

PET Procedure
The PET procedure was similar to that of Experiment 1, but we used a new PET scanner and an improved method of data analysis.The PET machine was a GE Scanditronix PC4096 15 slice whole-body tomograph, which was used in its stationary mode (see Kops et al., 1990).The camera produced contiguous slices that were 6.5 mm apart (center-to-center; the axial field was equal to 97.5 mm); the axial resolution was 6.0 mm FWHM.The images were reconstructed using a measured attenuation correction and a Hanning-weighted reconstruction filter, which was set so that there was an 8.0 mm in-plane spatial resolution (FWHM).When images were reconstructed, we also corrected for effects of random coincidences, scattered radiation, and counting losses that result from dead time in the camera electronics.The PET machine was in a suite built specifically for this purpose, and the same conditions were used for all testing-the lights were dimmed and there was no conversation or intrusive external noise.
Each subject was fitted with a thermoplastic custom molded face mask (TRUE SCAN, Annapolis, MD).The subject's head was aligned in the scanner relative to the CM line, using fixed horizontal and vertical lasers that were positioned relative to the slice positions of the scanner.Once the mask was mounted so that the head was stabilized, nasal cannulae were hooked to a radiolabeled gas inflow, and an overlying face mask was hooked to a vacuum.
Before scanning began, several transmission measurements were made with an orbiting rod source over the axial extent covering the cerebral hemispheres and cerebellum.Twenty measurements were made on each PET run; the first three measurements were each made over 10 sec, and the following 17 measurements were each made over 5 sec.The PET protocol was as follows: (1) the camera acquisition program was started (which measured residual background from previous study); (2) 15 sec later stimulus presentation began; (3) 15 sec later, the 150-C02 gas administration began; and (4) 60 sec later scanning ended, and the gas administration was stopped.
The images of relative blood flow were based on scans 4-16, which were summed after reconstruction.The ter-minal count rates were between 100,000 and 200,000 events per sec.Work in our laboratory has demonstrated, using radial artery cannulation, that integrated counts over periods up to 90 sec are a linear function over the flow range of 0 to 130 ml/min/100 g.Thus, we did not need an arterial line to ensure that data can be characterized in units of flow relative to the whole brain.

PET Data Analysis
We first pooled each slice of the scan data across all behavioral conditions, and then identified the coordinates of midline structures across all slices.A least squares procedure was applied to these coordinates to estimate the parameters of the midsagittal plane.We next resliced the images parasagitally at 5.1 mm intervals.The brain surface of a 10.2-mm parasagittal slice was outlined by hand at the 50% threshold level (nominal).If necessary, data that were missing from the surfaces of the parasagittal emission slices were filled in from more complete sagittal transmission images.
We next transformed the PET data to Talairach coordinates by deforming the 10 mm sagittal planes specified in the Talairach atlas until we obtained the best match (with "best" being defined in a least-squares sense).This procedure allowed us to estimate the locations of several key structures, namely the frontal pole, occipital pole, vertex, AC, PC, and the tilt angle.Using the locations of the midsagittal plane and the key structures just noted, we were able to compute the piece-wise linear transformation to Talairach coordinates.We evaluated the quality of this transformation both by examining the standard errors of the parameters and by visually comparing the manually drawn brain surface to the atlas contour.We also projected a computerized version of the 1967 Talairach atlas onto the transformed data, which allowed us to confirm that the transformed image conformed to the outlines of structures and key features in the atlas.
We specified the mean concentration in each slice for each run as an area-weighted sum, which was adjusted to a nominal value of 50 ml/min/100 g.The images were then scaled and smoothed with a two-dimensional Gaussian filter (20 mm wide, FWHM).We pooled the images over subjects for each testing condition, and a baseline image was subtracted from a test image; images were subtracted within subjects.The results were images of the mean differences, standard deviations, and a t-value for each pixel.
The PET data analysis was different from that of Experiment 1.The c statistic image was then submitted to a statistical parametric mapping (SPM) analysis, as developed by Friston, Frith, Liddle, and Frackowiak (1991).This procedure produced images of standardized normal deviates, which are called "omnibus subtraction images."To account for multiple comparisons and the spatial correlations in the image, we used the SPM technique to compute significance thresholds for specific hypotheses.This calculation depended on the image smoothness and the number of pixels in each hypothesized region.We measured image smoothness in the way recommended by Friston et al. (1991) and found it to be 14.2 mm.We adjusted the threshold for statistical significance to account for multiple comparisons and the smoothness of subtraction images (using Friston et al.'s formula); the problem of multiple comparisons occurs whenever there is more than one resolution element in a region of interest (ROI).The significance value was adjusted by dividing it by the number of pixels in the ROI.When we had predictions, one-tailed tests were used; when we did not have predictions, two-tailed tests were used.
We performed two different types of analyses of the data.First, ROIs were defined to test specific a priori hypotheses.Second, the omnibus subtraction images were examined to observe post hoc significance in regions that were not hypothesized to be relevant; the significance levels for these post hoc analyses were adjusted based on the number of comparisons (see Friston et al., 1991).One set of ROI's was added after the initial analysis, however, due to an oversight: We initially misclassified the region of the parietal lobe that has been identified with the "disengage" operation (see Corbetta et al., in press).Because we predicted that the disengage operation should be used when one generates visual mental images, we felt justified in placing an ROI over the superior parietal lobe after it was identified with this operation.In fact, this area was sufficiently active to be detected by a post hoc test in all but one case.

Subjects
Sixteen men volunteered to participate as paid subjects.The subjects' mean age was 22 years, 10 months (range 18;9-32;ll); for the 13 "best" subjects, the mean age was 22 years, 0 months (range 18;9-32;ll).Most of the subjects were college or medical school students, and were recruited from several Universities in the Boston area.All subjects were right-handed and reported good eyesight; none reported that they were in poor health or taking medication.None of the subjects was aware of the experimental hypotheses at the time of testing.

Materials
The stimuli used in this experiment were tones and words recorded using a MacRecorder sound digitizing device, the SoundEdit program, and a Macintosh computer.The stimuli were presented using a customized version of the MacLab program (Costin, 1988).The subjects heard the stimuli from two AWA speakers, which were attached to the speaker port of a Macintosh Plus computer.
In order to encourage the subjects to visualize the letters, and to check that they had done so at the appropriate size, the subjects made one of four possible judgments about each letter.These judgments were (1) whether the letter is composed solely of straight (including diagonal) lines or whether it includes one or more curved lines, (2) whether the letter is left/right symmetrical, (3) whether the letter has at least one straight (i.e., vertical or horizontal) line down the full length of one of its four sides, and (4) whether the letter has exactly two terminators (end points that do not meet any other lines).Four sets of cue words were devised to signal each type of judgment; the words were: (1) "all straight," (2) "symmetrical," (3) "straight side," and (4) "two terminators." The names of all 26 letters of the alphabet were presented, and each cue appeared equally often.The correct response for each type of judgment was equally often "yes" and "no," and no more than three trials in a row could feature the same judgment or response type.Twenty letters were used for each type of judgment, and a given type of judgment was performed no more than one time for a given letter; a total of 80 trials were prepared.
Two sets of 16 practice trials also were prepared, which differed only in the order of the stimuli.The practice trials included the names of numbers followed by the same four judgment cues used in the experimental trials.Each number from 0 through 9 was heard at least once during the practice trials.As in the actual test trials, the practice trials were balanced to include equal numbers of each of the four judgments and equal numbers of both responses for each.Two versions of the experiment were prepared, with the order of trials being the only difference between them.The 80 trials of the first version were divided in half, then each half was reversed, thus producing a second version.

Task Procedure
The subjects were first given written instructions that explained the four tasks; these judgments were illustrated using examples of random figures that did and did not correspond to the criterion for each judgment.The cue words for each of the tasks were provided and subjects were instructed to learn them.The instructions strongly emphasized the importance of keeping the eyes closed during the experiment, maintaining the mental image of the appropriate letter until hearing one of the four cues, responding as quickly and accurately as possible after ''looking'' at the imaged letter, and imagining the letter at the appropriate size for that set of trials (they were told that this aspect of the experiment would be discussed shortly).The subjects were asked to paraphrase the instructions to ensure that they understood how to perform each of the judgments as well as the other critical features of the experiment.The experimenter corrected the subject's understanding of the instructions if necessary, and repeated a verbal summary of the most important points.
The subjects were then positioned on the scanner bed and again were reminded of the important features of the experiment.Before the first set of practice trials, the subjects were told the size at which letters should be imaged during the set of trials.They either were to visualize the letter at a very large size or a very small size, depending on the counterbalancing group to which they belonged; half the subjects were instructed to imagine small letters for the first scan and large letters for the second scan, and vice versa for the other half of the subjects.Prior to the "small" imagery block of trials, the subjects were instructed to visualize the letter "as small as possible" in the center of their field of view, while still being able to distinguish all of its parts.Prior to the ''large'' imagery block, the subjects were instructed to visualize the letters so that "they fill up the entire field of view," while still being entirely visible in the image.
The subjects began with the practice trials, and were told that they would be performing the appropriate tasks only on numbers (no letters).The beginning of each trial was signaled by a 500-msec tone, followed by a 500-msec delay, after which subjects heard the name of a number.The subjects were to visualize (and maintain the image of) the appropriate digit at the indicated size.Four seconds later, a cue indicating one of the four judgments was presented.The subjects were then to evaluate the number for that property; if it was present they were to press the pedal under their right foot, and if it was not present they were to press the pedal under their left foot.They were to respond as quickly and accurately as possible.Five hundred msec after the subject responded, a tone marked the beginning of the next trial.
The test trials had exactly the same format as the practice trials; the only difference was that the names of letters (and not numbers) were presented.Subjects began the first block of trials 30 sec before beginning to inhale I5O and continued for a total of approximately 2 min before they were instructed to stop.After approximately 5 min, subjects received instructions about the size at which they were to visualize the letters in the following block of trials, and then received the practice trials a second time (in a different order from the first set).The subjects were asked to form the images at the size not used during the first block of trials, and were interviewed after the practice trials to discover whether they believed they had done so.The subjects began the second block of trials 30 sec before the second inhalation, and again continued performing the task for about 2 min before being instructed to stop.
After both sets of trials were completed, the subjects were asked about the strategies they used to perform the tasks and asked whether they were able to maintain the images at the appropriate size for each block.All subjects reported having followed the instructions and having held the images at the correct size for at least half of the trials in each block.The subjects were told the purpose and hypotheses of the experiment after this debriefing interview.

PET Procedure and Data Analysis
The PET procedure and data analyses were identical to those used in Experiment 2. The ROIs in area 17 (as specified in the Talairach atlas) were selected on the right side because of Sergent's (1989) earlier findings with a task that was similar to ours, and were positioned in the far posterior quarter of the region and in the most anterior quarter of the region.

Figure 1 .
Figure 1.Stimuli used in Experiment 1. Subjects either saw a letter in a grid (perception task), visualized the letter (imagery task), or waited for the X mark to be removed (sensory-motor task).

Figure 2 .
Figure 2. Transverse slices of averaged blood flow data from the three experiments, as seen from the bottom looking up.The lower part of each slice is the back of the brain, and the left side represents the right hemisphere and the right side represents the left hemisphere.The height above the ACPC line is identified for each slice or set of slices.The top left panel presents key results from Experiment 1, with results from the imagery group on the left and from the sensory-motor control group on the right.IMG, imagery; SMC, sensory-motor control; PER, perception; VC, visual cortex.The bottom center panel presents key results from Experiment 2; the top row is imagery-perception, the middle row is imagery-baseline, and the bonom row is perception-baseline.The slices in each column are the same height above the AGPC line, as indicated.MT, middle temporal gyrus; DPF, dorsolateral prefrontal; THAL, thalamus; VC, visual cortex; HP, hippocampus; 19, area 19 AN, angular gyius; 18, area 18; CAUD, caudate; IP, inferior parietal; SP, superior parietal.The top right panel presents key results from Experiment 3, when activation engendered by large images is subtracted from that engendered by small images and vice versa.POST 17, posterior area 17; ANT 17, anterior area 17.

Figure 3 . 9 -
Figure 3. Results from Experiment 1.The left and right cerebral hemispheres, seen from lateral and medial views.The triangles represent the loci of significant increases in activity in the imagery task relative to the perception task; the circles represent the loci of significant increases in activity in the sensory-motor control task relative to the perception task.The tick marks on the axes specify 20 mm increments relative to the anterior commissure.

Figure 4 .
Figure 4. Stimuli and trial sequences used in Experiment 2 ORegions are presented from posterior to anterior.Midline regions lie in the saggital aspect of the brain; some lateralized regions are on the lateral surface, and some are medial, as illustrated in Figure 5. Seen from the rear of the head, the X coordinate is horizontal (with positive values to the right), the Y coordinate is in depth (with positive values anterior to the anterior commissure), and the 2 coordinate is vertical (with positive values superior to the anterior commissure).

Figure 5 .Figure 6 .
Figure 5. Results from Experiment 2. The left and right cerebral hemispheres, seen from lateral and medial views.The triangles represent the loci of significant increases in activity in the imagery task relative to the perception task.The tick marks on the axes specify 20 mm increments relative to the anterior commissure.

Figure 7 . 3 .
Figure 7.The results from Experiment 3. The left and , which may be the locus of visual memories in humans; seeHaxby, Grady, Horwitz, Ungerleider, Mishkin, Carson, Herscovitch, Shapiro, & Rapoport, 1991; Sergent, Ohta, & Macdonald, 1992); dorsolateral prefrontal lobe structures (corresponding to the process that arranges parts into an image); superior parietal lobe structures (e.g., area 7, which are involved in shifting attention to the locations of successive segments as they are placed in the image, see Corbetta, Miezin, Shulman, & Petersen, 1993; Haxby et al., 1991); and the pulvinar and anterior cingulate (which apparently are involved in fixating visual attention, as required to note where parts belong on the emerging structure; see LaBerge & Buchsbaum, 1990; Posner & Petersen,

Table 1 .
Coordinates (in mm, Relative to the Anterior Commissure) andp Values for Regions in Which There Was More Activation during Imagery Than Perception in Experiment 1" Regions are presented from posterior to anterior.Midline regions lie in the saggital aspect of the brain; some left hemisphere regions are on the lateral surface, and some are medial, as illustrated in Figure3.Seen from the rear of the head, the X coordinate is horizontal (with positive values to the right), the Y coordinate is in depth (with positive values anterior to the anterior commissure), and the Z coordinate is vertical (with positive values superior to the anterior commissure).

Table 2 .
Coordinates (in mm, Relative to the Anterior Commissure) andp Values for Regions in Which There Was More Activation during Imagery Than Perception in Experiment 2a

Table 3 .
Coordinates (in mm, Relative to the Anterior Commissure) andp Values for Regions in Which There Was More Activation during Imagery Than in the Baseline Task in Experiment 2"

Table 4 . Coordinates (in mm, Relative to the Anterior Commissure) andp Values for Regions in Which There Was More Activation during Perception Than in the Baseline Task in Experiment 2"
"Regions are presented from posterior
, in press;Kosslyn  & Shin, 1993).This project has led us to formulate several hypotheses, which now must be explicitly addressed.PET research is ideally suited to help discover the nature of component processes used in imagery and the circumstances in which they are used.Two groups of seven right-handed men volunteered to participate as paid subjects.The subjects in the imagery group had a mean age of 24.0 years, with a range of 18.2 to 33.3; the subjects in the sensory-motor control group had a mean age of 25.7 years, with a range of 19.3 to 36.4.None of the subjects was taking medication, and all reported being in good health.All subjects were unaware of the purposes and predictions of the experiment at the time of testing.No subject was tested in more than one experiment reported in this article.