Prefrontal Activity and Diagnostic Monitoring of Memory Retrieval: fMRI of the Criterial Recollection Task

According to the distinctiveness heuristic, subjects rely more on detailed recollections (and less on familiarity) when memory is tested for pictures relative to words, leading to reduced false recognition. If so, then neural regions that have been implicated in effortful postretrieval monitoring (e.g., dorsolateral prefrontal cortex) might be recruited less heavily when trying to remember pictures. We tested this prediction with the criterial recollection task. Subjects studied black words, paired with either the same word in red font or a corresponding colored picture. Red words were repeated at study to equate recognition hits for red words and pictures. During fMRI scanning, alternating red word memory tests and picture memory tests were given, using only white words as test stimuli (say yes only if you recollect a corresponding red word or picture, respectively). These tests were designed so that subjects had to rely on memory for the criterial information. Replicating prior behavioral work, we found enhanced rejection of lures on the picture test compared to the red word test, indicating that subjects had used a distinctiveness heuristic. Critically, dorsolateral prefrontal activity was reduced when rejecting familiar lures on the picture test, relative to the red word test. These findings indicate that reducing false recognition via the distinctiveness heuristic is not heavily dependent on frontally mediated postretrieval monitoring processes.


INTRODUCTION
The prefrontal cortex (PFC) plays an important role in episodic memory retrieval. Patients with damage to the PFC often show elevated false recognition to new events (e.g., Verfaellie, Rapcsak, Keane, & Alexander, 2004;Schacter, Curran, Gallucio, Millberg, & Bates, 1996), and in some cases, fabricate fanciful stories about their personal past (i.e., confabulation; see Burgess & Shallice, 1996;Moscovitch, 1995). These findings suggest that prefrontal regions are critical for consciously controlled monitoring processes that regulate memory accuracy, such as searching for specific sorts of information and resulting decision processes. A growing body of neuroimaging evidence is consistent with these ideas. For instance, numerous studies have found retrieval-related activity in prefrontal regions during source memory tasks, which require the recollection and monitoring of specific types of to-be-remembered information (e.g., Dobbins, Rice, Wagner, & Schacter, 2003;Cansino, Maquet, Dolan, & Rugg, 2002;Dobbins, Foley, Schacter, & Wagner, 2002;Ranganath, Johnson, & D'Esposito, 2000;Nolde, Johnson, & D'Esposito, 1998).
It has been argued that the dorsolateral prefrontal cortex (DLPFC; near Brodmann's area [BA] 9/46) plays a critical role in postretrieval monitoring, or the addi-tional search and decision processes necessary when only partial information or a feeling of familiarity is retrieved from a test cue (e.g., Dobbins et al., 2002;Burgess & Shallice, 1996; for review, see Rugg, 2004). For instance, studies have found greater activity in the DLPFC when subjects have to reject familiar lures on a recognition memory test (e.g., Achim & Lepage, 2005;Rugg, Henson, & Robb, 2003;McDermott, Jones, Petersen, Lageman, & Roediger, 2000) during incorrect or difficult source judgments (e.g., Cansino et al., 2002) or during low-confidence recognition judgments (e.g., Henson, Rugg, Shallice, & Dolan, 2000). All of these studies reported activity in the right DLPFC, but several studies also have reported bilateral activation of the DLPFC under conditions where the recognition decision was more ''effortful,'' in that it should have demanded additional retrieval monitoring (e.g., Achim & Lepage, 2005;Cansino et al., 2002;Cabeza, Rao, Wagner, Mayer, & Schacter, 2001;Henson, Rugg, Shallice, & Dolan, 2000;McDermott et al., 2000;. Different episodic retrieval functions have been attributed to the right and left PFC (for various views, see Dobbins, Simons, & Schacter, 2004;Mitchell, Johnson, Raye, & Greene, 2004;Cabeza, Locantore, & Anderson, 2003), but for now we simply note that bilateral DLPFC effects have been found in many situations where subjects must carefully monitor memory retrieval.
Of course, not all types of memory monitoring are the same. Behavioral studies of source memory have indicated that different types of retrieval monitoring are engaged depending on the types of information that are encoded and retrieved, the demands of the retrieval task, and the resulting decision processes (e.g., Johnson, Hashtroudi, & Lindsay, 1993). Studies investigating false recognition, or the incorrect acceptance of lures on a memory test, have provided insights into these different types of monitoring. One type of monitoring, dubbed ''recall-to-reject,'' involves the use of recollected information to overcome familiarity-based errors (see Yonelinas, 2002, for a review). This form of recollectionbased editing has been demonstrated in exclusion tasks, in which the recollection of an item from one source precludes or disqualifies it from having occurred in the target source. Imaging studies have shown that this type of monitoring activates prefrontal regions, including the DLPFC (e.g., Achim & Lepage, 2005;Rugg et al., 2003;McDermott et al., 2000).
In contrast to a recall-to-reject process, in which the successful recollection of disqualifying information allows subjects to reject the lure, the present study focuses on more diagnostic recollection-based monitoring processes. Diagnostic monitoring occurs when recollection fails to conform to one's expectations of what they should recollect if the event had been studied in the to-be-remembered context (i.e., ''I didn't study this, because I'd remember if I had.''). The distinctiveness heuristic proposed by Schacter and colleagues is an example of this type of monitoring. This process was used to explain the finding that false recognition of new words was lower after studying a list of pictures compared to studying a list of words (e.g., Schacter, Israel, & Racine, 1999;Israel & Schacter, 1997). Although the lures were assumed to be equally familiar in the two cases, subjects expected more distinctive memories when tested for pictures, and thus, were better able to avoid false recognition of nonstudied lures (which would not elicit distinctive recollections).
Little is known about the neural substrates of the distinctiveness heuristic, and the few existing studies are inconsistent.  investigated this issue using the repetition-lag task, in which nonstudied words are repeated during the recognition test (so as to induce familiarity-based false alarms [FAs]). They found that patients with damage to the frontal lobes were unable to reduce false recognition of repeated lures following picture study, relative word study, even though controls showed the typical pattern (picture study < word study). These results led  to conclude that the distinctiveness heuristic is dependent on the frontal lobes, and in particular, the DLPFC (near BA 9 and BA 46). Budson, Droller, et al. (2005) reached a different conclusion in an ERP study of the same task. After studying words, waveforms for targets and lures were most differentiated in a relatively late interval (1000-2000 msec), which was thought to reflect frontally mediated monitoring processes. In contrast, after studying pictures, waveforms tended to be differentiated in a relatively early interval (550-1000 msec), which was thought to reflect processes involved in the search for picture recollections. The authors argued that subjects had used a different retrieval orientation following the study of pictures than words (cf. Herron & Rugg, 2003), and that relying on more distinctive recollections (i.e., pictures) decreased the need to engage frontally mediated monitoring processes. Unlike , these results suggest that this type of monitoring does not depend on frontal mechanisms (or at least, not as much as responding under less distinctive conditions). Consistent with this interpretation, several studies have shown that the healthy older adults are just as likely as younger adults to use the distinctiveness heuristic (e.g., Dodson & Schacter, 2002;Schacter et al., 1999), even though older adults are often impaired in source memory tasks (ostensibly due to reduced frontal lobe functioning, e.g., Henkel, Johnson, & De Leonardis, 1998).
One potential reason for these mixed results is that factors other than the distinctiveness heuristic can influence the repetition lag task. As discussed by Dodson and Schacter (2002) and others (see Jennings & Jacoby, 1997), subjects can reduce false recognition to repeated lures on this task by recollecting that they had early been presented on the test (as opposed to the study phase). This recall-to-reject strategy represents a disqualifying monitoring process, which is thought to be qualitatively different from the diagnostic monitoring processes involved in the distinctiveness heuristic (see Gallo, 2004). Thus, the frontal lobes might have been implicated in this task due to the use (or attempted use) of a recall-to-reject strategy (i.e., trying to recall whether the item had earlier been presented as a test word), as opposed to the use of a distinctiveness heuristic (i.e., trying to recall whether the item had been studied as a picture).
To avoid these complications in the present study, we investigated the distinctiveness heuristic using the criterial recollection task (Gallo, Weiss, & Schacter, 2004). In brief, subjects studied red words or colored pictures of objects (along with their verbal labels) and were tested using words as memory cues. (To avoid presentation format confounds, test words were presented in a different font and color than the words used at study.) In different test blocks, subjects were given three different test instructions. Under standard instructions, they were to respond ''yes'' to any test word that corresponded to a studied item, regardless of the study format. Under red word instructions, they responded ''yes'' only if they could recollect a red word, and vice versa under picture instructions. Importantly, some test items had been studied as both red words and pictures, so that the recollection of one format (a red word) did not necessarily preclude an item from also having been presented in the other format (a picture). As a result, subjects could not use a recall-to-reject strategy to reject studied lures, but instead they had to carefully query their memory for the to-be-recollected format on the red word test and picture test (i.e., the criterial recollection tests).
Using this task, Gallo et al. (2004) found that false recognition was lower on the picture test than on the red word test, indicating the use of a distinctiveness heuristic. This effect was found for to-be-excluded studied lures (i.e., those studied in the noncriterial format) as well as for new lures that were never studied. Further, these effects were found regardless of whether pictures were more familiar than red words, or vice versa (familiarity was manipulated by repeating red words at study, and was measured using hits on the standard recognition test, subjective measures, and response-time manipulations). The finding that manipulations of familiarity did not influence false recognition was taken as strong evidence that subjects had used a recollection-based monitoring process, such as the distinctiveness heuristic, to reduce false recognition on the picture test.
The current task was modeled after Gallo et al. (2004, Experiment 2), with minor presentation modifications for fMRI. Because we found that the familiarity of the stimuli was unrelated to the use of a distinctiveness heuristic in our prior study, we equated recognition hits for the two types of stimuli for the fMRI task (by repeating red words at study). The critical question was whether use of the distinctiveness heuristic would activate the DLPFC. The neuropsychological work concerning patients with frontal lobe lesions described above , as well as neuroimaging studies of other types of recollection-based monitoring processes (e.g., Achim & Lepage, 2005;Rugg et al., 2003;McDermott et al., 2000), suggest that DLPFC regions might be critical for use of the distinctiveness heuristic. However, as suggested by Budson, Droller, et al. (2005), having subjects orient retrieval towards more distinctive recollections might reduce the need to engage frontally mediated monitoring processes, relative to a less distinctive retrieval orientation. Consistent with this prediction, Gallo et al. found that correct rejections on the basis of the distinctiveness heuristic were relatively quick (as indexed by response latencies) and easy (as indexed by postexperiment questionnaires), suggesting that effortful postretrieval monitoring processes were not necessarily involved.

Behavioral Results
Recognition data are summarized in Table 1, and replicated the major results of Gallo et al. (2004). Performance on the standard test was quite good. As expected, recognition of test words that were studied in both formats (both-hits, mean = 0.67) was greater than recognition of test words that were studied only as red words (word-hits, 0.51) or only as pictures (picture-hits, 0.52), and all hit rates were greater than FAs to new lures (new-FAs, 0.17), all ps < .001. By design, word-hits were equivalent to picture-hits. On the criterial recollection tests, the results indicated that subjects were responding primarily on the basis of recollection of the to-beremembered format. On the red word test, word-hits (0.51) were greater than FAs to items studied only as pictures [picture-FAs, 0.40, t(15) = 2.44, p < .05], and on the picture test, picture-hits (0.46) were greater than FAs to items studied only as red words [word-FAs, 0.14, t(15) = 8.17, p < .001]. Familiarity differences alone cannot explain this crossover pattern of responding (red words > pictures on the red word test, but pictures > red words on the picture test). Instead, as instructed, subjects primarily responded on the basis of red word recollections on the red word test and picture recollections on the picture test.
Although subjects had relied on recollection, their memory was not perfect, and source confusions influenced performance. On the red word test, picture-FAs (0.40) were greater than new-FAs [0.17, t(15) = 7.86, p < .001], indicating that presentation in the noncriterial format increased false recognition. Also, both-hits (0.63) were greater than word-hits [0.51, t(15) = 3.32, p < .01], demonstrating that presentation in the noncriterial format boosted hit rates to items studied in both formats. Similar effects were found on the picture test, but importantly, the influences of source confusions were smaller than those found on the red word test. Word-FAs (0.14) were greater than new-FAs [0.09, t(15) = 3.07, p < .01], but this effect was smaller than that found on the red word test [F(1,15) = 23.10, p < .001].
Similarly, both-hits (0.51) were numerically greater than picture-hits (0.46), but unlike the red word test, this effect was not significant on the picture test [t(15) = 1.58, p = .13]. The finding that source confusions were lower on the picture test than on the red word test indicates that subjects had used a distinctiveness heuristic to avoid such errors. Another way to investigate the distinctiveness heuristic is to directly compare false recognition between the two criterial recollection tests. These comparisons indicated that false recognition was lower on the picture test than on the red word test, again suggesting that subjects had used a distinctiveness heuristic to reduce false recognition. First, word-FAs on the picture test (0.14) were lower than picture-FAs on the red word test (0.40), t(15) = 8.82, p < .001. Second, new-FAs on the picture test (0.09) were lower than new-FAs on the red word test (0.17), t(15) = 3.47, p < .01. We also compared hit rates between the two criterial recollection tests. Bothhits were lower on the picture test (0.51) than on the red word test (0.63), t(15) = 2.88, p < .05, consistent with the aforementioned idea that presentation in the inappropriate source was less likely to be confused with criterial recollection on the picture test. Finally, wordhits (WT) did not differ from picture-hits (PT) [means = 0.51 and 0.46, t(15) = 1.12, p = .29], replicating the equivalent hit rates obtained on the standard test.

Latencies
Response latencies for correct responses (hits and correct rejections) also are provided in Table 1. Due to technical error, latencies were only recorded for 14 of the 16 subjects. The main point to take from these latencies is that subjects were faster to accept targets and to reject lures on the picture test than on the red word test. This effect was in the appropriate direction for all comparisons, but was significant only for criterial hits [Word-hit (WT) > Picture-hit (PT), t(13) = 3.83, p < .01] and noncriterial correct rejections [Picture-FA (WT) > Word-FA (PT), t(13) = 2.32, p < .05]. As in Gallo et al. (2004), these findings suggest that relying on distinctive recollections not only reduced errors on the picture test (relative to the red word test), but also led to quicker hits and correct rejections. We will consider later the relation of these latency differences to our imaging results.

Imaging Results
Imaging data for the criterial recollection tests are most relevant to the monitoring hypothesis, because on these tests subjects had to search memory for the to-beremembered information (as opposed to recognition on the standard test, which could have been accomplished via familiarity alone). For this reason, we report only analyses of correct responses on the criterial tests (i.e., correct rejections of lures or correct recognition of targets). Incorrect responses (e.g., false recognition) could reflect either a lack of monitoring or an erroneous monitoring attempt, and thus, are more ambiguous. Also ambiguous are responses to items studied in both formats. By design, these items could have elicited recollection of one or the other presentation formats. We therefore present imaging results only for correct responses to items studied as red words (hits on the red word test and rejections on the picture test), and pictures (hits on the picture test and rejections on the red word test). Activity for hits reflected (in part) the recollection of the to-be-remembered format, and activity for correct rejections reflected the monitoring of this retrieved information under the different test orientations. Correct rejections of new items also were included as a type of baseline activity (i.e., the rejection of relatively unfamiliar items on either test).

Simple Contrasts
To examine neural activity when rejecting familiar lures on the red word test (WT), we first contrasted correct rejections of items that had been studied only as pictures [Picture CRs (WT)] to correct rejections of items that had never been studied [New CRs (WT)]. Activation was considered reliable if the region included at least 5 resampled voxels, at a threshold of p = .001, uncorrected. All regions that were more active when rejecting familiar lures on the red word test are presented in Table 2. These regions were all located in the frontal cortex, including bilateral dorsolateral prefrontal areas (BA 9 and BA 46). These results replicate those of other imaging studies (discussed previously), in that the rejection of familiar lures, at least under these less distinctive conditions, recruited regions that are thought to be involved in postretrieval monitoring processes (e.g., Achim & Lepage, 2005;Rugg et al., 2003;Dobbins et al., 2002;McDermott et al., 2000).
The analogous contrast was performed for the picture test (PT): correct rejections of items studied only as words [Word CRs (PT)] versus correct rejections of items that had never been studied [New CRs (PT)]. This contrast failed to reveal any regions that were more active when rejecting familiar lures (vs. unfamiliar lures) on the picture test, even when threshold was lowered to p = .01, uncorrected. Unlike the results of the red word test, prefrontal regions thought to be involved in post-retrieval monitoring were not more active when rejecting familiar lures on the picture test. The failure to find an effect on the picture test is not due to insufficient power, because subjects were more likely to correctly reject lures on the picture test, hence, there were more observations contributing to this analysis than that done for the red word test.
To directly compare the two tests, we contrasted activity for the rejection of familiar lures on the red word test (Picture CRs) to the rejection of familiar lures on the picture test (Word CRs). This contrast revealed only two regions that were more active on the red word test (at p = .001): right inferior frontal gyrus (BA 9: 56, 13, 32) and left midfrontal gyrus (BA 8: À48, 13, 35). The reverse contrast revealed no analogous regions that were more active for the picture test. Also, no regions were differentially activated for the correct rejection of new items across the two tests, indicating that these prefrontal activations were not due to general testing differences, but were specific to the monitoring of familiar lures. Collectively, these results suggest that prefrontal regions (including the DLPFC) were less likely to be activated when subjects rejected lures via the distinctiveness heuristic (i.e., on the picture test).

Correct Rejection Conjunctions
One difficulty with the aforementioned analyses was that, by design, the familiar lures had different histories on the two tests (i.e., they corresponded to red words on the picture test, and pictures on the red word test). Thus, differential activation of prefrontal regions across tests could have been due to reduced retrieval monitoring on the picture test, or to different types of recollection elicited by the two different types of studied lures (i.e., noncriterial recollection). Given prior research on the recollection of pictures from verbal cues (e.g., Vaidya, Zhao, Desmond, & Gabrieli, 2002;Wheeler et al., 2000), we find it unlikely that picture recollections (on the red word test) would have elicited more prefrontal activity than red word recollections (on the picture test) in our task. Nevertheless, to circumvent these item-history confounds, we conducted several conjunction analyses using the masking function in SPM99. These analyses are more conservative, in that they only show common regions that are active across several separate contrasts, and they also allow for the controlling of various item-history confounds (as discussed below).
To identify regions that were specifically recruited when rejecting familiar items on the red word test, we examined the common regions that were active in each of the following three contrasts: (1) Picture CRs (WT) > New CRs (WT), (2) Picture CRs (WT) > Word CRs (PT), (3) Picture CRs (WT) > Picture Hits (PT). Because this analysis is more conservative, we set the threshold for each individual contrast to p = .01, uncorrected. Results from the first and second contrasts have already been presented above. The first contrast holds test and response constant (i.e., correct rejections on the red word test), while varying item history (familiar vs. unfamiliar lures). The second contrast holds response constant (correct rejections of familiar lures), while varying the test and item history (rejection of pictures on the red word test vs. rejection of red words on the picture test). The third contrast holds item history constant (test words corresponding to studied pictures), while varying the response and test (rejection of pictures on the red word test vs. acceptance of pictures on the picture test). These three contrasts were selected because, collectively, they control for type of test, response, and item history. Any common activations across all three contrasts should be due to those monitoring processes that are involved when rejecting a familiar lure on the red word test, independent of these other factors.
Results from this conjunction analysis are depicted in Figure 1. As was the case with the individual contrasts, all active regions were located in the frontal cortex. These regions included the right DLPFC (BA 9: 56, 16, 32), the left DLPFC (BA 46: À48, 27, 21), a more posterior midfrontal region (BA 8: À48, 11, 35), and a more anterior prefrontal region (BA 10/46: À42, 46, À5). This analysis bolsters the conclusion that retrieval monitoring on the red word test (i.e., the correct rejection of familiar lures) activated several prefrontal regions, including the bilateral DLPFC. The analogous conjunction analysis for the rejection of red words on the picture test revealed no common regions of activation, even with a more liberal threshold for the individual contrasts ( p = .05, uncorrected). Again, prefrontal activation was less likely when subjects were using the distinctiveness heuristic to reject familiar lures on the picture test.

Correct Acceptance Conjunctions
Up to this point, we have restricted analyses to those regions that were more active when rejecting familiar lures. Of course, postretrieval monitoring processes also could occur when subjects encounter familiar targets, and subsequently, need to search memory for the to-berecollected material (cf. Rugg et al., 2003). To explore this issue for hits on the red word test, we conducted a conjunction analysis to identify the regions that were active across each of the following contrasts: (1)  should reflect regions that are active due to retrieval success and/or postretrieval monitoring processes (i.e., a typical old/new contrast). The second contrast holds response constant (correct acceptance of studied targets), while varying the test and item history (acceptance of words on the red word test vs. acceptance of pictures on the picture test). The third contrast holds item history constant (test words corresponding to studied red words), while varying the response and test (acceptance of red words on the red word test vs. rejection of red words on the picture test). As with the correct rejection conjunction, this analysis should identify those regions that are selectively active whenever subjects monitor retrieval for targets on the red word test, controlling for item histories, response differences, and test differences across the individual contrasts. Unlike the correct rejection conjunctions, though, this conjunction also is potentially sensitive to retrieval success effects (i.e., activations due to successful recollection of red words on the red word test).
Results from this conjunction are presented in Figure 2 (see Table 3 for all coordinates). The first point to notice is that many of the prefrontal regions that were active when correctly rejecting familiar lures on the red word test (i.e., pictures) also were active when correctly accepting targets on the red word test (i.e., red words). These include right prefrontal regions (e.g., 56, 16, 27, BA 44/9; and 51, 11, 41, BA 8), left prefrontal regions (e.g., À51, 19, 32, BA 9), and more anterior prefrontal regions (e.g., À48, 41, 6; BA 46). Further, a direct comparison between this conjunction and the conjunction for correct rejections (WT) revealed that many  prefrontal regions were common across the two analyses, including the bilateral DLPFC (BA 9). These results provide additional evidence that retrieval monitoring on the red word test activated the bilateral DLPFC. Finally, note that there were several other active regions in the word-hits conjunction, most notably including the cingulate gyrus (9, 5, 38, BA 24) and the left inferior parietal cortex (À48, À32, 54, BA 40). Activations in the left parietal cortex are often sensitive to old > new effects in recognition experiments, at least for words, and have been argued to reflect the subjective experience of ''oldness'' or retrieval success (see Buckner & Wheeler, 2001, for review).
To examine whether similar regions were implicated in correct identification of targets on the picture test, we performed the analogous conjunction analysis for the picture test: (1) Picture Hits (PT) > New CRs (PT), (2) Picture Hits (PT) > Word Hits (WT), (3) Picture Hits (PT) > Picture CRs (WT). No common regions were found with this conjunction analysis using the same threshold as in the previous conjunctions, so a more liberal threshold was used (the significance of each contrast was set to p = .05, uncorrected). This conjunction revealed only two common regions of activity, spanning the left parahippocampal and fusiform gyrus (À33, À41, À8, BA 36, 13 voxels, and À27, À49, 8, BA 19, 19 voxels). Left fusiform activation previously has been found during the perceptual processing of the same picture stimuli used here (e.g., Garoff, Slotnick, & Schacter, 2005), and also in experiments, like the current one, where subjects were recollecting perceptual details of studied pictures from word cues at test (e.g., Vaidya et al., 2002;Wheeler, Petersen, & Buckner, 2000; see also Kahn, Davachi, & Wagner, 2004). Thus, the current findings might reflect the recollection of distinctive visual information on the picture test, in response to the verbal test cue, although this conclusion is only tentative because these activations were found using a more liberal threshold. More important is the fact that, even with this more liberal threshold, prefrontal activations were not found while subjects were responding on the picture test.

Time Course of Prefrontal Activity
Collectively, the conjunction analyses indicate that hits and correct rejections were more likely to elicit prefrontal activity on the red word test than on the picture test. To further illustrate this point, the time course of activation in a representative region in the right DLPFC (56, 13, 33, BA 9) is presented in Figure 3. As can be seen from the figure, activity was greater on the red word test than on the picture test, regardless of whether one compared test words corresponding to red words (left panel) or those corresponding to pictures (right panel).
In sum, regardless of the item history (red words or pictures), or the response (hit or correct rejection), dorsolateral prefrontal activity was greater on the red word test than on the picture test. This is not to say that prefrontal regions were not active when rejecting items on the picture test, as simple contrasts revealed that activity was greater in several regions compared to fixation trials, including the bilateral inferior frontal cortex (BA 47/11) and a more dorsal region near the left BA 9. Instead, when the response to studied items was contrasted across tests, DLPFC regions were less likely to be engaged when subjects could base their decisions on the recollection (or not) of distinctive information.

DISCUSSION
Using the criterial recollection task, we found that subjects reduced false recognition when tested for pictures relative to red words, even though the same retrieval cues were used at test (white words). Based on prior work with this task , these findings suggest that subjects had used a distinctiveness heuristic on the picture test. By expecting more distinctive recollections, subjects were better able to avoid source confusions or familiarity-based errors (e.g., ''This item probably wasn't presented as a picture, because I'd remember it if it had been.''). This difference in retrieval expectations was accompanied with differences in neural activity. When subjects were deciding whether studied items had been presented in the less distinctive source (i.e., the red word test), we found activation in several regions that have been linked to episodic memory retrieval (see Buckner & Wheeler, 2001;Rugg, 2004). These regions included the DLPFC, a region thought to be involved in postretrieval monitoring. Critically, DLPFC regions were not as active when memory was tested for more distinctive recollections (i.e., the picture test), suggesting that the distinctiveness heuris-tic reduces the need to engage frontally mediated retrieval monitoring processes. Instead, on the picture test, we found activity in the vicinity of the left fusiform/ parahippocampal gyrus, a region that has previously been implicated in visual object processing (e.g., Garoff et al., 2005). Because the picture stimuli were not presented at test in our task, the activity that we observed might have reflected reactivation of detailed perceptual memories. These results, considered along with other research, further elucidate our understanding of the neural correlates of the distinctiveness heuristic. Consistent with the ERP findings of Budson, Droller, et al. (2005), our results indicate that the need to engage in frontally mediated retrieval-monitoring processes is reduced when subjects monitor retrieval for more distinctive recollections. The notion that the distinctiveness heuristic does not tax frontally mediated monitoring processes might explain why healthy older adults can use this process to avoid false recognition as effectively as younger adults (e.g., Dodson & Schacter, 2002;Schacter et al., 1999). Older adults can be impaired in their ability to recollect sourcespecifying information, which can result in impairments in source memory tasks (e.g., Simons, Dodson, Bell, & Schacter, 2004;Henkel et al., 1998) as well as tasks that require a disqualifying recall-to-reject strategy (e.g., Jacoby, 1999). Older adult impairments in these tasks have been attributed, in part, to frontal dysfunction (e.g., Glisky, 2001), consistent with imaging evidence that these recall-to-reject processes are associated with frontal activation (e.g., Achim & Lepage, 2005;Rugg et al., 2003;McDermott et al., 2000). To the extent that the distinctiveness heuristic does not depend on these same frontally mediated processes, as suggested by the Figure 3. Time courses of activation (relative to fixation activity) on the red word test and picture test in a representative region in the right DLPFC (56,13,33,BA 9). The left panel shows activity for test words corresponding to red words at study; the right panel shows activity for test words corresponding to pictures at study. Bars represent standard error of the mean. current results, one might not expect it to be impaired by healthy aging.
Our results also are compatible with neuroimaging studies in which high levels of DLPFC activity have been observed during false recognition (cf., Cabeza et al., 2001;Schacter, Buckner, Koutstaal, Dale, & Rosen, 1997;Schacter, Reiman, et al., 1996). In these studies, subjects studied lists of semantically associated words or perceptually similar pictures, and later exhibited robust levels of false recognition to related lures. Because distinguishing between studied items and related lures was quite difficult in these experiments, investigators have typically argued that prefrontal activity during false recognition reflects the need for evaluation or monitoring of the strong sense of familiarity produced by related lure items (for review and discussion, see . These ideas fit well with our finding that reduced dorsolateral frontal activity was associated with the use of a distinctiveness heuristic, which eliminated the need for more elaborate monitoring processes in otherwise difficult retrieval conditions.

Response Latency Effects
In addition to reduced errors on the picture test, we found that responses were faster compared to the red word test. In fact, all of the contrasts that revealed greater prefrontal activation on the red word test than on the picture test also showed significant response latency differences. We take these latency differences to reflect the use of additional postretrieval monitoring processes on the red word test, such as the setting of familiarity-based response criteria or the search for additional recollective information. Other imaging studies that have reported DLPFC activation under more ''effortful'' retrieval conditions, or conditions that might be thought to require postretrieval monitoring processes, also have reported analogous latency differences (e.g., , 2004Cansino et al., 2002;Cabeza et al., 2001;Henson, Rugg, Shallice, & Dolan, 2000;McDermott et al., 2000;. In our study, the use of a distinctiveness heuristic on the picture test apparently minimized the need to engage in additional postretrieval monitoring processes to make memory decisions. Thus, although the distinctiveness heuristic can be characterized as a recollection-based monitoring process, it is one that is relatively fast acting and requires minimal frontally based resources. When subjects expect to retrieve more distinctive information, it is easier to decide that nonstudied events fail to elicit such recollections. These latency findings raise an alternative explanation of our findings, one that is more specific to neuroimaging methodology. Because the red word and picture conditions differed in response latency, the resulting difference in activity may have been due to longer processing time, as opposed to additional postretrieval monitoring processes. We do not believe that a time-on-task explanation of the current results is appropriate. First, the main prefrontal regions in question (DLPFC, BA 9/46) were consistently activated in analyses thought to be sensitive to monitoring processes, including both of the red word test conjunction analyses, whereas patterns of activation in other regions (e.g., BA 8; BA 40) were less consistent across analyses. This pattern argues against a general time-on-task account of all activity. Second, on the red word test, there were no latency differences for the old > new contrast [red wordhits (1458 msec) vs. new CRs (1410 msec), t(13) < 1], even though significant bilateral DLPFC activity was obtained with this contrast. Thus, although slower latencies were generally indicative of additional monitoring processes, such latency differences were not necessary to obtain DLPFC activations in this experiment (see Achim & Lepage, 2005, for analogous results). Finally, at the theoretical level, it is unclear what alternative cognitive processes would have caused latency differences for all of the contrasts that we considered, if not different search and decision processes associated with the to-be-recollected information on each test.

Laterality Effects
Another interpretative issue is whether the DLPFC activations found here reflect retrieval monitoring, in general, or only those search and decision processes that are specific to the recollection of verbal stimuli. This issue is difficult to resolve with present methodology, because monitoring differences were elicited by manipulating the to-be-recollected information (pictures or red words). The fact that the DLPFC activations obtained here were bilateral, instead of left-lateralized, provides some indication that these effects were not based purely on the retrieval monitoring of verbal information, as does the fact that verbal labels were presented for all stimuli (both pictures and red words) at both study and test. Further, Johnson, Raye, Mitchell, Greene, and Anderson (2003) have found that right DLPFC activity (near BA 9) can be elicited by either word or picture processing in a short-term retrieval task, and Wheeler and Buckner (2003) have provided evidence that activity in a variety of prefrontal regions tracked conditions requiring controlled recollective search for both picture and sound information (paired with verbal labels). Both of these results point to some level of generality across the types of to-be-recollected information in DLPFC regions. However, some material-specific differences were found in these studies (e.g., different regions of right DLPFC for the retrieval of different types of stimuli). Additional studies that target different types of stimulus distinctiveness are necessary to address whether PFC regions involved in monitoring processes can be subdivided on the basis of the types of to-be-recollected information.
As discussed earlier, several other studies have found bilateral DLPFC activation under conditions where the memory decision was difficult, and thus required additional postretrieval monitoring processes (e.g., Cansino et al., 2002;. In some of these studies, different retrieval functions were attributed to left and right prefrontal regions. Although precise theoretical distinctions vary, a reoccurring theme is that left prefrontal regions are more active when specific or detailed information is retrieved and/or monitored, whereas right prefrontal regions are more active when vaguely recollected or only familiar information is retrieved/monitored (e.g., Dobbins, Simons, et al., 2004;Mitchell et al., 2004;Wheeler & Buckner, 2004;Dobbins, Rice, et al., 2003;Kensinger, Clarke, & Corkin, 2003;Ranganath et al., 2000;Henson, Rugg, Shallice, Josephs, & Dolan, 1999). The present results do not directly speak to this dichotomy. Although picture recollections were more distinctive than red word recollections, on each of the criterial recollection tests there was behavioral evidence that subjects (1) had monitored memory for detailed recollective information (red words or pictures) and (2) were influenced by familiarity effects on false recognition. Given these two considerations, the additional monitoring processes engaged on the red word test (relative to the picture test) may have involved the processing of both specific (red word memories) and vague (familiarity) information. Our results do, however, speak to a functional distinction between retrieval success and retrieval monitoring (see Rugg, 2004). The fact that we found differences in bilateral DLPFC activation in contrasts where item history, hence, potential recollective content, was held constant argues against a retrieval success account of these effects, and instead favors a retrieval monitoring explanation (e.g., Rugg et al., 2003;.

Preretrieval versus Postretrieval Monitoring
One last theoretical issue to consider is the distinction between preretrieval orientation and postretrieval monitoring. Budson, Droller, et al. (2005) and Herron and Rugg (2003) found differences in activity for new (nonstudied) items across two testing conditions (picture study versus word study). In these studies, which used ERPs, such effects were interpreted as a global shift in preretrieval orientation that had a similar effect on all item types. In contrast, we failed to find differences in neural activity for new items across testing conditions, which is inconsistent with a preretrieval orientation interpretation. Other than differences in imaging techniques, there were two important task differences between the present study and these others. In these other studies, the same visual words used at study were repeated at test in the word condition, and a recall-to-reject exclusion strategy was possible in each condition. Either of these factors might have influenced the retrieval orientation or strategies used by subjects across the two conditions. These difficulties were avoided in the present study, and differences in neural activity between testing conditions were obtained only when highly familiar (studied) items were compared. This pattern of activity is more consistent with a postretrieval monitoring interpretation, in which some degree of memory retrieval was required before monitoring processes were engaged. We did find that false recognition of new lures was lower on the picture test, suggesting that a small percentage of these lures was sufficiently familiar to elicit postretrieval monitoring processes, but these trials were infrequent and not detected in our imaging analyses.
This is not to say that retrieval orientation did not differ across testing conditions in our study, or that other measures of orientation could not reveal differences in prefrontal activity in the present task (e.g., state/item designs, Velanova et al., 2003;Donaldson, Petersen, Ollinger, & Buckner, 2001). Rather, our point is that the pattern of DLPFC differences that we observed is more consistent with a postretrieval monitoring interpretation. This conclusion also is consistent with behavioral evidence that a preretrieval orientation is not necessary for the use of a distinctiveness heuristic. Using source memory tests, in which subjects had to simultaneously choose between the different formats (e.g., ''both,'' ''red word,'' ''picture,'' or ''new''), and thus, were unlikely to use a preretrieval orientation for only one of the formats, Gallo et al. (2004) found evidence for the use of the distinctiveness heuristic that was similar to that obtained using criterial recollection tests. These results indicate that it is the difference in recollective expectations for words and pictures, as opposed to a preorientation towards one type of information, that is critical for the use of the distinctiveness heuristic. In line with the current imaging results, we propose that these recollective expectations play an important role when only familiarity (or noncriterial recollection) is retrieved. If the to-be-recollected information is less distinctive (e.g., words), then additional frontally mediated monitoring processes are initiated to help make the memory decision. If the soughtafter information is more distinctive (e.g., pictures), then an initial recollection failure leads to immediate rejection.
In conclusion, we note that although we have used picture presentation to manipulate distinctiveness, diagnostic monitoring processes that take advantage of recollective expectations should generalize across different manipulations of distinctiveness. For instance, in the behavioral literature, source monitoring errors are less likely to occur for sources involving more elaborate cognitive operations (Johnson, Raye, Foley, & Foley, 1981), and false recognition is less likely to occur for more emotional events (e.g., Kensinger & Corkin, 2004). To the degree that these manipulations engender distinctive recollections, the need to recruit postretrieval monitoring processes for memory decisionsand corresponding activity in the DLPFC-should be reduced. To test this prediction, additional studies are needed that compare neural activity across retrieval conditions that differ only in the distinctiveness of the to-be-recollected events, while controlling for the potentially confounding effects of retrieval success.

Subjects
Twenty right-handed English-f luent volunteers, recruited from the student community, participated for $50. Data from four subjects were unusable (two due to insufficient behavioral performance, and two due to equipment failure), so data from 16 subjects (mean age = 19.8, range 18-23, 12 women) were included in the final analyses. In addition, 17 undergraduates participated in behavioral pilot testing for $10 or course credit. All subjects gave informed consent using methods approved by the appropriate human subjects committees at Massachusetts General Hospital and Harvard University.

Materials
Stimuli were 360 colored pictures of recognizable objects (e.g., dragon, telescope), cropped on white backgrounds, and corresponding verbal labels presented in large red font. To ensure proper identification of pictures, all study stimuli (pictures or red words) were preceded by the corresponding verbal label presented in smaller black font. Each black word was presented for 250 msec, immediately followed by the same word in larger red letters (1 sec) or the corresponding picture (1.5 sec), which were separated from the next black word by a 400-msec interstimulus interval. Pictures were studied for a longer duration to ensure distinctive encoding of their features. At test, a 6-sec prompt indicated the instructions for the upcoming test block (i.e., standard, red word, or picture). On each test, verbal labels were used as recognition memory cues. Labels were presented in white uppercase font on a black background, along with the appropriate test prompt as a reminder (i.e., ''studied?'' for the standard test, ''red word?'' for the red word test, and ''picture?'' for the picture test). Twelve counterbalancing conditions were created to rotate the stimuli, across subjects, through the studied conditions (studied as red word, picture, both, or neither) and test conditions (standard test, red word test, or picture test). After the initial counterbalancing was met, the remaining subjects were arbitrarily assigned to counterbalancing conditions (no condition occurred more than twice). All stimuli were back-projected onto a screen in the scanner bore, and participants viewed them through an angled mirror that was attached to the head coil.

Task Procedures
All subjects first completed a practice version of the experiment (approximately 10 m), outside the scanner, to ensure that all of the study and test procedures were understood. None of the 24 practice stimuli was used in the main experiment. The study phase of the main experiment occurred during structural scans (approximately 25 m), and was immediately followed by the recognition tests. During study, subjects were instructed to remember 270 words and pictures for the upcoming tests. One-third of the study stimuli (90) were presented as red words, one-third were presented as pictures, and one-third were presented as both red words and pictures (both items). To equate recognition memory for red words and pictures, each red word was repeated three times (for both items and for items presented as red words only) and each picture was presented once. Repetitions of red words, as well as the presentation of red words and pictures for both items, were distributed throughout the study phase. The presentation order of all study stimuli was randomly mixed, with the exception that an equal number of stimuli from the beginning, middle, and end of the study phase were subsequently tested in each of the three test runs.
Functional scans were acquired during the test phase, which was divided into three test runs (10 m, 12 sec per run). Each run was divided into three test blocks, corresponding to each of the three types of test, with each block separated by 21 sec of fixation. The order of the test blocks was varied across runs and counterbalanced across subjects. During each test block, subjects saw 10 test words corresponding to each type of studied item (red word, picture, both, or new). Each test word was presented for 3 sec, and test words were separated by a central fixation cross of jittered duration (3, 6, or 9 sec, mean SOA = 3.83 sec). The order of item types and fixation durations was mixed by a sequencing program designed to maximize the MR signal (e.g., Dale, 1999). In total, across all three test runs, there were 30 items of each type (red word, picture, both, new) in each of the three testing conditions (standard, red word, and picture test).
At test, subjects responded with the index (''yes'') and middle (''no'') fingers of their right hand while the test word was on the screen. On the standard test, they were to press ''yes'' for any test word that corresponded to a studied stimulus, regardless of the format in which the stimulus was studied (i.e., red word, picture, both items). They were to press ''no'' for those test words that did not correspond to a studied stimulus (i.e., new items). On the red word test, they were to press ''yes'' if they remembered studying a corresponding red word (i.e., red word and both items), and ''no'' if not, regardless of whether they remembered a corresponding picture (i.e., picture and new items). On the picture test, they were to press ''yes'' if they remembered studying a corresponding picture (i.e., picture and both items), and ''no'' if not, regardless of whether they remembered a corresponding red word (i.e., red word and new items). It was made clear to subjects that, because some items were studied in both formats, the recollection of one format (e.g., a picture) did not preclude presentation in the other format (e.g., a red word). Thus, whether they could remember a picture was irrelevant for the red word test, and vice versa for the picture test. Instead, on these criterial recollection tests, they were to focus only on whether they could recollect the to-be-remembered format.

Image Acquisition and Data Analysis
Images were acquired on a 3-T Siemens Allegra headonly MRI scanner. Detailed anatomic data were acquired using a multiplanar rapidly acquired gradient-echo (MP-RAGE) sequence. Functional images were acquired using a T2*-weighted echo-planar imaging (EPI) sequence (TR = 3000 msec, TE = 30 msec, FOV = 200 mm; flip angle = 908). To allow whole-brain coverage, 21 axial-oblique slices (5 mm thickness, 1 mm skip between slices), aligned along the anterior commissure/posterior commissure line, were acquired in an interleaved fashion. All preprocessing and data analysis were conducted within SPM99 (Wellcome Department of Cognitive Neurology). Standard preprocessing was performed on the functional data, including slice-timing correction, rigid body motion correction, normalization to the Montreal Neurological Institute template (resampling at 3 mm 3 voxels), and spatial smoothing (using an 8-mm full-width half-maximum isotropic Gaussian kernel).
For each participant, and on a voxel-by-voxel basis, an event-related analysis was first conducted in which all instances of a particular event type were modeled through convolution with a canonical hemodynamic response function. Event types reflected a combination of the retrieval test condition (red word test or picture test) and the participant's memory performance (correct rejection, hit). All participants had at least 10 instances of each event type included in the analyses. Effects for each event type were estimated using a subject-specific, fixed-effects model. These data were then entered into a second-order, random-effects analysis. Voxel coordinates are reported in Talairach and Tournoux (1998) coordinates and reflect the most significant voxel within the cluster of activation. Event-related time courses were extracted from active clusters by creating regions of interest (ROI) as 8 mm spheres using the ROI toolbox implemented in SPM99.