Orthographic and Phonological Effects in the Picture–word Interference Paradigm: Evidence From a Logographic Language

Oneimportantﬁndingwiththepicture–wordinterferenceparadigmisthatpicture-namingperformance isfacilitatedbythepresentationofadistractor(e.g.,CAP)formallyrelatedtothepicturename(e.g.,“cat”).Intwopicture-namingexperimentsweinvestigatedthenatureofsuchformfacilitationeffect withMandarinChinese,separatingtheeffectsofphonologyandorthography.Signiﬁcantfacilitationeffectswereobservedbothwhendistractorswereonlyorthographicallyoronlyphonologicallyrelated tothetargets.Theorthographiceffectwasoverallstrongerthanthephonologicaleffect.Theseﬁndingssuggestthattheclassicformfacilitationeffectinpicture–wordinterferenceisamixedeffectwith multipleloci:itcannotbeattributedmerelytothenonlexicalactivationofthetargetphonologicalsegmentsfromthevisualinputofthedistractor.Itseemsinsteadthatorthographicallyonlyrelated distractorsfacilitatethelexicalselectionprocessofpicturenaming,andphonologicallyonlyrelateddistractorsfacilitatetheretrievaloftargetphonologicalsegments.

The picture-word interference paradigm, a variant of the Stroop task (1935), has been widely used in psycholinguistic research, especially in the field of spoken word production (Glaser & Dungelhoff, 1984;Glaser & Glaser, 1989).In this paradigm, participants are required to name pictures that have distractor words superimposed upon them.Two kinds of picture-distractor relationships have been found to affect picture-naming performance.When the distractor word (e.g., DOG) belongs to the same semantic category as the picture (e.g., "cat" 1 ), it takes longer Bi et al.: Picture-word interference to name the picture than when the distractor word is unrelated to the target (e.g., PEN).This has been called the semantic interference effect.When the distractor (CAP) is related to the picture name ("cat") by phonological properties, the picture is named more quickly than when it is accompanied by an unrelated distractor.This is commonly referred to as the phonological facilitation effect.The dominant interpretation of these two effects is that they reflect different processing levels of picture naming.The semantic interference effect is the result of competition at the lexical selection stage, and the phonological facilitation effect is because of the priming of the target phonological nodes by the distractor (e.g., Meyer & Shriefers, 1991;Posnansky & Rayner, 1978;Schriefers, Meyer, & Levelt, 1990;Starreveld & La Heij, 1995).Based on such assumptions, the paradigm has been used to develop various theories of lexical access, concerning both the organization and the dynamics of speech production.
However, the interpretations of both the semantic and the phonological effects are still controversial.For example, there is disagreement about the locus of the semantic interference effect: it has been argued that this effect does not reflect competition at the stage of lexical selection but interference at the stage of response selection (Costa, Mahon, Savova, & Caramazza, 2003;Finkbeiner & Caramazza, 2006;Janssen Schrim, Mahon, & Caramazza, 2008;Mahon, Costa, Peterson, Vargas, & Caramazza, 2007;Miozzo & Caramazza, 2003).As for the phonological facilitation effect reported in the literature, it remains controversial whether this effect is the result of the priming of target phonological segments or the facilitation on earlier stages of target production (e.g., Damian & Martin, 1999;Roelofs, Meyer, & Levelt, 1996;Starreveld, 2000;Starreveld & La Heij, 1995).However, there has always been a crucial confound in the study of the phonological effect in picture-word interference.Because almost all studies were conducted in alphabetic languages with medium to high grapheme-phoneme correspondence, the "phonological" distractor is also similar to the target word in visual form (consider CAP and "cat"), and therefore, it is unclear whether the facilitation effect produced by such distractors should be attributed to the phonological relatedness between target and distractor or the orthographic relatedness between target and distractor.Although Lupker (1982) conducted experiments to examine the contribution of orthographic versus phonological relatedness of the distractor to the target, his study has not received much attention, and the theoretical implications of the results have not been considered in depth (but see Roelofs et al., 1996).Instead, researchers have focused on the phonological aspect of the relationship and assumed that the facilitation is an output effect resulting from the priming of the target phonological nodes (e.g., Shriefers et al., 1990;Starreveld & La Heij, 1995). 2  Does the confounding of orthographic and phonological relatedness matter in the interpretation of the mechanism responsible for the observed facilitation effects?Detailed analyses of how an orthographically and/or phonologically related distractor may affect the target-naming process are presented below, using the example of the pairs "cat"/KEY and "cat"/CELL for phonological and orthographic relatedness, respectively.
Two kinds of processes need to be considered to determine how a distractor word may affect picture naming: the word perception process and the Bi et al.: Picture-word interference picture-naming process.It is widely accepted that the picture-naming process involves at least the following stages: concept activation, lexical selection, and phonological encoding.This generic model will be used as a guide in our current discussion.Although the received view of lexical access is that the lexical layer is further divided into a lemma layer, which specifies the syntactic properties of a word, and a lexeme layer, which specifies the syntactically determined morphemes (e.g., Bock & Levelt, 1994;Dell, 1986;Garrett, 1980;Levelt, 1989;Roelofs, 1992Roelofs, , 1997) ) this view has been contested (Caramazza, 1997;Caramazza & Miozzo, 1997;Caramazza, Costa, Miozzo, & Bi, 2001).There is an unresolved controversy about the dynamics of the access process: whether activation flows between layers in a discrete (e.g., Levelt, Roelofs, & Meyer, 1999), cascading (e.g., Caramazza, 1997), or interactive fashion (e.g., Dell, 1986).The consequences of adopting a distinction between a lemma and a lexeme level and of adopting feedback connections for the interpretation of the effects of distractors on picture naming will be discussed in the General Discussion.
On the word perception side, a written word is assumed to activate its orthographic, semantic, and phonological representations.The details of the activation flow among these representations have received much attention but remain controversial.Many models have been proposed, including logogen models (Morton, 1969), serial search, and verification models (Forster, 1976), interactive activation (McClelland & Rumelhart, 1981), fuzzy logic models (Massaro & Cohen, 1991), and so on.Here, the one assumption we are committed to is that of spreading activation.It is assumed that the lexical orthographic representation of the distractor word is always activated by the visual input.From the orthographic representation, the lexical phonological representation receives activation either through its semantic representation (e.g., Hillis & Caramazza, 1995), or by direct lexical mapping between orthographic and phonological representations (e.g., Bub, Cancelliere, & Kertesz, 1985).The semantic representation also receives activation, either directly from the lexical orthographic representation (e.g., Coltheart, 1978) or indirectly via phonology (e.g., Lukatela & Turvey, 1994a, 1994b).The phonological segments that compose the word are activated both by the lexical phonological representation and through nonlexical graphemephoneme conversion (GPC) from the visual input.Although there is much debate in the literature on reading concerning the detailed timing and routes of activation among these representations, the central issue here is how the picture-naming process might be affected by a visual word input, namely, where and how the contact(s) between word-and picture-based processes occur.
Consider a distractor word that is phonologically, but not visually related to the target (KEY for "cat").There are at least two ways in which the presentation of KEY may affect the naming of a picture of a "cat."One is that, upon seeing the word distractor, its lexical phonological representation /ki:/ is activated, either directly (Route P, Figure 1) or via its semantic representation.This lexical phonological representation, in turn, sends activation to its phonological segments (/k/, /i:/), parts of which (e.g., /k/) may be shared by the target.Also, the distractor can prime the target phonological segments through the GPC process.For instance, the grapheme "k" in KEY activates the phoneme /k/ through the GPC process, leading to the facilitation of phonological encoding of the target (Route G, Figure 1).
Figure 1.How a phonological distractor word (e.g., KEY) and an orthographic distractor (e.g., CELL) affects production ("cat").Note: The connection between the corresponding items in the orthographic lexicon and the phonological lexicon could either be direct or via the conceptual system (see text).A direct line is drawn for the sake of simplicity.Orthographic lexical item CELL, once activated by the visual input, also activates its own semantic representation and phonological lexical representation.Only the activation that influences the target ("cat") is depicted.GPC, grapheme-phoneme conversion.Bi et al.: Picture-word interference Now consider an orthographically related word that does not share any phonology with the target, for example, the distractor CELL for the picture of "cat."Upon seeing the visual input CELL, the orthographic representations of all visual neighbors (CELL, CEILING, CALL, CAR, etc.) should be activated, including the target CAT (McClelland & Rumelhart, 1981).The phonological lexical nodes of the activated orthographic representations are then activated either directly through orthographic to phonological lexical connections or through the semantic system (lexical orthographic to semantic to lexical phonological).Therefore, the target phonological lexical node "cat" is primed by the presentation of a visually similar distractor CELL (Route O, Figure 1).In some cases, the orthographic distractors can prime target phonological segments through the GPC process; in the present example, the grapheme "c" in CELL might activate the phoneme /k/, which is also part of the target's phonological content (Route G, Figure 1).
The point of this analysis is that there are multiple potential routes that are responsible for the facilitation effect observed with phonologically and orthographically related distractors (e.g., CAP for "cat").An important study that shed light on the contributions of these possible routes was conducted by Lupker (1982).In two picture-word experiments conducted in English, Lupker tried to distinguish the contribution of the orthographic and phonological relatedness of the distractor on the processing of the target.He found that both phonologically (but not orthographically) related distractors and orthographically (but not phonologically) related distractors facilitated picture naming.In Experiment 1, each picture (e.g., "bear") was paired with a distractor with similar orthography and different phonology (e.g., YEAR), or an unrelated distractor (e.g., WORK), or nonwords with different degrees of orthographic similarity (e.g., XXXT, XXR, or DFRP).In Experiment 2, the picture targets (e.g., "plane") were paired with distractors of similar phonology and different spelling (e.g., BRAIN), distractors sharing both sound and spelling (e.g., CANE), unrelated words, and nonwords with orthographic and phonological properties parallel to the word conditions.Lupker's (1982) major findings were the following: (a) phonological distractors facilitated picture naming (by 23 ms in Experiment 2); (b) orthographically similar distractors facilitated picture naming (by 56 ms in Experiment 1); and (c) when both orthography and phonology were shared between distractor and picture name, the magnitude of the facilitation effect (55 ms in Experiment 2) was similar to that found when only orthography was shared (56 ms in Experiment 1).As discussed above, an orthographic distractor ("bear"-YEAR) may affect either the target lexical node or its phonological segments through GPC processes.The fact that facilitation from these kinds of distractors was indeed observed suggests that either or both of these processing layers are affected in the picture-word naming task.Similarly, the observed facilitation effect from phonologically related distractors ("plane"-BRAIN) suggests that the phonological content of the target is primed.
However, we need to view these results with caution.Because it is difficult to disentangle orthography and phonology in English, the orthographically related pairs sometimes also share phonological properties and the phonologically related pairs sometimes share orthographic properties.In addition, the item sets in the experiments were rather small-12 pictures in Experiment 1 and 9 pictures in Experiment 2. Because item analyses were not performed in the study, it is not Bi et al.: Picture-word interference clear how reliable the results are over different items.Furthermore, the observation that the magnitude of the orthographic facilitation effect is similar to that of the orthographic plus phonological effect, and is larger than that of the phonological facilitation effect, is based on "eyeball" comparisons across two experiments with different stimuli and participants.Thus, it is not obvious that these differences in the magnitude of effects are interpretable.
The primary goal of Weekes et al.'s (2002) study was to investigate the locus of the semantic interference effect by comparing it with the orthographic/phonological effects.Similarly to Lupker (1982), they found that both orthographically only and phonologically only related distractors produced significant effects.However, contrary to Lupker's study, the magnitudes of the two effects were comparable and the phonological and the orthographic effects were additive.In Zhou et al.'s (2003) study, they reported that the phonological effect was larger than the orthographic effect, although no direct statistical comparison was carried out between these two conditions.The interaction between these two types of effects was not assessed.There are certain methodological limitations in these studies, however.Weekes et al. (2002) did not control the visual complexity of distractors in different conditions nor the degree of orthographic/phonological similarity across conditions (B. S. Weekes, personal communication, 2008).Zhou et al. (2003) constructed the unrelated condition by pairing the orthographic distractors with targets.Hence, the three types of distractors (phonologically, orthographically, and semantically related distractors) were presented in the experiment an unequal number of times: orthographically related distractrors were presented twice as often as the other distractors.This makes a direct comparison of the phonological and orthographic distractors problematic because the fact that orthographic distractors led to less interference than phonological distractors in picture naming may be an effect of distractor repetition and not a difference Bi et al.: Picture-word interference between phonology and orthography.This difference in distractor repetition might temporarily change the activation level of distractor words.In addition, given that lexical activation levels modulate the word interference effect in the word-picturenaming paradigm (see the distractor word frequency effect; Miozzo & Caramazza, 2003), the orthographic and the phonological effects might be contaminated by a possible distractor word repetition effect.
More critically for the theoretical interpretations of the orthographic/ phonological effects, in these two studies the authors did not explicitly manipulate or control for the potential GPC factor.Although Chinese has highly opaque symbol-sound correspondences, it has been proposed that GPC or a GPC-like mechanism is not completely absent in Chinese.Over 80% of modern Chinese characters are so-called "compound characters" composed of a "semantic radical" and a "phonetic radical" (Perfetti & Tan, 1998;Zhu, 1988).The semantic radical relates to the meaning, typically the semantic category of the character.The phonetic radical, which is usually also a Chinese character by itself, provides cues to the pronunciation of the whole character, although these cues are often unreliable.The position of the phonetic radical is not fixed either.Studies have shown that in reading such compound characters, the phonological properties of the phonetic radicals are automatically activated and influence reading performance (e.g., Bi, Han, Shu, & Weekes, 2007;Hue, 1992;Law & Wang, 2005;Lee et al., 2004;Lee, Tsai, Su, Tzeng, & Huang, 2005;Peng, Yang, & Chen, 1994;Seidenberg, 1985;Shu & Zhang, 1987;Weekes & Chen, 1999;Yin & Butterworth, 1992;Zhou & Marslen-Wilson, 1999b; but see Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001).It remains controversial whether such a GPC-like procedure, which is often referred to as a "sublexical" mechanism, is nonlexical or lexical in nature (see Zhou & Marslen-Wilson, 1999b).For our current purposes what is relevant is whether orthographic and/or phonological distractors might affect target production without going through their corresponding lexical representations.In other words, the point is whether procedures similar to Route G in Figure 1 can be applied in the Chinese experiments.If yes, it would influence the interpretation of the effects in the following ways.If the orthographic distractors are compound characters and the phonetic radical is phonologically similar to the target name, the orthographic effect could have resulted from sublexical processes (Route G, Figure 1) alone, or the combination of the sublexical and the "lexical" routes (Route O).Similarly, If phonological distractors are compound characters and the phonetic radicals in these characters were phonologically similar to the target name, then the phonological effect could have resulted from sublexical (Route G, Figure 1) alone or the combination of sublexical (G) and the "lexical phonology" routes (Route P).In light of these considerations, it is important to control for such potential GPC-like sublexical origins for the phonological/orthographic effects even in experiments using Chinese, and this is what we did here.
In this article, we report two experiments using Mandarin Chinese to further investigate the mechanisms responsible for the putative phonological/orthographic facilitation effect.We paired each target picture with an unrelated distractor or a distractor that is related to the picture's name only orthographically, only phonologically, or both orthographically and phonologically.Critically, the distractors were constructed in a way to maximally reduce the application of GPC-like mechanisms.The rationale is as follows: if the phonological facilitation effect is merely because of the activation of the phonological segments through GPC processes (Route G, Figure 1), we should not find facilitation effects in any of the related conditions because the nonlexical symbol-sound correspondence is not applicable for our Chinese stimuli; the target phonemic segments do not receive direct activation via GPC from the visual input of the written distractor.If we do observe a facilitation effect from the phonologically only related distractors, it could only arise through the "lexical phonology route" (Route P).This is because the phonologically only related distractors (e.g., <kettle> /hu2/) share with the target items (" " <fox>, /hu2-li0/) nothing but phonological properties.Furthermore, there is no visual information of these distractors ( , <kettle>) that could provide cues for the target sound (/hu2/) through any nonlexical GPC process.A possible explanation for such an outcome is that the visual form activates the lexical phonological representation ( , <kettle>, /hu2/), which in turn, activates the phonetic segments shared with the target (" ," <fox> /hu2-li0/).If we observe a facilitation effect from a distractor that is related to the target only orthographically (e.g., <quack>, /gua1/) and if the contribution of GPC process is ruled out, we would have to attribute the cause of such an effect to processes internal to the "lexical route" (Route O).  1.
In the orthographically related conditions (O+P+ and O+P−), each distractor word and the first syllable of its corresponding target picture name shared at least half the visual components (or radicals).For instance, according to "Hanzi Xinxi Zidian" (Chinese Characters Information Dictionary; Li & Liu, 1988), character (<soft>) is composed of visual components " " on the left, " " on the top right, and " " on the bottom right.Character (<cotton>) shares the two components on the right " " and " " with (<soft>).In the phonologically related conditions Note: O+P+, a word that was both orthographically and phonologically related to the picture name; O-P+, a word that was phonologically related but orthographically dissimilar to the picture name; O+P-, a word that was orthographically related but phonologically dissimilar to the picture name; and O-P-, a word that was neither orthographically or phonologically related to the picture name.
(O+P+ and O−P+), each distractor word and the first syllable of its corresponding target picture name had the same vowel and consonant.For most cases the pair also had the same tones.None of the distractors was semantically related to the target picture.Care was taken to ensure that the distractor words in the three related conditions (O+P+, O+P−, O−P+) did not contain phonetic radicals that were homophonic to the name of the target picture.For example, for the picture of a pond, whose name is " ", the first syllable of the picture name is " " /chi2/, and the phonetic radical " " in the O+P+ and O+P− conditions is pronounced /ye3/, whereas the phonetic radical " " in the O−P+ condition is pronounced /si4/.
Thirty pictures from the same picture corpus were selected as fillers and warmup items.About one-third had one-syllable names and the rest had bisyllabic names.Each of the pictures was paired with four unrelated distractors.All the distractors were monosyllabic.
The four types of word distractors for all of the 49 pictures were assigned into four blocks according to the Latin-square method so that each picture appeared in each block once, and there were about an equal number of picture-distractor pairs from each condition in each block.Different pseudorandom orders were generated for each block such that no more than two successive trials were of the same experimental condition and that successive pictures were not semantically or phonologically related.For half of the participants, the order of trials in each block was reversed.Each participant saw all four blocks.The presentation order of the four blocks to the participants was assigned according to the Latin-square method.
All of the pictures were about 8 × 8 cm black line drawings on a white background.The distractors were randomly presented at nine locations around the center of the picture, in 24-point Song Black font.Procedure and apparatus.Stimulus presentation and reaction time (RT) recording were controlled by the dual screen version of DMDX (Forster & Forster, 2003).There was a familiarization session, a practice session, and the experiment proper.In the familiarization session the pictures were presented one by one for 700 ms followed by the name printed in the center of the screen.The participants were instructed to use the given name to name the pictures in the experiment.In the practice session, a picture was paired with an unrelated character that was not shown in any of the experimental conditions.On each trial of the experimental and practice sessions the following events occurred: a fixation point "+" appeared in the center of the screen for 700 ms, and then was replaced by the picture with distractors superimposed, which was presented for 700 ms; 2 s after the oral response, or 4 s in case of no response, the next trial started with the fixation point.Errors were recorded manually by the experimenter.The entire experiment lasted about 40 min.

Results and discussion
The data of one participant were discarded because of too many voice-key failures (over 50%).Naming latencies were discarded from the analyses whenever any of the following occurred: (a) a picture was named incorrectly, (b) a dysfluency occurred or an utterance was repaired, (c) RTs deviated from a participant's mean by more than 3 SD, and (d) voice-key failed to trigger.In total 5.3% of the trials (3.0% errors, 1.3% outliers, and 0.9% voice-key failures) were excluded from the analysis.Mean RTs, standard deviations, and error rates of the four conditions are listed in Table 2.The overall error rates were considered to be too low to be statistically analyzed.
A 2 (O+ vs. O−) × 2 (P+ vs. P−) analysis of variance was carried out.Subjects and items were treated as random variables to generate F1 and F2 analyses, respectively.The main effect of phonological similarity was significant: F1 (1, 23) = 6.7, mean square error (MSE) = 452, p < .05;F2 (1, 18) = 5.8, MSE = 560, p < .05; the main effect of orthographic similarity was significant: F1 (1, 23) = 42.9,MSE = 1056, p < .0001;F2 (1, 18) = 57.9,MSE = 714, p < .0001;and there was also a significant interaction effect between these two factors: F1 The results fully replicated those of Lupker (1982), showing that orthographic relatedness and the phonological relatedness affect picture naming even when the nonlexical GPC processes are ruled out.However, there seems to be one puzzling result in both our experiment and in Lupker's (1982) English experiments: the magnitude of the facilitation effect produced by a distractor that was both orthographically and phonologically related to the target (O+P+) was similar to that of the effect produced by an orthographic-only related distractor (O+P−).
No additional effect of phonological relatedness was observed when orthography was shared between the distractor and the target.Although the interpretation we offered in the introductory section predicts that the two effects could interact to some degree because they share at least one component (priming at the level of phonological segments), the models presented in Figure 1 cannot provide a straightforward explanation of why there is no additional priming of the phonological segments on top of the orthographic effect.We speculated that the absence of such additional phonological effect might be because of a floor effect.There is limited room for speeding up the target naming process, and the facilitation effect produced by orthographic relatedness is so large that any other effect would be less visible.Experiment 2 was then conducted to replicate the results in Experiment 1 with a new set of stimuli, especially to examine whether the absence of difference between the O+P+ vs. O+P− conditions was reliable.

Method
Participants.Twenty native speakers of Mandarin Chinese at Beijing Normal University served as paid participants.Materials.Ten pictures from Experiment 1 and 9 new pictures were selected as targets.Each were paired with distractors that were not used in Experiment 1 (see Table 1).Twenty-five new fillers pictures were also included each paired with four unrelated distractors.The aspects of stimuli construction are identical to those of Experiment 1.
Procedure and apparatus.These were identical to those of Experiment 1.

Results and discussion
Following the criteria used in Experiment 1, 3.6% of the trials (2.0% erroneous responses, 1.5% outliers) were excluded from the analysis.Mean RTs and error rates of the four conditions are listed in Table 2.Because the error rates in this experiment is very low only RT analyses were carried out.
The main effect of phonological similarity was significant: The results of Experiment 2 further replicated the major findings in Experiment 1 such that both orthographic and phonological distractors facilitated picture naming when the contribution of GPC processes is ruled out.Furthermore, a small, but significant difference between the response latencies of the O+P+ and the O+P− conditions was observed, whereas there was no difference in terms of error rates.In other words, the lack of difference between these two conditions in Experiment 1 was not replicated.However, the overall RTs in this experiment were shorter than those in Experiment 1, making a simple "floor effect" argument not too feasible.Rather, it could be that this difference (O+P+ and O+P−) is too weak to be obtained reliably.We acknowledge that we do not have a specific explanation, and whether it can be observed might depend on certain characteristics of particular item sets or subject groups.

GENERAL DISCUSSION
In two picture-word interference experiments we investigated the nature of the classic phonological facilitation effect, separating the effects of phonology and Bi et al.: Picture-word interference orthography by taking advantage of the logographic nature of Mandarin Chinese.It was critical that any potential contribution from the GPC procedure was ruled out.We repetitively observed that phonological-only related distractors significantly facilitated target picture naming.Orthographic-only related distractors produced an even larger facilitation effect on target naming.The additional phonological effect on top of an orthographic effect was observed in Experiment 2 but not in Experiment 1, suggesting that such additional phonological effect is small and unreliable.Below we will focus on the first two reliable results.We will first compare these results with other relevant studies in the literature, then we will analyze the possible mechanisms of the orthographic and the phonological effects within different speech production frameworks, and finally, we will address the theoretical implications of the results in the broader context of speech production research.
The major findings in our results replicated the findings in Lupker (1982) and Zhou et al. (2003) while avoiding their methodological limitations.Weekes et al. (2002) also found that both orthographic-only and phonological-only distractors produced significant facilitation effects on picture naming but in their study the magnitudes of these two effects were comparable.It is possible that the orthographic distractors in Weekes et al.'s study did not have as high degree of visual similarity to the targets as ours (and Zhou et al.'s, 2003).If this were the case, then although in all these studies, the degree of similarity was higher for the phonological manipulation (near 100%) than the orthographic manipulation, the difference was even stronger in Weekes et al., making the direct comparison of the magnitudes of the two effects in their study less meaningful.Unfortunately, the original stimuli used in Weekes et al. (2002) are not available for further analyses.
One important contribution of our experiments beyond these previous studies (Lupker, 1982;Weekes et al., 2002;Zhou et al., 2003) is that only in our study was the contribution of GPC processes clearly ruled out.In our experiments the facilitation effects were robust even when the contribution from the nonlexical GPC or sublexical processes was ruled out.We can conclude more confidently from these results that the orthographic/phonological facilitation effects obtained in the literature cannot be attributed merely to the activation of target phonetic segments by a written distractor via GPC processes (Route G, Figure 1).For the orthographic-only related distractors, the facilitation effect can only result from the "lexical route" (Route O).This conclusion is based on the assumption that the written distractors cannot activate the target phonetic segments via the nonlexical GPC route (Route G) or directly from the distractors' lexical phonological representation (Route P) because they are not phonologically related to the targets.The written form of the distractor (e.g., , <quack>, /gua1/) activates the orthographic representation of , and also other visually similar orthographic representations, including that of the picture target .The activation would spread to the semantic, lexical phonology, and phonetic segments of the target, resulting in faster selection/retrieval of the target item in these stages.Note that this "phonological" lexical node in the generic model roughly corresponds to the lexical node in the Independent Network model (Caramazza, 1997) and the wordform layer in the model proposed by Starreveld and colleagues (e.g., Starreveld, 2000;Starreveld & La Heij, 1995, 1996).Bi et al.: Picture-word interference In the above discussion we have discussed the mechanisms of the form facilitation effects in the generic framework of word production.We mentioned in the introductory section that there are ongoing debates both regarding the overall architecture of the lexical system and the dynamics of lexical access.The two most important notions that differ from our generic model are the following: (a) the lexical layer is further divided into a lemma level and a lexeme level, and (b) there are feedback connections in the system.The consequences of these two assumptions will be discussed in turn.
We utilize WEAVER++ proposed by Levelt and colleagues (1999) as an example of models of two lexical layers.In WEAVER++, after a lexical concept node is selected, it sends activation to the corresponding lemma and the lemmas of related concepts, which specify the syntactic properties of words.Upon the selection of the most active lemma, the activation flows to its corresponding lexeme.This is followed by the encoding of the phonological segments associated with a particular lexeme.Activation travels across different layers in a discrete manner.There are two possible routes in WEAVER++ for a phonological-only distractor to affect target naming.First, the phonological lexeme of the distractor is activated by the visual input.If the distractor is a homophone of the target, this lexeme is also the target lexeme.When the target and distractor are not homophonous (e.g., " " (/hu2/, fox)" and (/hun2/, spirit)), it is not obvious whether the lexeme of "/hu2/" could be activated by (/hun2/) aside from the shared phonological segments (e.g., /h/).Second, target phonological segments can also be primed by the distractor through GPC processes when applicable.For an orthographic-only related distractor, Roelofs et al. (1996) argue that at the very least it primes the target lemma.Upon seeing the visual input, the orthographic representations (orthographic lexeme) of all visual neighbors should be activated, including the target.Once the target orthographic lexeme is activated, the activation will spread to the target lemma, which is shared by the perception and speech production networks.Because of the assumption of WEAVER++ that "the distractor word affects the corresponding morpheme (lexeme) node in the production network," the target orthographic lexeme, activated by an orthographic distractor will also prime the target phonological lexeme directly.In other words, the target lemma and phonological lexeme are both primed by the presentation of an orthographic distractor.Furthermore, in some cases, nonlexical GPC may also lead to the priming of target phonological segments by orthographic distractors (e.g., consider the distractor PLANE for the target "brain").In summary, in WEAVER++, the orthographic effect could be attributed to the priming of target lemma and lexeme, and also priming of the phonological segments via GPC processes; the phonological effect could be attributed to the priming of target lexeme and target phonological segments.
Another important consideration about the nature of the lexical access process concerns assumptions about mechanism of activation flow: some theories assume bidirectional connections between processing levels (e.g., Dell, 1986;Rapp & Goldrick, 2000).These theories assume that a later processing Bi et al.: Picture-word interference level of representation can affect an earlier level through feedback connections between layers.Within such a framework, it is hard to attribute the phonological and the orthographic effects to a specific locus (or loci).For instance, even if a phonological distractor starts to affect target naming at the phonological segments level, there is nothing in the model to prevent the spreading of activation back to the lexical layer and affect lexical node selection.At best, once all parameters of a model are specified, such as the strength of the feedback connection, it might be possible to deduce through computational simulations what percentage of the observed effect are the result of facilitation on a given stage.Starreveld (2000) has produced an especially useful summary of the various accounts that have been offered in the literature for the orthographic/phonological effect in picture-word interference, including the phonological-segment view (Meyer & Shriefers, 1991), the "lemma-activation" view (Roelofs et al., 1996), and the "word-form" view (Starreveld, 2000;Starreveld & La Heij, 1995).Starreveld and La Heij (1996;Starreveld, 2000) have shown that the orthographic/phonological effect is obtained over a wide range of stimulus onset asynchronies and that it interacts with the semantic effect (Starreveld & La Heij, 1995; also see Damian & Martin, 1999).They have gone on to propose that the orthographic/phonological effect should be localized at the word-form layer.However, by showing a "pure" phonological effect (Route P, Figure 1) and a "pure" orthographic effect (Route O), we have direct empirical evidence that the classical orthographic/phonological effect reported in the literature is a mixture of effects arising at multiple stages, including the lexical layer(s) (both lemma and lexeme) and the phonological segments.
Before we discuss the broader theoretical implications of these findings, we have to analyze whether and how much of the results and conclusions about picture-word interference based on the Mandarin Chinese experiments can be generalized to alphabetic languages.There is neither empirical evidence nor theoretical grounds for holding that the speech production (picture naming) process in Mandarin is different from that of alphabetic languages (e.g., English).It might be more reasonable to ask if the word perception process of the distractors might differ in these two kinds of language systems, resulting in possible differences in the mechanism of picture-word interference.There is a substantial literature investigating how Mandarin words are recognized and whether this process differs from alphabetic languages (mostly English).Just as in English, there are debates on the detailed timing and routes of activation between the orthographic, the phonological, and the semantic representation, such as whether phonology mediates semantic access in visual word recognition (e.g., Perfetti & Tan, 1998;Perfetti & Zhang, 1995;Tan, Hoosain, & Peng, 1995;Zhou & Marslen-Wilson, 1999a).There is much evidence that phonological information is obligatorily activated in reading Chinese (Mandarin) (e.g., Perfetti & Tan, 1998;Zhou & Marslen-Wilson, 1999a;Zhou et al., 1999), and that the semantic activation is driven by both orthographic and phonological information.To be conservative on the universality notion of language processing, it could be Bi et al.: Picture-word interference argued that in English, phonology seems to play a primary role in semantic access, whereas in Chinese, orthographic information might be more important (Zhou & Marslen-Wilson, 1999a).If such were the case, we would expect that the orthographic effect in picture-word naming would be stronger in Chinese and the phonological effect would be stronger in English.Similarly, one could also imagine that the effect through GPC is more dominant in alphabetic languages than in Chinese.However, these possibilities are not encouraged by the results of Lupker (1982), and they should not affect the ways such effects occur either.
Another difference between English and Mandarin word recognition is the nature of sublexical phonological activation (GPC).It has been found that in Chinese, in the process of perceiving a character, both the semantic and the phonological properties of the phonetic radical are activated (e.g., Zhou & Marslen-Wilson, 1999b), whereas in English, only the phonology but not the semantic content of the sublexical units are activated (e.g., the phonology of "own" and not its meaning is activated when shown the word "shown").However, again this point should not influence the interpretation of the orthographic and the phonological effect in our Chinese picture-word naming experiments.In all of the related distractor conditions in our experiments, only the formal properties of the phonetic radicals differed systematically and the meaning of the phonetic radicals were all unrelated to the target.Therefore, we propose that our results do not speak only to the mechanisms of picture-word naming in Mandarin, but to this process more generally.
Our findings that both pure phonological distractors and pure orthographic distractors produce robust facilitation have important theoretical implications.We mentioned in the Introduction that it is a commonly held notion that the phonological facilitation effect in the picture-word interference paradigm is an output effect, at the level of retrieval of the target's phonological nodes.This assumption has led researchers to make inferences about the phonological encoding stage of production by manipulating phonological relatedness in this experimental paradigm.Our results, together with Lupker's (1982) study, suggest that this classical assumption of the phonological effect may be inaccurate, or at least must be considered with extreme caution.Because previous studies have used words that are both phonologically and orthographically related to the target, the facilitation effects described in the literature are most likely a mixture of effects occurring at multiple stages, including the lexical selection stage and the phonological output stage.Theoretical arguments that were based on conventional assumptions about the phonological facilitation effect need to be reevaluated (e.g., Costa & Caramazza, 2002;Starreveld & La Heij, 1995).It is important to keep in mind that without teasing apart orthographic and phonological relatedness, any facilitation effect observed is a mixture of effects from different sources and at different stages of processing.With this new understanding it will be possible to use the picture-word naming task to systematically investigate the interaction between phonological and orthographic processes in word recognition and speech production.

Table 1 .
Bi et al.: Picture-word interference Sample stimuli and mean values of frequency and visual complexity in Experiments 1 and 2

Table 2 .
Mean reaction time (RT), standard deviation (SD), and error rates (ERR) in Experiments 1 and 2 Note: O+P+, a word that was both orthographically and phonologically related to the picture name; O-P+, a word that was phonologically related but orthographically dissimilar to the picture name; O+P-, a word that was orthographically related but phonologically dissimilar to the picture name; and O-P-, a word that was neither orthographically or phonologically related to the picture name.*p < .05. **p < .01.