Children's Norm Enforcement in Their Interactions With Peers this article openly available. Please share how this access benefits you. Your story matters

This study investigates how children negotiate social norms with peers. In Study 1, 48 pairs of 3- and 5-year-olds ( N = 96) and in Study 2 48 pairs of 5- and 7-year-olds ( N = 96) were presented with sorting tasks with conflicting instructions (one child by color, the other by shape) or identical instructions. Three-year-olds differed from older children: They were less selective for the contexts in which they enforced norms, and they (as well as the older children to a lesser extent) used grammatical constructions objectifying the norms (“It works like this” rather than “You must do it like this”). These results suggested that children's understanding of social norms becomes more flexible during the preschool years.

NORM ENFORCEMENT 5 children, older children justify exclusion more if there is a threat to group functioning, although all age groups rate straightforward exclusion of a child from a group as unfair.
In addition, some recent experimental studies have documented that from early on, young children enforce social norms to third parties. A series of studies investigated children's interventions to a puppet committing a mistake within pretense and other rule-governed games (Rakoczy, 2008;Rakoczy, Brosche, Warneken & Tomasello, 2009;Rakoczy, Warneken, & Tomasello, 2008;Wyman, Rakoczy, & Tomasello, 2009). The converging evidence suggests that both 2-and 3-year-olds intervene and protest against the mistakes of the puppet. Three-year-olds particularly display normative protests (e.g. teaching, critiquing) using normative language such as "No! It does not go like this!" (Rakoczy et al., 2008, p.877), whereas 2-year-olds' protests are more descriptive and imperative such as "No! Not in this hole!" (Rakoczy et al., 2008, p.877). While enforcing rules to third parties, 3-year-olds show some understanding about the context-specificity of the rules, and evaluate the same action as a norm violation in certain contexts but not in others. For instance, Wyman et al. (2009) created two pretend play contexts in which the same object, a yellow stick, was used as a toothbrush in one and as a carrot in the other. Three-year-olds normatively protested when the puppet used the contextually-inappropriate pretend identity.
One limitation in both experimental and interview methods is that they rely on non-interactive measures by analyzing children's one-shot responses to norm violations of a puppet or hypothetical characters who do not respond back or challenge children's beliefs about norms. Some observational studies have analyzed how children enforce norms within their spontaneous peer conflicts, which are a NORM ENFORCEMENT 6 significant part of peer interactions (Corsaro, 1994;Kyratzis, 2004). Studies have documented that in their peer conflicts, children refer to norms of property entitlements or possessions, turn-taking, or aggression (Dunn & Munn, 1987;Hay, 1984;Hay & Ross, 1982;Ingram & Bering, 2012;Much & Schweder, 1978;Shantz, 1987;Smetana, 1984). In facing such conflicts, children are responsive to their peers' protests and attempt to resolve the conflict (Hartup, French, Laursen, Johnston, & Ogawa, 1993;Hartup, Laursen, Stewart, Eastenson, 1988;Killen & Turiel, 1991;Shantz, 1987;Hay & Ross, 1982). Nucci & Nucci (1982a) demonstrated that school aged children respond to peers' moral transgressions (e.g. throwing sand at someone) by explaining the effect of the act on victim's rights and welfare, such as You got it in my eyes. It hurts like hell (p.1339) and to conventional transgressions (e.g. spitting on the grass) by stating the normative rule such as You're not supposed to spit (p.1339).
In the studies on children's naturalistic peer interactions, children's response types to norm violations were mostly categorized into functional categories related to the content of the message such as "injury or loss statement", but how these normative expressions were formulated was not analyzed linguistically. For instance, the way children mark agency in their [normative] formulations might reveal their perspective on an event (Berman & Slobin, 1994;Budwig, 1995;Kyratzis, 2009).
When a child uses a transitive construction (an animate agent is acting on an object), You must put the pen here, they report the act from a speaker-subjective point of view: how things are seen by the speaker or what seems right for the speaker. When they use inanimate subjects, The pen goes here, they report the same act from a more factual point of view: how things are seen by everyone or what seems true for everyone (Berman & Slobin, 1994). Although both statements are normative, in the latter case, known as "middle constructions", the same action is more objectified and NORM ENFORCEMENT 7 the role of the speaker or the causal agent is downgraded when inanimate subjects are used (Budwig, Stein, & O'Brien, 2001;Kemmer, 2003).
Although observational methods take into account the interactive nature of normative peer conflicts, they still have limitations. Investigators do not have control over the peer conflict in terms of which child knows what about the norm, the type of norm violations (or the type of normative conflict), the number of parties involved in the conflict, so forth. Most importantly, it is very difficult to draw conclusions about developmental changes in the understanding and use of norms because the contexts of peer conflicts are often not comparable at different ages. One cannot compare how long it takes children to resolve a normative conflict or what kinds of linguistic strategies they deploy in their norm enforcements, if the sampled normative conflicts are not analogous. One major gap in the literature, therefore, is the analysis of children's normative disputes with peers in highly interactive, but still experimentally controlled settings.
In the current study, therefore, we experimentally created two contexts in which children had to interactively negotiate the rules of a game in dyadic peer conversations. In these two contexts, children of different ages (3, 5, and 7) interacted with one another in the same context and had equal footing such that they all learned the same game with the same rules from the same adult sources. We elicited peer conflicts to observe possible developmental changes in how normative disputes are resolved and what linguistic constructions (e.g. marking agency) children deploy. We used a methodology similar to that of Hartup et al. (1993) and taught children conflicting rules of a sorting game comparable to the Dimensional Change Cart Sort (DCCS) used by Zelazo, Frye, and Rapus (1996). In Study 1, pairs of 3-and 5-yearolds were presented with simple sorting games. In Study 2, pairs of 5-and 7-year-olds NORM ENFORCEMENT 8 were presented with complex sorting games. In both studies, in the incompatible context, one child was taught to sort the items by shape, and the other by color, such that the placement of the first item by a child would be a rule violation for his/her partner. In the other context, the compatible context, both children were taught to sort the items by both color and shape such that the placement of the first item by a child would not necessarily be a rule violation, but the appropriateness of that action would depend on which game the children decide to play and would require children to take the perspective of their peer partners. We first analyzed whether children distinguished these two contexts through looking at the presence of normative conflicts. We hypothesized normative conflicts would take place in the incompatible context more than the compatible context. The difference would be greater in the older age groups because they would be better at discriminating the contexts where an action is wrong from those contexts where an action is not necessarily wrong. Next, we analyzed how easily the children agreed on a rule by looking at how long it took them to agree and predicted that they would agree in the compatible context sooner than incompatible context. Older children would agree on a rule sooner than younger children because they would be more flexible in seeing alternative rules as acceptable.
And finally, we explored how children marked agency as a discursive strategy in their normative expressions to see whether they objectified the rules or they used more person-oriented statements.
These age groups were selected because the literature shows that by 3 years of age children are able to follow and detect violations of game rules (Kalish, Weismann, & Bernstein, 2000;Rakoczy et al., 2008;Zelazo et al., 1996); by 5 years of age they are able to take the perspective of their interactive partners (Wellman, Cross, & Watson, 2001); and in middle childhood (age 7 and 8) children show some NORM ENFORCEMENT 9 further flexibility in their reasoning on moral norms and social conventions (Conry-Murray & Turiel, 2010;Horn, Killen, & Stangor, 1999;Themier, Killen, & Stangor, 2001).
Within the 5-year-olds, 13 dyads participated in the compatible context, and 11 dyads in the incompatible context. The dyads were formed on the basis of the recommendations of their teacher and were composed of children who were friends.
All children were monolingual German-speakers, except for the two 3-year-old and two 5-year-old children who were bilinguals. The second languages of these bilingual children were Arabic, Russian, English, and French.

Materials
There were two sorting games to control for the possible difficulty or preference for one of the sorting dimensions (color and shape).
Board game: Children were instructed to sort the 14 magnets that came out of a dispenser according to animal shape (a hedgehog or a bird) and/or color (yellow or green). There were 4 types of magnets: a yellow hedgehog, a green hedgehog, a yellow bird, and a green bird. Children were asked to place these tokens on a 56.5 x NORM ENFORCEMENT 10 34 cm magnetic board on which there were two rows. On the head of one row there was a picture of a yellow hedgehog; and on the other row there was a picture of a green bird, signaling one level of each of the two sorting dimensions.
Tube game: Children were instructed to sort the 14 wooden tokens that came out of a dispenser according to geometric shape (ball or cube) and/or color (blue or red). There were 4 types of wooden tokens: a blue ball, a red ball, a blue cube, and a red cube. Children were asked to place these tokens into two 6 x 29.7cm plexiglass tubes on a platform. In front of one tube there was a red cube; and in front of the other tube there was a blue ball, signaling one level of each of the two sorting dimensions.

Procedure
The study took place in a quiet room of preschools in a mid-size German city. Each dyad played both games and the order of the games was counterbalanced.
There were two sessions for each game: individual teaching phase and testing phase.
The entire session lasted around 45 minutes and all sessions were videotaped.
In the individual teaching phase of the compatible context, both sets of rules (the shape rule and the color rule) were presented to each child and the presentation order of the rules were counterbalanced. We will describe the procedure in which the board game was played first and the color rule was introduced before the shape rule.
The first child, Child A, went through the individual teaching phase, while the other child, Child B, was outside with the second experimenter. The experimenter told Child A, "We will play two games. First we will play the color game" and instructed Child A to sort the 10 tokens by color within the board game ("All the green ones go here and the yellow ones go here. The green one never goes here, because they all go there and the yellow ones go there"). The first 2 tokens were demonstrated by the experimenter; the next 2 tokens were sorted by the experimenter and the child together; and the rest of the 6 tokens were sorted by the child alone. If the child made a mistake, the experimenter reminded him/her the rule. Children did not have problems understanding the rules. Then the experimenter introduced the second set of rules within the board game to Child A, "We just played the color game. Now we will play the animal game" and repeated the same instructions for sorting the same tokens by animal shape. After Child A left the room, Child B entered. The same instructions were delivered but the order of the rules was reversed such that Child B sorted the magnets by animal first and by color next. In other words, the last rule each child learnt was different. This constituted the first teaching phase. In the first testing phase, both children were in the room together. The experimenter told them "Now you will play together" and left the room. The children played together and sorted the 14 tokens. Then the same procedure was repeated for the tube game. This time, Child B went through the teaching phase first and was taught to sort the wooden tokens by color first and by shape next. Then, Child A was taught to sort the wooden tokens by shape first and by color next. Eventually, if a child learnt the color rule first and shape rule next in the first game, he/she learnt the shape rule first and the color rule next in the second game and vice versa.
In the incompatible context, the individual teaching phases were identical to the compatible context, except that only one set of rules was presented to each child.
We will again describe the procedure in which the board game was played first. The experimenter told Child A, with Child B out of the room, "We will play the board game" and instructed the child to sort the tokens by color. Then, with Child A out of the room, the experimenter told Child B, "We will play the board game" as well, however, instructed the child to sort the tokens by shape. In the first testing phase, the experimenter told both children "Now you will play the board game together" and left the room. The children sorted the 14 tokens that came out of the dispenser. Then the same procedure was repeated for the tube game. This time, Child B went through the teaching phase first and was instructed to sort the tokens by color; and then Child A was taught to sort the tokens by shape. In other words, if a child learnt the color rule in the first game, he/she learnt the shape rule in the second game and vice versa. In second testing phase, the experimenter told the children "Now you will play the tube game together". The children sorted the 14 tokens that came out of the dispenser.
The children were not told anything about what instructions their peer received. The order of 10 tokens in the teaching phase and 14 tokens the testing phase were the same across dyads. One 5-year-old in the compatible condition refused to play the tube game so that session was dropped from the analyses. Thus, the dataset has 95 sessions from 48 dyads.

Coding
All the sessions were transcribed using the transcription conventions of CLAN (MacWhinney, 2000). Each line in a transcript depicted an utterance with a maximum of one clause. "Utterance" was defined as a clause (complex sentences were divided into several lines), a conversational turn, or a group of words separated from one another by a pause of 2 seconds. Exact repetitions within a conversational turn were excluded from the analyses. We first identified the point at which the children reached agreement, and then extracted the on-task protest utterances until agreement. Finally we coded each on-task protest utterance for normativity, agency, and transitivity. A second coder recoded 22 % of the transcripts for all the variables.
The agreement between the two coders is reported at the end of the respective sections: 1) Agreement on a rule, 2) Extraction of on-task utterances, 3) Extraction of protest utterances, and 4) Normativity, agency, and transitivity).

NORM ENFORCEMENT 13
Agreement on a rule. The first coder marked the resolution in the transcript.
Agreement was operationalized as any kind of verbal agreement (e.g. Ja Ente zu Ente 'Yes ducks with ducks') and/or 2 consecutive compatible moves in the game (nonverbal agreement). In about 10% of the sessions, children did not reach agreement or they agreed to disagree, each playing by their own rule or moving to off-task games. In these cases, the agreement was marked at the point when children stopped protesting to one another. If children never reached agreement and continued to negotiate the rules until they were out of tokens to sort, then the agreement was marked at the end of the session. For the reliability on resolution, a second coder marked all sessions for when the children reached agreement. We first determined the positions in the sequences of utterances at which the two coders indicated that the children reached agreement. In the next step, we calculated the absolute difference between these positions per episode and then averaged these values across episodes.
We used this value as the test statistic in the following permutation test that we ran to test its significance (Adams & Anthony, 1996;Manly, 1997). For this, we randomly allocated the decision about agreement of one of the two coders along the sequence of utterances, separately for each episode, and then determined the test statistic again.
We permuted the data 1,000 times (including the original data as one permutation) and determined the p-value as the proportion of permutations revealing a test statistic at least as small as that of the original data such that the test statistic, which was 0.90, can be treated as equivalent to the κ value (test statistic: 0.90, expected = 14.79, p < .01). Extraction of on-task utterances. All the utterances until agreement were first coded for being on-task. On task-utterances were defined as all utterances related to the rules of the game. Due to our linguistic analyses, 2 types of on-task utterances NORM ENFORCEMENT 14 were differentiated: clause-level (Hier kommen alle Igel hin. 'All hedgehogs go here') and nonclause-level on-task utterances (Murmeln! 'Marbles!'). Off-task utterances were defined as utterances that were not about the game or its rules (Das ist eine Kamera 'That is a camera'). The reliability on this was done with the 5 levels: clauselevel on-task utterances, non-clause level on-task utterances, off-task utterances, exact repetitions, and unintelligible utterances. The agreement was κ = .84. Normativity, agency, and transitivity. Each clause-level protest utterance was coded for the following 3 dimensions (1) normativity, (2) agency and (3) transitivity.
Normativity. The coding of the normativity in this paper was mostly based on the coding scheme of Rakoczy et al. (2008), which defined normative protests as utterances that involve protest, critique, and teaching with the use of normative vocabulary. More specifically, an utterance was coded as normative, if: All other protest utterances were coded as nonnormative and included utterances that had: (a) imperatives such as Gib mir den Vogel 'Give me the bird' (b) nondeontic modals möchten 'would like to', wollen 'want to' such as Ich will Vogel machen. 'I want to do the birds' (c) descriptive protests such as Ich habe das anders gespielt 'I played it differently', Es ist aber grün 'But it is green'.
We then categorized each session for the presence of normative conflict. If each child in a dyad displayed at least one normative protest, that episode was considered as an episode with normative conflict and all others were coded as nonnormative. The agreement on normativity of on each clause-level protest utterance was κ = .94.
Agency. The subject of each protest utterance was coded as inanimate if it referred to an object or an action and animate if it referred to a person (including the generic person Man 'One'). The agreement on agency was κ = 1.00.
Transitivity. There were 5 categories. The utterances were coded as: NORM ENFORCEMENT 16 (a) intransitive, if there was only one participant (the grammatical subject) such as Ich gehe 'I go' or one implied participant Komm '(You) Come', as well as the formulaic expression Es geht nicht so 'It doesn't go like that'; (b) middle, if the utterances were intransitives with inanimate subjects and highly kinetic verbs ('come', 'go') that involve some motion and direction.
These included the verbs kommen 'come', gehen 'go' (when they co-occurred with noun phrases within non-formulaic utterances), gehören 'belong' and modals with locatives Es muss hier hin 'It must (go) in here'; (c) transitive, if there were two participants (the grammatical subject and the grammatical object) such as Ich brauche die Rote 'I need the red', Doch, so hat der gesagt 'Yes, he did say that', or two implied participants Frag '(You) ask (that/him)'; (d) sein 'to be', if there were copular constructions with the verb sein 'to be' such as Das ist ein Igel 'That is a hedgehog'; (e) other, if the utterances were incomplete or ambiguous in terms of whether they were transitive or not such as Du kannst 'You can'.
The agreement on transitivity was κ = .91.

Results
There were 4 analyses: 1) presence of normative conflict in each session, 2) number of utterances it took children to reach agreement, 3) marking of agency within normative protests, and 4) the co-occurrence patterns of agency, normativity and transitivity. All the statistical analyses were run using R (version 2.15.1; R Development Core Team, 2012). There were no gender differences in any of the measures so gender was not included in the models for the sake of parsimony.

Presence of Normative Conflict
We first tested whether an episode had a normative conflict, using Generalized Linear Mixed Model (GLMM; Baayen, 2008) fit by the Laplace approximation. The analyses were run using the function lmer of the statistics package lme4 (Bates, Maechler, & Bolker, 2012) in R. To test the significance of the full model, we compared its fit with that of a null model using a likelihood ratio test.
The models were fit with binomial error structure and logit link function. The response variable was a binary measure, session with a normative conflict vs. session The full model improved the fit as compared to the null model (χ² = 29.59, df = 3, p < .001). The significant interaction between age group and context suggested that both age groups had more normative conflicts in the incompatible context than in the compatible conflicts, but this difference was greater for the 5-year-olds (B = 2.79, SE = 1.36, z = 2.06, p = .04; see the top panel in Figure 1). Namely, 3-year-olds had normative conflicts in the compatible context more than 5-year-olds did.

Number of utterances until Agreement
To test how long it took the children to agree on a rule in each session, we used a repeated measures analysis of variance (ANOVA) with one within subjects factor (game with two levels: board and tube) and three between subjects: 1) Age group with 2 levels: 3 vs. 5, 2) Context with 2 levels: compatible vs. incompatible, and 3) Order of game with 2 levels: board game first vs. tube game first. The response variable was the total number of on-task utterances (both at the clause level and nonclause level and regardless of whether they were protest utterance or not) by both children in the dyad until they reached agreement. Because there were some episodes in which children had nonverbal agreements, and the number of utterances until agreement was zero, we log transformed the response variable after adding the constant 1. Through this log transformation, the assumptions of normal distribution and homogeneity of residuals were fulfilled by visual inspection of a qq-plot and residuals plotted against fitted values.
The repeated measures ANOVA was carried out with 47 dyads because one dyad (in the 5-year-old group in the compatible context) did not complete the tube game. The significant main effect of context suggested that it took both 3-and 5-yearolds significantly longer to reach agreement in the incompatible context than in the compatible context (F (1,42) = 52.51, p < .001, η p 2 = 0.56; see the left panel in Figure   2). The main effect of age group showed a trend suggesting that it took 3-year-olds somewhat longer to reach agreement than 5-year-olds (F (1, 42) = 3.23, p = .08, η p 2 = 0.07). The interaction between the context and age group was not significant.

Marking of Agency within Normative Protests
We compared the co-occurrence patterns of normative protests with animate vs. inanimate subjects across the age groups and the two contexts, using GLMM fit by the Laplace approximation. The response variable was the proportion of normative protests to all protests (normative and nonnormative). To test the significance of the full model, we compared its fit with that of a null model using a likelihood ratio test.
The models were fit with binomial error structure and logit link function. The full model included all the predictors of interest, that is age group (

Agency and Transitivity
In order to see which type of constructions these animate and inanimate subjects co-occur in, all the 644 protest utterances were coded for their transitivity, which had 5 levels: utterances with copular (constructions with sein 'be'), transitives, middles, intransitives, and other.
In their normative protests both 3-and 5-year-olds used predominantly middle constructions with inanimate subjects, such as Nein, die rote Murmel kommt hier rein 'No the red marble goes in here', 80% and 66% of the time respectively (See the top panel of Table 1). Three-year-olds almost exclusively used middles in their normative protests whereas 5-year-olds seemed a bit more discursively flexible and used other constructions in their normative protests such as intransitives with inanimate subjects (Das stimmt nicht 'That is not correct'; 14%) and transitives with animate subjects (Du musst die blauen rein 'You have to put the blue ones in'; 13%). In

Discussion
The results suggested that 3-year-olds were less selective for the contexts in which they enforced norms than 5-year-olds did. That is, 3-year-olds had normative conflicts also in the compatible context. Second, when facing conflict about the rules of the game, 3-year-olds insisted on their rule longer and were more reluctant to consider the other rule as a plausible alternative, as compared to 5-year-olds. Finally, in their normative protests, 3-year-olds used more limited discursive strategies, almost exclusively inanimate middle constructions, objectifying the norms. We will discuss these findings in detail in General Discussion.
In order to have a more complete developmental story, we conducted a second study in which we analyzed the negotiations of 5-and 7-year-old children because in middle childhood children show some further flexibility in their reasoning on moral norms and social conventions (Conry-Murray & Turiel, 2010;Horn, Killen, & Stangor, 1999;Themier, Killen, & Stangor, 2001). In Study 2, we presented 5-and 7-year-olds with the same sorting games. Although task difficulty was not an issue in Study 1, to make the tasks more interesting for 5-and 7- year-olds, each game had a more complicated version, in which there were 4 levels to each of the 2 sorting dimensions. The second language of one child was Russian, and the second language for the other child was not specified by his parent).

Materials
Each game had a more complicated version.
Board game, there were 4 animal shapes (a hedgehog, a bird, a bunny, a frog), 4 colors (yellow, green, orange brown) and, thus, 16 different types of magnets combining each animal shape with each color. Children were asked to place these tokens on a 56.5 x 34 cm magnetic board on which there were four rows. On the head of each row, there was a picture of a yellow hedgehog, a green bird, an orange bunny, or a brown frog, signaling one level of each of the two sorting dimensions.
Tube game, there were 4 wooden shapes (a ball, a cube, a triangle, a star), 4 colors (blue, red, white, pink) and, thus, 16 different types of wooden tokens combining each shape with each color. Children were asked to place these tokens into four 6 x 29.7cm plexiglass tubes on a platform. In front of each tube there was a red cube, a blue ball, a white triangle, or a pink star, signaling one level of each of the two sorting dimensions.
In both games, children sorted 12 tokens in the teaching phase and 16 tokens in the testing phase. The order of 12 tokens in the teaching phase and 16 tokens the testing phase were the same across dyads.

NORM ENFORCEMENT 23
The procedure, the counterbalancing, and the coding were exactly the same as Study 1. The kappa values reported in Study 1 for the inter-rater reliability on the coding apply to Study 2 (Half of the reliability files were from Study 2). The instructions were slightly different. In the teaching phases of both games, the experimenter told each child, for instance for the color rule in the board game, "All the yellows must go in this row; all the greens must go in this row; all the browns must go in this row; and all the oranges must go in this row" and delivered the instructions the same way for the shape rule in both games. The use of the modal verb müssen 'must' in the instructions could have been interpreted as encouraging normative language. However, the frequency of the modal müssen 'must' within ontask utterances did not change between Study 1 and Study 2. The models (fit with poisson error structure) included the control and random predictors, as well as age as a covariate. Adding the study type (1 vs. 2) to the model did not improve the fit, χ² = 1.64, df = 1, p = .19.

Results
The same 4 analyses were carried out: 1) presence of normative conflict in each session, 2) number of utterances it took children to reach agreement, 3) marking of agency within normative protests, and 4) the co-occurrence patterns of agency, normativity and transitivity.

Presence of Normative Conflict
We ran a GLMM model in which the response variable was a binary measure, session with a normative conflict vs. session with a nonnormative conflict  Figure 1).

Number of utterances until Agreement
We used a repeated measures analysis of variance (ANOVA) with one within subjects factor (game with two levels: board and tube) and three between subjects: 1) Age group with 2 levels: 5 vs. 7, 2) Context with 2 levels: compatible vs.
incompatible, and 3) Order of game with 2 levels: board game first vs. tube game first. The response variable was the total number of on-task utterances (both at the clause level and non-clause level and regardless of whether they were protest utterance or not) by both children in the dyad until they reached agreement. We log transformed the response variable. Through this log transformation, the assumptions of normal distribution and homogeneity of residuals were fulfilled by visual inspection of a qq-plot and residuals plotted against fitted values.
The repeated measures ANOVA revealed only a significant effect of context, suggesting that it took both 5-and 7-year-olds significantly longer to reach agreement NORM ENFORCEMENT 25 in the incompatible context than in the compatible context (F (1, 43) = 49.28, p < .001, η p 2 = .53; see the right panel in Figure 2).

Marking of Agency within Normative Protests
To compare the co-occurrence patterns of normative protests with animate vs. inanimate subjects across the age groups and the two contexts, we ran GLMM in which the response variable was the proportion of normative protests to all protests (normative and nonnormative). The full model included all the predictors of interest, that is age group (5 vs. 7), context (compatible vs. incompatible), agency (animate vs. inanimate), and the interaction between agency and age group, as well as the control

Agency and Transitivity
In order to see which type of constructions these animate and inanimate subjects co-occur in, all the 461 protest utterances were coded for their transitivity, which had 5 levels: utterances with copular (constructions with sein 'be'), transitives, middles, intransitives, and other. Five-and 7-year-olds did not differ from one another and showed the same general trend as the 5-year-olds in Study 1. Namely, they predominantly used inanimate middles (61% for 5-year olds, 57% for 7-year-olds) and some animate transitives (23% for 5-year-olds, 21% for 7-year-olds) in their normative protests (See the bottom panel of Table 1). In their nonnormative protests they used mostly transitive constructions with animate subjects and some copular constructions with inanimate subjects.

Discussion
The results of Study 2 suggested that 5-year-olds and 7-year-olds did not differ in any of the measures and the patterns were similar to the 5-year-olds in Study 1. This could be a ceiling effect due to our tasks, which did not require abstract level of reasoning. In dealing with more abstract concepts like stereotypes, children's socio-moral reasoning shows progression after preschool years (Conry-Murray & Turiel, 2010;Horn et al., 1999;Themier et al., 2001). For instance, in evaluating two candidates, "a boy and a girl equally good at ballet", for a ballet club, school-aged children mostly sided with the candidate who goes against the stereotype based on the reasons like "Boys don't get a chance to take ballet" (Killen & Stangor, 2001, p.183); as opposed to preschoolers who sided with the genderappropriate candidate in comparable contexts (Themier et al., 2001). acceptable. In fact, they continue to argue over the rule for 66 more lines and do not eventually agree on a rule. This pattern was also evident in the quantitative analyses.

General Discussion
As compared to older children, it took 3-year-olds longer to reach agreement and 80% of their protest utterances consisted of the same type of inanimate middle constructions, which were repeatedly used in their negotiations.
In contrast to Example 1, the following example shows how 5-year-old boys arrive at a resolution relatively more quickly. In Example 2, two 5-year-olds, Roger and Rafael are playing the tube game (Study 1). Rafael was instructed to sort the items by shape and Roger by color. Rafael places the first token in the tube according to its shape. beliefs about the rules. Thus, 3-year-olds almost exclusively took the objective and factual view of the event and portrayed the rules as "unalterable facts". This was also reflected in the relatively greater number of utterances it took them to reach agreement, their longer persistence on one rule, and their reluctance to see the alternative rule as possible as in Example 1 above.
One explanation for why all age groups used inanimate subjects in their normative protests could be due to the instructions delivered by the adult experimenter, who used inanimate subjects (Alle grünen kommen hier hin 'All the green ones go here'). Children could be repeating how the adult had presented the rules. However, despite receiving the same instructions, 3-year-olds still appealed to normative expressions with inanimate subjects more than older children did.
Another finding was that 3-year-olds did not distinguish the two contexts as well as the older children and enforced norms in the compatible context where there were technically no norm violations. In the compatible context, the children were presented with both rules in each game and due to our counterbalancing, the last rule each child learnt was different. The appropriateness of an action was contingent upon the joint decision about by which rule the children would play the game. Since there were two possible rules, children had to take the perspective of their peer and monitor to which rule their partner was orienting. The limited perspective-taking skills of 3- year-olds might explain why they enforced norms in the compatible context (Perner, Brandl, & Garnham, 2003;Perner, Stummer, Sprung, & Doherty, 2002). That is, when one child placed the first token without verbally marking which rule he/she had in mind, the second child had to inhibit the rule he/she had in mind, and adjust to his/her partner's rule. In these cases, some 3-year-olds did not adjust to the rules of their peers, but put forward their own rule, and treated the other's actions as norm violations. The fact that they could not inhibit the rule they had in mind could also result from the limited executive functioning abilities of 3-year-olds (Carlson & Moses, 2001;Frye, Zelazo, & Palfai, 1995). Older children were better at adjusting to the rule of their peer. In fact, older children often disambiguated the rules from the beginning by asking their peer to which set of rules to commit such as Machen wir Farbenspiel oder Tierespiel? 'Do we play the color game or the animal game?' or by explicitly signaling to which set of rules they are orienting such as Zuerst das Formenspiel 'First the shape game'.
Overall, the findings are informative about children's understanding of conventional norms regarding game rules. The results corroborate other studies on the fact that by 3 years of age, children are aware of such norms and responded to violations of them by referring to the rule (Nucci & Nucci, 1982a, 1982bNucci & Turiel, 1978;Rakoczy, 2008;Rakoczy et al., 2009;Smetana, 1981;Turiel, 1983;Wyman et al., 2009). However, this initial normative understanding seems to be less discriminate and less flexible. At age 3, children seem to be pretty dogmatic about game rules and view these as somewhat less alterable. Throughout the preschool years, normative understanding seems to get more fine-tuned (Kalish, 2005;Rakoczy, 2008;Smetana, 2006;Turiel, 2002). Children become more flexible in evaluating the right way and the wrong way to play the game; and are more able to divorce the current interactive context (a peer wanting to play the tube game and sort the items by shape) from the broader general context (the rule of the tube game is to sort the items by color). They are quicker to adjust to possible alternative rules, even if this contradicts their own knowledge. So what complements children's normative understanding during these years seems to be the appreciation of the arbitrariness and the negotiability of conventional norms within rule-governed games, as envisaged by Piaget (1932Piaget ( /1965.
In summary, when facing a conflict or ambiguity about game rules, children bring normative order to their peer interactions. How flexibly they arrive at a resolution and the linguistic strategies they deploy in their negotiations are informative about their normative understanding. Our results suggested that, although they know much about norms even in the preschool years, young children's actual use of social norms to structure their interactions with others, especially peers, continues to develop and becomes more flexible into the school years.