How Scheduling Can Bias Quality Assessment: Evidence from Food Safety Inspections

Accuracy and consistency are critical for inspections to be an effective, fair, and useful tool for assessing risks, quality, and suppliers — and for making decisions based on those assessments. We examine how inspector schedules could introduce bias that erodes inspection quality by altering inspector stringency. Our analysis of thousands of food safety inspections reveals that inspectors are affected by the inspection outcomes at their prior inspected establishment (outcome effects), citing more violations after they inspect establishments that exhibited worse compliance levels or trends. Moreover, consistent with negativity bias, the effect is stronger after observing compliance deterioration than improvement. Inspection results are also affected by when the inspection occurs within an inspector’s day (daily schedule effects): inspectors cite fewer violations after spending more time conducting inspections throughout the day, and when inspections risk prolonging their typical workday. Overall, our findings suggest that currently unreported violations would be cited if the outcome effects — which increase scrutiny — were triggered more often and if the daily schedule effects — which erode scrutiny — were reduced. For example, our estimates indicate that if outcome effects were doubled and daily schedule effects were fully mitigated, 11% more violations would be detected, enabling remedial actions that could substantially reduce foodborne illnesses and hospitalizations. Understanding and addressing these inspection biases can help managers and policymakers improve not only food safety but also process quality, environmental practices, occupational safety, working conditions, and infrastructure.

managerial decisions, including how to allocate quality improvement resources, which suppliers to source from, and how to penalize noncompliance. Inaccurate assessments can prevent managers, workers, customers, and neighbors from making well-informed decisions based on the risks imposed by an establishment's operations. Moreover, inspections that miss what they could have caught can undermine the inspection regime's ability to deter intentional noncompliance. In this study, we theorize and find evidence of several sources of bias that lead to inaccurate inspections. We also propose solutionsincluding alternative inspection scheduling regimes-that can improve inspection accuracy without increasing inspection costs.
Several studies have revealed various sources of inspection inaccuracy, yet little is known about inspector bias. We consider an unexplored type of bias that results from an operational decision: scheduling. Building on work from the behavioral sciences, we hypothesize how the sequence of inspections might affect the number of violations cited. Specifically, inspector stringency on a particular inspection may be influenced by (a) the outcomes of the inspector's prior inspection (prior inspection outcome effects, or, simply, outcome effects) and (b) its position within the inspector's day (daily schedule effects).
We study the influence of scheduling on inspection accuracy in the context of local health department food safety inspections of restaurants and other food-handling establishments. While these inspections need to accurately assess compliance in order to protect consumer health, the number of violations cited in these reports is a function of both the facility's actual hygiene and the inspector's stringency in detecting and recording violations. Because citing violations requires supporting documentation, inspector bias takes the form of underreporting the violations that are actually present rather than reporting nonexistent ones. Using data on thousands of inspections, we find strong evidence that inspectors' schedules affect the number of violations cited.
We hypothesize three ways in which an inspector's experience at one inspection affects the number of violations cited at his or her next inspection (outcome effects). Throughout this paper, we refer to an inspector's preceding inspection as his or her "prior" inspection and an establishment's preceding Electronic copy available at: https://ssrn.com/abstract=2953142 inspection as its "previous" inspection. (Figure 1 illustrates this distinction and depicts the relationships we hypothesize.) First, we hypothesize that an inspector's stringency will be influenced by the number of violations at his or her prior inspection. Those violations will affect the inspector's emotions and perceptions about the general compliance of the community of inspected establishments (via the salience of those recent inspection results), in turn altering his or her expectations and attitudes when inspecting the next establishment. This leads us to predict that having just conducted an inspection that cites more violations will lead the inspector to also cite more violations in the next establishment he or she inspects.
As predicted, we find that each additional violation cited in the inspector's prior inspection (of a different establishment) increases by 1.7% the number of violations he or she cites at the next establishment.
Second, we hypothesize that trends matter, too: discovering more compliance deterioration (or less improvement) at one inspection affects inspectors' emotions and perceptions in ways that lead them to cite more violations at the next establishment. Supporting this hypothesis, we find that inspectors cite 2.1% more (fewer) violations after having inspected another establishment whose violation trend worsened (improved) by one standard deviation.
Third, we hypothesize, based on negativity bias, that this trend effect will be stronger following an inspection that found deterioration than following one that found improvement. Indeed, we find empirical evidence that the trend effect is asymmetric, occurring when compliance at the inspector's prior establishment deteriorates but not when it improves.
We then hypothesize two daily schedule effects. We first theorize that, over the course of a day, inspecting causes fatigue that erodes inspectors' stringency and leads them to cite fewer violations. We find empirical evidence to support this, observing that each subsequent hour an inspector conducts inspections during the day yields 3.7% fewer citations per inspection. 1 Second, we hypothesize that inspections that risk prolonging an inspector's workday will be conducted less stringently, which will lead across inspectors' propensity to report violations has identified the importance of their tenure, training, gender, and former exposure to the establishment (Macher, Mayo, and Nickerson 2011, Ball, Siemsen, and Shah 2017. Inspector accuracy among third-party inspection firms has been shown to be influenced by whether the establishment or its buyer hires the inspection firm and pays for the inspection (Ronen 2010, Duflo et al. 2013, Short and, the level of competition among inspection firms (Bennett et al. 2013), and whether the inspecting firm has cross-selling opportunities (Koh, Rajgopal, andSrinivasan 2013, Pierce and. In contrast to these demographic aspects of individual inspectors and structural dimensions of the relationship between the inspection firm and the inspected establishment, we explore a very different potential source of inspection bias: where the inspection falls within an inspector's schedule.

Scheduling and Task Performance
Our study also relates to research that has examined how work schedules affect task performance. This literature has, for example, proposed optimal scheduling of workforces (e.g., Green, Savin, and Savva 2013) and of periodic tasks such as machine inspections (e.g., Lee and Rosenblatt 1987). Studies of the sequencing of individual workers' tasks have shown that scheduling similar tasks consecutively to increase task repetition can improve performance by reducing delays incurred from switching tasks (e.g., Gino 2012, Ibanez et al. 2017) and that healthcare workers work more quickly later in a service episode (Deo and Jain 2015). We extend this work by focusing on the effects of work schedules on task quality in a setting that purports to provide inspections that are of consistent quality as the basis for a fair and objective monitoring regime.
A few studies have examined the relationship between work schedule and task quality. Dai et al. (2015) found that healthcare workers become less compliant with handwashing rules over the course of their shift. That study focused on adherence to a secondary task that was largely unobservable to others, where noncompliance was common, and where fatigue might lead workers to shift their attention from this secondary task toward their primary tasks. In contrast, our study focuses on a primary task, the Electronic copy available at: https://ssrn.com/abstract=2953142 outcome of which (violations cited) is explicitly observable to others and the visibility of which could deter variation. Moreover, whereas Dai et al. (2015) measured adherence dichotomously, we use a more nuanced scalar measure. Another study examined the decisions of eight judges and found that they were more likely to deny parole as they issued more judgments since their last break (whether overnight or midday), suggesting that repeated decisions might have caused mental depletion (Danziger, Levav, and Avnaim-Pesso 2011). Whereas judges became harsher as they made more decisions throughout the day, inspectors might behave differently, given that for an inspector, greater harshness (manifested as stringency) requires more work.
Finally, two studies examined how workers adjust their decisions based on their prior decisions.
A study of MBA application assessments found that the higher the cumulative average of the scores an interviewer had given to applicants at a given moment on a given day, the lower he or she scored subsequent applicants that day, suggesting that decision-makers adjust their scores to maintain a consistent daily acceptance rate (Simonsohn and Gino 2013). Another study found that judges, loan reviewers, and baseball umpires were more likely to make "accept" decisions immediately after a "reject" decision (and vice versa), a form of decision bias (Chen, Moskowitz, and Shue 2016). Whereas these two studies find that subsequent decisions typically oppose prior ones, inspectors do not have explicit or selfimposed quotas or targets and, as we explain below, their emotions and perceptions may be affected by their prior tasks in ways that encourage subsequent decisions to be similar to prior ones. Additionally, we go beyond what prior work has considered by proposing that the magnitude of the effects from prior task outcomes will be asymmetric and will depend on whether the prior outcome was positive or negative.

Theory and Hypotheses
Quality assurance audits and inspections have detailed procedures to be followed in pursuit of accuracy.
Yet, in practice, behavioral biases may influence an inspector's stringency. Whereas inspections are typically assumed to yield the same results no matter when they occur on the inspector's schedule, we hypothesize that inspection results will indeed be influenced by the type of experience inspectors have at Electronic copy available at: https://ssrn.com/abstract=2953142 their immediately prior inspection (of a different establishment)-which we refer to as prior inspection outcome effects-and by when an inspection occurs during an inspector's daily schedule-which we refer to as daily schedule effects.

Prior Inspection Outcome Effects on Quality Assessment
3.1.1. Violation level at the inspector's prior inspected establishment. We theorize that inspectors will be influenced by the results of prior inspections. One such outcome effect is driven by whether the establishment an inspector just visited had many or few violations. There are two reasons why inspecting an establishment with many violations can imbue inspectors with a negative attitude that leads them to inspect more diligently at their next inspection, whereas inspecting a more compliant establishment can lead them to be less stringent in their subsequent inspection. First, an inspector's prior inspection can affect him or her emotionally. When more violations are cited at that prior establishment, its personnel are more likely to be dissatisfied and resentful, which can lead to hostile interactions with inspectors that can erode their goodwill and thus heighten their stringency during the next inspection.
Merely observing such dissatisfaction and resentfulness can similarly affect inspectors via emotional contagion (Barsade 2002). Conversely, finding fewer violations at the prior inspection is more likely to bolster an inspector's goodwill at the next inspection. Second, the experience at the inspector's prior inspection can shape his or her perceptions of the overall behavior of establishments, which can influence his or her stringency at the subsequent inspection. Recently experiencing an event (such as compliance) increases its salience and results in more rapid recall. An inspector may therefore use the results of that inspection to update his or her estimate of typical compliance levels, relying on the availability heuristic (Tversky and Kahneman 1974) and seeking evidence at his or her next inspected establishment that supports these expectations, consistent with confirmation bias (Nickerson 1998). This becomes a selffulfilling prophecy, where experiencing poor (good) compliance at the prior establishment leads inspectors to heighten (reduce) scrutiny at their next inspection and therefore detect more (fewer) violations. 2 We therefore hypothesize: Hypothesis 1: The more (fewer) violations an inspector cites at one establishment, the more (fewer) violations he or she will cite at the next establishment.
3.1.2. Violation trend at the inspector's prior inspected establishment. An inspector's behavior is shaped not only by the prior establishment's level of compliance, but also by its change in compliance relative to its previous inspection. This second type of outcome effect also results from how the prior inspection affects the inspector's emotions and perceptions.
The inspector's emotional response (through emotional contagion and interactions) at his or her prior establishment will depend on the trend there because the expectations of the establishment's personnel will be based on its previous inspection; they will be pleased or displeased according to whether their violation count has decreased or increased. After visiting an establishment with greater improvement, we predict the inspector will exhibit a more positive temperament and approach his or her next inspection with greater empathy and less stringency.
An inspector's perceptions, too, may be biased by the change in violations at the prior establishment. Many inspectors view inspections as a cooperative endeavor with the regulated entity to help improve business operations and safeguard stakeholders (e.g., May andWood 2003, Pautz 2009, 2 Other types of decisions might exhibit the opposite bias, whereby successive decisions are negatively autocorrelated, akin to the law of small numbers, the gambler's fallacy, sequential contrast effects, or quotas (Chen, Moskowitz, and Shue 2016). These effects are likely weak in the case of inspections. First, the law of small numbers and the gambler's fallacy, in which the decision-maker underestimates the likelihood of sequential streaks occurring by chance, are less likely to apply to inspectors. Instead, inspectors can be expected to predict a high likelihood that the establishments they sequentially inspect will exhibit similar compliance (that is, sequential streaks) because they share external factors that affect their compliance, including their competition, regulatory knowledge, and requirements about whether they must disclose their inspection results (such as restaurants in Los Angeles, New York, and Boston being required to post restaurant grade cards). Second, sequential contrast effects, in which the decision-maker's perception of the quality of the current establishment is negatively biased by the quality of the previous one, are ameliorated because inspectors are extensively trained to evaluate quality based on what they observe and thus have well-defined evaluation criteria that reduce the influence of prior inspections as temporary reference points. Moreover, each inspection takes significant time and often involves additional time traveling across inspected entities, so decisions are farther apart than sequential instantaneous decisions that may lead to unconscious contrasts of establishments. Third, quotas for the number of positive (or negative) decisions (in terms of violations cited or overall assessments of an establishment) would imply that fewer positive (or negative) decisions could be made after a prior positive (or negative) decision. Though the immediate prior decision would not directly matter, the cumulative prior decisions could. However, inspectors typically lack quotas or targets. Pautz 2010). Improved compliance may therefore be attributed to management taking the rules and regulations seriously-that is, cooperating-whereas worsened compliance may be attributed to management ignoring or deliberately flouting the rules-definitely not cooperating. Improved compliance therefore confirms a cooperative relationship that includes inspectors bestowing leniency (Hawkins 1983), which we assert can increase an inspector's faith that the overall community of inspected establishments is cooperating and thus lead him or her to be less stringent in the next inspection.
Similarly, worsened compliance can lead an inspector to believe that the establishments are "defecting" from their commitment to compliance (Ayres and Braithwaite 1992), which we assert can lead the inspector to update beliefs about the overall community of inspected establishments as not being cooperative, triggering him or her to be more stringent in the next inspection. We therefore hypothesize: Hypothesis 2: The more an establishment's compliance has deteriorated (improved), the more (fewer) violations an inspector will record at the next establishment.

Violation trend at the inspector's prior inspected establishment: Asymmetric effects
of deterioration versus improvement. According to the principle of negativity bias, negative events are generally more salient and dominant than positive events (Rozin and Royzman 2001). Negative events instigate greater information processing to search for meaning and justification, which in turn strengthens the memory and tends to spur stronger and more enduring effects in many psychological dimensions (Baumeister et al. 2001).
Negativity bias can affect the impact of the prior inspection's violation trend on the inspector's emotions and perceptions. First, negativity bias implies that for the inspected establishment's staff, the negative emotional effect of a drop in compliance will be stronger than the positive emotional effect of an improvement. This would result in stronger conveyance to inspectors of negative emotions associated with a drop in compliance and weaker conveyance of positive emotions associated with improvement. An inspector will then absorb more negative emotions after the negative finding than positive emotions after the positive finding. Moreover, as argued by Barsade (2002), mood contagion might be more likely for unpleasant emotions because of higher attention and automatic mimicry. These asymmetries in the extent Electronic copy available at: https://ssrn.com/abstract=2953142 to which declining versus improving conditions affect inspectors' emotions will lead, in turn, to asymmetric effects on the strength of the resulting positive or negative outcome effects.
Second, the salience of negative outcomes may have a stronger effect on inspectors' perceptions of how the establishments they monitor generally think about compliance, which can shape their stringency in a subsequent inspection. This is due to the status-quo bias: with the status quo acting as the reference point, negative changes are perceived as larger than positive changes of the same magnitude (Samuelson andZeckhauser 1988, Kahneman 2003). We therefore hypothesize: Hypothesis 3: Observing deteriorated conditions at an establishment will increase the inspector's stringency at the next establishment to a greater extent than observing improved conditions will reduce his or her stringency.
To summarize, Figure 1 depicts the relationships we theorize in our first three hypotheses.

Figure 1. Prior Inspection Outcome Effects
This diagram represents the history of Inspector i (downward arrow) and of two establishments, p and e (left to right). The shaded box represents the focal inspection, in which Inspector i inspects Establishment e. We refer to an inspector's preceding inspection as his or her "prior" inspection and an establishment's preceding inspection as its "previous" inspection. In this diagram, Inspector i inspects Establishment p and then Establishment e, the focal inspection. H1 refers to the influence of the former on the latter. H2 and H3 refer to how the focal inspection is influenced by Establishment p's change in compliance compared to its previous inspection (H2) and propose that the effect is stronger when the outcome is negative (H3). Prior research has focused, in contrast, on the relationship between an establishment's previous and focal inspections (depicted by the dashed arrow), such as an establishment's improvements as it undergoes successive inspections, the lag between inspections, inspectors' familiarity with an establishment from having inspected it before, and other factors related to the focal establishment's inspection history (e.g., Ko, Mendeloff, and Gray 2010, Macher, Mayo, and Nickerson 2011, Toffel, Short, and Ouellet 2015, Ball, Siemsen, and Shah 2017, Mani and Muthulingam 2018.

Daily Schedule Effects on Quality Assessment
3.2.1. Inspector fatigue. Inspectors are influenced not only by the results of prior inspections, but also by the sequencing of inspections within the day. Their work typically consists of a sequence of evaluative tasks that include physical tasks (such as manually examining the dimensions of a part or the temperature of a freezer) and mental tasks (such as interviewing an employee or determining whether or not a set of observations is within acceptable standards). As these tasks are executed, physical and mental fatigue will increase (Brachet, David, and Drechsler 2012). Furthermore, experimental evidence indicates that mental fatigue itself increases physical fatigue (Wright et al. 2007, Marcora, Staiano, andManning 2009).
Over the course of a day, inspectors' physical and mental fatigue will reduce their physical and cognitive effort. This undermines stringency, which requires physical and cognitive efforts such as moving throughout the facility, interviewing personnel, waiting to observe work, executing procedures such as taking measurements, and conducting unpleasant tasks (such as observing storage practices in a walk-in freezer). Once an attribute is observed, inspectors need to recall and interpret the relevant standards to decide whether there is a violation and, if so, to document it. Each step must be executed according to rules that increase the complexity even of tasks that might appear simple to the untrained eye. Moreover, mental effort is required to make decisions against the status quo; as inspectors grow more tired during the day, they may become more willing to accept the status quo (Muraven andBaumeister 2000, Danziger, Levav, andAvnaim-Pesso 2011), which, in the context of inspections, can take the form of passing inspection items. Finally, mental effort is required to withstand the social confrontations that can erupt when a finding of noncompliance is disputed by those working at the establishment, who may genuinely disagree and for whom, in any case, much may be at stake in terms of reputation and sales.
Citing violations can also provoke threats of appeals and lawsuits. Anticipating such responses, inspectors who are growing fatigued may exert less effort and seek to avoid confrontation, both of which increase leniency. For all these reasons, we hypothesize: Electronic copy available at: https://ssrn.com/abstract=2953142 Hypothesis 4: Inspectors will cite fewer violations as they spend more time conducting inspections throughout the day.

Potential shift prolonging.
In many settings, workers have discretion over their pace, which can lead them to prolong tasks to fill the time available (Hasija, Pinker, and Shumsky 2010) and to conduct work more quickly when facing higher workloads (KC and Terwiesch 2012, Berry Jaeker and Tucker 2017). Beyond these workload-related factors, we propose that inspectors will inspect less stringently when they expect to work later than usual (that is, beyond when they typically end work for the day). We hypothesize that inspectors' reluctance to suspend an inspection once under way-which would require them to bear the travel cost again the next day to finish the inspection-combined with a desire to finish at their typical time, will create pressure to speed up and inspect less thoroughly. As workers approach their typical end-of-shift time, accomplishing whatever remaining work cannot be postponed can become increasingly pressing as their perceived opportunity cost of time increases. The desire to speed up in these circumstances can result in increased reliance on using workarounds and cutting corners (Oliva and Sterman 2001), which in turn can reduce the quality of the work performed.
Because properly conducting inspections requires carefully evaluating a series of individual elements to identify whether each is in or out of compliance, omitting or expediting tasks to avoid prolonging the shift will result in a less comprehensive inspection with fewer violations detected and cited. We therefore hypothesize: Hypothesis 5: Inspectors will cite fewer violations at inspections when they are at risk of working beyond the typical end of their shift.

Empirical Context: Food Safety Inspections
Our hypotheses are ideally tested in an empirical context in which inspectors work individually, which avoids the challenge of discerning individuals' behaviors from those of co-inspectors. Food safety inspections conducted by local health departments fulfill this criterion because environmental health officers are individually responsible for the inspection of restaurants, grocery stores, and other food-Electronic copy available at: https://ssrn.com/abstract=2953142 handling establishments to protect consumers by monitoring compliance and educating kitchen managers in their assigned geographical area. Moreover, food safety inspections, commonly known as restaurant health inspections despite their broader scope, are designed to minimize foodborne illness; noncompliance can jeopardize consumer health. The quality of these assessments-and their ability to safeguard public health-depends on the accuracy of inspectors.
Foodborne disease in the United States is estimated to cause 48 million illnesses resulting in 128,000 hospitalizations and 3,000 deaths each year, imposing billions of dollars of medical costs and costs associated with reduced productivity and with pain and suffering , Minor et al. 2015 (Glaeser et al. 2016).
Because inspectors need evidence to justify citing violations (and thus can cite violations only if they are truly present), studies of inspection bias (e.g., Bennett et al. 2013, Duflo et al. 2013 are based on the assumption that deviations from the true number of violations are due only to underdetection and that bias does not lead inspectors to cite nonexistent violations. This assumption was validated in our interviews with inspectors and underlies our empirical approach.
Moreover, because violations are based on regulations that are based on science-based guidance for protecting consumers, each violation item is relevant.
Electronic copy available at: https://ssrn.com/abstract=2953142 We purchased data from Hazel Analytics, a company that gathers food safety inspection data from several local governments across the United States, processes the information to create electronic datasets, and sells these datasets to researchers and to companies-such as restaurant chains-interested in monitoring their licensees. These datasets include information about the inspected establishment (name, identification number, address, city, state, ZIP code), the inspector, the inspection type, the date, the times when the inspection began and ended, the violations recorded, and, where available, the inspector's comments on those violations. To learn more about the setting, we observed and interviewed inspectors, environmental health department managers, store operators, and directors of food safety and quality assurance at retail companies. Our interviews with managers and inspectors at health inspection departments represented in our dataset indicate that inspectors have limited discretion over scheduling. Each inspector is responsible for inspecting all establishments within an assigned geographic territory. Inspectors rotate to different territories every two to three years. They are instructed to schedule their inspections by prioritizing establishments based on their due dates, which are computed for routine (and routine-education) inspections based on previous inspection dates and the required inspection frequency for an establishment type (based on the riskiness of its operations). Particular events (such as a consumer complaint or the need to verify that a severe violation has been rectified) may trigger more immediate due dates; we control for these in our models via the inspection-type dummies described below. To minimize travel time, inspectors are instructed to group inspections with similar due dates by geographic proximity.
Although inspectors also carry out many administrative duties (such as reviewing records, answering emails, and attending department meetings at the office), most of their work is inspections and the associated travel. As they prepare to conduct inspections, inspectors review the establishments' most recent inspections. Traveling between their office and establishments to inspect often accounts for a substantial portion of the day because of the geographical dispersion in the areas covered by our data.
Inspectors are discouraged from working overtime.
When inspectors arrive at an establishment, they ask to speak to the person in charge and encourage this person to accompany them during their visit. During the inspection, they inspect the establishment (e.g., taking temperatures), observe workers' behaviors (e.g., whether and how they use gloves and wash their hands), and ask many questions to understand the processes (e.g., receiving or the employee health policy). As they walk through the establishment, the inspectors point out the violations they find, explain the public health rationale, and ask the personnel to correct them straightaway when possible. Though any immediately corrected violations are still marked as violations on the inspection form, this approach ensures that (a) the violations are corrected as soon as possible to improve food safety and (b) the personnel learn how to be compliant. Because of (a) the immediate corrections, (b) the instruction about regulations and how to improve the processes in the future, and (c) the incentive for compliance resulting from effective monitoring and enforcement, the citing of each violation-or the failure to cite it-has a real impact on public health. Thus, reducing the underreporting of violations resulting from the effects we identify would improve actual compliance and health outcomes.

Dependent and independent variables.
We measure violations as the number of violations cited in each inspection, a typical approach used by others (e.g., Helland 1998, Stafford 2003, Langpap and Shimshack 2010. We create two indicator variables to distinguish whether the inspector's prior establishment had improved, deteriorated, or not substantially changed violation count compared to its previous inspection.

Prior inspected establishment's violations
We classify an establishment's violation trend as improved saliently (or deteriorated saliently) if its current inspection yielded at least two fewer (more) violations than its previous inspection. We measure an inspector's schedule-induced fatigue at a given inspection as time inspecting earlier today-computed as the cumulative number of hours (with minute precision) inspectors spent onsite in their prior inspections that day before the focal inspection-to better account for the fact that some inspections take longer than others and that longer (

Control variables.
We measure inspector experience as the number of inspections the inspector had conducted (at any establishment) since the beginning of our sample period by the time he or she began the focal inspection.
We create an indicator variable, returning inspector, coded 1 when the inspector of the focal inspection had inspected it before and 0 otherwise.
We create an indicator variable, lunch period, coded 1 when the inspection began between 11:00 am and 3:59 pm, the period that tends to be especially busy for kitchen operations.
We create a series of indicator variables to control for whether the inspection is the establishment's nth inspection (2nd through 10th or more), each of which indicates whether an inspection is the establishment's second, third (and so on) inspection in our sample period; first is the omitted category.
Electronic copy available at: https://ssrn.com/abstract=2953142 We create a series of inspection-type dummies to indicate whether the inspection was (a) routine, (b) routine-education, (c) related to permitting, (d) due to a complaint, (e) an illness investigation, or (f) a follow-up. Routine inspections are conducted to periodically monitor establishments; routine-education inspections are similar to routine inspections but also include an educational presentation to train establishment staff. These two types make up 79% of the inspections in our estimation sample. Permit inspections are conducted when establishments change ownership or undergo construction, upgrades, or remodeling. Complaint inspections are triggered by the local health department receiving a complaint.
Because Camden does not classify particular inspections as triggered by complaints but does record complaint dates and the inspectors assigned to investigate them, complaint risk inspections refers to all inspections those inspectors conducted the day-and the day after-they were assigned to investigate a complaint. Illness investigation inspections are those conducted to investigate a possible foodborne illness (food poisoning). A follow-up inspection (or re-inspection) is conducted to verify that violations found in a preceding inspection have been corrected and is therefore of limited scope. Other inspections includes visits to confirm an establishment's deactivation/closure and inspections of mobile establishments, vending machines, and temporary events such as outdoor festivals; this is the omitted category in our empirical specifications.

Empirical Specification
We test our hypotheses by estimating (via quasi-maximum likelihood) a conditional fixed-effects Poisson model that predicts the number of violations cited in an inspection. Our specification exploits the double panel structure of the data, where an inspection could be viewed as both the nth inspection of establishment e and the jth inspection of inspector i in our sample.

Our independent variables include (1) the inspector's prior inspected establishment's violations,
that is, the number of violations that inspector i cited at his or her most recent inspection of any other establishment; (2) the prior inspected establishment's violation trend or, in some specifications, the two variables that indicate particular ranges of that variable: prior inspected establishment saliently improved and prior inspected establishment saliently deteriorated; (3) inspector i's time inspecting earlier today; and (4) an indicator for whether the inspection was potentially shift-prolonging.
The model also includes several controls. First, we control for inspector experience (Macher, Mayo, andNickerson 2011, Short, Toffel, and. We control for returning inspector because inspectors who return to an establishment they had inspected before tend to behave differently than inspectors who are there for the first time Hugill 2016, Ball, Siemsen, andShah 2017).
We include lunch period to control for the possibility that an establishment's cleanliness might vary over the course of a day and because prior research indicates that many individual behaviors are affected by time of day (Linder et al. 2014, Dai et al. 2015. We also include two sets of fixed effects to denote the month and the year of the inspection. We include a series of fixed effects to control for the establishment's nth inspection (2nd through 10th or more) because research has shown that, in other settings, establishments improve compliance over subsequent inspections (Ko, Mendeloff, andGray 2010, Toffel, Short, andOuellet 2015).
Because different types of inspections might mechanically result in different numbers of violations (e.g., due to different scopes), the model includes inspection type dummies.
Finally, we include fixed effects for every inspector-establishment combination. These inspectorestablishment dyads control for all time-invariant inspector characteristics (such as gender, formal education, and other factors that might affect his or her average stringency) and all time-invariant establishment characteristics (such as cuisine type and neighborhood). Thus, our specification identifies changes in the number of violations that a particular inspector cited when inspecting a given establishment on different occasions. Including inspector-establishment fixed effects also avoids concerns that our results are driven by spatial correlation; specifically, the concern that proximate establishments that inspectors tend to visit sequentially might exhibit similar violation counts because they share neighborhood characteristics that might affect the supply of and demand for compliance. Including fixed effects for inspector-establishment dyads is more conservative than including separate sets of fixed effects for inspectors and for establishments; a robustness test that includes these separate sets of fixed effects yields similar results.

Identification
We took several steps to ensure that our empirical approach tests our hypothesized relationships, controlling for or ruling out alternative plausible explanations. For example, the positive correlation between the number of violations that inspectors cite at a focal establishment and at their prior establishment could result not only from the mechanism represented in H1 but also if inspectors clustered on their schedules the establishments they expected to yield many (or few) violations. Our inspector interviewees revealed that they in fact tended to cluster inspections of establishments near each other in order to minimize travel time. While violations might be spatially correlated due to demographic clustering, our inclusion of fixed effects for inspector-establishment dyads controls for such timeinvariant establishment characteristics.
We test our hypothesis that inspector fatigue reduces inspector stringency (H4) by looking for evidence that fewer violations are cited at inspections conducted later in an inspector's daily schedule.
But that could have two other explanations. First, daily trends in customer visits, staffing levels, and staff cleaning efforts could result in better hygiene conditions later in the day. Our inspector interviews indicated, however, that many violations reflect longer-term problems whose propensity does not change throughout the day (e.g., sinks functioning improperly) and that hygiene conditions often get worse (not better) as establishments serve more customers, which would bias against our hypothesized effect. Our specifications nonetheless include fixed effects for time of day to control for potential variation in establishments' cleanliness at different time periods of the day. Second, inspectors might intentionally schedule "dirtier" establishments-those with historically more violations and thereby expected to have more violations-earlier in their daily schedule, leaving "cleaner" establishments for later in their schedule. However, two supplemental analyses yielded no evidence for this. A simple correlation analysis reveals that an establishment's previous inspection violation count is not significantly related to when in an inspector's daily schedule its focal inspection is conducted (Pearson's χ 2 = 285, p = 0.93). Moreover, Poisson regression results enable us to rule out that inspectors intentionally sequenced, to any meaningful degree, their day's inspections based on establishments' previous violations. Specifically, a Poisson regression that predicts how long an inspector has already been conducting inspections that day before he or she begins to inspect the focal establishment (time inspecting earlier today) based on the focal establishment's previous violation count (and including inspector-day fixed effects as controls) indicates that more violations in a previous inspection predicts that an establishment's subsequent inspection will be scheduled slightly later in the inspector's shift ( = 0.016, S.E. = 0.006, with standard errors clustered by inspector-day), which would be a bias against our hypothesized effect.
Finally, we test our hypothesis that an inspection being potentially shift-prolonging reduces inspector stringency (H5) by assessing whether shift-prolonging inspections yield fewer violations.
However, shift-prolonging inspections might also yield fewer violations if, as an inspector's normal shift end-time approaches, he or she intentionally chooses to inspect establishments anticipated to yield fewer violations in order to minimize how late he or she will need to work, presuming "cleaner" establishments can be inspected more quickly. Two supplemental analyses, however, rule that out. First, establishments with a potentially shift-prolonging inspection averaged 3.1 violations in their previous inspection, significantly more than the average of 2.3 average violations in the previous inspection of establishments whose inspections were not potentially shift-prolonging (Pearson's χ 2 = 243, p < 0.01). Second, a logistic regression indicates that the probability of an establishment's inspection being potentially shiftprolonging slightly increases if its previous inspection yielded more violations. Specifically, regressing a dummy indicating whether an establishment's inspection is potentially shift-prolonging on the violation count from its previous inspection and inspector-day fixed effects yields a significant positive coefficient Electronic copy available at: https://ssrn.com/abstract=2953142 on the violation count ( = 0.104, S.E. = 0.013, clustered by inspector-day). Both results would bias against our hypothesized effect.

Model results.
We estimate the count model using fixed-effects Poisson regression and report standard errors clustered by establishment (Table 2) Our results are robust to several alternatives: clustering standard errors by inspector, estimating the model with negative binomial regression with conditional fixed effects, and estimating the model using ordinary least squares regression predicting log violations. Multicollinearity is not a serious concern, given that variance inflation factors (VIFs) are less than 1.68 for all hypothesized variables and less than 6.01 for all variables except three of the inspection-type indicators. Because our specifications control for a variety of factors that affect the number of violations cited, we interpret coefficients on the hypothesized variables as evidence of bias, as done in prior studies (e.g., Chen, Moskowitz, and Shue 2016, . Because deviations from the true number of violations are assumed to result only from underdetection (as described above), we interpret negative coefficients to indicate the extent of underdetection, whereas positive coefficients indicate the extent to which underdetection is avoided. We interpret effect sizes based on incidence rate ratios (IRRs).
We test Hypotheses 1, 2, 4, and 5 using Model 1. We begin by interpreting the coefficients on our control variables. The estimated coefficient on inspector experience is positive and statistically significant, suggesting that, all else constant, the number of violations cited per inspection increases as the inspector conducts inspections over time, albeit by a small amount on an inspection-by-inspection level.
The negative and statistically significant coefficient on returning inspector (= -0.114, p < 0.01, IRR = Electronic copy available at: https://ssrn.com/abstract=2953142 0.892) indicates that inspectors who return to an establishment cite 11% fewer violations 5 than inspectors who had not inspected that establishment before, which is consistent with prior studies. Considering timeof-day effects, we note that, on average, inspections conducted during the lunch period cite 5% fewer To test H3, Model 2 replaces prior inspected establishment's violation trend with the indicator variables prior inspected establishment saliently improved and prior inspected establishment saliently deteriorated. The baseline condition occurs when the prior inspected establishment had no more than one violation more or less than it had in its previous inspection. Compared to this baseline condition, we find that inspectors cite more violations after their prior inspected establishment exhibited salient deterioration (= 0.079, p < 0.01, IRR = 1.082). The IRR indicates that, on average, an inspector who has just inspected an establishment with salient deterioration will report 8% more violations in the focal inspection. However, we find no evidence that observing salient improvement in the prior inspected establishment has any effect on the number of violations cited in the focal inspection. A Wald test indicates that these effects significantly differ (Wald  2 = 4.74, p < 0.05), which supports H3: the spillover effect on the focal inspection of having observed salient deterioration in the prior inspected establishment is statistically significantly stronger than the spillover effect of having observed salient improvement.
Model 1 also supports both of our hypothesized daily schedule effects. Thus, these behavioral effects have real implications because they affect citation rates of actual violations.
Electronic copy available at: https://ssrn.com/abstract=2953142 If inspectors' detection rates were improved so that they cited the violations that are currently going unreported due to the scheduling biases we identify, establishments could improve their food safety practices in two ways. First, they can improve compliance immediately because many violations can be instantly rectified. Second, they can improve future compliance because citations not only motivate establishments to improve the processes that generated them but also more broadly motivate compliance to prevent other violations, which is the deterrent intent of monitoring and enforcement. Thus, citations prompt behavioral responses that improve compliance, which in turn prevents foodborne health incidents.
Prior research that reveals decision biases tends to focus on quantifying their magnitudes.
Improving the accuracy of inspectors' citations of violations is in itself a very important outcome, one that organizations and governments care deeply about. We go beyond that typical approach by also estimating the real-world consequences. Our efforts to translate our primary findings (how scheduling affects the citations of violations) into their broader societal impacts (health consequences) would be equivalent to, for example, Chen, Moskowitz, and Shue (2016)

Robustness Tests
We conduct several analyses to confirm the robustness of our findings. Our primary results are based on a conservative approach that includes fixed effects for inspector-establishment dyads. We find similar results whether we instead include establishment fixed effects or separate sets of fixed effects for inspectors and for establishments (estimating the latter with Poisson regression led to convergence problems that led us to instead use OLS regression to predict log (violations+1)) or if we include the Electronic copy available at: https://ssrn.com/abstract=2953142 leave-out-means instead of individual fixed effects (Chen, Moskowitz, and Shue 2016). Our results are robust to omitting the establishment's nth inspection (2nd through 10th or more) fixed effects or replacing them with a continuous measure of the establishment's inspection sequence. All of our results also hold when we remove the inspection-type indicators and instead control for whether the inspection was routine. Our results are also robust to suppressing any of the other fixed effects.
To assess whether unusually busy days, which might make inspectors especially fatigued, might be driving our schedule-induced fatigue (H4) results, we re-estimated our models on the subsample of inspector-days with no more than six inspections (the 99th percentile). Our hypothesized results are robust to this subsample test.
Our results regarding the effects of schedule-induced fatigue (H4) hold even when we measure it using any of the following four alternative approaches rather than the time spent conducting prior inspections on the day of the focal inspection. In our first alternative, we calculate the number of prior inspections today, coded 0 for an inspector's first inspection of the day, 1 for the second, and so on.
Though this does not account for the fact that some inspections take longer than others and that longer This incorporates the time inspecting earlier today (that is, the time spent actually conducting inspections) and the time that elapsed between those inspections, which could also add to fatigue. Our last two alternative approaches accommodate the concern that fatigue might have increased the duration of the predicted durations derived from an ordinary least squares regression model, with a log-transformed outcome variable and including the covariates from the corresponding main specification.
The evidence supporting our hypotheses is also robust to including, as additional controls in our primary models, indicator variables denoting the day of the week the inspection occurred. Our results are robust to substituting our control for lunch period with three time-of-day periods to designate when the inspection began: breakfast period (midnight-10:59 am), lunch period ( Electronic copy available at: https://ssrn.com/abstract=2953142 Our main specification controls for inspectors' experience at the focal establishment via inspector experience and returning inspector, which are complemented by the establishment's inspection history (i.e., indicators for whether the inspection is the establishment's nth inspection-2nd through 10th or more) and the indicators for the inspector-establishment dyad. Our results are robust to replacing inspector's complacency instead of learning (Ball, Siemsen, and Shah 2017). Our results provide evidence of this link, with site-specific experience leading to inspectors citing fewer violations.
Finally, our results are robust to controlling for weekly workload or monthly workload, measured as the number of inspections the inspector conducted the week or the month of the focal inspection. As an aside, the estimated coefficient on workload is not significant, suggesting that despite the prevalence of workload effects in other settings, inspectors in our sample are resilient to them. This shows that inspection outcomes are difficult to influence and makes our identified effects even more impressive (Prentice and Miller 1992).

Extensions
Our primary results test the effects on inspector scrutiny of conducting inspections that risk prolonging the inspector's shift (H5). We also investigate whether the extent to which the shift might be prolonged matters (Appendix C in the online supplement). To measure the extent to which an inspection might reasonably be anticipated to extend beyond the inspector's typical end-of-shift time, we calculate potential extent of shift-prolonging (in fractions of hours) by subtracting the "inspector's typical end-ofshift time" from the "inspection's anticipated end time" (both defined in Section 4.2.1), using the difference when it is positive and otherwise coding this variable as 0. For both metrics, we use timestamps at the minute level but convert them to hours (and fractions of hours). Including this continuous variable instead of the binary potentially shift-prolonging lets us examine how the magnitude of shiftprolonging affects inspector scrutiny. As reported in the online supplement (Columns 3-4 of Table C1), the negative, statistically significant coefficient on potential extent of shift-prolonging ( = -0.068, p < 0.01, IRR = 0.934) indicates that inspectors cite 6.6% fewer violations for each hour that the inspection risked going past his or her typical end-of-shift time. We also pursued a more flexible approach by creating a series of dummy variables denoting the following ranges of potential extent of shift-prolonging: (a) up to 0.5 hours (the omitted baseline category), (b) above 0.5 hours and up to 1 hour, (c) above 1 hour and up to 1.5 hours, and (d) above 1.5 hours (Columns 5-6 of Table C1). These specifications yield the same inferences as our main models.
We conduct additional analysis to examine the persistence of some of our outcome effects (Appendix D in the online supplement). To explore whether these outcome effects persist beyond the next inspection, we added two variables to our models: the inspector's penultimate inspected establishment's violations (that is, two establishments ago) and then also the inspector's antepenultimate inspected establishment's violations (three establishments ago). The significant positive coefficients on both indicate that the number of violations cited is significantly affected not only by the violations at the inspector's immediately preceding inspection but also by each of the two inspections before that; the Electronic copy available at: https://ssrn.com/abstract=2953142 declining magnitudes of these coefficients indicates that the effect dissipates (Columns 1-2 of Table D1 in the online supplement). Second, we assessed whether the outcome effects attenuated if an inspector's successive inspections occur across different days, rather than on the same day. We replaced prior  Table D1). These results reduce the likelihood that the inspector's mood or other temporary factors are driving the outcome effects.
We explore other ways that breaks could affect inspector scrutiny (Appendix E in the online supplement). Our primary results that tested H4 indicate that inspectors exhibit more scrutiny following an overnight break. This finding is similar in some ways to Danziger, Levav, and Avnaim-Pesso's (2011) finding that parole judges' decisions were affected by overnight breaks. But while those judges' decisions were similarly affected after they took each of two food breaks, we find no evidence that the length of an inspector's break between inspections during the day affects scrutiny. We reach this conclusion by estimating a model that includes two variables measuring the length of two types of breaks: (a) overnight break length, which measures the amount of time elapsed between the end of the inspector's prior inspection on a preceding day and the start of the focal inspection that is the inspector's first inspection of a day, and (b) within-day break length, which measures the amount of time elapsed between the end of the inspector's prior inspection on the same day as the focal inspection and the start of the focal inspection. Both of these variables are measured in tens of hours and are top-coded at their 99th percentiles to avoid outliers influencing results. Adding these two variables to our primary specification (Column 1 of Table E1 in the online supplement) yields a positive significant coefficient on overnight break length, which indicates that break length matters for the first inspection of the day (specifically, Electronic copy available at: https://ssrn.com/abstract=2953142 longer overnight breaks increase inspector's violation detection rate), but a nonsignificant coefficient on within-day break length, which provides no evidence that break length matters for breaks within the day.
We also find no evidence of within-day break length affecting inspectors when we add to our primary specification a set of dummy variables for breaks of different length (a break of less than 1 hour, a break of 1-2 hours, a break of 2-3 hours, and a break longer than 3 hours, with overnight breaks as the omitted category).
We highlight a few other ways in which our study and that of Danziger, Levav, and Avnaim-Pesso (2011) resemble and differ from each other. First, the directions of the effects: overnight breaks in our context lead inspectors to restore their "harshness" by overlooking fewer violations, whereas breaks in the Danziger, Levav, and Avnaim-Pesso (2011) study lead judges to reduce their "harshness" with greater tendency to grant parole. The apparent disparity resolves, however, if breaks are viewed as leading both types of decision-makers to reduce status-quo bias associated with fatigue. Specifically, fatigue leads inspectors to increasingly overlook violations that require effort to discover, document, and defend and thus to accept the status quo of not citing a violation and allowing the establishment to continue operating as it currently does; similarly, fatigue leads judges to be "more likely to accept the default, status quo outcome: deny a prisoner's request" (Danziger, Levav, and Avnaim-Pesso, 2011: 6889). In both studies, breaks serve to counter this status-quo bias resulting from fatigue. Second, while judges' and inspectors' decisions are both affected by overnight breaks, only judges appear to be affected by breaks within the day. 6 6 There are several reasons why midday breaks might have influenced parole judges in the context of Danziger, Levav, and Avnaim-Pesso (2011) but not the inspectors we study. First, judges hear cases in their courtroom continuously except when they take breaks, whereas inspectors travel to different establishments to conduct successive inspections, resulting in breaks between every inspection. Second, judges' breaks between cases might bestow more mental relief than inspector's breaks between inspections, because inspectors need to be mentally engaged while driving to their next inspection. Third, breaks within the day might not meaningfully restore physical fatigue, and inspectors (but not judges) exert a lot of physical energy throughout the day (e.g., examining kitchens, inspecting freezers, traveling). Fourth, the judges' midday breaks always included food, whereas (presumably) not all of the inspectors' breaks did. Fifth, the decisions of parole judges and food safety inspectors require different amounts of documentation to justify and face different appeals processes. Sixth, cultural differences might play a role; for example, the differences between judges and inspectors and between Israel (the setting of Danziger, Levav, and Avnaim-Pesso) and the United States (our setting). Future research should explore how different types of breaks affect various types of decisions.
We assess whether the outcome effects on inspector scrutiny are affected by how much time had lapsed since the inspector's prior inspection by adding to our primary model overnight break length and within-day break length and their interactions with the two outcome-effect variables, prior inspected establishment's violations and prior inspected establishment's violation trend. None of the coefficients on these four interaction terms is statistically significant, yielding no evidence that break time (whether overnight or within-day) attenuates either of these outcome effects. (Results are reported in Column 2 of Table E1.) We also assess whether our hypothesized daily schedule effect regarding inspector fatigue (H4) is moderated by break length. That main result indicated that inspectors cited fewer violations after having inspected for more time on that day. While that provides evidence that overnight breaks enhance inspector scrutiny, we explore whether the within-day decline in inspector scrutiny was affected by within-day break length. We add to our primary specification break length (the amount of time that had lapsed since the inspector's prior inspection, measured as tens of hours since the inspector's prior inspection (topcoded at its 99th percentile to avoid outliers influencing results) and its interaction with time inspecting earlier today (our H4 measure). The coefficient on the interaction term is not statistically significant, yielding no evidence that break length during the day affects the rate at which inspector scrutiny declines throughout their day. (Results are reported in Column 3 of Table E1.) We also investigate the extent to which our hypothesized effects influence the citing of two types of violations: (a) critical violations, which are related to food preparation practices and employee behaviors that more directly contribute to foodborne illness or injury, and (b) noncritical violations, which are overall sanitation and preventative measures to protect foods, such as proper use of gloves, that are less risky but also important for public health. We find that the four schedule effects (prior Electronic copy available at: https://ssrn.com/abstract=2953142 We also examine whether our hypothesized effects influence other aspects of inspections that might be linked to scrutiny (see Appendix G in the online supplement). We find that inspectors conduct inspections more quickly as they progress through their shifts: inspection duration decreases by 3.1% for each additional hour of inspection earlier in the day. Moreover, the inspector's citation pace-violation citations per hour, a measure of productivity in this setting, representing the net of the effects on violations and inspection duration-decreases by 1.9% for each additional hour of prior inspections that day. Potentially shift-prolonging inspections are conducted 4.3% more quickly, but citation pace remains largely unaffected; thus, our main finding that potentially shift-prolonging inspections result in fewer violations is likely due to inspectors' desire to avoid working late, rather than to fatigue eroding their citation pace. We find little to no evidence of outcome effects on inspection duration and conclude that our main outcome-effect findings-that more violations and worsening trends at an inspector's prior establishment increase the inspector's citations at his or her next inspection-mostly result from inspectors increasing their citation pace rather than from spending more time onsite.
Finally, we examine whether our hypothesized effects are associated with documentation effort.
We find no evidence that average violation comment length (in characters or words) is influenced by the time inspecting earlier today, an inspection being potentially shift-prolonging, or the prior inspected establishment's violations (results not reported). However, we do find that an increase in the prior inspected establishment's violation trend is associated with a decrease in the focal inspection's comment length (in characters and words). That is, on average, an inspector documents the focal inspection with shorter comments when the prior establishment exhibited worsening violation trends. Thus, a potential mechanism by which such trends might increase citation pace at the focal inspection (that is, improve inspectors' productivity in citing violations) is by shifting some effort from documentation to detection.

Discussion
We find strong evidence that inspectors' evaluations are affected by their experience at the prior establishment they inspected. We also find that inspectors' scrutiny is influenced by their daily schedules: Electronic copy available at: https://ssrn.com/abstract=2953142 as inspectors conduct inspections throughout their workday, their scrutiny is eroded by increasing fatigue and by the perceived time pressure to finish before the typical end of their shift. The effect magnitudes that we identify, ranging from 1.3% to 8.2% individually and 11% overall, are large compared to decision bias among professionals in other field settings-such as the 0.5% effect size regarding decision bias exhibited by judges, 0.9% by baseball umpires, and 2.1% to 6.9% by social auditors-and compared to experimental results yielding biases of 0 to 8 percentage points by loan review officers (Chen, Moskowitz, andShue 2016, Short, Toffel, and.

Contributions
This study contributes to three literature streams. First, it is among the first to bring an operational lens to the literature on monitoring and assessment of standards adherence. In particular, we identify important scheduling effects on the scrutiny and thus the accuracy of those who monitor establishments' adherence to standards. We contribute to this literature's focus on improving monitoring schemes' effectiveness by analyzing how inspection outcomes are affected by outcomes of prior inspections at other establishments and by inspectors' daily schedules.
Second, by identifying spillover effects between inspections, our findings contribute to a related literature on the spillover effects of regulatory sanctions (e.g., Cohen 2000, Shimshack andWard 2005).
While that literature focuses on how an inspection agency's monitoring and enforcement affect its reputation for stringency, which has a spillover influence on other establishments' compliance, our study focuses on how inspectors' experiences at one establishment have spillover effects on their scrutiny at others. Ours is thus the first study of which we are aware that identifies spillover effects on inspector stringency associated not only with the outcomes of the immediately preceding inspection, but also with how many prior inspections an inspector had already conducted that day and with the inspector's apparent desire to avoid working late. We contribute to the nascent literature on the accuracy of inspectionsspecifically, of regulatory regimes and third-party monitoring of labor conditions in supply chains-that has largely focused on inspector bias due to economic conflicts of interest, team composition, and site-Electronic copy available at: https://ssrn.com/abstract=2953142 specific experience (e.g., Duflo et al. 2013, Ball, Siemsen, and Shah 2017. To our knowledge, our study is the first to bring the operational lens of scheduling to this literature by showing how work schedules can drive inaccuracies. Third, we contribute to the literature on the performance implications of scheduling and task sequencing. By examining actual decisions with important consequences for consumers, we contribute to the recent attempts to explore high-stakes decision-making in field settings (e.g., Chen, Moskowitz, and Shue 2016). The idiosyncrasies of quality-evaluation decisions result in biases that are different from those for other types of decisions. In contrast to a prior study that finds that judges, loan reviewers, and baseball umpires are more likely to make an "accept" decision following a "reject" decision (and vice versa) (Chen, Moskowitz, and Shue 2016), we find the opposite relationship among inspectors' assessments of subsequent establishments. This disparity could be due to inspectors being less susceptible to the causes of the negative autocorrelation found in those other settings. First, the law of small numbers and the gambler's fallacy, whereby the decision-maker underestimates the likelihood of sequential streaks occurring by chance, are ameliorated because inspectors know the establishments they sequentially inspect often share external factors such as competition that can explain their exhibiting similar compliance (that is, sequential streaks). Second, sequential contrast effects, whereby the decision-maker's perception of the quality of the current establishment is negatively biased by the quality of the previous one, might be mitigated by inspectors' targeted training to evaluate quality consistently based on what they observe and evidence they can document, and also by the longer time between inspections (including traveling from their prior inspected establishment). Third, quotas or targets (in terms of violations cited or overall assessments of an establishment) that could lead an individual's decisions to be influenced by his or her recent decisions are rare in the context of inspections. Further research is needed to identify circumstances under which decisions are similar or opposite to prior decisions. Further, we find that this effect was asymmetric: prior negative outcomes are much more influential than prior positive ones. To the best of our knowledge, our findings provide the first evidence of how sequential decision-making is Electronic copy available at: https://ssrn.com/abstract=2953142 influenced by negativity bias, whereby negative events are generally more salient and dominant than positive events (Rozin and Royzman 2001).
In contrast to prior research that finds judges becoming more stringent as they make more decisions since their last break (whether overnight or midday) (Danziger, Levav, and Avnaim-Pesso 2011), we find that inspectors become less stringent. One way to resolve this apparent contradiction is to consider that in both studies decisions made over the course of the day tend toward yielding the same result that would occur in the absence of a decision being made, a form of status-quo bias. In particular, judges tend toward denying parole, which results in the same situation that would have occurred in the absence of a hearing; inspectors tend to avoid citing violations, the same situation that would have occurred in the absence of an inspection. Differing relationships between effort and stringency might also contribute to the opposing effects observed between judges and inspectors. Although judges can exert stringency by denying parole without justification and thus with little effort, for inspectors to exhibit stringency, they must find proof of violations, which requires physical and mental effort to interact with establishment staff. Fatigue associated with additional inspections can thus impede violation detection.
Finally, while both parole decisions and inspections result in mental fatigue, inspections also trigger physical fatigue; it is plausible that mental and physical fatigue affect stringency differently.
Our daily schedule effects findings also complement the literature that has found that increased worker fatigue after long hours led to accidents among nuclear and industrial plant operators, airline pilots, truck drivers, and hospital workers (Dinges 1995, Landrigan et al. 2004. In response to such findings, industry standards and regulations have capped the number of consecutive work hours in some of these professions; our results indicate that such policies might also improve inspection accuracy. We contribute to this debate by providing evidence of the negative effects of fatigue on work quality during normal shifts (rather than the very long work periods others have examined) in a different setting (health inspections), focusing on primary tasks (rather than secondary ones). Moreover, we investigate a different performance dimension (accuracy of quality assessments) and identify potential remedies. To the best of our knowledge, we are the first to provide evidence of the negative impact on quality assessment of Electronic copy available at: https://ssrn.com/abstract=2953142 within-day fatigue and potentially shift-prolonging tasks. The results of our extension analysis suggest that inspectors themselves might attempt to ameliorate these effects by focusing on critical violations at the expense of detecting fewer noncritical violations and producing less documentation.
In addition, our finding that inspectors inspect less stringently as they approach the time they typically end their workday contributes to a broader understanding of how workers alter their procedures as they approach the end of their shift (e.g., Chan's (2017) finding that hospital physicians concluding their shifts accept fewer patients and make different decisions about patient care). More broadly, our work shows that even workers who lack formal shifts and have some flexibility to schedule their own work hours behave differently as they approach the typical end their workday, suggesting that work is done differently toward the end of a workday in more settings than previously conceived. Our work also responds to the call for behavioral research in the operations management field (Bendoly, Donohue, and Schultz 2006) by identifying ways in which task sequencing affects worker behavior. Our finding that inspectors' experiences at prior inspections bias their subsequent inspections shows that the outcome of tasks can affect how humans-unlike machines-perform their next task.

Managerial Implications
Extrapolating our study's results to the approximately one million food-handling establishments Electronic copy available at: https://ssrn.com/abstract=2953142 cited per inspection, citing one fewer violation constitutes a 41% decrease, a large change that creates unfairness across facilities and impedes accurate decisions being taken in response to inspection reports.
Thus, more accurate inspections that result in fewer violations being overlooked could prompt more effort to fully comply with food safety standards. For example, franchisors could be better equipped to interpret inspection reports so as to know which franchisees require more (or less) oversight.
Moreover, regulators and private-sector inspectors across industries can take steps to mitigate these biases in order to create more accurate inspection reports, which would yield fairer and more comparable results across inspected establishments, generate more reliable information for consumers, and better motivate compliance. For example, our identified outcome effects imply that increasing the salience of noncompliance and thus the need to enforce regulation could increase the number of violations detected. This suggests that reminders of noncompliance to inspectors-or other ways to increase such salience to them-could be a lever for inspection managers to increase inspectors' stringency, even if the information is already available to those inspectors and despite their innate desire to protect consumers.
With respect to daily schedule effects, one way to reduce the extent to which these biases erode inspection accuracy is to limit fatigue effects by smoothing the number of inspections per day, or if inspection capacity needs allow it, by capping the number of inspections a given inspector can conduct each day. Another approach, which can be used at the same time, is to minimize the number of shiftprolonging inspections by reallocating an inspector's weekly schedule to reduce variation in the predicted completion time of their final inspection each day or by shifting administrative tasks (such as office meetings) from the beginning to the end of the day. Reorganizing inspectors' schedules could eliminate these negative outcomes and might-according to our interviews with health inspectors in the areas covered in our data as well as in other areas across the United States-be possible without adding cost.
Managers can also use our findings to develop policies to reduce the consequences of inspector biases eroding inspection accuracy. For example, understanding that scrutiny typically declines as inspectors (a) conduct successive inspections during the day and (b) conduct inspections that risk Electronic copy available at: https://ssrn.com/abstract=2953142 prolonging their shift, the inspectors themselves could be required to schedule establishments that pose greater risks earlier in the shift. Such changes could reduce risk to public health.

Limitations and Future Research
Our study has several limitations that could be explored in future research. Though our data contain details of inspections and citations, we do not observe inspectors' beliefs or their onsite interactions. We find that they cite fewer violations after inspecting establishments that had fewer violations. Perhaps they make less effort to find hidden violations and are more willing to take a coaching approach for borderline violations-training operators to operate with better hygiene rather than writing citations. Possible extensions of our study could use observations of these actions to quantify how they are affected by scheduling. In addition, although our research context-food safety inspections-is common worldwide, it is just one of many types of inspections conducted by companies and governments. Future research should examine whether the relationships we identified hold in other contexts.
Wright R.A., Junious T.R., Neal C., Avello A., Graham C., Herrmann L., Junious S., Walton N. 2007. Mental fatigue influence on effort-related cardiovascular response: Difficulty effects and extension across cognitive performance domains. Motivation and Emotion 31(3) 219-231. Denotes whether an inspection is the establishment's nth (first, second, third, and so on) inspection in our sample period (modeled in our empirical specification as a series dummies for second through tenth-or-more, using the first as the baseline category) 4.02 2.06 1 10 Lunch period (11:00 am-3:59 pm) Indicates if the inspection began 11:00 am-3:59 pm 0.66 0.47 0 1

Appendix B. Interpretation of Results
To illustrate the magnitude of the estimated effects, we consider interventions that exploit outcome effects and ameliorate daily schedule effects, both of which would lead inspectors to cite violations that currently go underreported. In particular, we consider various scenarios that both (a) amplify the outcome effects in order to more routinely trigger the heightened inspector scrutiny that ensues after inspections reveal many violations and worsening compliance trends, and (b) mitigate the daily schedule effects in order to attenuate the reduced scrutiny that accompanies successive inspections and potentially shift-prolonging inspections. We estimate the effects of such interventions on the average inspection based on our sample, scale up the results to estimate the impact across the entire United States, and translate how such an increase in cited violations would translate to fewer foodborne illness cases and their associated healthcare costs.
In the best-case scenario, outcome effects (which increase scrutiny) would be fully triggered all the time and daily schedule effects (which erode scrutiny) would be entirely eliminated. The full consequence of these biases is reflected by the difference in inspection outcomes between this best-case scenario and the status quo, which quantifies the number of unreported violations and excess illnesses and costs that could be avoided if steps were taken to address these biases. Our discussions with inspectors suggest that some interventions are feasible-such as limiting or smoothing the number of inspections each inspector conducts per day-often without imposing any additional costs. We estimate a range of scenarios that consider the impacts associated with the daily schedule effects being attenuated by-and the outcome effects being actuated by-varying amounts.
We first consider the average impact on violations cited per inspection. Specifically, we compare the status quo (that is, the current practice with its associated scheduling effects) with alternative scenarios that consider various percentage changes (10% to 100% in 10% increments) of the effects we identified that would increase inspectors' detection rate (that is, decrease by 10% the daily schedule effects and increase by 10% the outcome effects). We make all these comparisons based on Model 1 in For example, consider the very conservative "10% scenario" depicted in the second row of Table   B1. In this scenario, we estimate the effects of (1) amplifying the outcome effects by increasing by 10% the actual values of prior inspected establishment's violations and prior inspected establishment's violation trend while also (2)  The estimates we construct should be considered as an illustration of the possible implications of the biases. We acknowledge the possibility that our estimates might overestimate the effects if the conversion factors we use overestimate the benefits of citing a particular violation and that they might underestimate the effects because we do not incorporate the spillovers and system-wide benefits of citing a particular violation, as each citation may encourage establishments to improve health practices more broadly. Failing to cite one violation thus not only carries the health risks associated with that violation but may also encourage noncompliance-an effect similar to the broken window phenomenon. That said, while developing a more Finally, we estimate the impact of citing more violations on the costs associated with foodborne illness cases based on two alternative estimates of the average cost per foodborne illness case of $747 (Minor et al. 2015) and $1,626 , which we use to construct the lower and upper bounds of our cost estimates (Columns 7 and 8). In the 10% scenario, applying these figures to the estimated 1.94 million fewer foodborne illness cases compared to the status quo yields a $1,446-million-to-$3,147million drop in the annual costs associated with foodborne illness cases nationwide.
As noted, there are many assumptions and caveats associated with these analyses and one can consider alternative scenarios. Our estimations above assume that mitigating bias would yield citations of violations that are as correlated with foodborne incidents as the violations currently cited. But what if newly cited violations are less "important," meaning they impose less health risk? For example, suppose comprehensive methodology to estimate the health impacts of citing more food safety violations is a necessary and worthy endeavor, it is beyond the scope of this paper. 4 We are aware of little research that has estimated the effect of each food safety violation on health outcomes and we rely on Jin and Leslie (2003), which we believe presents the best estimate. They show that introducing restaurant grade cards-signs posted outside restaurants that report the establishment's letter grade based on its most recent food safety inspection results-affects food safety inspection violation scores and health outcomes, so restaurant grade cards can be viewed as an instrument that reveals the relationship between violations cited and health outcomes. Because violations are supposed to be corrected when cited, we assume that the new citations resulting from reducing the bias translate into fewer actual violations. (To be conservative, we are not accounting for how citations motivate compliance more broadly.) The relationship Jin and Leslie (2003) identified between compliance and health outcomes applies to our setting because it is based on a similar type of inspection and a compliance measure based on total violations, which implicitly controls for the heterogeneous effects of different types of violation on health.
that remediating a newly cited violation would prevent half as many foodborne incidents as remediating a currently cited violation. Estimating the health impacts would then require adjusting Jin and Leslie's (2003) finding that a 5% improvement in restaurant compliance yields a 20% decline in foodborne illness hospitalizations to a 10% decline. In that scenario, if the drivers of outcome effects were doubled (that is, amplified by 100%) and the drivers of daily schedule effects were fully mitigated (that is, reduced by 100%), the 11.03% increase in citations (last row, Column 2) would translate into a 22.06% decline in hospitalizations [=11.03*(-10/5)] (compared to our original estimate of a 44.12% decline, calculated as 11.03*(-20/5) and reported in the last row of Column 4), which nationwide would result in 28,235 fewer foodborne-illness-related hospitalizations and 10.54 million fewer foodborne illness cases, saving $7.88 billion to $17.14 billion in foodborne illness costs.
Electronic copy available at: https://ssrn.com/abstract=2953142   Table B1, which is based on the methodology described in Appendix B. The horizontal axes represent different bias-reduction scenarios. For example, the 20% scenario illustrates the results of reducing bias by amplifying by 20% the outcome effects (which increase scrutiny) and mitigating by 20% the daily schedule effects (which erode scrutiny). Notes: Poisson regression coefficients with robust standard errors clustered by establishment. *** p < 0.01, ** p < 0.05, * p < 0.10.

Appendix D. Supplemental Analysis: Persistence of Outcome Effects
We conduct additional analysis to examine the persistence of some of our outcome effects (Table D1; see Section 4.7, "Extensions").
Electronic copy available at: https://ssrn.com/abstract=2953142 To assess whether our hypothesized relationships differentially influence inspectors' behavior across different types of violation, we estimated our models on two subsets of violations. First, we predict the number of critical violations, which are related to food preparation practices and employee behaviors that more directly contribute to foodborne illness or injury. These factors are prioritized in Alaska and in Camden County by being displayed on the first page of the inspection report and in Lake County by being tagged in the reports. Second, we estimated our models on the number of noncritical violations (that is, violations of procedures often referred to as "good retail practices"). While less risky than the other type, these are also important for public health and include overall sanitation and preventative measures to protect foods, such as proper use of gloves. Inspections averaged 0.93 critical violations and 1.49 noncritical violations.
Fewer noncritical violations are cited in inspections conducted during the lunch period than in other periods, but the results yield no evidence that time of day affects critical violations (see Table F1).
The latter finding is consistent with critical violations being related to longer-term establishment practices that are insensitive to the number of customers being served or to the staff's busyness and thus ability to respond to the inspector's presence.
Outcome effects are ubiquitous, affecting critical and noncritical violations alike. Each additional violation cited at the inspector's prior inspected establishment is associated with 1.92% more critical violations (Column 1: = 0.019, p < 0.01) and 1.61% more noncritical violations (Column 3:= 0.016, p < 0.01) cited in the focal inspection.
As with total violations, there is no evidence of critical and noncritical violations being affected when the prior inspected establishment saliently improved. When the prior inspected establishment saliently deteriorated, inspections yield, on average, 7.57% more critical violations (Column 2: = 0.073, p < 0.10) and 8.22% more noncritical violations (Column 4: = 0.079, p < 0.05).
Turning to daily schedule effects, we find that fatigue affects inspectors' ability to discover and report both types of violations. Specifically, the estimated coefficients on time inspecting earlier today indicate that each additional hour conducting prior inspections during the day results, on average, in 2.86% fewer critical violations cited (Column 1: = -0.029, p < 0.10) and 4.30% fewer noncritical violations cited (Column 3: = -0.044, p < 0.01).
These results also indicate that the potentially shift-prolonging effects identified in our primary results are driven by noncritical violations rather than critical ones. In particular, potentially shiftprolonging inspections result in 6.29% fewer citations (Column 3: = -0.065, p < 0.01). However, we find no evidence that citations of critical violations are affected by whether the inspection risks Electronic copy available at: https://ssrn.com/abstract=2953142 prolonging the shift: the coefficient on potentially shift-prolonging is not statistically significant when predicting critical violations (Columns 1 and 2). This suggests that avoiding prolonging the shift does not affect inspectors' ability to discover and report critical violations.
Overall, these results indicate that inspectors' schedules have somewhat different effects on citing critical versus noncritical violations. Citing noncritical violations appears to be influenced by all types of daily schedule effects and outcome effects, while citing critical violations appears to be influenced by all but the potentially shift-prolonging effects. Notes: Poisson regression coefficients with robust standard errors clustered by establishment. *** p < 0.01, ** p < 0.05, * p < 0.10.
Electronic copy available at: https://ssrn.com/abstract=2953142 Our primary results show how inspections of prior establishments and daily schedules are associated with the number of violations cited. To assess whether such results might be driven by inspectors spending more or less time and exhibiting more or less scrutiny in the subsequent (focal) inspection, we estimate our primary models on the log of inspection duration, the number of minutes between an inspection's start time and end time. Moreover, to assess the net effect of the changes in violations cited and inspection duration, we explore the inspector's citation pace-a measure of productivity in this setting-and estimate our primary models on the log (after adding 1) of violation citations per hour. The results are reported in Table G1.
Considering potential outcome effects, we find that inspectors spend only slightly more time conducting inspections succeeding inspections in which more violations were cited (Column 1: prior inspected establishment's violations  = 0.004, p < 0.10) and find no evidence that the violation trend of the inspector's prior inspection affects inspection duration (Columns 1 and 2: the estimated coefficients on the prior inspected establishment's violation trend, prior inspected establishment saliently improved and prior inspected establishment saliently deteriorated are not statistically significant). Recall that our primary results found that more violations or worsening trends at an inspector's prior establishment predicted more violations cited at the focal inspection. Results in Column 3 indicate that citation pace increases by 1.0% for each additional violation at the prior establishment ( = 0.010, p < 0.01) and by 1.9% for each one-standard-deviation increase in the prior inspected establishment's violation trend ( = 0.012, p < 0.05). Column 4 indicates that, as was the case with the number of violations, this effect is asymmetric and driven by negative trends: whereas we find no change in citation pace after inspecting an establishment with salient improvement, it does increase by 3.9% after inspecting an establishment with salient deterioration (Column 4:  = 0.039, p < 0.10). This indicates that our main outcome-effect findings-that more violations and worsening trends at an inspector's prior establishment increase the inspector's citations at his or her next inspection-result mostly from inspectors increasing their citation pace rather than spending more time onsite.
We next consider potential daily schedule effects. We find that inspectors conduct inspections more quickly as they progress through their shift: inspection duration decreases by 3.1% for each additional hour already spent conducting inspections that day (Column 1: time inspecting earlier today  = -0.031, p < 0.01). For context, recall that our primary results indicate that each additional hour inspecting during the day cites an average of 3.73% fewer violations. The model reported in Column 3 indicates that the net effect is that inspector citation pace decreases by 1.9% for each subsequent inspection of the day (time inspecting earlier today  = -0.019, p < 0.05).
Electronic copy available at: https://ssrn.com/abstract=2953142 Turning to potentially shift-prolonging inspections, recall that our primary results indicated that these had 5.0% fewer citations. Column 1 reveals that inspectors conduct such inspections 4.3% more quickly (potentially shift-prolonging  = -0.044, p < 0.01). Column 3 reveals that the effect of potentially shift-prolonging on citation pace is not statistically significant. These results jointly suggest that the diminishment in citations results from shorter inspection duration rather than slower inspector speed, with inspectors' citation pace remaining largely unaffected by the risk of working late. This, in turn, suggests that our earlier finding that potentially shift-prolonging inspections result in fewer violations is likely due to inspectors' desire to avoid working late, rather than to fatigue eroding their citation pace. Notes: Ordinary least squares coefficients with robust standard errors clustered by establishment. *** p < 0.01, ** p < 0.05, * p < 0.10.