CRYPTOGRAPHIC ENFORCEMENT OF LANGUAGE-BASED INFORMATION ERASURE Aslan Askarov Scott Moore Christos Dimoulas and Stephen Chong TR-01-15 Computer Science Group Harvard University Cambridge, Massachusetts CRYPTOGRAPHIC ENFORCEMENT OF LANGUAGE-BASED INFORMATION ERASURE ASLAN ASKAROV, SCOTT MOORE, CHRISTOS DIMOULAS, AND STEPHEN CHONG Abstract. Information erasure is a formal security requirement that stipulates when sensitive data must be removed from computer systems. In a system that correctly enforces erasure requirements, an attacker who observes the system after sensitive data is required to have been erased cannot deduce anything about the data. Practical obstacles to enforcing information erasure include: (1) correctly determining which data requires erasure; and (2) reliably deleting potentially large volumes of data, despite untrustworthy storage services. In this paper, we present a novel formalization of language-based information erasure that supports cryptographic enforcement of erasure requirements: sensitive data is encrypted before storage, and upon erasure, only a relatively small set of decryption keys needs to be deleted. This cryptographic technique has been used by a number of systems that implement data deletion to allow the use of untrustworthy storage services. However, these systems provide no support to correctly determine which data requires erasure, nor have the formal semantic properties of these systems been explained or proven to hold. We address these shortcomings. Specifically, we study a programming language extended with primitives for public-key cryptography, and demonstrate how information-flow control mechanisms can automatically track data that requires erasure and provably enforce erasure requirements even when programs employ cryptographic techniques for erasure. 1. Introduction The security requirements of modern software often impose restrictions on the lifetime of sensitive information. For instance, to protect its users, electronic payment software should erase credit-card information after the transaction. Similarly, browsers running in “private mode” promise (but often fail) to not leak or permanently store any information after the end of a private session [3, 34, 42]. These examples reveal two significant challenges for programs whose security depends on erasing sensitive information when a dynamic condition is met. The first challenge is to track sensitive data in a program in order to erase it at the appropriate time. This is complicated by the need to track data derived from sensitive data, and erase that too. For example, a browser in private mode must delete cookies from visited websites but also delete history, passwords, text box input and other cached information that may reveal user activity from a private session. Hence, to correctly erase sensitive information, it is necessary to track and control information dependencies; a difficult task for complex programs. The second challenge is that applications that handle sensitive information often receive, generate, and store large amounts of data: so much data that the application must use large but untrustworthy storage services (such as the cloud, or hard drives, depending on the trust model). For example, the back-end of a high-traffic credit-card processing service handles millions of requests and generates terabytes of logs every day. Untrustworthy storage services may not be able to delete data reliably or quickly. In this paper, we address these two challenges by providing language-level support to track sensitive information that requires erasure and use cryptographic mechanisms to efficiently and reliably delete data when dynamic conditions are met. Previous work considers these challenges in isolation; we bridge the gap between these separate lines of work and address both challenges. We improve on previous language-based techniques for information erasure by making it more practical, while clarifying and extending the guarantees offered by existing systems that use cryptography for data erasure. Motivating example. Snapchat is a photo messaging smartphone app that allows senders to specify a time limit after which a message should be erased from the recipient’s device. Snapchat did not correctly erase messages, allowing recipients to access supposedly erased messages via the file system [24] and resulting in a complaint filed with the FTC [22]. Consider a Snapchat-like application that receives sensitive data that should be erased under certain conditions (e.g., after a fixed period of time, at the user’s command, or when a certain event happens, such as 10 consecutive incorrect password attempts). The data is stored in an untrustworthy file system (which is untrustworthy because its contents may be viewed by untrusted parties) and deletion of data from the file system may not be immediate or 1 permanent [9, 10, 19, 20, 21].1 When the application computes on sensitive data, for example by allowing a user to manipulate sensitive documents, it produces additional sensitive data that should also be erased when required. Note that this simple application has the two challenges described above: there is sensitive information that needs to be erased, and the application relies on an untrustworthy storage service to store data that may need to be erased. Consider how this application could be implemented using existing approaches: language-based information-flow control and cryptographic data deletion. Language-based information-flow control provides formal foundations for information erasure [11] and its provably correct enforcement through language techniques, such as security-type systems [12, 26] that track sensitive data. If our Snapchat-like application is written in a language that tracks and controls information flow to enforce information erasure, then the programmer can express the application’s erasure requirements as security policies, and language mechanisms can ensure that the application correctly tracks all data that needs to be erased. However, the application’s storage service is untrustworthy, and this approach is unable to ensure that the storage service correctly deletes data, even though the data that should be deleted is accurately identified. Alternatively, our application could be written in a language or system that supports cryptographic data deletion. These systems allow sensitive information to be stored encrypted in untrusted storage services and erase information by erasing the encryption keys [9, 41, 44]. Following this approach, our Snapchat-like application would encrypt the sensitive data before storing it, and would “delete” the stored data by deleting the encryption keys. Thus, the application avoids trusting the untrustworthy storage service. The application does still require a trusted store for storing and reliably deleting cryptographic keys, but the requirements and scale of this trusted store are simpler than the (file system or cloud) storage service. However, this approach provides no support for ensuring that information derived from sensitive data is correctly erased. The developer must track information dependencies to ensure all derived information is deleted appropriately. This is an error-prone manual process, leading to security vulnerabilities. Moreover, such systems lack formal foundations for the security guarantees they are attempting to provide, and so it is unclear whether they handle sensitive information correctly. In this work, we combine the two approaches while avoiding the shortcomings of each. Since our goal is a practical, yet provably correct, cryptographic enforcement mechanism for information erasure, it is important to provide both expressive erasure policies and a clear semantics for their correct cryptographic enforcement. This requires the development of a formal framework that can be extended with cryptographic primitives without obscuring the semantics of information erasure. At its core is a novel, simple and intuitive semantic definition for information erasure that supports the cryptographic enforcement of erasure. Previous work has either achieved the same level of semantic simplicity by sacrificing the expressiveness of erasure policies [26] or sacrificed simplicity and thus easy extensibility of the semantics of erasure [11]. Our framework manages to meet both requirements: expressive dynamic erasure policies and a simple extensible semantic definition of erasure. In our approach, security policies describe the current confidentiality level of data, when to erase it, and the security level to enforce on the data after erasure. Often the security level to enforce after erasure will indicate that the data should be removed from the system, but our framework can handle more general requirements where the data is made more restricted, but does not need to be removed. For example, a suitable policy in our SnapChat-like application for a document that needs to be erased upon multiple incorrect password guesses would be L 10BadGuesses . Here, L is a low-confidentiality security level, is the top security level and 10BadGuesses is a program variable that the application sets upon ten consecutive incorrect password guesses. Intuitively, if this policy is associated with data, it requires that the data may initially be treated in accordance with security level L, but if and when variable 10BadGuesses is set, then the data (and any data derived from it) must have security level enforced on it, requiring the removal of the data from the system. We enforce erasure policies with a flow-sensitive type and effect system. To gain insight into how our enforcement mechanism works, consider our Snapchat-like application. The untrusted store (i.e., the file system or cloud storage service) is permitted to store only information at level L or lower (enforced by the type system restricting calls to the store’s API). Since the sensitive document has policy L 10BadGuesses , which is not lower than L in our security policy ordering, the document could not be stored in the untrusted store. However, given an appropriate cryptographic key, the document could be encrypted, and the resulting ciphertext (which has level L) could be stored in the untrusted store. An appropriate cryptographic key is a key that will be erased when variable 10BadGuesses is set. Our type system ensures that only appropriate keys are used and that the keys themselves are handled appropriately. Using 1 This reflects the Snapchat scenario, where a smartphone user could access Snapchat messages via the device’s file system, which does not reliably delete data. It also applies to applications that use cloud storage services. 2 c ::= skip | x := e | c0 ; c1 | if e then c1 else c2 | while e do c | set(x)| init y to e| read y into x| output( , e) e ::= v | x | e1 op e2 | (e1 , e2 ) | proji (e) v ::= n | (v, v) x ∈ Var, y ∈ Var, ∈L Figure 1. Simple imperative language standard information-flow control techniques, the type system also ensures that appropriate policies are enforced on data derived from sensitive data. We have developed a tool, Keyrase, which extends our enforcement mechanism to the Java programming language, and used it to implement a chat client that supports information erasure of received messages stored in an encrypted log. This demonstrates the practicality and usefulness of our approach to express and enforce realistic information erasure requirements. This paper makes the following three contributions: (1) A novel semantic security condition for information erasure. The security condition is easily extended with cryptographic primitives, providing an intuitive foundation for cryptographic information erasure and its correct enforcement. Moreover, it formalizes and justifies existing mechanisms for cryptographic data deletion. (2) A flow-sensitive type and effect system that provably enforces information erasure for an imperative language with cryptographic primitives. This is the first such cryptographic enforcement mechanism for dynamic erasure policies. Proving correctness requires showing the type and effect system is a sound and accurate static approximation of the dynamic erasure policies, a task significantly more complex than standard noninterference-like proofs. However, the complexity of the proof does not pollute the simplicity of the semantic definition or the type system. (3) An implementation of the enforcement mechanism for Java (including an extension for erasing information from the heap). The implementation, together with a chat client that supports erasing information from previously received messages, provides evidence for the pragmatic value of our approach and its applicability to real languages. Note that in this work we are concerned with “logical erasure” [15], not physical erasure [20, 21]. We do not address the problem of reliably erasing data from physical media. Also, we focus on confidentiality guarantees, and assume the attacker can observe but not modify data in untrustworthy storage. 2. Language, attacker model, and security This section presents a novel security definition for information erasure. We first describe a language model and use it to express an appropriate attacker model. In our model, deletion of data corresponds to an assignment to the variable containing the data. For locations the program cannot reliably delete, we use write-once variables, which can be initialized but not overwritten. For example, the untrustworthy storage of our Snapchat-like example corresponds to one or more write-once variables. The identifier of a write-once variable is underlined (e.g., y, z). We assume a set of security levels L, and use meta-variable to range over security levels. Security levels describe confidentiality requirements, and we assume a partial order that describes the relative restrictiveness of security levels. If then security level is at least as restrictive as security level , i.e., requires at least as much confidentiality as . We assume the partial order has a greatest element, denoted . In our examples, we use the two-point security lattice with elements L and H, where L H, but H L (i.e., H = ). Programs interact with the external environment via communication channels. For simplicity, we assume there is exactly one communication channel per security level. 2.1. Language and attacker model. Figure 1 depicts the syntax of our language. The commands of the language are standard with the exception of set, init, read, and output. Command set sets variables that are used as erasure conditions; we discuss it further in Section 2.2. Command init y to e creates a write-once variable y and initializes it to the result of e; attempts to re-initialize a write-once variable block program execution. Command read y into x assigns the value stored in y to x. Command output( , e) outputs the value of e on channel . Outputs correspond to attacker observations. 3 For evaluation of commands, we use a small-step semantics. A configuration is a tuple c, m, U, t , where c is a command, m is a standard memory that maps standard variables to values, U is a write-once memory that maps write-once variables to values, and t is an output trace. Output traces have the form ( 1 , v1 ) · . . . · ( n , vn ), where ( i , vi ) records that the ith output was of value vi on channel i . We write t · t for concatenation of traces t and t , and ε for the empty trace. The initial configuration for program c with initial memory m is c, m, Uinit , ε , where Uinit is the initial write-once memory (i.e., the one where no write-once variable has been written to). The reduction relation → for configurations is straightforward and we omit it. Note that the semantics enforces that write-once variables are written to at most once, and that set(x) is semantically equivalent to assigning 1 to x. We use a large step evaluation relation for expressions; m, e ⇓ v. The following program models the interaction of our Snapchat-like example with the phone display and an untrusted store. We use a write-once variable to model the untrusted store, since it is not trusted to reliably delete data. A standard variable represents the display, since the program can erase it by displaying new data. Observations of the display by the user are modeled by output statements. In this program msg contains sensitive information that should be erased by the time condition variable delete is set. 1 2 3 4 5 6 display := msg; output(L, display); // Display message compressed := gzip(msg); // Compress message init store to compressed; // Store message display := 0; msg := 0; compressed := 0; set(delete); output(L, display); read store into tmp; output(L, tmp); //Observe store The program displays the message on the screen (line 1), and compresses it using gzip (line 2), and then writes the compressed message to the storage device (line 3).2 Importantly, even though the compressed message differs from msg, it carries information that depends on msg. That is, there is information flow from msg to the result of gzip(msg). Finally, the program attempts to delete the message by overwriting display, msg, and compressed (line 4) just before delete is set. However, it cannot overwrite the write-once variable store. Later, an attacker may be able to observe its contents (line 6), and thus learn the sensitive message after it should have been deleted. This program does not correctly erase all information about the sensitive message. Attacker observations and knowledge. The previous example appeals to an intuitive notion of secure information erasure. To turn this notion into a formal definition, we develop a formal attacker model. An attacker is an entity that observes the outputs performed on some set of communication channels during some contiguous period of program execution. Based on these observations, the attacker infers facts about sensitive data contained in the initial memory.3 We assume an attacker observes only some of a program’s execution because we are interested in whether an attacker that starts observing after sensitive information should be erased learns anything about the sensitive information. We associate each attacker with a security level , and assume that an attacker with security level can observe outputs on all channels such that . Given a subtrace t, which represents the output of a program during some contiguous part of its execution, the projection of t to level , t , describes the observations of an attacker with security level ; it is the list of outputs in t on any channel such that . Suppose execution of a program produces trace t · tobs and the attacker observes tobs . The knowledge of the attacker is the set of initial standard memories that are consistent with these observations. An initial memory is consistent with an attacker’s observation if execution of the program under that memory could produce trace t · t such that t = tobs . Definition 1 (Attacker knowledge). Given a program c, security level , and subtrace tobs observed by the attacker, define attacker knowledge at level for subtrace tobs as follows k (c, tobs ) = {m | c, m, Uinit , ε −→∗ c , m , U , t1 −→∗ c , m , U , t1 · t ∧ tobs = t } The smaller the attacker knowledge set, the more the attacker knows. For example, an attacker that makes no observations (i.e., tobs = ε) has all initial memories in its attacker knowledge, meaning that the attacker has no information about the initial memory of the execution. 2Our model does not include functions but we write gzip(msg) to abstract the process of compressing msg. 3We assume sensitive information comes from the initial memory; orthogonal generalizations to input streams [4, 6] or user strategies [38] are possible. 4 As an example of attacker knowledge, assume that the initial standard memory contains two variables x and y, and we execute the program output(L, x); output(L, y). Now assume that the attacker’s observation is tobs L = (L, 1). Either of the two commands may have produced this output. So the attacker knows that either the initial value of x is 1 or the initial value of y is 1. Therefore, the knowledge of this attacker is {m | m(x) = 1 ∨ m(y) = 1}. A similar scenario occurs in the Snapchat-like example from this section. Assume we execute the program with an initial memory m where m(msg) = 42, and an attacker at level L observes only the output on line 5. The attacker observes only the trace (L, 0). All initial memories are consistent with this observation, so the attacker learns nothing; the attacker’s knowledge is the set of all memories. However, if the attacker also observes the output on line 6 then the attacker’s observations are (L, 0), (L, gzip(42)), and the attacker’s knowledge is {m | m(msg) = 42}. 2.2. Erasure policies and erasure environments. So far, we have defined erasure policies informally. In order to express these security requirements formally, we specify the syntax of erasure policies: p, q ::= |p cnd q An erasure policy is either a security level or a policy of the form p cnd q, where cnd is the erasure condition for this policy. Intuitively, information with erasure policy p cnd q is treated according to policy p until condition cnd is set, and thereafter must be treated according to policy q. Conditions cnd are program variables. We assume that initially all condition variables are set to 0. In Section 4, we give these variables the type cond, which prevents assignments to condition variables. Condition variables can be set only by the command set, which updates their value to 1. A simple security level means that no erasure is required, and the information has security level for the duration of program execution. Condition variables provides a simple but expressive way for a programmer to connect erasure conditions to program semantics. We conjecture that more general erasure conditions, such as arbitrary program expressions or semantic conditions [11, 12], do not significantly improve the practicality of the model. We attach erasure policies to sensitive data through an erasure environment. An erasure environment γ : Var → Pol maps variables to erasure policies. Given erasure environment γ and variable x, γ(x) is the erasure policy that should currently be enforced for information derived from the initial contents of x, i.e., the value of x in the initial memory. The initial erasure environment of a program serves as the specification of its erasure requirements. For instance, in the Snapchat-like example the initial erasure environment maps msg to policy L delete H. The erasure environment may change during program execution when the program sets condition variables. For example, after delete is set during the execution of our example program, the erasure environment changes to map msg to policy H: the initial value of msg and data derived from it should now be protected at level H. To capture the dynamic nature of erasure requirements, we augment configurations with erasure environments, c, m, U, t γ . We lift the reduction relation using the following rule.4 POL-ENV-UPDATE c, m, U, t → c , m , U , t c, m, U, t γ γ = update(m , γ) γ → c ,m ,U ,t The rule uses update to recursively update each policy in γ when a condition cnd is set. Function update takes two arguments: the memory after setting cnd and the erasure environment, and returns the updated erasure environment. update(m, γ) = λx . upd(m, γ(x))      upd(m, q)   upd(m, p) =   upd(m, p ) cnd upd(m, q)     if p = if p = p cnd q and m(cnd ) = 0 if p = p cnd q and m(cnd ) = 0 2.3. Security for erasure. Using attacker knowledge and erasure environments, we can define what it means for a program to erase information securely. Intuitively, a program is secure iff at any point in its execution, it produces only observable effects that are consistent with the security requirements of the program at the moment of the effect. This accounts for changes of erasure policies during execution. In more concrete terms, a program is secure iff it 4Notice that γ-lifting does not change the behavior of programs. 5 outputs information on channel only when the currently enforced security level for that information permits it to flow to channel . Function cur(p) returns the current security level of p. cur(p) = cur(p ) if p = if p = p c q We use this function to describe what information an attacker at level can learn given an erasure environment γ. Specifically, for any variable x and initial memory m, the attacker is permitted to learn about the initial value of x only if the currently enforced security policy for m(x) (i.e., cur(γ(x)))) is allowed to flow to . The following definition expresses this idea as a set of memories that agree on the contents of x: Definition 2 (Indistinguishable memories). Given memory m, erasure environment γ, and security level , define indistinguishable memories according to γ at to be the set ind (m, γ) {m | m ≈cur m } ,γ We can use indistinguishable memories to define what information an attacker at level who observes a subtrace tobs of the execution is permitted to learn: the intersection of all the initial memories that are consistent with the erasure policies that were current during the production of tobs . This means that if the erasure environment changes while an attacker observes the program, the information the attacker is permitted to learn includes information both before and after the change. For instance, if the execution of the program with initial memory m involves at least n configurations and the attacker observes a subtrace that corresponds to configurations i to n, the attacker is allowed to learn i≤j≤n ind (m, γj ). Since (i) a program is secure iff the information an attacker learns is bound by the information the attacker is allowed to learn, and (ii) attacker knowledge (Definition 1) captures the information an attacker learns, we define security as follows: Definition 3 (Security). Program c is secure for initial erasure environment γ if for all memories m and executions c, m, Uinit , ε γ −→∗ ci , mi , Ui , ti k (c, tobs ) ⊇ γi −→∗ cn , mn , Un , ti · tobs γn where i ≤ n, it holds that, for all levels , we have ind (m, γj ) i≤j≤n Note that an attacker who observes a subtrace of a program’s execution is allowed to learn less than an attacker who observes a larger subtrace. Conversely, attackers observing a larger part of the computation may obtain more precise knowledge. Also note that this definition of security is progress sensitive: it rejects programs that leak sensitive information via their termination behavior [8]. Finally, when erasure environments do not change during execution, i.e., γ = γi , for all i ≤ n, the definition coincides with non-interference. We conclude this section with an example. Consider a two-point security lattice with elements L and H, an erasure condition variable t, variable x, erasure environment γ such that γ(t) = L and γ(x) = L t H and program output(L, x); set(t); output(L, x). This program outputs the value of x after condition variable t is set. It is insecure and is rightfully rejected by Definition 3. If m(x) = 42 and m(t) = 0 in the initial memory, the program execution is as follows: output(L, x); set(t); output(L, x), m, Uinit , ε γ −→ γ set(t); output(L, x), m, Uinit , (L, 42) output(L, x), m , Uinit , (L, 42) γ −→ stop, m , Uinit , (L, 42) : (L, 42) −→ γ Here m = m[t → 1] and γ = update(m , γ) = γ[x → H]. For an attacker that observes the subtrace tobs = (L, 42) attacker knowledge is kL (c, (L, 42)) = {m | m(x) = 42 ∧ m(t) = 0}.5 In order to determine whether the attacker’s knowledge is bounded by the information the attacker is permitted to learn, we must consider the sets of indistinguishable memories permitted by the configurations that produced tobs . Our example program is secure (Definition 3) if indL (m, γ ) ⊆ kL (c, (L, 42)). Since cur(γ (t)) = L and cur(γ (x)) = H L, we have indL (m, γ ) = {m | m(t) = 0}. Thus, the program is insecure, since {m | m(t) = 0} ⊆ {m | m(x) = 42 ∧ m(t) = 0}. Similarly, our security definition rejects our Snapchat-like example. Recall that initially m(msg) = 42 and γ(msg) = L delete H. Assume that γ is the erasure environment for the configuration that is about to perform the output on line 6. For this configuration, we obtain that γ (msg) = H and indL (m, γ ) = {m | m(delete) = 0}. However, for an 5The initial value of all condition variables is known to be 0. 6 v ::= . . . | pk i | sk i | v | newkeypair(x1 , x2 , p) r j pk i | NaV c ::= . . . | x := encrypt(e1 , e2 ) | x := decrypt(e1 , e2 ) Figure 2. Language with cryptographic primitives S-PUB-ENCRYPT m, e1 ⇓ pk i m, e2 ⇓ v Y(enc) = j r j pk i x := encrypt(e1 , e2 ), m, U, Y, t −→ stop, m[x → v S-PUB-DECRYPT ], U, Y[enc → j + 1], t m, e1 ⇓ sk i m, e2 ⇓ v r j pk i x := decrypt(e1 , e2 ), m, U, Y, t −→ stop, m[x → v], U, Y, t S-PUB-DECRYPT-FAIL m, e1 ⇓ sk i m, e2 ⇓ v r j pk i i=i x := decrypt(e1 , e2 ), m, U, Y, t −→ stop, m[x → NaV], U, Y, t S-PUB-NEWKEY Y(pubkey) = i newkeypair(x1 , x2 , p), m, U, Y, t −→ stop, m[x1 → pk i , x2 → sk i ], U, Y[pubkey → i + 1], t Figure 3. Semantic of cryptographic commands attacker at level L who observes the output on line 6, attacker knowledge is {m | m(msg) = 42 ∧ m(delete) = 0} which is not a subset of indL (m, γ ), and so the program is insecure. 3. Language with cryptographic primitives In this section we extend our semantic security condition for erasure to a setting with cryptographic primitives. Our notion of symbolic cryptographic attacker is inspired by the one of Askarov and Sabelfeld [5]. but is different in that our extension is based on Laud’s model for cryptographically-masked flows [30]. We use Laud’s symbolic approach to cryptography because it has been proven sound for the computational cryptographic model. We do not directly prove computational soundness for this work, but such a proof would follow Laud’s proof closely. Section 3.1 extends our language with cryptographic primitives; technical development in that subsection is a direct adaptation of Laud’s work [30, §8], and is not a contribution of this work. Section 3.2 extends our model of attacker knowledge to account for cryptographic operations, and correspondingly extends our semantic definition of security. We use a public key cryptographic model, although our development can be straightforwardly adapted for symmetric key cryptography. Specifically, in this setting, symmetric key cryptography is less challenging than public key cryptography. With public key cryptography, our security definitions and enforcement mechanism must ensure that both public and private keys are used correctly, while with symmetric cryptography we would only have to consider symmetric keys. An adapted enforcement mechanism would impose similar restrictions on symmetric keys as it does on private keys. The model for encryption and decryption operations and ciphertexts would be similar as well. 3.1. Syntax and semantics. Figure 2 presents the extensions to our language syntax. We extend values with an indexed collection of key pairs, and refer to the individual keys as pk i and sk i for the public and private keys, respectively. We also use an indexed collection of random strings r j . These strings are not first-class values, r j but appear as components of encrypted values. Encrypted values have form v pk i , and consist of a plaintext v, the indexed public key pk i and the indexed random string r j that was used for the encryption of v. We model randomized encryption, and r j encapsulates the randomness used in the encryption. Including it in the symbolic model of ciphertexts allows us to model the fact that an attacker can distinguish between distinct ciphertexts even if the attacker cannot decrypt them. 7 pat(NaV, S) = NaV pat(n, S) = n pat(sk i , S) = sk i pat( v pat(pk i , S) = pk i r j pk i pat( · r j pk i r j , S) = · r j pat((v1 , v2 ), S) = (pat(v1 , S), pat(v2 , S)) , S) = pat(v, S) · r j if sk i ∈ S if sk i ∈ S Figure 4. Value pattern pat(v, S) Semantics. We extend configurations to have form c, m, U, Y, t , where Y is the symbolic cryptographic environment that counts the number of key pairs and random strings generated in an execution. Y maps {pubkey, enc} to N. Y(pubkey) is the number of key pairs that have been generated during the execution so far, and Y(enc) is the number of encryptions that have been performed so far (and thus is equal to the number of random strings generated). The initial symbolic cryptographic environment, written Yinit , maps both counters Y(pubkey) and Y(enc) to 0. The counters are incremented by commands for key generation and encryption respectively. Figure 3 presents operational semantics for the new commands. Inference rules for other commands are modified for the new configurations in the obvious way. Encryption command x := encrypt(e1 , e2 ) evaluates e1 to a public key, evaluates e2 to a value v, encrypts v using the public key (generating a new random string to do so) and assigns the result to variable x. Decryption x := decrypt(e1 , e2 ) evaluates e1 to a private key and evaluates e2 to a ciphertext. If the ciphertext was encrypted using the corresponding public key, the decryption succeeds, assigning to x the encrypted value. Otherwise, the decryption fails assigning to x the value NaV. Finally, newkeypair(x1 , x2 , p) generates a new key pair and assigns the public key to x1 and the private key to x2 . Policy p is an annotation used by our enforcement mechanism, and is described in more detail in the following section. It is an upper bound on the information that the public key may be used to encrypt, and also restricts how the private key is handled. Symbolic equivalence of encrypted values. In order to define what values an attacker can and cannot distinguish, we define an equivalence relation over ciphertexts. For this, we define a few auxiliary concepts based on a Dolev-Yao-style attacker model [16]. First, given a set of values V, the Dolev-Yao observables of V, written DY-obs(V), are the values that an observer can obtain by splitting pairs into constituent values and using any private keys it possesses to decrypt ciphertexts it possesses. Formally, for a set of values V, DY-obs(V) is the least superset of V, such that • if (v1 , v2 ) ∈ DY-obs(V) then v1 ∈ DY-obs(V) and v2 ∈ DY-obs(V). r j • if sk i ∈ DY-obs(V) and v pk i ∈ DY-obs(V) then v ∈ DY-obs(V) Second, given a value v, and a set of private keys S, the value pattern of v with respect to S, denoted pat(v, S), is a template that describes which values an attacker that possesses S cannot distinguish from v. In particular, any r j ciphertext v pk i that appears in v for which the attacker does not possess the corresponding private key sk i is replaced by · r j , indexed by the randomness used to generate the ciphertext. As mentioned before, we use this index to allow the attacker to distinguish identical ciphertexts from distinct ciphertexts, even when the attacker cannot decrypt the ciphertexts. The definition of pat(v, S) is given in Figure 4. We lift value patterns to lists of values: given a list of values v = v1 , . . . , vn , define the list pattern, denoted as pattern(v), to be the list of values pat(v1 , S), . . . , pat(vn , S), where S is the set of observable private keys within v, and is defined as S DY-obs({vi | 1 ≤ i ≤ n}) ∩ {sk i | i ∈ N}. Using the above concepts, we can define equivalence up to formal randomness: two lists of values are equivalent up to formal randomness iff the list patterns of the two lists are identical, up to permutations on the indices of key pairs and random strings. Intuitively, two lists of values are equivalent up to formal randomness if a Dolev-Yao attacker is unable to distinguish the two lists. Definition 4 (Equivalence up to formal randomness). Given two lists of values v = v1 , v2 , . . . , vn and v = v1 , v2 , . . . , vn , say that these lists are equivalent up to formal randomness, written v = v , when there exist two permutations φ, ψ : N → N such that pattern(v) is identical to pattern(v ) when public and private key indices are permuted according to φ and random string indices are permuted according to ψ, i.e., pattern(v) = pattern(v )[pk φ(1) rnd /pk 1 , pk φ(2) /pk 2 , ..., sk φ(1) r ψ(1) 8 /sk 1 , sk φ(2) /r /sk 2 2 , ..., /r 1 , r ψ(2) , ...] Note that the definition uses permutations φ and ψ on keys and random strings respectively, implying that the specific keys and ciphertexts are not significant, but rather how these keys and ciphertexts are used. Example 1. The following program generates two key pairs and then depending on whether x > 0 encrypts the integer 0 using one of the two keys before outputting the resulting ciphertext. 1 2 3 newkeypair(pk , sk , H); newkeypair(pk , sk , H); if x > 0 then v := encrypt(pk , 0) else v := encrypt(pk , 0) output(L, v); Consider two executions of this program: one where x = 5 and the other where x = 0. The first execution results in the r 0 r 1 output u1 = 0 pk 0 ; the second results in the output u2 = 0 pk 1 . The corresponding list patterns of u1 and u2 are both · r 0 and · r 1 . Using the permutation that maps 0 to 1 for both keys and random strings, we can derive that rnd u1 = u2 . This matches our intuition: in both executions the attacker observes a single ciphertext that is unable to decrypt. Example 2. The following program generates a key pair, encrypts the integer 0 and outputs the ciphertext, and then either outputs the same ciphertext, or another encryption of 0, depending on expression x > 0. 1 2 3 newkeypair(pk , sk , H); u := encrypt(pk , 0); output(L, u); if x > 0 then v := u else v := encrypt(pk , 0); output(L, v) Consider two executions of this program that differ in the expression x > 0. The execution where x > 0 outputs values r 0 r 0 r 0 r 1 u1 = 0 pk 0 , v1 = 0 pk 0 . The other execution, where x ≤ 0, outputs values u2 = 0 pk 0 , v2 = 0 pk 0 . The list pattern of these outputs are · r 0 , · r 0 and · r 0 , · r 1 respectively. There are no permutations for which these two lists of values can satisfy Definition 4. Again, this matches our intuition: an attacker can distinguish an execution where the same ciphertext is output twice from an execution where two different ciphertexts are output, even though the attacker is unable to decrypt the ciphertexts. Example 3. The following program is similar to the first example, but outputs the two secret keys before doing any encryptions. 1 2 3 4 newkeypair(pk , sk , ); newkeypair(pk , sk , ); output( , (sk, sk )); if x > 0 then u := encrypt(pk , 0) else u := encrypt(pk , 0) output( , u); Again, consider two executions that differ in the value of x > 0. One of these executions results in output sequence r 0 r 0 (sk 0 , sk 1 ), 0 pk 0 , while the other one results in (sk 0 , sk 1 ), 0 pk 1 . The list patterns for these executions are not equivalent, because the attacker is able to use the private keys to decrypt the ciphertext, and thus distinguish them. 3.2. Cryptographic erasure. Using the notion of equivalence up to formal randomness, we can now extend our definitions of attacker knowledge and security to the extended language. Definition 5 (Cryptographic attacker knowledge). Given program c, security level , and a subtrace tobs observed by the attacker, define cryptographic attacker knowledge for subtrace tobs at level as krnd (c, tobs ) {m | c, m, Uinit , Yinit , ε −→∗ c1 , m1 , U1 , Y1 , t1 −→∗ c2 , m2 , U2 , Y2 , t1 · t ∧ t rnd = tobs } The structure and intuition of this definition is similar to Definition 1. The only difference is the use of equivalence up to formal randomness between trace projections. Definition 6 (Security with cryptography). Program c is secure for initial erasure environment γ if for all memories m and executions c, m, Uinit , Yinit , ε γ −→∗ ci , mi , Ui , Yi , ti krnd (c, tobs ) ⊇ γi −→∗ cn , mn , Un , Yn , ti · tobs γn where i ≤ n, it holds that for all levels , we have ind (m, γj ) i≤j≤n Again, the structure and intuition of this definition is similar to Definition 3 from Section 2.3, differing only in the use of cryptographic attacker knowledge. 9 Example 4. The following variant of our Snapchat-like example demonstrates how encrypting the compressed message before writing it to store makes the example secure. 1 2 3 4 5 6 7 newkeypair(pk, sk, L delete H); display := msg; output(L, display); compressed := encrypt(pk, gzip(msg)); init store to compressed; display := 0; msg := 0; sk := 0; set(delete); output(L, display); output(L, sk ); read store into tmp; output(L, tmp); This program first generates a key pair that can be used to encrypt information up to policy L delete H, and encrypts the compressed content of msg, storing the ciphertext in compressed. At line 5, variables display, msg and sk are overwritten and the condition delete is set (requiring erasure of the initial value of variable msg and any information derived from that value). This program is secure: an attacker that starts observing program execution after delete is set will learn nothing about the initial value of msg, even if the program subsequently outputs the contents of all memory locations, including the ciphertext value of store (line 7). Intuitively, this program is secure because both the initial value of msg and the private key used to encrypt the information derived from that value were erased prior to delete being set, even though an encryption of the compressed message is still accessible. If the assignment to sk in line 5 was omitted (i.e., the private key was not overwritten), the program would be insecure: after delete is set, an observer could learn the private key and thus decrypt the ciphertext and learn the initial value of msg. 4. Enforcement In this section, we develop an enforcement mechanism for secure erasure based on a flow-sensitive security type system. A straightforward extension of our mechanism combines our type system with a run-time enforcement mechanism for a language with a heap. This extension is inspired by the work of Chong and Myers [12] and is described in the Appendix. In flow sensitive type systems, the type of a variable may change during program execution if the variable is updated. For example, in program x := y, where, initially, variable x is typed as H and variable y is typed as L, the type of x may be updated to L after the assignment. The intuition behind our mechanism is the following: 1) The type system prevents information flows through variables or locations that enforce a weaker policy than the information requires. For example, the type system allows information with erasure policy p cnd q to flow to variables with policy p cnd , but prevents flows to variables with policy p, since those variables would not be erased to q when the condition cnd is set. We formalize this relationship between security policies using a relabeling judgment (Section 4.1). 2) A command set(cnd ) can only be typed when the typing environment contains no variables with security policies that contain erasure conditions cnd . In combination with (1), this requirement ensures that a condition cnd can be set only when no program variable contains information that needs to be erased by the time cnd is set (Section 4.2). 3) Encryption of sensitive information transfers the burden of preventing information flows from the data to the cryptographic keys used to encrypt it. As long as the keys are protected in accordance with the original policy on the information, the policy on the ciphertext can be relaxed (Section 4.3). 4.1. Relabeling judgment. Our first step for constructing an enforcement mechanism, is to define a relabeling judgment that we can use to compare erasure policies. We say that policy p can be relabeled to q, written p ≤ q, when enforcement of policy q on some information implies enforcement of policy p on the same information. Figure 5 defines the relabeling judgment. Erasure policies consisting of labels are compared using the ordering of the underlying security lattice, as per rule R-LAB. For compound policies, the rule R-LHS specifies that a policy p cnd q may be relabeled to policy p if both p and q can be relabeled to p . The condition variable cnd can be ignored because policy p is restrictive enough to satisfy the requirements of both policies p and q, regardless of the condition variable. Rule R-RHS is similar. Finally, rule R-SAMECOND defines relabeling between two policies with the same condition variable; p1 c q1 can be relabeled to p2 c q2 provided p1 ≤ p2 and q1 ≤ q2 ). 10 R-LHS R-RHS p≤p p c q≤p q≤p p ≤p p ≤p R-SAMECOND p ≤q c q R-LAB p1 ≤ p2 ≤ p1 c q1 ≤ q2 c q 1 ≤ p2 q2 Figure 5. Relabeling judgment on policies p1 ≤ p2 Γ okpol Γ(cnd ) = cond q Γ p1 q ≤ pi cnd Γ pi okpol (i = 1, 2) p2 okpol Figure 6. Well-formed erasure policies T-ASSIGN T-INIT Γ e:bp Γ, pc cond-free(b) {p, pc} ≤ q Γ e:bp Γ, pc Γ(x) = b q {pc, p} ≤ q x := e : Γ[x → b q] T-SET init x to e : Γ Γ(x) = cond p pc ≤ p T-READ Γ(x) = b q Γ, pc {pc, q} ≤ p x ∈ erasureconds(Γ) ∪ erasureconds(pc) Γ, pc T-OUTPUT read x into y : Γ[y → b p] Γ e:bp Γ, pc {p, pc} ≤ q set(x) : Γ cur(q) output( , e) : Γ Figure 7. Selected typing rules for basic commands The relabeling judgment is sound with respect to the semantic interpretation of erasure policies. That is, strengthening the policy required on information in accordance with the relabeling order strictly decreases the number of initial memories an attacker can distinguish(cf. Appendix). 4.2. Flow-sensitive enforcement of erasure. Using the relabeling judgment, we introduce a security type system that enforces information erasure. Our type system has the following types: b ::= int | cond | pubkeyp | privkeyp | encp b | (b, b) τ ::= b p Base types, denoted by b, range over integers, erasure conditions, public and private keys, encrypted values, and tuples (b, b). Types for public and private keys and encryptions are annotated with a key policy p. For keys the policy bounds the level of information that may be encrypted with the key pair. For encrypted values, the policy is same as the policy of the key that was used to produce the ciphertext. General types are denoted by τ and consist of a base type b and an associated policy p. Typing environments and well-formed policies. Let Γ be a typing environment that maps standard variables and write-once variables to general types: Γ : Vars ∪ Vars → τ. We use x to range over the domain of Γ, while we use x ˙ and x when we refer specifically to write-once and standard variables respectively. We require that typing environments are well-formed. Defining well-formed typing environments requires first defining well-formed policies. Figure 6 defines when a policy p is well-formed in Γ, denoted Γ p okpol . Erasure policies that are simple security levels are always well formed. There are two key restrictions for a well-formed policy p1 cnd p2 . First, the policy of the condition variable Γ(cnd ) must be no more restrictive than either p1 or p2 . That is, information about whether cnd has been set is allowed to flow to both p1 and p2 . This prevents covert information flows via condition variables. Second, policy p2 must be as restrictive as p1 . Thus the policy is indeed an erasure policy rather than a declassification or an arbitrary relabeling. We lift well-formedness from policies to typing environments. An environment Γ is well-formed, written and only if all policies it contains are well-formed (cf. Definition 7 in Appendix). 11 Γ okenv , if Flow-sensitive type system. The typing judgment for expressions has the form Γ e : b p. The intuition and inference rules for this judgment are the standard ones for security type systems, and we omit them for brevity. The typing judgment for commands is flow sensitive, and has the form Γ, pc c : Γ , where Γ is the typing environment immediately before the execution of command c, program counter policy pc is an erasure policy that is an upper bound on the information that may have influenced the decision to execute command c, and Γ is the updated typing environment that holds immediately after command c terminates. Figure 7 presents selected typing rules for the non-cryptographic commands. In these rules, we use a straightforward extension of relabeling to types and typing environments. We require that all typing environments are well-formed, but omit the corresponding premises Γ okenv from Figure 7 for brevity. The structure of the rules follows the flow-sensitive security type system of Hunt and Sands [25]. The interested reader can find the remaining rules of the type system in the Appendix. Rule T-ASSIGN updates the erasure policy for variable x with policy q, where q is an upper bound of both the policy of expression e and the current program counter policy pc. This accounts for both explicit information flows from the computed expression and implicit information flows due to control flow Premise cond-free(b) ensures that the type of the assigned variable is not cond or a tuple that contains a cond. Thus only a set(x) command can mutate a condition variable. The typing rules for write-once variables, T-INIT and T-READ, are similar to the typing rule for assignment and expressions. Crucially, however, the types of write-once variables are flow-insensitive since information stored in them cannot be erased. The type system does not enforce that write-once variables are written to at most once (and thus the type system does not enforce progress). However, this property is not required to prove that the type system enforces security. Rule T-SET requires that when typing command set(x) neither variables in the current typing environment nor the program counter policy pc contain erasure policies with x as the erasure condition variable. This effectively ensures that any information that needs to be erased when the condition is set has already been erased by the program. Furthermore, it implies that information with an erasure policy that is eventually set cannot be stored in a write-once variable, since the type of write-once variables is fixed, and so the environment will always contain that condition variable. These restrictions are imposed by the requirement that x ∈ erasureconds(Γ) ∪ erasureconds(pc). The type system also requires that pc ≤ p where Γ(x) = cond p, which ensures that the decision to set the condition variable does not reveal sensitive information. Operator erasureconds(arg1 , arg2 , . . . ) returns all condition variables that appear in the erasure policies of its arguments (cf. Appendix). The erasure conditions of a variable with type b p include condition variables within private key types appearing in b, ensuring that keys which could be used to decrypt sensitive information are also erased. Condition variables in public keys are not included because they do not protect sensitive information and do not need to be erased. Rule T-OUTPUT emphasizes the temporal nature of erasure policies: outputting a value on channel is permitted if the currently enforced security level of the value’s policy is allowed to flow to . For example, if the policy L cnd H is enforced on a value, then the currently enforced security level of the policy is L, and thus the value is allowed to be output to a channel with level L. Example 5. Under initial typing environment Γ such that Γ(cnd ) = cond L and Γ(x) = int L program counter policy pc = L, the program output(L, x); x := 0; set(cnd ); output(L, x) is well-typed. While x initially contains information that requires erasure, it has been overwritten with public information before the command set(cnd ). Thus the program type checks. Both output statements are secure since the first occurs while information with policy L cnd H is allowed to be observed on channel L and the second outputs only public information. Example 6. Under the same initial typing environment, the following program if x > 0 then set(cnd ) else skip is rejected by the type system because information in x is not erased by the time cnd is set. Example 7. Consider the program output(L, x); init file to x; x := 0; set(cnd ); read file into y; output(L, y) under an initial program counter level pc = L and initial typing environment Γ where Γ(cnd ) = cond L, Γ(x) = int L cnd H, and Γ(file) = int L cnd H. This program is similar to the first example, but stores the sensitive information 12 cnd H, and initial T-PUB-NEWKEY pc ≤ p pc ≤ pk pk ≤ q Γ, pc Γ = Γ[x → pubkeypk p, y → privkeypk q] newkeypair(x, y, pk ) : Γ p ≤ pk {pc, p} ≤ p Γ = Γ[x → encpk b p ] T-PUB-ENC Γ e1 : pubkeypk p Γ e2 : b p Γ, pc x := encrypt(e1 , e2 ) : Γ {q, pk , p, pc} ≤ q Γ = Γ[x → b q ] T-PUB-DEC Γ e1 : privkeypk p Γ e2 : encpk b q Γ, pc x := decrypt(e1 , e2 ) : Γ Figure 8. Typing for cryptography in x to file that cannot be erased. Right before set(cnd ) maps file to L cnd H. Thus set(cnd ) does not type check. Alternatively, if the initial environment mapped file to H, set(cnd ) would type check, but output(L, y) would not. 4.3. Typing cryptography. Figure 8 introduces the additional typing rules for cryptographic operations. These rules control the flow of information in the program and ensure that encrypted values and keys can be safely used to enforce information erasure. Values of type pubkeypk p and privkeypk q are public and private keys, respectively, where pk is an upper bound on the information that those keys can encrypt and decrypt. Thus in the rule for encryption (T-PUB-ENC), the information to be encrypted must be bounded above by pk , and in the rule for decryption (T-PUB-DEC), the result of decryption must be bounded below by pk . In rule T-PUB-NEWKEY, the policy q enforced on a private key must be at least as restrictive as pk and the decision to generate a new key pair (pc ) must be bounded below by pk . Intuitively, if secret information is encrypted, the private key that can decrypt it should also be treated as secret information; if the private key were public information, then anyone could use the private key to decrypt the ciphertext and obtain secret information. Crucially, this restriction also ensures that if encrypted information needs to be erased, it suffices to erase the private key. That is, if a keypair can be used to encrypt information that must be erased, then the policy enforced on the private key must be at least as restrictive as the erasure policy, and the private key must be erased if needed. A single key-pair may be used to protect multiple data items, so long as the bound of the key pair is at least as restrictive as the erasure policy to enforce on each sensitive datum. In rule T-PUB-ENC, the ciphertext that results from encrypting a value with policy p using a public key with type pubkeypk p has policy p , and p does not need to be bounded below by either p or pk . That is, the result of encrypting a secret value is a non-secret ciphertext. Note, however, that both the decision to encrypt and the choice of which public key to use may reveal information, and so p must be bounded below by both p and pc. Similarly, in rule T-PUB-DEC the result of decrypting a ciphertext must be bound below by the decision to decrypt, the choice of which private key to use, and the choice of which ciphertext to decrypt. Example 8. Recall the secure variant of the Snapchat-like example from Section 3. Consider an initial typing environment Γ(delete) = cond L, Γ(msg) = int L delete H, and pc = L. Also assume that the type of gzip(msg) is the same as the type of msg, i.e, the type captures the implicit flow from msg to the result of the expression. This program is well typed, since the ciphertext compressed may have policy L and thus store does not need to be erased before the condition is set. Moreover, as we discuss in Section 3, the secret key sk is erased. Thus this program is also semantically secure. Example 9. Consider the same program but with the command sk := 0 removed. This program is no longer secure because an observer can use the secret key to decrypt the second output, learning information about msg after the condition is set. This program is rejected by the type system because sk is not treated in accordance with the key policy L delete H, which would require it to be erased before the command set(delete). 4.4. Assurance. We now formulate our main soundness result. First, observe that our type system is termination insensitive [8] and thus cannot enforce Definition 6. Instead, we state soundness with respect to a weaker, terminationinsensitive variant of Definition 6. Existing techniques that either combine type-based enforcement with termination analysis [35] or further restrict the expressiveness of programs to eliminate termination channels [38, 43, 47] can mitigate this limitation. 13 As a technical device in our definition of termination-insensitive security, we introduce Conv(c), the set of converging memories. Conv(c) {m | c, m, Uinit , ε −→∗ stop, m , U , t } In the statement of our security theorem below, we use the set of converging memories to exclude memories that would cause the program to diverge or get stuck. Theorem 1 (Termination-insensitive security). Consider program c such that Γ, pc and all runs such that c, m, Uinit , Yinit , ε where i ≤ n, it holds that for all levels krnd (c, tobs ) ⊇ Conv(c) ∩ i≤j≤n γ c : Γ , Then for all m ∈ Conv(c) γn −→∗ ci , mi , Ui , Yi , ti γi −→∗ cn , mn , Un , Yn , ti · tobs ind (m, γj ) Standard proof techniques to show that a security type system enforces a noninterference-like property do not immediately work. There are two complicating factors. First, we need to prove a noninterference-like property for security policies ordered by the relabeling relation. However, channel outputs violate this ordering (since, e.g., information labeled L cnd H can be output to channel L, even though L cnd H ≤ L). This requires proving a property stronger than noninterference, where, for example, even though two L-equivalent executions may start with different values at policy L cnd H, once condition cnd is set, the two executions are thereafter L cnd H-equivalent. Second, our erasure policies are dynamic policies [4], and we must show that the type system is a sound and accurate static approximation of these policies. This requires powerful invariants, including that branches and loops modify the type of a variable only if the branch or loop might modify the variable’s value. The appendix includes the proof of Theorem 1 for our model extended with a heap enforcement mechanism. Theorem 1 as stated in this section is a corollary. 5. Implementation We have implemented an extended version of the enforcement mechanism from Section 4 in the tool Keyrase. Keyrase combines static analysis and source-to-source translation to enforce erasure policies in Java programs. To effectively support erasure in the presence of objects, the tool combines a flow-sensitive analysis to ensure erasure of information in local variables with a run-time mechanism to erase data in the heap. The run-time mechanism allows information with erasure policies to flow into the heap without requiring precise static reasoning. The extension of the language to model a heap and run-time erasure mechanism is presented in the appendix. Keyrase uses the Accrue Interprocedural Java Analysis Framework [13] (or, simply, “Accrue”). Accrue is itself built as a compiler extension to Polyglot [37], an extensible compiler framework for Java. Keyrase is implemented in approximately 3,000 lines of Java code, but is built of on top of an interprocedural constraint-based information-flow analysis, which consists of approximately 11,000 lines of Java code. Keyrase uses contraint-based type inference to require fewer type annotations. Language extensions. Keyrase extends Java with a distinguished type Condition. Conditions may appear only as static fields or local variables, must be declared final, and cannot be used in expressions or passed as arguments. This ensures that conditions used in erasure policies can always be statically identified. A program sets a condition with its set method. Programmers may annotate field declarations and expressions with a security policy for the contents of the field or the result of the expression. Erasure policies that appear in annotations can refer to any condition variable in scope. Keyrase also has annotations for declassification. While our formal model does not include declassification, we conjecture that it can be accounted for by weakening the indistinguishability relation in the style of Askarov and Sabelfeld [5]. Security analysis. When invoked, Keyrase attempts to infer security policies for every field, local variable, and expression in a program and reports whether those policies are consistent with information flow constraints and annotations. The constraints track explicit and implicit information flows through Java language features such as objects, fields, static and dynamic method dispatch, and exceptions. The constraints do not track information flow via termination, timing, or other covert channels. Like the enforcement mechanism described in Section 4, our analysis is flow sensitive for local variables and expressions but flow insensitive for heap locations. We do not support concurrency. 14 In addition to constraints that ensure standard information-flow restrictions, Keyrase uses several constraints to enforce the novel requirements of our type system: (1) Whenever a condition may be set (either by calling set directly, or by calling a method or constructor that may set a condition), we require that the condition does not appear in the security policies inferred for either the program counter or local variables in scope at that program point. This is analogous to the requirements of typing rule T-SET. (2) At calls to encryption and decryption functions, we constrain the policies associated with public and private keys according to typing rules T-PUB-ENC and T-PUB-DEC. (3) By default, fields are constrained to simple security level policies unless they are explicitly annotated with an erasure policy. Furthermore, to make sure that fields with erasure policies can be erased by reassignment, they may not be marked final. We rely on programmer-provided signatures to identify methods for key generation, encryption and decryption. The programmer may also provide signatures describing the information-flow behavior of library code. When neither code nor a signature is available, Keyrase (unsoundly) assumes that information may flow from the receiver and arguments of a method call to the return value of the method and to all fields of the receiver. Translation. Keyrase implements a source-to-source translation to enforce erasure in the heap. The translation injects code into the constructors or static initializers of classes that contain fields that may require erasure. This code registers a listener with the condition specified in the erasure policy. When the condition is set, it notifies the listeners, which overwrite the fields with default values. 6. Case study We used Keyrase to check the security of examples in this paper. We also implemented a simple Internet Relay Chat (IRC) client that keeps a compressed and encrypted log of messages on disk. The client comprises 316 non-comment lines of code (excluding libraries) and can connect to a server, join a chat channel, send and receive messages, and view previous messages from the log. The client also has a \clear command that erases all information about previous chat messages from the system. This security requirement is expressed as an erasure policy on received messages: L clear . This policy requires information to be erased when the condition clear is set, which happens when the \clear command is issued. Afterwards, the client can resume logging new messages. To reflect that data on disk cannot be reliably deleted, the signatures for methods that write to the file system accept only information with policy L. Thus, the client cannot log messages to the file system directly. Instead, it compresses and encrypts the log in screen-sized blocks before writing it to disk. Keyrase ensures that the keys used are also protected with policy L clear and thus cannot be written to the file system. Information about received messages also flows to the component that maintains a short buffer of messages used to redraw the screen. This buffer must also be erased when the clear condition is set. Of course, the terminal window itself may contain sensitive information. The client erases this trusted channel by sending a clear screen command. Other information flows in the program reveal sensitive information indirectly: for example, writing even encrypted information to the file system reveals information about the existence of messages, if not their contents. Keyrase detects this leak because the client attempts to write to the file system while the program counter is tainted. Since the information revealed is minimal, we added a declassification annotation to allow it rather than modify the program to always write a fixed-size log entry. Keyrase also detected leaks caused by insecure programming. For example, in Java, many methods that operate on sensitive data can throw exceptions that leak information by preventing subsequent outputs that would otherwise be secure. Avoiding these leaks requires careful handling of exceptions, which is difficult without tool support. In total, the IRC client required eight annotations: two to declare the types of cryptographic keys, two to declare the erasure policies of object fields, and four declassification annotations. One is the declassification of whether a new log entry has been added, described above. The remaining three declassify (1) the number of previous log entries, (2) output to the terminal window, and (3) whether a message was a “ping” (which generates multiple log entries). Our experience with Keyrase confirms that correctly identifying sensitive data that should be erased is challenging— most of the information flows we found were not immediately obvious. This reinforces the importance of cryptographic information erasure rather than data deletion in applications that use untrusted storage. 15 7. Related work Information erasure. Chong and Myers introduce language-based information erasure [11] and define erasure policies via a denotational semantics with an attacker that has direct access to memory. Our semantics for erasure policies is operational. In addition we use a precise and extensible knowledge-based attacker model, which allows for an extension to cryptographically-masked values [7]. Additional work has furthered the semantic definition of information erasure. Del Tedesco et al. [15] provide a semantic framework for expressive erasure policies. Their semantic condition, like ours, is based on attacker knowledge. Their policies can describe richer erasure policies than ours, including erasure policies that depend on the data to be erased. In contrast to our work, though, they do not provide an enforcement mechanism. Beyond the sequential programs that our work focuses on, Jiang et al. [27] consider information erasure and release in multi-threaded programs, using a bisimulation-based semantic condition. There are several language-based mechanisms for information erasure. However, unlike our work, none support cryptographic enforcement. Hunt and Sands [26] give the first such non-cryptographic mechanism, but support only lexically delimited erasure conditions. This restriction enables simple and elegant enforcement via a flow-sensitive security-type system. Hansen and Probst [23] enforce erasure policies in Java Card bytecode, with a single erasure condition set at the end of execution. Nanevski et al. [36] specify and enforce erasure policies using Relational Hoare Type Theory. Del Tedesco et al. [14] design a Python library for erasure but ignore implicit flows. Chong and Myers [12] enforce dynamic erasure conditions using an expensive run-time mechanism that requires local variables that may need erasure to be lifted to the heap. Our tool, Keyrase combines static enforcement of erasure for local variables with a run-time enforcement mechanism similar to that of Chong and Myers [12] for the heap. Thus Keyrase mitigates the scalability issues of an exclusively run-time mechanism. Van Delft et al. [45] describe a general framework for enforcing dynamic policies like information erasure. Similar to our approach, they separate the enforcement of the policy into a flow-sensitive dependency analysis and a static approximation of the policy. In fact, proving the correctness of the static approximation is the most challenging aspect of our proofs. Deletion without guaranteed erasure. Perlman proposes a file system that provides “assured deletion” [39], where data is stored encrypted and trusted stores (called ephemerizers) are responsible for managing cryptographic keys, including erasing them when needed. Tang et al. [44] implement a service on Amazon S3 that realizes these ideas. Dunn et al. [17] use similar techniques to implement ephemeral channels that enable assured deletion of communication between a process and peripheral devices, and Bauer and Priyantha [9] propose similar techniques to securely delete files and backups. Several encrypted file systems use this approach to securely delete files (e.g., [32, 33, 40, 49]). Our work extends this line of work. It (i) provides firm foundations for assured deletion and (ii) extends it to information erasure. Cryptography and information-flow control. We build on prior work on information-flow security for languages with cryptographic primitives. Volpano and Smith [46, 48] consider the secure use of one-way functions in programs. Abadi [1] introduces information-flow control type systems for a language with symmetric cryptographic operators. Abadi and Blanchet [2] consider asymmetric cryptography operators, and Fournet et al. [18] consider operators homomorphic encryption. Laud [28] considers encryption operators, but assumes that keys can be statically distinguished [29]. Askarov et al. [7] introduce (and enforce) a semantic security condition that can reason about keys that can be dynamically (but not statically) distinguished. These prior works use a symbolic model of cryptography [16], which Laud proves sound for a computational model of cryptography [30]. Laud and Vene [31] present a type system that is sound for a computational model of cryptography. Our work is inspired by these results. In particular, Askarov et al. [7] and Laud [30] provide the starting point for our exploration of the use of cryptography for information erasure — a direction beyond the goals of previous work on cryptography-based information security. 8. Conclusion This work combines cryptographic data deletion with information-flow control. The result is a novel semantic security condition for cryptographic information erasure, a flow-sensitive security type system that enforces it, and an implementation (and case study) that demonstrates the practicality of the approach. This work improves the practicality of language-based information erasure and provides stronger guarantees than existing cryptographic data deletion. 16 Acknowledgments This work is supported by the National Science Foundation under Grant No. 1054172 and by the Air Force Research Laboratory. We would like to thank Anitha Gollamudi for her helpful feedback. 17 References [1] M. Abadi. Secrecy by typing in security protocols. Journal of the ACM, 46(5):749–786, Sept. 1999. [2] M. Abadi and B. Blanchet. Secrecy types for asymmetric communication. Theoretical Computer Science, 298(3): 387–415, 2003. [3] G. Aggarwal, E. Bursztein, C. Jackson, and D. Boneh. An analysis of private browsing modes in modern browsers. In USENIX Security Symposium, pages 79–94, 2010. [4] A. Askarov and S. Chong. Learning is change in knowledge: Knowledge-based security for dynamic policies. In Proceedings of the 25th IEEE Computer Security Foundations Symposium, pages 308–322, 2012. [5] A. Askarov and A. Sabelfeld. Gradual release: Unifying declassification, encryption and key release policies. In Proceedings of the IEEE Symposium on Security and Privacy, pages 207–221, 2007. [6] A. Askarov and A. Sabelfeld. Tight enforcement of information-release policies for dynamic languages. In Proceedings of the IEEE Computer Security Foundations Symposium, July 2009. [7] A. Askarov, D. Hedin, and A. Sabelfeld. Cryptographically-masked flows. In Proceedings of the 13th International Static Analysis Symposium, Lecture Notes in Computer Science, Aug. 2006. [8] A. Askarov, S. Hunt, A. Sabelfeld, and D. Sands. Termination-insensitive noninterference leaks more than just a bit. In Proceedings of the 13th European Symposium on Research in Computer Security, pages 333–348, 2008. [9] S. Bauer and N. B. Priyantha. Secure data deletion for Linux file systems. In Proceedings of the 10th Conference on USENIX Security Symposium, Berkeley, CA, USA, 2001. USENIX Association. [10] D. Boneh and R. J. Lipton. A revocable backup system. In Proceedings of the 6th Conference on USENIX Security Symposium, Focusing on Applications of Cryptography, page 9, Berkeley, CA, USA, 1996. USENIX Association. [11] S. Chong and A. C. Myers. Language-based information erasure. In Proceedings of the 18th IEEE Computer Security Foundations Workshop, pages 241–254, June 2005. [12] S. Chong and A. C. Myers. End-to-end enforcement of erasure and declassification. In Proceedings of the 21st IEEE Computer Security Foundations Symposium, pages 98–111, June 2008. [13] S. Chong, A. Johnson, S. Moore, and O. Arden. Accrue Interprocedural Java Analysis Framework, 2013. http://people.seas.harvard.edu/~chong/accrue.html. [14] F. Del Tedesco, A. Russo, and D. Sands. Implementing erasure policies using taint analysis. In Proceedings of the Nordic Conference in Secure IT Systems, 2010. [15] F. Del Tedesco, S. Hunt, and D. Sands. A semantic hierarchy for erasure policies. In Seventh International Conference on Information Systems Security, 2011. [16] D. Dolev and A. Yao. On the security of public-key protocols. IEEE Transactions on Information Theory, 2(29): 198–208, 1983. [17] A. M. Dunn, M. Z. Lee, S. Jana, S. Kim, M. Silberstein, Y. Xu, V. Shmatikov, and E. Witchel. Eternal sunshine of the spotless machine: Protecting privacy with ephemeral channels. In 10th USENIX Symposium on Operating Systems Design and Implementation, pages 61–75. USENIX, 2012. [18] C. Fournet, J. Planul, and T. Rezk. Information-flow types for homomorphic encryptions. In Proceedings of the 18th ACM conference on Computer and communications security, CCS ’11, pages 351–360, 2011. [19] S. L. Garfinkel and A. Shelat. Remembrance of data passed: A study of disk sanitization practices. IEEE Security and Privacy, 1(1):17–27, Jan. 2003. [20] P. Gutmann. Secure deletion of data from magnetic and solid-state memory. In The Sixth USENIX Security Symposium Proceedings, pages 77–90, 1996. [21] P. Gutmann. Data remanence in semiconductor devices. In The Tenth USENIX Security Symposium Proceedings, pages 39–54, 2001. [22] J. Guynn. Privacy watchdog EPIC files complaint against Snapchat with FTC. Los Angeles Times, May 17 2013. [23] R. R. Hansen and C. W. Probst. Non-interference and erasure policies for Java Card bytecode. In Proceedings of the 6th International Workshop on Issues in the Theory of Security, Mar. 2006. 18 [24] K. Hill. Snapchats don’t disappear: Forensics firm has pulled dozens of supposedly-deleted photos from Android phones. Forbes, May 9 2013. [25] S. Hunt and D. Sands. On flow-sensitive security types. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 79–90, Jan. 2006. [26] S. Hunt and D. Sands. Just forget it—the semantics and enforcement of information erasure. In Proceedings of the 17th European Symposium on Programming, pages 239–253, 2008. [27] L. Jiang, L. Ping, and X. Pan. Handling information release and erasure in multi-threaded programs. In Computational Intelligence and Security: International Conference, pages 824–828, Dec. 2007. [28] P. Laud. Semantics and program analysis of computationally secure information flow. In Proceedings of the 10th European Symposium on Programming Languages and Systems, pages 77–91, 2001. [29] P. Laud. Handling encryption in analyses for secure information flow. In Proceedings of the 12th European Symposium on Programming, pages 159–173, 2001. [30] P. Laud. On the computational soundness of cryptographically masked flows. In Proceedings of the 35th ACM Symposium on Principles of Programming Languages (POPL), Jan. 2008. [31] P. Laud and V. Vene. A type system for computationally secure information flow. In Proceedings of the 15th international conference on Fundamentals of Computation Theory, pages 365–377, 2005. [32] B. Lee, K. Son, D. Won, and S. Kim. Secure data deletion for usb flash memory. Journal of Information Science and Engineering, 27:933–952, 2011. [33] J. Lee, S. Yi, J. Heo, H. Park, S. Y. Shin, and Y. Cho. An efficient secure deletion scheme for flash file systems. Journal of Information Science and Engineering, 26:27–38, 2010. [34] B. S. Lerner, L. Elberty, N. Poole, and S. Krishnamurthi. Verifying web browser extensions’ compliance with private-browsing mode. In Computer Security–ESORICS 2013, pages 57–74. 2013. [35] S. Moore, A. Askarov, and S. Chong. Precise enforcement of progress-sensitive security. In Proceedings of the 19th ACM Conference on Computer and Communications Security, pages 881–893, Oct. 2012. [36] A. Nanevski, A. Banerjee, and D. Garg. Dependent type theory for verification of information flow and access control policies. ACM Transactions on Programming Languages and Systems, 35(2), 2013. [37] N. Nystrom, M. R. Clarkson, and A. C. Myers. Polyglot: An extensible compiler framework for java. In 12th International Conference on Compiler Construction, pages 138–152, 2003. [38] K. R. O’Neill, M. R. Clarkson, and S. Chong. Information-flow security for interactive programs. In Proceedings of the 19th IEEE Computer Security Foundations Workshop, pages 190–201, June 2006. [39] R. Perlman. File system design with assured delete. In Proceedings of the Network and Distributed System Security Symposium, 2007. [40] Z. N. J. Peterson, R. Burns, J. Herring, A. Stubblefield, and A. D. Rubin. Secure deletion for a versioning file system. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies, page 11, Berkeley, CA, USA, 2005. USENIX Association. [41] J. Reardon, D. Basin, and S. Capkun. Sok: Secure data deletion. 2013 IEEE Symposium on Security and Privacy, pages 301–315, 2013. [42] K. Satvat, M. Forshaw, F. Hao, and E. Toreini. On the privacy of private browsing - a forensic approach. Journal of Information Security and Applications, 19(1):88–100, 2014. [43] G. Smith and D. Volpano. Secure information flow in a multi-threaded imperative language. In Proceedings of the ACM Symposium on Principles of Programming Languages, pages 355–364, Jan. 1998. [44] Y. Tang, P. Lee, J. Lui, and R. Perlman. Secure overlay cloud storage with access control and assured deletion. Dependable and Secure Computing, IEEE Transactions on, 9(6):903–916, Nov. 2012. [45] B. van Delft, S. Hunt, and D. Sands. Very static enforcement of dynamic policies. In R. Focardi and A. Myers, editors, Principles of Security and Trust, volume 9036 of Lecture Notes in Computer Science, pages 32–52. Springer Berlin Heidelberg, 2015. [46] D. Volpano. Secure introduction of one-way functions. In Proceedings of the 13th IEEE Workshop on Computer Security Foundations, pages 246–254, 2000. 19 [47] D. Volpano and G. Smith. Eliminating covert flows with minimum typings. In Proceedings of the 10th IEEE Computer Security Foundations Workshop, pages 156–168, 1997. [48] D. Volpano and G. Smith. Verifying secrets and relative secrecy. In Conference Record of the Twenty-Seventh Annual ACM Symposium on Principles of Programming Languages, pages 268–276, Jan. 2000. [49] E. Zadok, I. Badulescu, and A. Shender. Cryptfs: A stackable vnode level encryption file system. Cucs-021-98, Columbia University, 1998. Appendix A. Enforcement and extensions A.1. Enforcement of information erasure. This section presents the complete set of rules of the type system of Section 4 for the basic commands of the language of Section 2, i.e., our language model without cryptographic primitives. Before delving into the rules, we introduce a few auxiliary definitions that for brevity we describe only informally in Section 4. A.1.1. Well-formedness of environments. We lift well-formedness from policies to typing environments. An environment Γ is well-formed if and only if all erasure policies in Γ are well-formed. To formalize this definition, we introduce an auxiliary function nestedpols(κ, b) that returns the set of erasure policies appearing within a base type b. The argument κ is a key type selector, and is a subset of the set of strings {privkey, pubkey}. Operator nestedpols is defined as follows: nestedpols(κ, privkeyp ) = p if privkey ∈ κ nestedpols(κ, pubkeyp ) = p if pubkey ∈ κ nestedpols(κ, (b1 , b2 )) = nestedpols(κ, b1 ) ∪ nestedpols(κ, b2 ) nestedpols(κ, b) = ∅ otherwise For a base type b we refer to nestedpols({privkey}, b) as the set of the critical policies within b—this set includes all nested policies except those appearing in public keys, and is used in a few places within the type system. With this operator, we can now define well-formedness of a typing environment. Definition 7 (Well-formedness of typing environments ). A typing environment Γ is well formed, written Γ okenv , if and only if for all x such that Γ(x) = b p and all policies q such that q ∈ {p} ∪ nestedpols({privkey, pubkey}, b), ˙ ˙ Γ q okpol . A.1.2. The predicate cond-free(b). The predicate cond-free(b) succeeds when b is not cond or a tuple that contains a cond. The predicate is defined recursively as follows: cond-free(int). cond-free(pubkeyp ). cond-free(privkeyp ). cond-free(encp b ) if cond-free(b ). cond-free((b , b )) if cond-free(b ) and cond-free(b ). A.1.3. The operator erasureconds(p). The operator erasureconds(arg1 , arg2 , . . . ) returns all condition variables that appear in the erasure policies of its arguments: erasureconds( ) = ∅ erasureconds(p1 x p2 ) = {x} ∪ erasureconds(p1 ) ∪ erasureconds(p2 ) ∪ . . . erasureconds(p1 , p2 , . . . ) = erasureconds(p1 ) erasureconds(Γ) = Γ(x)=b p erasureconds(p) ∪ {erasureconds(q) | q ∈ nestedpols({privkey}, b) ∧ ¬(q ≤ p)} 20 A.1.4. The operator Γ2 : . The constrained merge operator Γ1 Γ2 returns an environment that combines Γ1 and Γ1  Γ1 (x) ˙  Γ2 = λx. ˙ τ   if Γ1 (x) = Γ2 (x) ˙ ˙ if Γ1 (x) = Γ2 (x) ˙ ˙ where Γi (x) ≤ τ, i = 1, 2 ˙ A.1.5. The type system rules. With the above definitions in hand, we can now provide the inference rules of the flow-sensitive type system of Section 4 for the basic commands of the language model of Section 2: T-SKIP T-ASSIGN T-INIT Γ Γ, pc skip : Γ Γ(x) = b q Γ, pc e:bp Γ, pc cond-free(b) {p, pc} ≤ q Γ e:bp Γ, pc Γ(x) = b q {pc, p} ≤ q x := e : Γ[x → b q] T-SET init x to e : Γ pc ≤ p T-READ {pc, q} ≤ p x ∈ erasureconds(Γ) ∪ erasureconds(pc) Γ, pc T-SEQ Γ(x) = cond p read x into y : Γ[y → b p] T-OUTPUT set(x) : Γ c1 : Γ 1 Γ0 , pc Γ1 , pc c1 ; c2 : Γ2 c2 : Γ2 c2 : Γ2 Γ e:bp Γ, pc T-IF {p, pc} ≤ q cur(q) Γ0 , pc output( , e) : Γ e : int p {p, pc} ≤ pc Γ, pc Γ, pc c1 : Γ1 Γ Γ, pc Γ2 if e then c1 else c2 : Γ1 Γi e : int pi (where Γi+1 = Γi T-WHILE {pi , pc} ≤ pc i Γi , pc i c : Γi (0 ≤ i ≤ n) Γ0 , pc i ≤ pc i+1 , Γn+1 = Γn , pc n+1 = pc n ) Γ0 , pc while e do c : Γn if Γ1 (x) = Γ2 (x) ˙ ˙ if Γ1 (x) = Γ2 (x) ˙ ˙ where Γi (x) ≤ τ, i = 1, 2 ˙ Γ1  Γ1 (x) ˙  Γ2 = λx. ˙ τ   21 A.2. Language extension: erasable heap. This section presents an extension to our language in the form of an erasable heap. To enforce information erasure in the heap, we use a runtime mechanism that enforces coarse-grained policies that entirely delete values when an erasure condition is set. Locations in an erasable heap are associated with erasure policies of form cnd . When the condition cnd is set, the run-time mechanism overwrites the value with a default, erasing any sensitive information. A.2.1. Syntax and semantics of the extension. We extend values with heap locations a, and introduce syntax for heap dereference, update, and allocation: v ::= . . . | a c ::= . . . | x :=!e | x e|x new(e, c p) Note that the command for heap allocation takes an erasure policy as one of its arguments—this allows extra precision in the typing, as we explain later in this section. Let erasable heap environment H map locations to a pair of values and conditional variables: H : Loc → Val × CondVar. We extend semantic configurations to include heap environments. Most of the semantic rules simply propagate the heap environment unchanged, with the exception of rules for the new syntactic forms and an updated rule for set(cnd ). Figure 9 presents the new rules. The two interesting cases are S-HEAP-NEW and S-SET-HEAP, where S-HEAPNEW records the erasure condition in the heap, and rule S-SET-HEAP deletes all values in the heap that are associated with the condition that is being set. S-HEAP-DEREF S-HEAP-UPDATE e, m ⇓ a H(a) = (v, _) x m(x) = a H(a) = (_, cnd ) e, m ⇓ v x :=!e, m, U, H, Y, t −→ stop, m[x → v], U, H, Y, t S-HEAP-NEW e, m, U, H, Y, t −→ stop, m, U, H[a → (v, cnd )], Y, t e, m ⇓ v x new(e, S-SET-HEAP a ∈ dom(H) cnd p), m, U, H, Y, t −→ stop, m[x → a], U, H[a → (v, cnd )], Y, t H = H[a1 → NaV, . . . , an → NaV] {a1 , . . . , an } = {a | H(a) = (_, cnd )} set(cnd ), m, U, H, Y, t −→ stop, m[cnd → 1], U, H , Y, t Figure 9. Semantics for erasable heap A.2.2. Typing erasable heap. To enforce secure usage of the erasable heap, we extend our base types to reference types, ref p b where p is the policy on the heap content. Note that this policy may be more precise than the policy enforced by the runtime mechanism, which has form cnd . b ::= . . . | ref p b We also extend the definition of nested policies to references. nestedpols(κ, ref p b) = {p} ∪ nestedpols(κ, b) Figure 10 presents typing rules for the new commands. Rule T-DEREF constrains the resulting policy p on the dereferenced value by requiring that p is an upper bound of pc, p, and the heap content policy q. Rule T-UPDATEVAR, which typechecks an in-place reference update, requires that the type of the expression that is used to update the reference coincides with the type of the reference. Note that similar to the typing rules for write-once variables, this rule is not flow-sensitive for the heap location. Rule T-NEW requires that policy so that it bounds the pc and all sensitive policies associated with expression e. 22 cnd p is sufficiently restrictive T-DEREF T-UPDATE-VAR Γ e : (ref q b) p Γ, pc {pc, q, p} ≤ p Γ(x) = (ref q b) p Γ Γ, pc e:bq x {pc, q , p} ≤ q {pc, p} ≤ p x :=!e : Γ[x → b p ] T-NEW e : Γ[x → (ref q b) p ] c Γ e:bp Γ Γ, pc x c p okpol new(e, c nestedpols(privkey, b) ∪ {p, pc} ≤ p) : Γ[x → (ref c p p b) pc] Figure 10. Typing erasable heap Note that our choice of allowing the policy argument for command new to have form intervening set(cnd) operation. cnd p has the benefit that it allows retaining the original erasure policy of expressions that are stored and retrieved from the heap without an 23 With the definition of the model extended with an erasable heap in hand, we adapt Theorem 1 to the extended language. Theorem 1 (Termination-insensitive security, extended). Consider program c such that Γ, pc m ∈ Conv(c) and all runs such that Γ, pc c, m, Uinit , Hinit , Yinit , ε it holds that for all levels k (c, tobs ) ⊇ Conv(c) ∩ i≤j≤n γ c : Γ , Then for all c:Γ ∗ fin , memory m such that m ∈ Conv(c) , and γi −→ ci , mi , Ui , Hi , Yi , ti −→∗ cn , mn , Un , Hn , Yn , ti · tobs γn ind (m, γj ). A high-level description of the proof follows. 24 S-INT S-VAR S-EXPR m(x) = v m, n ⇓ n S-TUPLE m, ei ⇓ vi (i = 1, 2) m, x ⇓ x (i = 1, 2) m, e1 op e2 ⇓ u S-PROJ u = v1 op v2 m, ei ⇓ vi m, e ⇓ (v1 , v2 ) m, proji (e) ⇓ vi m, (e1 , e2 ) ⇓ (v1 , v2 ) (i = 1, 2) Figure 11. Semantics of expressions Appendix B. Soundness Soundness of the relabeling for the full language. Property 1 (Soundness of the relabeling judgment). Consider a run c0 , m0 , U0 , H0 , Yinit , ε −→ . . . −→ cn , mn , Un , Hn , Yn , tn j 1 2 and two two policy environments γ0 and γ0 such that for j = 1, 2 γ0 -liftings of the above run are c0 , m0 , U0 , H0 , Yinit , ε . . . −→ cn , mn , Un , Hn , Yn , tn 1 2 ind (m, γi ) ⊆ ind (m, γi ). γ0 j . If for all x we have 1 γ0 (x) ≤ 2 γn (x) γ0 j −→ then for all levels and all i ≥ 0 it holds that Proof. By induction on ≤ followed by induction on the number of steps. B.1. Auxiliary semantics. c ::= · · · | {x := e}p | {set(x)}p | {output( , e)}p | cond {e}pc then c1 else c2 | loop {e}pc do c | S∆ {c} | {newkeypair(x, y, pk )}p,q | {x := encrypt(pk, e)}p | {x := decrypt(sk, e)}p | {x e}p,p | {x new(e, cnd p)}q | {x :=!e}p | init x to e | {read x into y}p We extend cryptographic environment to record formal keys and formal random indices generated at different levels. In particular, we add two environments Keys and Rnds, each of which is a function from an erasure policy to a set of natural numbers. For completeness, Figure 11 presents the evaluation rules for the expression semantics. Figure 12 presents the transition rules for the auxiliary semantics. Formally, auxiliary semantic transitions have form ˜ c, m, U, H, Y, t | Γ γ ˜ −→ c , m , U , H , Y , t | Γ γ ˜ where Γ is a tuple (Γ, pc), and pc is a stack of pc-policies. We treat pc as a set of policies when used in relabel ordering ≤. The semantics of how the new environments are updated is given in the rules S-PUB-ENCRYPT-AUX and SPUB-NEWKEY-AUX in Figure 12. We use notation Y.Keys p to be the union of all sets Keys(q) in environment Y where p ≤ p (and similarly for Rnds). We extend the heap state to include auxiliary information about the allocation counter, and the the erasure policy associated with the location, and these as (v, cnd | j, p). We write H.Locs p to return the information about allocations at levels below p and return the set of the form (i, pi , ai ), such that pi ≤ p. B.1.1. Auxiliary typing judgment. Figure 15 presents the type system for expressions. Typing for commands is given by Figure 16. 25 S-ASSIGN-AUX m, e ⇓ v {x := e}p , m, U, H, Y, t | Γ, pc S-SET-AUX γ −→ stop, m[x → v], U, H, Y, t | Γ[x → p], pc γ {a1 , . . . , an } = {a | H(a) = (_, cnd | j, qj )} {set(x)} , m, U, H, Y, t | Γ, pc p γ −→ stop, m[x → 1], U, H[a1 → (NaV | j, qj ), . . . , an → (NaV | jn , qn )], Y, t | Γ[x → p], pc γ S-OUTPUT-AUX m(e) ⇓ v ˜ γ −→ stop, m, U, H, Y, t · o( , v) | Γ ˜ {output( , e)} , m, U, H, Y, t | Γ p S-SCOPE-COND-i γ m, e ⇓ v S {cond {e} ∆ pc then c1 else c2 }, m, U, H, Y, t | Γ, pc γ −→ S∆ {ci }, m, U, H, Y, t | Γ, pc : pc v=0 i = (v = 0 ? 1 : 0) γ S-SCOPE-LOOP-ENTER m, e ⇓ v S {loop {e} ∆ pc do c}, t, m, U, H, Y | Γ, pc γ −→ S {c}; S∆ {loop {e}pc do c}, m, U, H, Y, t | Γ, pc : pc v=0 γ ∆ γ S-SCOPE-LOOP-SKIP m, e ⇓ v S {loop {e} ∆ pc do c}, m, U, H, Y, t | Γ, pc −→ S∆ {stop}, m, U, H, Y, t | Γ, pc : pc γ S-SCOPE-LIFT ˜ c, m, U, H, Y, t | Γ ˜ S∆ {c}, m, U, H, Y, t | Γ S-SCOPE-RESET γ γ ˜ −→ c , m , U , H , Y , t | Γ ∆ γ γ ˜ −→ S {c }, t , m , U , H , Y | Γ S∆ {stop}, m, U, H, Y, t | Γ, pc : pc S-SEQ-AUX-1 γ −→ stop, m, U, H, Y, t | ∆, pc γ ˜ c1 , m, U, H, Y, t | Γ ˜ −→ c1 , m , U , H , Y , t | Γ γ c1 = stop ˜ γ −→ c1 ; c2 , m , U , H , Y , t | Γ γ ˜ c1 ; c2 , m, U, H, Y, t | Γ γ S-SEQ-AUX-2 ˜ −→ stop, m , U , H , Y , t | Γ γ ˜ ˜ c1 ; c2 , m, U, H, Y, t | Γ γ −→ c2 , m , U , H , Y , t | Γ γ γ S-INIT-AUX ˜ c1 , m, U, H, Y, t | Γ m, e ⇓ v U(x) = • ˜ γ −→ stop, m, U[x → v], H, Y, t | Γ ˜ init x to e, m, U, H, Y, t | Γ S-READ-AUX γ U(x) = v ˜ {read x into y}p , m, U, H, Y, t | Γ γ v=• γ ˜ −→ stop, m[y → v], U, H, Y, t | Γ[y → p] Figure 12. Auxiliary semantics for commands B.1.2. Adequacy. Definition 8 (Translation from source to auxiliary). Given a command c, such that Γ, pc Γ, pc trans c : Γ , relation c : Γ : c that produces a translated command c as part of the type-checking. The translation is straightforward adaption of the source type system. Theorem 2 (Adequacy of the auxiliary semantics). Given a program c such that Γ, pc that Γ, pc trans c : Γ , and program d such c : Γ : d then c and d agree on outputs and on the policy environments at the times of the outputs. Proof. By induction on the typing derivation. 26 S-PUB-ENCRYPT-AUX m, e1 ⇓ pk i m, e2 ⇓ v j = Y(enc) R = Y(Rnds) R = R[p → R(p) ∪ {j}] Y = Y[enc → j + 1, Rnds → R ] {x := encrypt(e1 , e2 )}p , m, U, H, Y, t | Γ, pc S-PUB-DECRYPT-AUX γ −→ stop, m[x → v m, e2 ⇓ v γ r j pk i r j pk i ], U, H, Y , t | Γ[x → p], pc γ m, e1 ⇓ sk i {x := decrypt(e1 , e2 )} , m, U, H, Y, t | Γ, pc S-PUB-DECRYPT-FAIL-AUX p −→ stop, m[x → v], U, H, Y, t | Γ[x → p], pc r j pk i γ m, e1 ⇓ sk i {x := decrypt(e1 , e2 )} , m, U, H, Y, t | Γ, pc S-PUB-NEWKEY-AUX m, e2 ⇓ v γ i=i γ p −→ stop, m[x → NaV], U, H, Y, t | Γ[x → p], pc K = K[q2 → K(q2 ) ∪ {i}] i = Y(pubkey) K = Y(Keys) Γ = Γ[x1 → q1 , x2 → q2 ] Y = Y[pubkey → i + 1, Keys → K ] γ {newkeypair(x1 , x2 , p)}q1 ,q2 , m, U, H, Y, t | Γ, pc −→ stop, m[x1 → pk i , x2 → sk i ], U, H, Y , t | Γ , pc γ Figure 13. Auxiliary semantics for cryptographic commands S-HEAP-DEREF-AUX e, m ⇓ a {x :=!e}p , m, U, H, Y, t | Γ, pc S-HEAP-UPDATE-AUX H(a) = (v, cnd | j, q) −→ stop, m[x → v], U, H, Y, t | Γ[x → p], pc H(a) = (v0 , cnd | j, qa ) e, m ⇓ v γ γ γ Γ(x) = (ref q b) p {x e}p,q , m, U, H, Y, t | Γ, pc e, m ⇓ v {x new(e, cnd γ m(x) = a −→ stop, m, U, H[a → (v, cnd | j, qa )], Y, t | Γ[x → (ref q b) p], pc z= γ S-HEAP-NEW-AUX a ∈ dom(H) max (_,_|i,_)∈H i H = H[a → (v, cnd | z + 1, cnd p)] p p)}q , m, U, H, Y, t | Γ, pc −→ stop, m[x → a], U, H , Y, t | Γ[x → (ref cnd b) q], pc γ Figure 14. Auxiliary semantics for heap T-CONST T-VAR T-EXP Γ(x) = τ Γ Γ n : int p T-TUPLE Γ ei : int pi Γ Γ Γ (i = 1, 2) {p1 , p2 } ≤ q Γ (i = 1, 2) x:τ {p1 , p2 } ≤ q e1 op e2 : int q e : (b1 , b2 ) q proji (e) : bi q T-PROJ ei : bi pi Γ (e1 , e2 ) : (b1 , b2 ) q (i = 1, 2) Figure 15. Typing for expressions B.1.3. Preservation of the auxiliary typing. We extend the notion of well-formedness of typing environments to require that any expression e such that Γ e : privkeypk p it must be that pk ≤ p. annot Lemma 1 (Preservation). Consider configuration c, m, U, H, Y, t | Γ, pc γ , such that Γ, pc c, m, U, H, Y, t | Γ, pc then Γ , pc Proof. By induction on the typing derivation Γ, pc annot γ c : ∆, pc . If −→ c , m , U , H , Y , t | Γ , pc γ c : ∆, pc annot c : ∆, pc . 27 T-ASSIGN-AUX T-STOP Γ, pc annot stop : Γ, pc Γ e:bp cond-free(b) Γ, pc annot {p, pc} ≤ q Γ[x → b q] okenv {x := e}q : Γ[x → b q], pc pc ≤ p Γ okenv T-SET-AUX x ∈ erasureconds(Γ) ∪ erasureconds(pc) Γ, pc Γ e:bp Γ, pc T-SCOPE-COND Γ(x) = cond p {set(x)} : Γ, pc cur(q) q q annot T-OUTPUT-AUX {p, pc} ≤ q annot {output( , e)} : Γ, pc annot Γ e : int p {p, pc} ≤ pc Γ, pc Γ, pc : pc annot ci : Γi , pc : pc ∆ = Γ1 Γ2 ∆ okenv S∆ {cond {e}pc then c1 else c2 } : ∆, pc Γn ≤ ∆ Γi+1 okenv ) pc = pc n T-SCOPE-LOOP Γi Γi , pc : pc i annot c : Γi , pc : pc i 0≤i≤n e : int pi {pi , pc} ≤ pc i (where Γ0 = Γ, Γi+1 = Γi Γ, Γn+1 = Γn , pc i ≤ pc i+1 , pc n+1 = pc n , Γ, pc T-SCOPE-CMD annot S∆ {loop {e}pc do c} : ∆, pc Γ ≤∆ c ∈ {loop, cond} Γ, pc : pc annot c : Γ , pc : pc annot Γ, pc : pc T-SEQ-AUX S∆ {c} : ∆, pc T-INIT-AUX Γ, pc annot c1 : Γ1 , pc1 Γ, pc annot Γ1 , pc1 annot c2 : Γ2 , pc2 Γ e:bp Γ, pc Γ(x) = b q annot {pc, p} ≤ q c1 ; c2 : Γ2 , pc2 Γ(x) = b q {pc, q} ≤ p init x to e : Γ, pc T-READ-AUX Γ, pc annot {read x into y}p : Γ[x → b p], pc Figure 16. Typing rules for the auxiliary commands T-PUB-NEWKEY-AUX pc ≤ q1 pc ≤ pk Γ, pc pk ≤ q 2 annot Γ = Γ[x → pubkeypk q1 , y → privkeypk q2 ] {newkeypair(x, y, pk )}q1 ,q2 : Γ , pc p ≤ pk {pc, p} ≤ p p T-PUB-ENC-AUX Γ e1 : pubkeypk p Γ e2 : b p annot Γ = Γ[x → encpk b p ] Γ, pc T-PUB-DEC-AUX {x := encrypt(e1 , e2 )} : Γ , pc Γ = Γ[x → b q ] Γ e1 : privkeypk p Γ Γ, pc e2 : encpk b q annot {q, pk , p, pc} ≤ q q {x := decrypt(e1 , e2 )} : Γ , pc Figure 17. Typing rules for auxiliary cryptographic commands T-DEREF-AUX T-UPDATE-VAR-AUX Γ Γ, pc e : (ref q b) p annot {pc, q, p} ≤ p p Γ(x) = (ref q b) p Γ, pc Γ {x p e:bq e} Γ c p p ,q {pc, q , p} ≤ q {pc, p} ≤ p {x :=!e} Γ : Γ[x → b p ], pc annot : Γ[x → (ref q b) p ], pc p okpol pc ≤ q T-NEW-AUX e:bp nestedpols({privkey}, b) ∪ {p, pc} ≤ Γ, pc annot c c {x new(e, c p)} : Γ[x → (ref q b) q], pc Figure 18. Typing rules for auxiliary heap commands 28 B.2. Equivalence of semantic configurations up to relabel ordering. B.2.1. Policy sets. We use notation p for a set of levels p = {p1 , . . . , pn }. We write p ≤ q when for all p ∈ p there is q ∈ q such that p ≤ q. Definition 9 (Policy of a variable in an environment). Given a typing environment Γ, and a variable x, we define an operator Γ x that returns the erasure policy of the variable x in Γ, as Γx where Γ(x) = b p for some base type b and erasure policy p. Definition 10 (Currently allowed policies for output). We define operator OutPols( , Γ) {p | Γ x ≤ p ∧ cur(p) } p B.2.2. Constructing permutations. We say that a permutation list π is a list of tuples of form (a1 , b1 ) . . . (an , bn ) such that (1) values a1 . . . an are pairwise unique (2) values b1 . . . bn are pairwise unique Given a permutation list π, we construct the corresponding permutation function by defining π(a) to be the value b that corresponds to a in the permutation list, and to be undefined otherwise. Given two sets of natural numbers A and B with equal cardinality, define perm(A, B) to be a function that constructs a permutation list from these sets:   perm(A, B) = (u, v) · perm(A , B ) if A = B = ∅ where u = min(A), v = min(B), A = A \ {u}, B = B \ {v} Given two pairs of environments Y1 , Y2 and W1 , W2 , and a set of policy levels q, define permutation extension from Y1 /W1 to Y2 /W2 , denoted permextq (Y2 − Y1 , W2 − W1 ) to be a tuple (φ , ψ ) where φ = perm(Y2 .Keys q \ Y1 .Keys q , W2 .Keys q \ W1 .Keys q ) ψ = perm(Y2 .Rnds q \ Y1 .Rnds q , W2 .Rnds q \ W1 .Rnds q ) B.2.3. Extended memory environment. For clarity, we combine the write-once environment U and memory m into an extended memory environment, which we denote as m. Formally, given U and m, we define  m(x) when m(x) is defined m(x) = U(x) otherwise B.2.4. Memory equivalence. Definition 11 (Projected memory). Given a typing environment Γ, define  m(x) if Γ(x) ≤ p m Γ p undefined otherwise 29 Definition 12 (Projected heap).  (v, cnd | i, p)    (NaV | i, p)    undefined if p ≤ p and H(a) = (v, cnd | i, p) if p ≤ p and H(a) = (NaV | i, p) otherwise H p Definition 13 (Pattern of memory). We lift the notion of value pattern from section 3.1 (Symbolic equivalence of encrypted values) to extended memories, and define memory pattern for a set of keys S as follows pattern(m, S) = λx . pat(m(x), S) Definition 14 (List representation of the heap). Given a heap H, let listify(H) be the list of heap values ordered in the order of their allocation. We extend pattern function to lists, by mapping it pointwise to the pattern of values in the list. Definition 15 (Observable keys given initial key set). Given exended memory m, heap H, policy set p, typing environment Γ, and a set of keys S, define the set of observable keys given the initial set S as follows ObsKeysp,Γ (m, H | S) DY-obs ({m(x) | Γ(x) ≤ p} ∪ {v | H(a) = (_, v, _, p) ∧ p ≤ p} ∪ S) ∩ {sk i | i ∈ N} When we want to be explicit about the permutations used in the various definitions of equivalence up to formal randomness, we use notation = to denote substitution of the corresponding permutations. Definition 16 (Memory and heap equivalence up to randomness with extra keys). Given memories m1 , m2 , heaps H1 , H2 , two sets of keys S1 , S2 , permutations φ, ψ, policy set p, and environment Γ, define memory equivalence up to randomness with extra keys as follows m1 , H1 , S1 φ,ψ φ,ψ Γ p , S1 ) = φ,ψ φ,ψ ≈ p,Γ m2 , H2 , S2 p ), pattern( m1 pattern( m2 Γ p , S2 ) ∧ pattern(L1 , S1 ) = pattern(L2 , S2 ) where Li = listify( Hi Si = ObsKeysp,Γ (mi , Hi | Si ), for i = 1, 2. Definition 17 (Initial memory equivalence at p). m1 ≈init m2 p,Γ where H is an empty heap. B.2.5. Properties of memory equivalence. Lemma 2 (Dropping information about related keys). m1 , H1 , S 1 φ,ψ m1 , H , ∅ ≈ p,Γ m2 , H , ∅ , , ≈ p,Γ m2 , H2 , S2 =⇒ m1 , H1 , S1 \ {i} ≈ p,Γ m2 , H2 , S2 \ {φ(i)} φ,ψ Proof. By Definition 16 (Memory and heap equivalence up to randomness with extra keys). Lemma 3 (Memory equivalence at higher levels implies memory equivalence at lower ones). m1 , H1 , S1 φ,ψ ≈ p∪q,Γ m2 , H2 , S2 ⇔ m1 , H1 , S1 φ,ψ ≈ p,Γ m2 , H2 , S2 ∧ m1 , H1 , S1 φ,ψ ≈ q,Γ m2 , H2 , S2 . Proof. By Definition 16 (Memory and heap equivalence up to randomness with extra keys). B.3. Noninterference for expressions. Lemma 4 (Limited expansion of DY-obs for integers). Consider set of values V and an integer u. Then DY-obs(V ∪ {u}) = DY-obs(V) ∪ {u} 30 Proof. By definition of DY-obs, observing that because u is an integer, it holds that u is neither a tuple nor, more importantly, is not sk i for any i. Lemma 5 (Noninterference for expressions with initial keys). Consider two extended memories m and s, heaps H and J, two sets S and T, polity set q, and permutations φ and ψ s.t. m, H, S ≈ q,Γ s, J, T . Consider expression e such that Γ Define S = ObsKeysq,Γ (m, H | S) Then, it must hold that (1) pat(u, S ) = pat(v, T ) (2) DY-obs ({m(x) | Γ(x) ≤ q} ∪ {u} ∪ {v | H(a) = (_, v , _, q) ∧ q ≤ q} ∪ S) ∩ {sk i | i ∈ N} = S (3) DY-obs ({s(x) | Γ(x) ≤ q} ∪ {v} ∪ {v | J(a) = (_, v , _, q) ∧ q ≤ q} ∪ T) ∩ {sk i | i ∈ N} = T Proof. By induction on the typing derivation for expressions. • Case T-CONST Requirement (1) is straightforward from the definition of pattern function in Figure 4. To show requirement (2), observe that u must be an integer. Then by Lemma 4 (Limited expansion of DY-obs for integers) DY-obs {m(x) | Γ(x) ≤ q} ∪ {u} ∪ {v | H(a) = (_, v , _, q) ∧ q ≤ q} ∪ S = DY-obs {m(x) | Γ(x) ≤ q} ∪ {v | H(a) = (_, v , _, q) ∧ q ≤ q} ∪ S ∪ {u} Because u is an integer, intersecting these sets with {sk i | i ∈ N} yields the same set as S . Requirement 3 is shown similarly. • Case T-VAR Follows from the assumption of memory equivalence. • Case T-EXP Using the induction hypothesis and Lemma 4 for the integer result of the operation. • Case T-TUPLE By induction hypothesis and the definition of DY-obs for tuples. • T-PROJ By induction hypothesis. φ,ψ φ,ψ e : b p, for some base type b and policy p s.t. p ≤ q, two values u and v, s.t. m, e ⇓ u, and s, e ⇓ v. and T = ObsKeysq,Γ (s, J | T) B.4. Coverage. Given an extended environment  pc i ˜ Γx Γ x ˜ Γ = (Γ, pc), we define if x = pc i and pc = pc 0 : . . . : pc n otherwise ˜ ˜ Definition 18 (ConnectedInitPols set). Given a sequence of typing environments Γ1 , . . . , Γn , and a set of policies q, ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γn | q) ˜ ˜ ˜ ˜ {Γ1 (x1 ) | ∃x2 , . . . , xn : Γ1 (x1 ) ≤ Γ2 (x2 ) ≤ . . . ≤ Γn (xn ) ≤ q} Definition 19 (Coverage of policies via typing environments). Given two sets of policies p and q and a sequence of ˜ ˜ ˜ typing environment Γ1 , . . . Γn , where n ≥ 1, say that p covers q via Γ1 , . . . , Γn , written p Γ1 ...Γn q, and defined as ˜ ˜ follows: p B.4.1. Properties of coverage. Lemma 6 (Symmetric coverage for one typing environment). p 31 ˜ Γ ˜ ˜ Γ1 ...Γn q ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γn | q) ≤ p ˜ ˜ q ⇐⇒ {Γ x | Γ x ≤ q} ≤ p. Proof. Immediate by Definition 19 (Coverage of policies via typing environments). Lemma 7 (Reflexivity under one typing environment). p ˜ Γ p. Proof. Immediate by Definition 19 (Coverage of policies via typing environments) ˜ ˜ ˜ Lemma 8 (Coverage composition). For all policy sets p, q and all typing environments Γ1 , . . . , Γn , . . . , Γm we have ∃q : p ˜ ˜ Γ1 ...Γn q and q ˜ ˜ Γn+1 ...Γm q ⇐⇒ p ˜ ˜ Γ1 ...Γm q Proof. We consider each of the directions separately. =⇒: We define two sets X and Y by unfolding the definitions, and establishing the corresponding relations. ˜ ˜ ˜ X = ConnectedInitPols(Γ1 , . . . , Γn | q) = {Γ1 x and ˜ ˜ ˜ Y = ConnectedInitPols(Γn+1 , . . . , Γm | q ) = {Γn+1 x n+1 1 ˜ ˜ | ∃x2 . . . xn : Γ1 x1 ≤ . . . ≤ Γn xn ≤ q} ≤ p ˜ ˜ | ∃xn+2 . . . xm : Γn+1 xn+1 ≤ . . . ≤ Γm xm ≤ q } ≤ q ˜ ˜ ˜ We want to show that Z = ConnectedInitPols(Γ1 , . . . , Γm | q ) ≤ p. That is for all x such that Γ x ∈ ˜ Z, it must be that Γ x ≤ p. Indeed, x ∈ Z implies there is a sequence x1 , . . . , xn , xn+1 , . . . xm such ˜ n+1 xn+1 ≤ . . . ≤ Γm xm ≤ q . This means that xn+1 ∈ Y , and consequently it must be that ˜ that Γ ˜ ˜ ˜ Γn+1 xn+1 ≤ q. Then, it must be that Γ1 x1 ∈ X, and Γ1 x1 ≤ p. This completes our proof in this direction. ˜ ˜ ⇐=: We define q as ConnectedInitPols(Γn+1 , . . . , Γm | q ). By construction, this gives us that q Γn+1 ...Γm q . Next, ˜ ˜ ˜ ˜ we need to show that, p ˜ ˜ q. Consider x such that x ∈ ConnectedInitPols(Γ1 , . . . , Γn | q). We need to Γ1 ...Γn ˜ show that Γ1 x ≤ p. ˜ ˜ ˜ ˜ From x ∈ ConnectedInitPols(Γ1 , . . . , Γn | q), there exist x2 , . . . , xn such that Γ1 x1 ≤ . . . ≤ Γn xn ≤ q. ˜ n ≤ q. But, from construction of q, it must From definition of ≤, this means that there is qn+1 ∈ q such that Γ ˜ be that qn+1 = Γn+1 xn+1 for some variable xn+1 , and further there are variables xn+2 , . . . , xm such that ˜ ˜ ˜ ˜ Γn+1 xn+1 ≤ Γm xm ≤ q . Putting all these xi together, we obtain that x ∈ ConnectedInitPols(Γ1 , . . . , Γm | ˜ q ), and from our assumption it follows that Γ1 x ≤ p, as required. Lemma 9 (Relation of coverage to relabeling). If p ˜ ˜ Γ1 ...Γn q and q ≤ q and p ≤ p then p ˜ ˜ Γ1 ...Γn q. Proof. Case for p is immediate from Definition 18 (ConnectedInitPols set). We consider case when q ≤ q. We need to show that ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γn | q ) ≤ p Define sets Sq Sq ˜ ˜ ˜ ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γn | q) = {Γ1 x | ∃x2 , . . . , xn : Γ1 x1 ≤ . . . ≤ Γn xn ≤ q} ˜ ˜ ˜ ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γn | q ) = {Γ1 x | ∃x2 , . . . , xn : Γ1 x1 ≤ . . . ≤ Γn xn ≤ q } Because q ≤ q, we can show that Sq ⊆ Sq , and consequently Sq ≤ Sq ≤ p. Lemma 10 (Coverage of uninhabited levels). For all p, q p ˜ Γ ˜ q and q ≤ p =⇒ {Γ x | Γ x ≤ q ∧ Γ x ≤ p} = ∅ 32 ˜ ˜ ˜ Proof. By contradiction. Assume {Γ x | Γ x ≤ q ∧ Γ x ≤ p} = ∅. That is, there is some variable x such that ˜ ˜ ˜ ˜ ˜ Γ x ≤ q ∧ Γ x ≤ p. Because p Γ q, we have that {Γ x | Γ x ≤ q} ≤ p, and therefore it must be that Γ x ≤ p, ˜ ˜ x ≤ p. which contradicts the original choice of x where Γ Lemma 11 (Coverage of a defined variable in a single environment). For any policy set p, extended typing environent ˜ Γ and variable x, we have that ˜ ˜ p ˜ Γ x =⇒ Γ x ≤ p Γ Proof. By Definition 19 (Coverage of policies via typing environments), we have ˜ ˜ ConnectedInitPols(Γ | Γ x ) ≤ p, or equivalently ˜ ˜ ˜ {Γ x1 | Γ x1 ≤ Γ x } ≤ p ˜ This implies that it must be that Γ x ≤ p. B.5. Relatively low and high events. Definition 20 ( ˜ p, Γ0 ˜ and p, Γ0 ˜ events). Given an initial configuration c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 γ0 , and a set of policies p, we define it to be a relatively low event, written ˜ ci , mi , Ui , Hi , Yi , ti | Γi γi −→ ˜ p,Γ0 ˜ ci+1 , mi+1 , Ui+1 , Hi+1 , Yi+1 , ti+1 | Γi+1 γi+1 when there exists a sequence of transitions such that ˜ c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 γ0 ˜ −→∗ ci , mi , Ui , Hi , Yi , ti | Γi γi ˜ −→ ci+1 , mi+1 , Ui+1 , Hi+1 , Yi+1 , ti+1 | Γi+1 γi+1 and the last transition is produced by one of the following commands • an assignment of form {x := e}q , or • condition set of form {set(x)}q , or • an output {output( , e)}q , or • a read command {read x into y}q , or • heap dereference of form {x :=!e}q where in the above cases we have p ˜ ˜ Γ0 ,...,Γi+1 q, or ˜ ˜ Γ0 ,...,Γi+1 • initialization of a write-once variable init x to e, where p or • heap update of form {x • heap allocation of form {x e}q,q , where p new(e, cnd ˜ ˜ Γ0 ,...,Γi+1 Γi+1 x q or p ˜ ˜ Γ0 ,...,Γi+1 ˜ ˜ Γ0 ,...,Γi+1 q , or ˜ ˜ Γ0 ,...,Γi+1 cnd p )}q , where p q or p p Otherwise, we define this transition, as a relatively high event, written ˜ ci , mi , Ui , Hi , Yi , ti | Γi Definition 21 ( ˜ p, Γ0 γi −→ ˜ p,Γ0 ˜ ci+1 , mi+1 , Ui+1 , Hi+1 , Yi+1 , ti+1 | Γi+1 γi+1 γ0 , ˜ /stop-event). Given an initial configuration c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 ˜0 policies p, say that a transition produces p, Γ /stop-event, written ˜ ci , mi , Ui , Hi , Yi , ti | Γi γi and a set of −→ ˜ p,Γ0 /stop ˜ ci+1 , mi+1 , Ui+1 , Hi+1 , Yi+1 , ti+1 | Γi+1 γi+1 when one of the following holds: 33 ˜ (1) ci , mi , Ui , Hi , Yi , ti | Γi (2) ci+1 = stop γi −→ ˜ p,Γ0 ˜ ci+1 , mi+1 , Ui+1 , Hi+1 , Yi+1 , ti+1 | Γi+1 γi+1 is a ˜ p, Γ0 -event B.5.1. Bounded updates in high contexts. Proposition 1 (Bounded updates in high contexts). Consider program c, memory m, environment Γ0 , a stack pc, and a policy set p0 such that (1) Γ0 , pc0 annot c : ∆, pcn γ0 ˜ (2) c, m, U, H, Y, t | Γ0 ˜ −→∗ stop, mn , Un , Hn , Yn , tn | Γn γn (3) pc is the shortest pc-stack in the above sequence of transitions, i.e., for all j, 0 ≤ j ≤ n, it holds pcj = pcj : pcj for a possibly empty pcj . (4) p0 Then (1) p0 (2) if p ˜ ˜ Γ0 ...Γn ˜ Γ0 pc pc q then for all x it holds that if Γn (x) ≤ q then Γ0 (x) = Γn (x) and Γn (x) ≤ p. γ0 ˜ ˜ Γ0 ...Γn ˜ (3) c, m, U, H, Y, t | Γ0 in the trace. −→∗p,Γ0 ˜ ˜ stop, Un , Hn , Yn , mn , tn | Γn γn , that is, there are no relatively low events (4) for all q such that p0 ˜ ˜ Γ0 ...Γn q it holds that (a) ∀x . Γn (x) ≤ q . m(x) = mn (x) (b) listify( H1 q) = listify( H2 q ), Proof. By induction on the typing derivation Γ0 , pc0 annot c : ∆, pcn . T-STOP: The only applicable case is when the configuration is related to itself, and that case is immediate. T-SKIP: Immediate from the induction hypothesis because memories and typing environment are not changed. T-ASSIGN-AUX: We have that c is {x := e}p . By T-ASSIGN-AUX, it must be that pc0 ≤ p, and q ≤ p, where Γ e : q. The only applicable semantic rule is S-ASSIGN-AUX: {x := e}p , m, U, H, Y, t | Γ0 , pc0 γ0 −→ stop, m1 , U, H, Y, t | Γ1 , pc0 γ0 ˜ ˜ Γ0 Γ1 where m1 = m[x → v] and Γ1 = Γ0 [x → p]. From pc0 = pc : pc0 , we obtain that p ˜ pc ≤ p, we obtain that p ˜ ˜ p. Hence, this transition is indeed a p, Γ0 -event. 0 Γ0 Γ1 pc0 , and from Consider q such that p ˜ ˜ Γ0 Γ1 q. Then, by Lemma 8 (Coverage composition), there is q0 such that p ˜ Γ0 q0 and q0 ˜ Γ1 q We want to show item 4 (Bounded updates in high contexts), i.e., variable x such that Γ0 [x → p](x ) = Γn (x ) ≤ q we have that m(x ) = m[x → v](x ) We consider the following possible cases q ≤ q0 : There are two possibilities. Case x = x : In this case, m1 (x ) = m(x ). In addition Γ1 (x) = Γ0 (x), and by Lemma 10 (Coverage of uninhabited levels) it must be that Γ0 (x) ≤ p, so we are done. 34 Case x = x : Using T-ASSIGN-AUX, we have pc0 ≤ p = Γ1 (x) ≤ q1 But then, by Lemma 9 (Relation of coverage to relabeling), it must then be that p which is a contradiction, therefore this case is impossible. q ≤ q0 : By Lemma 10 (Coverage of uninhabited levels), we have that because q0 ˜ ˜ ˜ {Γ1 y | Γ1 y ≤ q ∧ Γ1 y ≤ q0 } = ∅ This means we could have not picked such an x . This also shows item 2 (Bounded updates in high contexts). T-PUB-ENC-AUX, T-PUB-DECRYPT-AUX, T-PUB-NEWKEY-AUX, : Similar to T-ASSIGN-AUX. T-DEREF-AUX, T-UPDATE-VAR-AUX, T-NEW-AUX: Similar to T-ASSIGN-AUX. T-SET-AUX, T-INIT-AUX, T-READ-AUX : Similar to T-ASSIGN-AUX. T-OUTPUT-AUX: This command does not change typing environments. By reasoning similar to the case for ˜ assignment, we can show that output is indeed a p, Γ0 -event. T-SEQ: We have that Γ0 , pc0 ca ; cb : ∆, pcn , and by T-SEQ-AUX, it must be that Γ0 , pc0 ca : ∆u , pcu and Γu , pcu cb : ∆n , pcn ˜ Γ1 ˜ ˜ Γ0 Γ1 pc0 , q and q ≤ q0 then for some u < n. Consider q such that p0 that p0 ˜ ˜ Γ0 ...Γu ˜ ˜ Γ0 ...Γn q. By Lemma 8 (Coverage composition), there is qu such ˜ ˜ Γu ...Γn qu and qu ˜ ˜ Γu+1 ...Γn q. We can also rewrite the latter as qu q, because qu ˜ Γu qu by Lemma 6 (Symmetric coverage for one typing environment). We want to show that (1) Γn (x) ≤ q =⇒ Γ0 (x) = Γn (x) and Γn (x) ≤ p (2) for any variable x it holds Γn (x) ≤ q =⇒ m(x) = mn (x). Consider x such that Γn (x) ≤ q. By applying induction hypothesis to cb with initial policy set qu , we have (1) Γn (x) ≤ q =⇒ Γn (x) = Γu (x) and Γn (x) ≤ qu , which gives us that Γu (x) ≤ qu (2) Γn (x) ≤ q =⇒ mu (x) = mn (x), which gives us mn (x) = mu (x). By applying induction hypothesis to ca we have that (1) Γu (x) ≤ q =⇒ m(x) = mu (x), which gives us that mu (x) = m(x), and therefore m(x) = mn (x). (2) Γu (x) ≤ qu =⇒ Γu (x) = Γ0 (x) and Γu (x) ≤ q, which gives us that Γ0 (x) = Γn (x) and Γ0 (x) ≤ q. T-SCOPE-CMD: By induction hypothesis, noting that it must be that the resulting environment of the scoped ˜ command Γ must be bounded by the scoped environment ∆, and observing that we are only interested in scoped transitions where pc remains on stack. T-SCOPE-COND: By induction hypothesis, noting that the final environment ∆ is an upper bound of the final environments of each branch. The remaining cases can be shown by the induction hypothesis. 35 B.6. Advancement for relabel ordering. Proposition 2 (Progress-sensitive advancement for relabel ordering). Consider program c0 , environment Γ0 , stack pc0 , a set of policies p0 , some natural number N ≥ 0, two configurations c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 , pc0 c0 , s0 , X0 , J0 , W0 , r0 | Γ0 , pc0 (1) Γ0 , pc0 annot δ0 , and two sets S and T, such that γ0 and c : ∆, pc . (2) {m0 , s0 } ⊆ Conv(c) (3) p0 ˜ Γ0 pc0 (4) φ0 = perm(Y0 .Keys p0 , W0 .Keys p0 ) (5) ψ0 = perm(Y0 .Rnds p0 , W0 .Rnds p0 ) (6) max(S ≤ Y0 (pubkey), max(T) ≤ W0 (pubkey) (7) |S| = |T| (8) m0 , H0 , S φ0 ,ψ0 ≈ p,Γ0 s0 , J0 , T γ0 (9) c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 , pc0 ˜ Γ2 γ2 −→Np,Γ0 ˜ ˜ c1 , m1 , U1 , H1 , Y1 , t1 | Γ1 γ1 −→ ˜ p,Γ0 /stop c2 , m2 , U2 , H2 , Y2 , t2 | then there is M ≥ 0 and there are memories s1 , s2 , such that (1) c0 , s0 , X0 , J0 , W0 , r0 | Γ0 , pc0 ˜ Γ2 δ2 (2) for any q it holds that p0 (3) p0 N−1 ˜ ˜ ˜ ˜ Γ0 [Γαi ]i=1 Γ1 Γ2 δ0 −→Mp,Γ0 ˜ ˜ c1 , s1 , X1 , J1 , W1 , r1 | ∆1 δ1 −→ ˜ p,Γ0 /stop c2 , s2 , X2 , J2 , W2 , r2 | N−1 ˜ ˜ ˜ ˜ Γ0 [Γαi ]i=1 Γ1 Γ2 q ⇐⇒ p0 ˜ ˜ ˜ ˜ Γ0 [Γβ ]M−1 ∆1 Γ2 j j=1 ˜ q, where Γ2 = (Γ2 , pc2 ) ˜ pc2 , where Γ2 = (Γ2 , pc2 ) (4) if we let • S0 = ObsKeysp,Γ0 (m0 , H0 | S) • T0 = ObsKeysp,Γ0 (s0 , J0 | T) • φ2 = φ0 · φ and ψ2 = ψ0 · ψ , where • (φ , ψ ) = permextQ (Y2 − Y0 , W2 − W0 ) where Q = ∪{q | p0 then (a) for any q such that p0 (b) Moreover, if ˜ c1 , m1 , U1 , H1 , Y1 , t1 | Γ1 is a p0 , Γ0 -event then δ1 γ1 N−1 ˜ ˜ ˜ ˜ Γ0 [Γαi ]i=1 Γ1 Γ2 ˜ ˜ ˜ ˜ Γ0 [Γαi ]N−1 Γ1 Γ2 i=1 q}. q it holds that m2 , H2 , S0 φ2 ,ψ2 ≈ q,Γ2 s2 , J2 , T0 , −→ p0 ,Γ0 ˜ c2 , m2 , U2 , H2 , Y2 , t2 | Γ2 γ2 ˜ c1 , s1 , X1 , J1 , W1 , r1 | ∆1 is a p0 , Γ0 −→ p0 ,Γ0 ˜ c2 , s2 , X2 , J2 , W2 , r2 | Γ2 δ2 ˜ ˜ -event, and Γ1 = ∆1 . p0 , Γ0 -event is an output on channel p0 ˜ ˜ ˜ Γ0 [Γαi ]N−1 Γ1 i=1 (c) In addition, if this such that ˜ OutPols( , Γ1 ) then there are events θ and ι such that 36 • t2 = t1 · θ and r2 = r1 · ι. • pat(θ, S ) 2= 2 pat(ι, T ). φ ,ψ Proof. By strong induction on N. Case N = 0: We proceed by an inner induction on the typing derivation Γ0 , pc0 T-STOP: Not applicable. T-SKIP: This command does not modify typing environments or memories, so we are done immediately. T-ASSIGN-AUX: In this case c0 is {x := e}p , and we have {x := e}p , m0 , U0 , H0 , Y0 , t | Γ0 , pc0 γ0 annot c0 : ∆0 , pc . −→ ˜ p,Γ0 /stop stop, m0 [x → u], U0 , H0 , Y0 , t | Γ0 [x → p], pc0 γ0 Because the environments in both runs are updated in the same manner, the s-trace also results in a similar assignment. {x := e}p , s0 , X0 , J0 , W0 , r | Γ0 , pc0 δ0 −→ ˜ p,Γ0 /stop stop, s0 [x → v], X0 , J0 , W0 , r | Γ0 [x → p], pc0 γ0 We set M = 0. This satisfies requirement (1) of our Lemma. Requirements (2) and (3), are straightforward. Assume the definitions in (4). Requirement (4a) is immediate. Requirement (4c) is vaguely satisfied. We focus on requirement (4b). Because the semantics for the assignment does changes neither key nor random environments, the permutation extension is empty. Hence, we have φ2 = φ0 and ψ2 = ψ0 . Let u and v be the values of expression e in memories m0 and s0 , respectively. Consider q such that p0 ˜ ˜ Γ0 (Γ0 [x→p]) q. We need to show that m0 [x → u], H0 , S0 φ0 ,ψ0 ≈ q,Γ0 [x→p] s0 [x → v], J0 , T0 Unfolding Definition 16 (Memory and heap equivalence up to randomness with extra keys), this means we need to show that • pattern( m0 [x → u] φ,ψ Γ ∗ φ,ψ q, S ) = pattern( s0 [x → v] Γ ∗ q, T ) and • pattern(L1 , S∗ ) = pattern(L2 , T∗ ) where • L1 = listify( H0 • L2 = listify( J0 q) q) • S∗ = ObsKeysq,Γ0 [x→p] (m0 [x → u], H0 | S0 ) • T∗ = ObsKeysq,Γ0 [x→p] (s0 [x → v], J0 | T0 ) ˜ Consider some variable y such that (Γ0 [x → p]) y ≤ q. We want to show that pat((m0 [x → u])(y), S∗ ) 0= 0 pat((s0 [x → v])(y), T∗ ) By Lemma 8 (Coverage composition), p0 ˜ ˜ Γ0 Γ1 φ ,ψ q =⇒ p0 37 ˜ Γ0 p0 ∧ p0 ˜ Γ1 q By Lemma 9 (Relation of coverage to relabeling), we obtain that p0 ˜ Γ0 [x→p] ˜ (Γ0 [x → p]) y and By Lemma 11 (Coverage of a defined variable in a single environment) it holds that (1) ˜ Γ0 [x → p] y ≤ p0 . By analyzing the DY-observables, and using NI for expressions, we see that DY-observables at q are bounded by the ones at p. Then we are done using NI for expressions. T-SET-AUX: We establish memory equivalence, similarly to T-ASSIGN-AUX. For heap equivalence we use the fact that S-SET erases the heap values, so the resulting heaps are low-equivalent. T-OUTPUT-AUX, T-PUB-DEC-AUX, T-INIT-AUX, T-READ-AUX : Similar to T-ASSIGNAUX. T-PUB-NEWKEY-AUX: We have that {newkeypair(x1 , x2 , p)}q1 ,q2 , m0 , U0 , H0 , Y0 , t | Γ, pc where • Y(pubkey) = i • Y(Keys) = K • K = K[q2 → K(q2 ) ∪ {i}] , • Γ1 = Γ[x1 → q1 , x2 → q2 ], • Y1 = Y[pubkey → i + 1, Keys → K ] We have two cases (1) None of q1 or q2 are covered by p. In this case, this key generation is a stop event for both traces and we are done. (2) One or both of q1 and q2 are covered by p. Because both runs reach the same command, the s-trace must also increment its pubkey counter, say i’. We define our key permutation φ2 to be the extension of φ2 that also relates keys with indices i and i . Moreover, in case q2 is covered by p we define sets S and T to be extensions of S and T that include the corresponding newly generated secret key. For all other variables, we are done by assumption. T-PUB-ENC-AUX: We have {x := encrypt(e1 , e2 )}p , m, H, Y, t | Γ, pc where • m, e1 ⇓ pk i • m, e2 ⇓ v • Y(enc) = j • Y(Rnds) = R • R = R[p → R(p) ∪ {j}] • Y = Y[enc → j + 1, Rnds → R ] It must be that s-trace reaches the same command. We have that 38 γ γ −→ stop, m[x1 → pk i , x2 → sk i ], U0 , H0 , Y1 , t | Γ1 , pc γ −→ stop, m[x → v r j pk i ], H, Y , t | Γ[x → p], pc γ • s, e1 ⇓ pk i • s, e2 ⇓ u • W(enc) = j • W(Rnds) = R • R = R [p → R (p) ∪ {j }] • W = W[enc → j + 1, Rnds → R ] By reasoning similar to case T-ASSIGN-AUX, consider the case when p is covered by p, and where we need to show that there are φ2 and ψ2 such that pat( v where S = ObsKeysp,Γ (m | S) T = ObsKeysp,Γ (s | T) r j pk i φ ,ψ , S ) 2= 2 pat( u r j pk i ,T ) Define φ2 = φ0 , and ψ2 to be the extension of ψ0 that relates j and j . By T-PUB-ENC-AUX, it must be that encpk b p, where Γ it must be that pk ≤ p . We consider two cases (1) p Γ e1 : pubkeypk q such that {pc, q} ≤ p. Beek : privkeypk p cause environment Γ is well-formed, it must be that if there is an expression e such that Γ pk . In this case, because both u and v must have matching patterns, by assumptions of our lemma, and we are done. (2) ¬(p Γ pk ). In this case, the corresponding private key sk i does not belong to the set ObsKeysp,Γ (m | ∅) (and similarly the matching key pk φ0 (i) ∈ ObsKeysp,Γ (s | ∅)). We have two more cases (a) pat( v r j pk i r j ,S ) = · In this case, the secret key sk i ∈ S and, therefore the matching key sk φ(i) ∈ T . (b) pat( v r j pk i , S ) = pat(v, S ) φ0 ,ψ0 r j pk i In this case, the secret key sk i ∈ S , and, therefore the matching key sk φ(i) ∈ T . But then, because m0 , S as desired. T-SEQ: Assume c0 is ca ; cb . We have two cases (1) Command ca produces the hypothesis to ca . (2) Command ca runs to completion without producing any is produced by command cb . Then we are done by applying the induction hypothesis to ca and cb , and using Lemma 8 (Coverage composition). Case (T-SCOPE-COND, T-SCOPE-LOOP): . Not applicable, because they do not step to a terminal configuration in one step. Inductive case: Again, we proceed by an inner induction on the typing derivation Γ0 , pc0 inductive cases only, using the outer induction hypothesis when necessary. 39 annot ≈ p,Γ0 s0 , T , we obtain than pat(v, S ) r j pk i = pat(u, S ) r ψ(j) pk φ(i) , ˜ p0 , Γ0 -event. In this case we are done by applying the induction ˜ p0 , Γ -events, and ˜ p0 , Γ -event c0 : ∆0 , pc . Cases where commands transition to a terminal configuration in one step are not applicable. Hence, we focus on the T-SEQ: By induction hypothesis. T-SCOPE-COND: We have Γ0 , pc0 Assume Γ0 annot S∆ {cond {e}pc then c1 else c2 } : ∆, pc0 e : int p. We consider two possible cases Case ∀x ∈ Vars(e) . Γ0 x ≤ p0 : Using Lemma 5 (Noninterference for expressions with initial keys), we can show that both executions take the same branch, and we are done by IH. Case ∃x ∈ Vars(e) . Γ0 x ≤ p0 : We have that for i = 1, 2 it holds that Γ0 , pc : pc ci : Γi , pc : pc where {p, pc0 } ≤ pc . In this case, it holds that pc ≤ p0 . Assume the two executions pick different branches. S∆ {cond {e}pc then c1 else c2 }, m0 , U0 , H0 , Y0 , t0 | Γ0 , pc0 ˜ ∆ ˜ γ0 −→ γ0 S {c1 }, m, U0 , H0 , Y0 , t0 | Γ0 , pc0 : pc ˜ −→∗ γ ˜ S∆ {stop}, m , U , H , Y , t | Γαn and S∆ {cond {e}pc then c1 else c2 }, s, X, J, W, r | Γ, pc S {c2 }, s, X, J0 , W, r | Γ, pc : pc ˜ ∆ δ ˜ δ ˜ −→ stop, m , U , H , Y , t | ∆ γ −→ δ ˜ ˜ −→∗ S∆ {stop}, s , X , J , W , r | Γβm −→ δ ˜ stop, s , X , J , W , r | ∆ where ∆ = Γαn ˜ ∆βm . Denote Γ = (Γ0 , pc0 : pc ). Consider q such that p0 ˜ ˜ ˜ ˜ Γ0 Γ ...Γαn ∆ q. The proof then proceeds as follows. (1) First, we show that p0 ˜ ˜ ˜ ˜ Γ0 Γ ...Γβm ∆ q. From p0 ˜ ˜ ˜ ˜ Γ0 Γ ...Γαn ∆ q, we have that ˜ ˜ ˜ ˜ ˜ {Γ0 x | ∃x , x2 , . . . , xn+1 : Γ0 x ≤ Γ x ≤ . . . Γαn xn ≤ ∆ xn+1 ≤ q} ≤ p0 We want to show that ˜ ˜ ˜ {Γ0 y | ∃y , y2 , . . . , ym+1 : Γ0 y ≤ Γ y ˜ ˜ ≤ . . . Γβm ym ≤ ∆ ym+1 ≤ q} ≤ p0 Assume there is a sequence of variables y, y , y2 , . . . , ym+1 such that ˜ ˜ Γ0 y ≤ Γ y ˜ ˜ ≤ . . . Γβm ym ≤ ∆ ym+1 ≤ q ˜ and Γ y ≤ p0 . We consider two cases: (a) pc ≤ q. ˜ ˜ By construction, all intermediate environments Γ , . . . , Γαn contain pc , and we also know there is at least one variable x such that Γ(x) ≤ pc and Γ(x) ≤ p0 . Consider chain x, pc , . . . pc of length n + 2. From Γ(x) ≤ p0 , it must be that this chain cannot be further extended in a way that its n + 3-rd component is bound by q, because it would then contradict p0 ˜˜ ˜ ˜ ΓΓ ...Γαn ∆ q. Hence, it must be that ˜ {xn+1 | pc ≤ ∆ xn+1 ≤ q} = ∅ Therefore, it must be that ˜ pc ≤ ∆ ym+1 ≤ q 40 Then, this means that ym+1 is not updated in either of the branches, and therefore, we can construct a chain consisting of one variable y and n + 2 copies of ym+1 , such that ˜ ˜ ˜ ˜ Γ0 y ≤ Γ y ≤ . . . Γαn ym+1 ≤ ∆ ym+1 ≤ q This contradicts our assumption that p0 of y1 . . . ym+1 cannot exist. (b) pc ≤ q ˜ ˜ Because ∆ ym+1 ≤ q , we have that pc ≤ ∆ ym+1 . The rest of the proof of this case is similar to the case above. (2) From Proposition 1 (Bounded updates in high contexts) applied to c1 and c2 , facts that ˜ ˜ ˜ ˜ Γαn ≤ ∆, Γβm ≤ ∆, we obtain both memory equivalence at q and that there are no ˜0 p0 , Γ -events within any of the branches. T-SCOPE-CMD: We have that Γ, pc : pc S∆ {c} : ∆, pc, where Γ, pc : pc c : Γ , pc : pc and Γ ≤ ∆ and ˜ ˜ ˜ ˜ Γ0 Γ ...Γαn ∆ q, and hence the original sequence c is not a conditional or a loop. The two matching transitions are S-SCOPE-LIFT, in which case we are done immediately by the induction hypothesis, and S-SCOPE-RESET, in which case we have that S∆ {stop}, m, t | Γ, pc : pc and we need to show that p0 ˜˜ Γ∆ γ −→ stop, m, t | ∆, pc γ pc. We have that pc ≤ pc : pc , and hence by Lemma 9 (Relation of coverage to relabeling) from p0 Γ pc : pc ˜ ˜ by construction, it must be that pc ˜ pc, it must be that p0 Γ pc. Then, because pc is defined in ∆ ˜ ∆ and therefore, by Lemma 8 (Coverage composition), it holds that p0 T-SCOPE-LOOP: Similar to T-SCOPE-COND. ˜˜ Γ∆ pc. B.7. Security for relabel ordering. Proposition 3 (Backbone for relabel ordering). Consider program c, environment Γ, stack pc, a policy set p, and two configurations c, m, U, H, Y, t | Γ, pc (1) Γ, pc annot γ and c, s, X, J, W, r | Γ, pc δ , such that c : ∆, pc . (2) {m, s} ⊆ Conv(c) (3) m ≈init s. p,Γ (4) c, m, U, H, Y, t | Γ, pc −→∗ c0 , m0 , U0 , H0 , Y0 , t0 | Γ0 , pc0 ˜ c2 , m2 , U2 , H2 , Y2 , t2 | Γ2 γ2 γ − ∗ → γ0 −→∗p,Γ ˜ ˜ c1 , m1 , U1 , H1 , Y1 , t1 | Γ1 γ1 −→ ˜ p,Γ /stop (5) c, s, J, W, r, Γ, pc | δ c0 , s0 , X, J0 , W0 , r0 | Γ0 , pc0 ˜ ˜α ˜ Γ[Γ−i ]N Γ0 i=1 δ0 (6) for any q it holds that p m-execution as (7) p ˜ ˜ Γ...Γ0 q ⇐⇒ p ˜ ˜ ˜ Γ[Γ− ]M Γ0 βj j=1 q, where we denote the typing environments in Γ−i , α 1 ≤ i ≤ N , and the typing environments in s-execution as ∆−j , 1 ≤ j ≤ M . β pc0 ˜ ˜α ˜ Γ[Γ−i ]N Γ0 i=1 (8) there are S0 and T0 such that for any q such that p s0 , J0 , T0 . Then there are s1 , s2 , . . . such that 41 q it holds that m0 , H0 , S0 φ0 ,ψ0 ≈ q,Γ (1) c0 , s0 , X0 , J0 , W0 , r0 | Γ0 , pc0 ∆2 δ2 δ0 −→∗p,Γ ˜ ˜ c1 , s1 , X1 , J1 , W1 , r1 | ∆1 δ1 −→ ˜ p,Γ /stop c2 , s2 , X2 , J2 , W2 , r2 | (2) for any q it holds that p ˜ ˜α ˜ ˜ Γ[Γ−i ]N Γ0 ...Γ2 i=1 q ⇐⇒ p ˜ ˜ ˜ ˜ Γ[Γ− ]M Γ0 ...Γ2 βj j=1 q. φ2 ,ψ2 (3) there are sets S2 ,T2 and permutations φ2 , ψ2 such that m2 , H2 , S2 p (4) p ˜ ˜ Γ...Γ2 ˜ ˜ Γ...Γ2 ≈ q,Γ2 s2 , J2 , T2 , for any q such that q pc2 (5) Moreover, if ˜ c1 , m1 , U1 , H1 , Y1 , t1 | Γ1 is a p, Γ -event then ˜ c1 , s1 , X1 , J1 , W1 , r1 | ∆1 is a p, Γ -event. In addition, if this p then there are events θ and ι such that (a) t2 = t1 · θ and r2 = r1 · ι (b) pat(θ, S2 ) 2= 2 pat(ι, T2 ) Proof. By Lemma 1 (Preservation), we obtain that Γ0 , pc0 that p ... φ ,ψ δ1 γ1 −→ p,Γ ˜ c2 , m2 , U2 , H2 , Y2 , t2 | Γ2 γ2 −→ p,Γ ˜ c2 , s2 , X2 , J2 , W2 , r2 | Γ2 δ2 p, Γ ˜ ˜ Γ...Γ1 -event is an output on channel ˜ OutPols( , Γ1 ) such that annot c : ∆, pc . Assume that the p0 , Γ0 p, Γ -event is obtained by assignment/output/set of level p. By Lemma 8 (Coverage composition), we get that there is p0 such p0 and p0 p0 , Γ0 ... p, i.e., the same event can be viewed as -event w.r.t. c0 -th configuration. By Proposition 2 (Progress-sensitive advancement for relabel ordering), s-trace produces a configuration that matches m trace w.r.t. p, Γ -event. By applying Lemma 8 (Coverage composition), we can show that it must also be -event w.r.t. the initial c-configuration. B.7.1. Security for the relabel ordering. Lemma 12 (Progress-sensitive security for relabel ordering). Consider program c1 , memory m1 , level , and a policy ˆ set Pn , such that (1) Γ1 , pc1 annot c1 : ∆, pc (2) m1 ∈ Conv(c) (3) c1 , m1 , U1 , H1 , Y1 , ε | Γ1 , pc 1 ˆ (4) Pn Then Conv(c) ∩ {m | m1 ≈init ,Γ m } ⊆ k (c1 , tobs ) ˆ Pn Proof. Let Z be the index of the configuration that produces the last event in ti · tobs · θ, i.e., ˜ cZ , mZ , UZ , HZ , YZ , tZ | ΓZ where tZ+1 γZ j=i..n γ1 ˜ −→∗ ci , mi , Ui , Hi , Yi , ti | Γi γi ˜ −→∗ cn , mn , Un , Hn , Yn , ti · tobs | Γn γn ˜ ˜ ˜ ˜ ConnectedInitPols(Γ1 , . . . , Γi , . . . , Γj | OutPols( , Γj )) −→ ˆ Pn ,Γ ˜ cZ+1 , mZ+1 , UZ+1 , HZ+1 , YZ+1 , tZ+1 | ΓZ+1 γZ+1 = ti · tobs · θ . Similarly, let A be the index of the configuration that produces the last event in ti . Pick s ∈ Conv(c) ∩ {m | m1 ≈init ,Γ m }. We proceed as follows. ˆ Pn 42 • We show that there are two sets of indices idx0 , idx1 , . . . , idxN and alt0 , alt1 , . . . , idxN such that for all j, where 1 ≤ j ≤ N , configuration cidxj−1 +1 , midxj−1 +1 , Uidxj−1 +1 , Hidxj−1 +1 , Yidxj−1 +1 , tidxj−1 +1 | ˜ Γidxj−1 +1 γidxj−1 +1 is “matched” by configuration caltj−1 +1 , saltj−1 +1 , Ualtj−1 +1 , Jaltj−1 +1 , Waltj−1 +1 , raltj−1 +1 | ∆altj−1 +1 , pc altj−1 +1 γaltj−1 +1 : (1) cidxj−1 +1 = caltj−1 +1 (2) Pn ˜ ˜ Γ1 ...Γidxj−1 +1 q ⇐⇒ Pn ˜ ˜ Γ1 ...∆altj−1 +1 q (3) There are sets Sj−1 and Tj−1 , and permutations φj−1 and ψj−1 such that (4) midxj−1 +1 , Hidxj−1 +1 , Sj−1 q (5) Pn ˜ ˜ Γ1 ...∆altj−1 +1 φj−1 ,ψj−1 ≈ ˜ q,Γj−1 saltj−1 +1 , Jaltj−1 +1 , Tj−1 for any q such that Pn ˜ ˜ Γ1 ...Γidxj−1 +1 pcidxj−1 +1 and Pn ˜ ˜ Γ1 ...Γaltj−1 +1 pcaltj−1 +1 ˜ ˜ (6) Γidxj−1 +1 = ∆altj−1 +1 ˆ (7) For idxj−1 ≥ i, if Pn such that (a) tidxj−1 +1 = tidxj−1 · θ and taltj−1 +1 = raltj−1 · ι. (b) pat(θ, Sj−1 ) φj−1 ,ψj−1 has a form {output( , e)} or {output( , e)}q ; cidxj−1 +1 , where ˜ ˜ Γ1 ...Γidxj−1 q ˜ OutPols( , Γidxj−1 ) and cidxj−1 is an -output command—that is, it —then there are events θ and ι = pat(ι, Tj−1 ) Because in matched configurations, the commands and typing environments must agree, we denote them by ˜ c[j−1] and Γ[j−1] repsectively: c[j−1] = cidxj−1 +1 = caltj−1 +1 and ˜ ˜ ˜ Γ[j−1] = Γidxj−1 +1 = ∆altj−1 +1 We proceed to show the existence of these sets of indices by induction on j. For base case j = 1 we set idx0 and alt0 to 0. We also set t0 = t1 = ε. and set S0 and T0 to empty sets. Therefore, idx0 + 1 points to the initial configuration c1 , m1 , U1 , H1 , Y1 , ε | Γ1 , pc 1 the initial s-configuration. For inductive case, we assume we have two matching configurations indexed by idxj−1 and altj−1 , and proceed to constructing indices idxj and altj . Given that ˜ c[j−1] , midxj−1 +1 , Uidxj−1 +1 , Hidxj−1 +1 , Yidxj−1 +1 , tidxj−1 +1 | Γidxj−1 +1 ˜ cZ , mZ , UZ , HZ , YZ , tZ | ΓZ γZ γidxj−1 +1 ˆ Pn ,Γ γZ+1 γ1 , and similarly the configuration indexed by alt0 + 1 is −→∗ −→ ˜ cZ+1 , mZ+1 , UZ+1 , HZ+1 , YZ+1 , tZ+1 | ΓZ+1 we define idxj ≤ Z to be such that ˜ c[j−1] , midxj−1 +1 , Uidxj−1 +1 , Hidxj−1 +1 , Yidxj−1 +1 , tidxj−1 +1 | Γ[j−1] γidxj−1 +1 γidxj −→∗Pn ,Γ ˆ ˆ Pn ,Γ γidxj +1 ˜ cidxj , midxj , Uidxj , tidxj , Hidxj , Yidxj | Γidxj −→ ˜ cidxj +1 , midxj +1 , Uidxj +1 , Hidxj +1 , Yidxj +1 , tidxj +1 | Γidxj +1 We can apply Proposition 2 (Progress-sensitive advancement for relabel ordering) to obtain the matching altj configuration in the s-trace. For all idxj ≤ A, we drop Sj and Tj to ∅ using Lemma 2 (Dropping information about related keys). 43 ˆ • By construction, we have that Pn ˜ ˜ Γ1 ...Γj ˜ OutPols( , Γj ), and therefore the s-trace agree on the patterns of the outputs θj that are produced in tobs by producing outputs with matching patterns ιj . Moreover, because of the way Sj and Tj are constructed starting from the first event in tobs , we can show that these sets are “self-sufficient”, i.e., further DY observations down the trace do not change pattern-equivalence of the j B.8. Security of enforcement. Definition 22 (Policy consistency w.r.t. a memory). Say that an erasure policy p is consistent w.r.t. a memory m when for all erasure conditions cnd such that cnd ∈ erasureconds(p), it holds that m(cnd ) = 0. A policy environment γ is consistent w.r.t. memory m when for all variables x the corresponding erasure policies γ(x) are consistent w.r.t. m. Lemma 13 (Preservation of policy consistency by policy update). Given memory m and a erasure policy p such that p is consistent w.r.t. m, and memory m such that m = m[cnd → 1] for some condition variable cnd , and p = upd(m , p), then p is consistent w.r.t. m . Proof. By induction on the structure of p, and analysis of the definition of upd(m , p). Lemma 14 (Preservation of policy environment consistency by policy update). Given memory m and a policy environment γ such that γ is consistent w.r.t. m, and memory m such that m = m[cnd → 1] for some condition variable cnd , and γ = update(m , γ), then γ is consistent w.r.t. m . Proof. Direct lifting of Lemma 13 (Preservation of policy consistency by policy update). Lemma 15 (Empty erasure conditions constrain relabeling). Consider memory m, condition variable cnd , policies pα and q such that • pα is consistent w.r.t. m • cnd ∈ erasureconds(q) • pα ≤ q Then, for pβ = upd(m[cnd → 1], pα ), we have pβ ≤ q. Proof. By induction on the height of the derivation pα ≤ q. Lemma 16 (Relationship between policy and type environments). Given a program c, memory m, environments γ and Γ, such that (1) Γ, pc annot c : ∆, pc , γ (2) c, m, U0 , H, Y0 , ε | Γ, pc −→∗ cn , mn , Un , Hn , Yn , tn | Γn , pcn γn ˜ ˜ ˜ (3) there are variables x, x1 , . . . xn−1 , y such that γ(x) ≤ Γ1 x1 . . . ≤ Γn−1 xn−1 ≤ Γn y ˜ Then γn (x) ≤ Γn y . Proof. By induction on n. Base case n = 0 is immediate. The inductive case is that whenever c, m, U0 , H, Y0 , ε | Γ, pc γ ˜ −→∗ cn−1 , mn−1 , Un−1 , Hn−1 , Yn−1 , tn−1 | Γn−1 , pcn−1 γn−1 −→ γn ˜ cn , mn , Un , Hn , Yn , tn | Γn , pcn 44 such that there are variables x1 , . . . , xn−1 such that ˜ ˜ ˜ γ(x) ≤ Γ1 x1 . . . ≤ Γn−1 xn−1 ≤ Γn y ˜ and γn−1 (x) ≤ Γn−1 xn−1 then ˜ γn (x) ≤ Γn y We proceed by induction on structure of cn−1 . • For all base cases that are not {set(x)}p we have that γn−1 = γn , and hence ˜ ˜ γn (x) = γn−1 (x) ≤ Γn−1 xn−1 ≤ Γn y . • Consider case cn−1 is {set(cnd )}p . We examine the following cases – γn−1 (x) = γn (x): This is similar to the case above. – γn−1 (x) = γn (x): In this case, it must be that ∗ mn−1 (cnd ) = 0 ∗ mn = mn−1 [cnd → 1] ∗ γn−1 (x) = pn−1 ∗ cnd ∈ erasureconds(pn−1 ) ∗ γn (x) = pn = upd(mn , pn−1 ) ∗ pn−1 = pn ˜ By T-SET-AUX, we have that environment Γn−1 is free of erasure condition cnd . In particular, if ˜ n−1 xn−1 is some level q, then cnd ∈ erasureconds(q). We have that Γ ˜ pn−1 = γn−1 (x) ≤ Γn−1 xn−1 = q Lemma 15 (Empty erasure conditions constrain relabeling) gives us pn ≤ q. Thus, we have that ˜ ˜ γn (x) = pn ≤ q = Γn−1 xn−1 ≤ Γn y , as required. • Other cases follow by induction hypothesis. Lemma 17 (Relabeling and current policy). q ≤ p and cur(p) Proof. By induction on the height of the derivation q ≤ p. Lemma 18 (Relationship between current and policy equivalences). Consider program c such that Γ, pc and a sequence of transitions c, m, U0 , H0 , Y0 , ε | Γ, pc γ =⇒ cur(q) c : Γfin , pc ˜ −→∗ ci , mi , Ui , Hi , Yi , ti | Γi γi ˜ −→ ci+1 , mi+1 , Ui , Hi , Yi , ti · θ | Γi γi ˜ ˜ ˜ Then it holds that for set pi = ConnectedInitPols(Γ, . . . , Γi | OutPols( , Γi )) we have ind (m, γi ) = {m | m ≈cur i m } ⊆ {m | m ≈init,γ m }. ,γ pi 45 Proof. Suppose we have m such that m ≈init,γ m . That is, there is some variable x such that m(x) = m (x) pi ˜ ˜ and γ(x) ≤ pi . That means there is some variable y such that Γi y ∈ OutPols( , Γi ) and there is a chain ˜ x 0 ≤ Γ1 x 1 ≤ · · · ≤ Γi y . ˜ ˜ γ(x) ≤ Γ ˜ ˜ By Lemma 16 (Relationship between policy and type environments), we have γi (x) ≤ Γi y . Since Γi y ∈ ˜ i ), i.e., cur(Γi y ) ˜ OutPols( , Γ , then by Lemma 17 (Relabeling and current policy), cur(γi (x)) . Therefore, m ≈cur i m . ,γ B.8.1. Proof of Theorem 1. We restate our main theorem. Theorem 1. Given a program c such that Γ, pc c, m, Uinit , Hinit , Yinit , ε it holds that for all levels k (c, tobs ) ⊇ Conv(c) ∩ i≤j≤n γ c : Γfin , memory m such that m ∈ Conv(c) , and γi −→∗ ci , mi , Ui , Hi , Yi , ti −→∗ cn , mn , Un , Hn , Yn , ti · tobs γn ind (m, γj ) Proof. From the typing derivation Γ, pc in the auxiliary type system c : Γfin we use Definition 8 (Translation from source to auxiliary) and Theorem 2 (Adequacy of the auxiliary semantics) to obtain a decorated matching program d with the typing derivation Γ, pc and the auxiliary trace d, m, Uinit , Hinit , Yinit , ε | Γ, pc γ annot d : Γfin , pc ˜ −→∗ di , mi , Ui , Hi , Yi , ti | Γi γi ˜ −→∗ dn , mn , Un , Hn , Yn , ti · tobs | Γn γn By Lemma 18 (Relationship between current and policy equivalences), we have that for all j ind (m, γj ) ⊆ {m | m ≈init,γ m }, pj ˜ ˜ where pj = ConnectedInitPols(Γ, . . . , Γj | OutPols( , Γj )). From this, we have ind (m, γj ) ⊆ i≤j≤n {m | m ≈init,γ m } = {m | m ≈init pj ∪i≤j≤n pj ,γ m } i≤j≤n Then, we are done by Lemma 12 (Progress-sensitive security for relabel ordering). 46