Research 158 SRC Report - HP Labs(including bytecode verifier) Figure 1: The Java VM and security...

June 11, 1998

SRCResearchReport 158

A Type System for Java Bytecode Subroutines

Raymie Stata and Martın Abadi

d i g i t a lSystems Research Center130 Lytton AvenuePalo Alto, California 94301

http://www.research.digital.com/SRC/

Systems Research Center

The charter of SRC is to advance both the state of knowledge and the state of the artin computer systems. From our establishment in 1984, we have performed basic andapplied research to support Digital’s business objectives. Our current work includesexploring distributed personal computing on multiple platforms, networking, pro-gramming technology, system modelling and management techniques, and selectedapplications.

Our strategy is to test the technical and practical value of our ideas by buildinghardware and software prototypes and using them as daily tools. Interesting systemsare too complex to be evaluated solely in the abstract; extended use allows us toinvestigate their properties in depth. This experience is useful in the short term inrefining our designs, and invaluable in the long term in advancing our knowledge.Most of the major advances in information systems have come through this strategy,including personal computing, distributed systems, and the Internet.

We also perform complementary work of a more mathematical flavor. Some of itis in established fields of theoretical computer science, such as the analysis of algo-rithms, computational geometry, and logics of programming. Other work exploresnew ground motivated by problems that arise in our systems research.

We have a strong commitment to communicating our results; exposing and testingour ideas in the research and development communities leads to improved under-standing. Our research report series supplements publication in professional jour-nals and conferences. We seek users for our prototype systems among those withwhom we have common interests, and we encourage collaboration with universityresearchers.

A Type System for Java Bytecode Subroutines

Raymie Stata and Martın Abadi

June 11, 1998

This paper will appear in the ACM Transactions on Programming Languages andSystems. A preliminary version appeared in the Proceedings of the 25th ACMSymposium on Principles of Programming Languages.

c©ACM 1998

All rights reserved. Published by permission. Permission to make digital or hardcopies of part or all of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for profit or commercial advantageand that copies bear this notice and the full citation on the first page. Copyrightsfor components of this work owned by others than ACM, Inc. must be honored.Abstracting with credit is permitted. To copy otherwise, to republish, to post onservers, or to redistribute to lists, requires prior specific permission and/or a fee.Request permission from Publications Dept, ACM Inc., fax +1 (212) 869-0481, [email protected].

Abstract

Java is typically compiled into an intermediate language, JVML, that is interpretedby the Java Virtual Machine. Because mobile JVML code is not always trusted,a bytecode verifier enforces static constraints that prevent various dynamic errors.Given the importance of the bytecode verifier for security, its current descriptionsare inadequate. This paper proposes using typing rules to describe the bytecodeverifier because they are more precise than prose, clearer than code, and easier toreason about than either.

JVML has a subroutine construct which is used for the compilation of Java’s try-finally statement. Subroutines are a major source of complexity for the bytecodeverifier because they are not obviously last-in/first-out and because they require akind of polymorphism. Focusing on subroutines, we isolate an interesting, smallsubset of JVML. We give typing rules for this subset and prove their correctness.Our type system constitutes a sound basis for bytecode verification and a rationalreconstruction of a delicate part of Sun’s bytecode verifier.

Contents

1 Bytecode verification and typing rules 1

2 Overview of JVML subroutines and our type system 4

3 Syntax and informal semantics of JVML0 7

4 Semantics without subroutines 84.1 Dynamic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Static semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.3 One-step soundness theorem . . . . . . . . . . . . . . . . . . . . . . . 12

5 Structured semantics 125.1 Dynamic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125.2 Static semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.3 One-step soundness theorem . . . . . . . . . . . . . . . . . . . . . . . 16

6 Stackless semantics 186.1 Dynamic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.2 Static semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.3 One-step soundness theorem . . . . . . . . . . . . . . . . . . . . . . . 22

7 Main soundness theorem 25

8 Sun’s rules 258.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258.2 Technical differences . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

9 Other related work 27

10 Conclusions 28

Appendix 28

A Proof of soundness for the structured semantics 28A.1 Preservation of WFCallStack . . . . . . . . . . . . . . . . . . . . . . 28A.2 Lemmas about types . . . . . . . . . . . . . . . . . . . . . . . . . . . 31A.3 Preservation of stack typing . . . . . . . . . . . . . . . . . . . . . . . 34A.4 Preservation of local variable typing . . . . . . . . . . . . . . . . . . 35

B Proof of soundness for the stackless semantics 36B.1 Analogous steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38B.2 Preservation of Consistent . . . . . . . . . . . . . . . . . . . . . . . . 39

B.2.1 Miscellaneous lemmas . . . . . . . . . . . . . . . . . . . . . . 40B.2.2 Preservation of WFCallStack2 . . . . . . . . . . . . . . . . . . 44B.2.3 Preservation of GoodStackRets . . . . . . . . . . . . . . . . . 47B.2.4 Preservation of GoodFrameRets . . . . . . . . . . . . . . . . . 50

C Proof of main soundness theorem 54C.1 Making progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56C.2 Chained soundness theorem . . . . . . . . . . . . . . . . . . . . . . . 57

Acknowledgements 58

References 61

1 Bytecode verification and typing rules

The Java language is typically compiled into an intermediate language that is inter-preted by the Java Virtual Machine (VM) [LY96]. This intermediate language, whichwe call JVML, is an object-oriented language similar to Java. Its features includepackages, classes with single inheritance, and interfaces with multiple inheritance.However, unlike method bodies in Java, method bodies in JVML are sequences ofbytecode instructions. These instructions are fairly high-level but, compared to thestructured statements used in Java, they are more compact and easier to interpret.

JVML code is often shipped across networks to Java VMs embedded in webbrowsers and other applications. Mobile JVML code is not always trusted by theVM that receives it. Therefore, a bytecode verifier enforces static constraints onmobile JVML code. These constraints rule out type errors (such as dereferencing aninteger), access control violations (such as accessing a private method from outsideits class), object initialization failures (such as accessing a newly allocated objectbefore its constructor has been called), and other dynamic errors.

Figure 1 illustrates how bytecode verification fits into the larger picture of Javasecurity. The figure represents trusted and untrusted code. At the base of thetrusted code is the Java VM itself—including the bytecode verifier—plus the operat-ing system, which provides access to privileged resources. On top of this base layeris the Java library, which provides controlled access to those privileged resources.Java security depends on the VM correctly interpreting JVML code, which in turndepends on the verifier rejecting illegal JVML code. If the verifier were broken butthe rest of the VM assumed it was correct, then JVML code could behave in waysnot anticipated by the Java library, circumventing the library’s access controls.

For example, the current implementation of the library contains private methodsthat access privileged resources without performing access control checks. This im-plementation depends on the verifier preventing untrusted code from calling thoseprivate methods. But if the verifier were broken, an attacker might be allowed totreat an instance of a library class as an instance of a class defined by the attacker.In this situation, the class defined by the attacker could be used as a back door. Atruntime, most VMs use dispatch-table indices to name methods. Successful invoca-tion of a method depends only on this index and on passing the right number andtype of arguments to the method. If the verifier allowed the attacker to treat aninstance of the library class as an instance of the back-door class, then the attackercould call the instance’s methods, including the private ones that do not performaccess control checks.

As this example shows, a successful attack requires more than a broken verifierthat accepts rogue JVML code. If, in our example, a VM had some runtime mech-anism for preventing the use of an instance of an unexpected class, then the attack

1

Trusted code

Untrustred code

Hostile JVML code

Librarylevel

Baselevel

Java library (includingreference monitors)

C libraries/syscalls forprotected resources

Java VM(including bytecode verifier)

Figure 1: The Java VM and security

would fail. However, this mechanism might be expensive. Thus, the purpose ofthe verifier is to permit implementations of JVML that are fast without sacrificingsafety. The primary customer of the verifier is the rest of the VM implementation:instruction interpretation code, JIT-compiler routines, and other code that assumesthat illegal input code is filtered out by the verifier.

Given its importance for security, current descriptions of the verifier are deficient.The official specification of the verifier is a prose description [LY96]. Although goodby the standards of prose, this description is ambiguous, imprecise, and hard to rea-son about. In addition to this specification, Sun distributes what could be considereda reference implementation of the verifier. As a description, this implementation isprecise, but it is hard to understand and, like the prose description, is hard to reasonabout. Furthermore, the implementation disagrees with the prose description.

This paper proposes using typing rules to describe the verifier. Typing rulesare more precise than prose, easier to understand than code, and they can be ma-nipulated formally. Such rules would give implementors of the verifier a systematicframework on which to base their code, increasing confidence in its correctness. Suchrules would also give implementors of the rest of the VM an unambiguous statementof what they can and cannot assume about legal JVML code.

From a typing perspective, JVML is interesting in at least two respects:

• In JVML, a location can hold different types of values at different programpoints. This flexibility allows locations to be reused aggressively, allowinginterpreters to save space. Thus, JVML contrasts with most typed languages,in which a location has only one type throughout its scope.

• JVML has subroutines to help compilers generate compact code, for example

2

for Java try-finally statements. JVML subroutines are subsequences of thelarger sequence of bytecode instructions that make up a method’s body. TheJVML instruction jsr jumps to the start of one of these subsequences, and theJVML instruction ret returns from one. Subroutines introduce two significantchallenges to the design of the bytecode verifier: the verifier should ensure thatret is used in a well-structured manner, and it should support a certain kindof polymorphism. We describe these challenges in more detail in Section 2.

This paper addresses these typing problems. It defines a small subset of JVML,called JVML0, presents a type system and a dynamic semantics for JVML0, andproves the soundness of the type system with respect to the dynamic semantics.JVML0 includes only 9 instructions, and ignores objects and several other featuresof JVML. This restricted scope allows us to focus on the challenges introduced bysubroutines. Thus, our type system provides a precise, sound approach to bytecodeverification, and a rational reconstruction of a delicate part of Sun’s bytecode verifier.

We present the type system in three stages:

1. The first stage is a simplified version for a subset of JVML0 that excludes jsrand ret. This simplified version provides an introduction to our notation andapproach, and illustrates how we give different types to locations at differentprogram points.

2. The next stage considers all of JVML0 but uses a structured semantics for jsrand ret. This structured semantics has an explicit subroutine call stack forensuring that subroutines are called on a last-in, first-out basis. In the contextof this structured semantics, we show how to achieve the polymorphism desiredfor subroutines.

3. The last stage uses a stackless semantics for jsr and ret in which return ad-dresses are stored in random-access memory. The stackless semantics is closerto Sun’s. It admits more efficient implementations, but it does not dynamicallyenforce a last-in, first-out discipline on calls to subroutines. Because such adiscipline is needed for type safety, we show how to enforce it statically.

The next section describes JVML subroutines in more detail and summarizesour type system. Section 3 gives the syntax and an informal semantics for JVML0.Sections 4–6 present our type system in the three stages outlined above. Section 7states and proves the main soundness theorem for our type system. Sections 8 and 9discuss related work, including Sun’s bytecode verifier. Section 8 also considers howour type system could be extended to the full JVML. Section 10 concludes. Severalappendices contain proofs.

3

2 Overview of JVML subroutines and our type system

JVML subroutines are subsequences of a method’s larger sequence of instructions;they behave like miniature procedures within a method body. Subroutines are usedfor compiling Java’s try-finally statement.

Consider, for example, the Java method named bar at the top of Figure 2.The try body can terminate in one of three ways: immediately when i does notequal 3, with an exception raised by the call to foo, or with an execution of thereturn statement. In all cases, the compiler must guarantee that the code in thefinally block is executed when the try body terminates. Instead of replicating thefinally code at each escape point, the compiler can put the finally code in a JVMLsubroutine. Compiled code for an escape from a try body executes a jsr to thesubroutine containing the finally code.

Figure 2 illustrates the use of subroutines for compiling try-finally statements.It contains a possible result of compiling the method bar into JVML, putting thefinally code in a subroutine in lines 13–16.

Figure 2 also introduces some of JVML’s runtime structures. JVML bytecodeinstructions read and write three memory regions. The first region is an object heapshared by all method activations; the heap does not play a part in the examplecode of Figure 2. The other two regions are private to each activation of a method.The first of these private regions is the operand stack, which is intended to be usedon a short-term basis in the evaluation of expressions. For example, the instructioniconst 3 pushes the integer constant 3 onto this stack, while ireturn terminates thecurrent method and returns the integer at the top of the stack. The second privateregion is a set of locations known as local variables, which are intended to be used ona longer-term basis to hold values across expressions and statements (but not acrossmethod activations). Local variables are not operated on directly. Rather, values inlocal variables are pushed onto the stack and values from the stack are popped intolocal variables via the load and store instructions respectively. For example, theinstruction aload 0 pushes the object reference in local variable 0 onto the operandstack, while the instruction istore 2 pops the top value off the operand stack andsaves it in local variable 2.

Subroutines pose two challenges to the design of a type system for JVML:

• Polymorphism. Subroutines are polymorphic over the types of the locationsthey do not touch. For example, consider how variable 2 is used in Figure 2. Atthe jsr on line 7, variable 2 contains an integer and is assumed to contain aninteger when the subroutine returns. At the jsr on line 18, variable 2 containsa pointer to an exception object and is assumed to contain a pointer to anexception object when the subroutine returns. Inside a subroutine, the type

4

int bar(int i) {try {

if (i == 3) return this.foo();} finally {

this.ladida();}return i;

}

01 iload 1 // Push i02 iconst 3 // Push 303 if icmpne 10 // Goto 10 if i does not equal 3

// Then case of if statement04 aload 0 // Push this05 invokevirtual foo // Call this.foo06 istore 2 // Save result of this.foo()07 jsr 13 // Do finally block before returning08 iload 2 // Recall result from this.foo()09 ireturn // Return result of this.foo()

// Else case of if statement10 jsr 13 // Do finally block before leaving try

// Return statement following try statement11 iload 1 // Push i12 ireturn // Return i

// finally block13 astore 3 // Save return address in variable 314 aload 0 // Push this15 invokevirtual ladida // Call this.ladida()16 ret 3 // Return to address saved on line 13

// Exception handler for try body17 astore 2 // Save exception18 jsr 13 // Do finally block19 aload 2 // Recall exception20 athrow // Rethrow exception

// Exception handler for finally body21 athrow // Rethrow exception

Exception table (maps regions of code to their exception handlers):Region Target1–12 1713–16 21

Figure 2: Example compilation of try-finally into JVML

5

// Assume variable 0 contains pointer to this01 iconst 1 // Push 102 istore 1 // Initialize variable 1 to 103 jsr 13 // Call subroutine04 aload 0 // Push this05 iconst 0 // Push 006 putfield x // Set this.x to 007 iconst 1 // Push 108 istore 0 // Set variable 0 to 109 iconst 0 // Push 010 istore 1 // Set variable 1 to 011 jsr 13 // Call subroutine12 return // End of method

// Top of subroutine13 iload 1 // Push variable 114 iconst 0 // Push 015 if icmpeq 18 // Goto 18 if variable 1 equals 016 astore 2 // Save return address in variable 217 ret 2 // Return to return address in variable 218 pop // Throw out current return address19 ret 2 // Return to old address saved in variable 2

Figure 3: Illegal JVML code

of a location such as variable 2 can depend on the call site of the subroutine.(Subroutines are not parametric over the types of the locations they touch; thepolymorphism of JVML is thus weaker than that of ML.)

• Last-in, first-out behavior. In most languages, when a return statement inprocedure P is executed, the dynamic semantics guarantees that control willreturn to the point from which P was most recently called. The same is nottrue of JVML. The ret instruction takes a variable as a parameter and jumpsto whatever address that variable contains. This semantics means that, unlessadequate precautions are taken, the ret instruction can transfer control toalmost anywhere.

Using ret to jump to arbitrary places in a program is inimical to static typing,especially in the presence of polymorphism. For example, consider the illegal JVMLmethod body in Figure 3. This method body has two calls to the subroutine thatstarts at line 13, which is polymorphic over variable 0. The first time this subroutineis called, it stores the return address into variable 2 and returns. The second timethis subroutine is called, it returns to the old return address stored away in variable 2.

6

This causes control to return to line 4, at which point the code assumes that variable 0contains a pointer. However, between the first and second calls to the subroutine, aninteger has been stored into variable 0. Thus, the code in Figure 3 will dereferencean integer.

Our type system allows polymorphic subroutines and enforces last-in, first-outbehavior. It consists of rules that relate a program (a sequence of bytecode instruc-tions) to static information about types and subroutines. This information mapseach memory location of the VM to a type at each program point, identifies theinstructions that make up subroutines, indicates the variables over which subrou-tines are polymorphic, and gives static approximations to the dynamic subroutinecall stack.

Our type system guarantees the following properties for well-typed programs:

• Type safety. An instruction will never be given an operand stack with too fewvalues in it, or with values of the wrong type.

• Program counter safety. Execution will not jump to undefined addresses.

• Bounded operand stack. The size of the operand stack will never grow beyonda static bound.

3 Syntax and informal semantics of JVML0

In JVML0, our restricted version of JVML, a program is a sequence of instructions:

P ::= instruction∗

We treat programs as partial maps from addresses to instructions. We write Addr

for the set of all addresses. Addresses are very much like positive integers, and we usethe constant 1 and the function + on addresses. However, to provide more structureto our semantics, we treat numbers and addresses as separate sets. When P is aprogram, we write Dom(P ) for the domain of P (its set of addresses); P [i] is definedonly for i ∈ Dom(P ). We assume that 1 ∈ Dom(P ) for every program P .

In JVML0, there are no classes, methods, or objects. There is no object heap,but there is an operand stack and a set of local variables. We write Var for the setof names of local variables. (In JVML, these names are actually natural numbers,but we do not require this.) Local variables and the operand stack both containvalues. A value is either an integer or an address.

7

JVML0 has only 9 instructions:

instruction ::= inc | pop | push0| load x | store x| if L| jsr L | ret x| halt

where x ranges over Var, and L ranges over Addr. Informally, these instructionsbehave as follows:

• The inc instruction increments the value at the top of the operand stack ifthat value is an integer. The pop instruction pops the top off the operandstack. The push0 instruction pushes the integer 0 onto the operand stack.

• The load x instruction pushes the current value of local variable x onto theoperand stack. The store x instruction pops the top value off the operandstack and stores it into local variable x.

• The if L instruction pops the top value off the operand stack and either fallsthrough when that value is the integer 0 or jumps to L otherwise.

• At address p, the jsr L instruction jumps to address L and pushes returnaddress p + 1 onto the operand stack. The ret x instruction jumps to theaddress stored in x.

• The halt instruction halts execution.

4 Semantics without subroutines

This section introduces our approach and some notation. It presents static anddynamic semantics for the subset of JVML0 that excludes jsr and ret.

We use (partial) maps extensively throughout this paper. When g is a map,Dom(g) is the domain of g; for x ∈ Dom(g), g[x] is the value of g at x, and g[x 7→ v]is the map with the same domain as g defined by the following equation, for ally ∈ Dom(g):

(g[x 7→ v])[y] =

{g[y] x 6= yv x = y

That is, g[x 7→ v] has the same value as g at all points except x, where g[x 7→ v] hasvalue v. We define equality on maps as follows:

f = g ≡ Dom(f) = Dom(g) ∧ ∀x ∈ Dom(f). f [x] = g[x]

8

That is, f = g exactly when f and g have the same domains and have equal valuesat all points.

We often use maps with domain Addr. We call those maps vectors. When F isa vector and i is an address, we may write F i instead of F [i].

We also use strings. The constant ε denotes the empty string. If s is a string,then v · s denotes the string obtained by prepending v to s. Sometimes we treatstrings as maps from indices to integers. Then Dom(s) is the set of indices of s, ands[i] is the ith element of s from the right. Concatenation adds a new index for thenew element but does not change the indices of the old elements; that is:

∀s, i, n. i ∈ Dom(s)⇒ i ∈ Dom(n · s) ∧ (n · s)[i] = s[i]

4.1 Dynamic semantics

We model a state of an execution as a tuple 〈pc, f, s〉, where pc is the programcounter, f is the current state of local variables, and s is the current state of theoperand stack.

• The program counter pc is an address, that is, an element of Addr.

• The current state of local variables f is a total map from Var to the set ofvalues.

• The current state of the stack s is a string of values.

All executions start from states of the form 〈1, f, ε〉, where f is arbitrary and whereε represents the empty stack.

Figure 4 contains a small-step operational semantics for all instructions otherthan jsr and ret. This semantics relies on the judgment

P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

which means that program P can, in one step, go from state 〈pc, f, s〉 to state〈pc′, f ′, s′〉. By definition, the judgment P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉 holds onlywhen P is a program and 〈pc, f, s〉 and 〈pc′, f ′, s′〉 are states. In Figure 4 (and inthe rest of this paper), n matches only integers while v matches any value. Thus,for example, the pattern “n · s” represents a non-empty stack whose top elementis an integer. Note that there is no rule for halt—execution stops when a haltinstruction is reached. Execution may also stop as the result of a dynamic error, forexample attempting to execute a pop instruction with an empty stack.

9

P [pc] = incP ` 〈pc, f, n · s〉 → 〈pc + 1, f, (n+ 1) · s〉

P [pc] = popP ` 〈pc, f, v · s〉 → 〈pc + 1, f, s〉

P [pc] = push0P ` 〈pc, f, s〉 → 〈pc + 1, f, 0 · s〉

P [pc] = load xP ` 〈pc, f, s〉 → 〈pc + 1, f, f [x] · s〉

P [pc] = store xP ` 〈pc, f, v · s〉 → 〈pc + 1, f [x 7→ v], s〉

P [pc] = if LP ` 〈pc, f, 0 · s〉 → 〈pc + 1, f, s〉

P [pc] = if Ln 6= 0

P ` 〈pc, f, n · s〉 → 〈L, f, s〉

Figure 4: Dynamic semantics without jsr and ret

4.2 Static semantics

Our static semantics employs the following typing rules for values:

v is a valuev : Top

n is an integern : Int

K,L are addressesK : (ret-from L)

The type Top includes all values. The type Int is the type of integers. Types of theform (ret-from L) include all addresses. However, the typing rules for programsof Sections 5 and 6 make a more restricted use of address types, preserving stronginvariants. As the syntax (ret-from L) suggests, we use L as the name for thesubroutine that starts at address L, and use (ret-from L) as the type of returnaddresses generated when L is called. Collectively, we refer to the types Top, Int,and (ret-from L) as value types.

Types are extended to stacks as follows:

(Empty hypothesis)ε : ε

v : T s : αv · s : T · α

where T is a value type and α is a string of value types. The empty stack is typedby the empty string; a stack with n values is typed by a string with n types, eachtype correctly typing the corresponding value in the string of values.

A program is well-typed if there exist a vector F of maps from variables to typesand a vector S of strings of types satisfying the judgment:

F, S ` P

10

P [i] = incF i+1 = F i

Si+1 = Si = Int · αi+ 1 ∈ Dom(P )F, S, i ` P

P [i] = if LF i+1 = FL = F i

Si = Int · Si+1 = Int · SLi+ 1 ∈ Dom(P )L ∈ Dom(P )F, S, i ` P

P [i] = popF i+1 = F iSi = T · Si+1

i+ 1 ∈ Dom(P )F, S, i ` P

P [i] = push0F i+1 = F i

Si+1 = Int · Sii+ 1 ∈ Dom(P )F, S, i ` P

P [i] = load xx ∈ Dom(Fi)F i+1 = F i

Si+1 = F i[x] · Sii+ 1 ∈ Dom(P )F, S, i ` P

P [i] = store xx ∈ Dom(Fi)

F i+1 = F i[x 7→ T ]Si = T · Si+1

i+ 1 ∈ Dom(P )F, S, i ` P

P [i] = haltF, S, i ` P

Figure 5: Static semantics without jsr and ret

The vectors F and S contain static information about the local variables and operandstack, respectively. The map F i assigns types to local variables at program point i.The string Si gives the types of the values in the operand stack at program point i.(Hence the size of Si gives the number of values in the operand stack.) For notationalconvenience, the vectors F and S are defined on all of Addr even though P is not;F j and Sj are dummy values for out-of-bounds j.

We have one rule for proving F, S ` P :

F 1 = ES1 = ε

∀i ∈ Dom(P ). F , S, i ` PF, S ` P

where E is the map that maps all variables to Top and ε is (as usual) the emptystring. The first two hypotheses are initial conditions; the third is a local judgmentapplied to each program point. Figure 5 has rules for the local judgment F, S, i ` P .

11

These rules constrain F and S at point i by referring to F j and Sj for all points jthat are control-flow successors of i.

4.3 One-step soundness theorem

The rules above are sound in that any step from a well-typed state leads to anotherwell-typed state:

Theorem 1 For all P , F , and S such that F, S ` P :

∀pc, f, s, pc′, s′, f ′.∧ P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉∧ ∀y. f [y] : Fpc[y]∧ s : Spc⇒ s′ : Spc′ ∧ ∀y. f ′[y] : Fpc′ [y]

We do not prove this theorem, but we do prove similar theorems for programs in-cluding jsr and ret. (See the appendices.)

5 Structured semantics

This section shows how to handle jsr and ret with the kind of polymorphismdescribed in Section 2. To isolate the problem of polymorphism from the problem ofensuring that subroutines are used in a well-structured manner, this section presentswhat we call the structured semantics for JVML0. This semantics is structuredin that it defines the semantics of jsr and ret in terms of an explicit subroutinecall stack. This section shows how to achieve polymorphism in the context of thestructured semantics. The next section shows how to ensure that subroutines areused in a well-structured manner in the absence of an explicit call stack.


In the structured semantics, we augment the state of an execution to include asubroutine call stack. This call stack holds the return addresses of subroutines thathave been called but have not yet returned. We model this call stack as a string ρof addresses. Thus, the state of an execution is now a four-tuple 〈pc, f, s, ρ〉.

Figure 6 defines the structured dynamic semantics of JVML0. The rules ofFigure 6 use the subroutine call stack to communicate return addresses from jsrinstructions to the corresponding ret instructions. Although ret takes an operandx, the structured dynamic semantics ignores this operand; similarly, the structureddynamic semantics of jsr pushes the return address onto the operand stack as well as

12

P [pc] = incP s 〈pc, f, n · s, ρ〉 → 〈pc + 1, f, (n+ 1) · s, ρ〉

P [pc] = popP s 〈pc, f, v · s, ρ〉 → 〈pc + 1, f, s, ρ〉

P [pc] = push0P s 〈pc, f, s, ρ〉 → 〈pc + 1, f, 0 · s, ρ〉

P [pc] = load xP s 〈pc, f, s, ρ〉 → 〈pc + 1, f, f [x] · s, ρ〉

P [pc] = store xP s 〈pc, f, v · s, ρ〉 → 〈pc + 1, f [x 7→ v], s, ρ〉

P [pc] = if LP s 〈pc, f, 0 · s, ρ〉 → 〈pc + 1, f, s, ρ〉

P [pc] = if Ln 6= 0

P s 〈pc, f, n · s, ρ〉 → 〈L, f, s, ρ〉

P [pc] = jsr LP s 〈pc, f, s, ρ〉 → 〈L, f, (pc + 1) · s, (pc + 1) · ρ〉

P [pc] = ret xP s 〈pc, f, s, pc′ · ρ〉 → 〈pc′, f, s, ρ〉

Figure 6: Structured dynamic semantics

onto the subroutine call stack. These definitions enable us to transfer the propertiesof the structured semantics of this section to the stackless semantics of the nextsection.


The structured static semantics relies on a new typing judgment:

F, S s P

13

This judgment is defined by the rule:

F 1 = ES1 = εR1 = {}

∀i ∈ Dom(P ). R, i ` P labeled∀i ∈ Dom(P ). F , S, i s P

F, S s P

The new, auxiliary judgment

R, i ` P labeled

is used to define what it means to be “inside” a subroutine. Unlike in most lan-guages, where procedures are demarcated syntactically, in JVML0 and JVML theinstructions making up a subroutine are identified by constraints in the static se-mantics. For the instruction at address i, Ri is a subroutine label that identifies thesubroutine to which the instruction belongs. These labels take the form of eitherthe empty set or a singleton set consisting of an address. If an instruction’s labelis the empty set, then the instruction belongs to the top level of the program. Ifan instruction’s label is the singleton set {L}, then the instruction belongs to thesubroutine that starts at address L. Figure 7 contains rules for labeling subroutines.These rules do not permit subroutines to have multiple entry points, but they dopermit multiple exits.

For some programs, more than one R may satisfy both R1 = {} and the con-straints of Figure 7 because the labeling of unreachable code is not unique. It isconvenient to assume a canonical R for each program P , when one exists. (Theparticular choice of R does not matter.) We write RP for this canonical R, and RP,ifor the value of RP at address i.

Much as in Section 5, the judgment

F, S, i s P

imposes local constraints near program point i. For the instructions considered inSection 5 (that is, for all instructions but jsr and ret), the rules are the same as inFigure 5. Figure 8 contains rules for jsr and ret.

The elements of F need not be defined on all variables. For an address i insidea subroutine, the domain of F i includes only the variables that can be read andwritten inside that subroutine. The subroutine is polymorphic over variables outsideDom(F i).

Figure 9 gives an example of polymorphism. (Simpler examples could be foundif the typing rules allowed a more liberal use of Top.) The code, in the rightmost

14

P [i] ∈ {inc, pop, push0, load x, store x}Ri+1 = Ri

R, i ` P labeled

P [i] = if LRi+1 = RL = RiR, i ` P labeled

P [i] = jsr LRi+1 = RiRL = {L}

R, i ` P labeled

P [i] ∈ {halt, ret x}R, i ` P labeled

Figure 7: Rules labeling instructions with subroutines

P [i] = jsr LDom(F i+1) = Dom(F i)Dom(FL) ⊆ Dom(F i)

∀y ∈ Dom(F i)\Dom(FL). F i+1[y] = F i[y]∀y ∈ Dom(FL). FL[y] = F i[y]SL = (ret-from L) · Si

i+ 1 ∈ Dom(P )L ∈ Dom(P )F, S, i s P

P [i] = ret xRP,i = {L}

∀j. P [j] = jsr L⇒(∧ ∀y ∈ Dom(F i). F j+1[y] = F i[y]∧ Sj+1 = Si

)F, S, i s P

Figure 8: Structured static semantics for jsr and ret

15

F i[0] F i[1] Si i P [i]Top Top ε 01 push0Top Top Int · ε 02 store 1Top Int ε 03 jsr 6(ret-from 6) Int ε 04 jsr 11(ret-from 11) (ret-from 15) Int · ε 05 haltTop Int (ret-from 6) · ε 06 store 0(ret-from 6) Int ε 07 load 1(ret-from 6) Int Int · ε 08 jsr 15(ret-from 6) (ret-from 15) Int · ε 09 store 1(ret-from 6) Int ε 10 ret 0(ret-from 6) Int (ret-from 11) · ε 11 store 0(ret-from 11) Int ε 12 load 1(ret-from 11) Int Int · ε 13 jsr 15(ret-from 11) (ret-from 15) Int · ε 14 ret 0undefined Int (ret-from 15) · Int · ε 15 store 1undefined (ret-from 15) Int · ε 16 incundefined (ret-from 15) Int · ε 17 ret 1

Figure 9: Example of typing

column of Figure 9, computes the number 2 and leaves it at the top of the stack.The code contains a main body and three subroutines, which we have separated withhorizontal lines. The subroutine that starts at line 6 increments the value in variable0. The subroutine that starts at line 11 also increments the value in variable 0 butleaves its result on the stack. These subroutines are implemented in terms of theone that starts at line 15, which increments the value at the top of the stack. Thislast subroutine is polymorphic over variable 0.


To state the one-step soundness theorem for the structured semantics, two definitionsare needed.

In the previous section, where programs do not have subroutines, the type oflocal variable x at program point pc is denoted by Fpc[x]. However, for programpoints inside subroutines, this definition is not quite adequate. For x ∈ Dom(Fpc),the type of x is still Fpc[x]. For other x, the type of x depends on where executionhas come from. A subroutine is polymorphic over local variables that are not inDom(Fpc): different call sites of the subroutine may have values of different typesin those local variables. Therefore, we construct an assignment of types to local

16

variables that takes into account where execution has come from. The expression:

F(F, pc, ρ)[x]

denotes the type assigned to local variable x when execution reaches point pc witha subroutine call stack ρ. The type F(F, pc, ρ)[x] is defined by the rules:

x ∈ Dom(Fpc)F(F, pc, ρ)[x] = Fpc[x]

(tt0)

x 6∈ Dom(Fpc)F(F, p, ρ)[x] = T

F(F, pc, p · ρ)[x] = T

(tt1)

Note that F(F, pc, ρ)[x] is not defined when x is not in Dom(Fpc) or any of theDom(F p) for the p’s in ρ. However, the theorems and lemmas in the followingsubsections consider only situations in which F is defined.

The second auxiliary definition for our theorem concerns an invariant on thestructure of the subroutine call stack. The theorem of the previous section saysthat, when a program takes a step from a well-typed state, it reaches a well-typedstate. With subroutines, we need to assume that the subroutine call stack of thestarting state is well-formed. For example, if the address p is in the call stack,then it must be the case that the instruction at p − 1 is a jsr instruction. Thenew theorem says that if the program takes a step from a well-typed state with awell-formed subroutine call stack, then it reaches a well-typed state with a well-formed subroutine call stack. The following judgment expresses what we mean by awell-formed subroutine call stack:

WFCallStack(P, F , pc, ρ)

This judgment is defined by the rules:

Dom(Fpc) = Var

WFCallStack(P, F , pc, ε)(wf0)

P [p− 1] = jsr LL ∈ RP,pc

Dom(Fpc) ⊆ Dom(F p)WFCallStack(P, F , p, ρ)

WFCallStack(P, F , pc, p · ρ)

(wf1)

Informally, a subroutine call stack ρ is well-formed when:

17

• given a return address p in ρ, the instruction just before p is a jsr instructionthat calls the routine from which p returns;

• the current program counter pc and all return addresses in ρ have the correctsubroutine label associated with them;

• the return address at the bottom of the call stack is for code that can accessall local variables;

• any variable that can be read and written by a callee can also be read andwritten by its caller.

With these two definitions, we can state a one-step soundness theorem for thestructured semantics. The proof of this theorem is in Appendix A.

Theorem 2 For all P , F , and S such that F, S s P :

∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc⇒∧ WFCallStack(P, F , pc′, ρ′)∧ ∀y. ∃T ′. F(F, pc′, ρ′)[y] = T ′ ∧ f ′[y] : T ′

∧ s′ : Spc′

6 Stackless semantics

The remaining problem is to eliminate the explicit subroutine call stack of the pre-vious section, using instead the operand stack and local variables to communicatereturn addresses from a jsr to a ret. As discussed in Section 2, when the semanticsof jsr and ret are defined in terms of the operand stack and local variables, uncon-trolled use of ret combined with the polymorphism of subroutines is inimical to typesafety. This section presents a static semantics that rules out problematic uses ofret. This static semantics restricts programs to operate as if an explicit subroutinecall stack like the one from the previous section were present. In fact, the soundnessargument for the stackless semantics relies on a simulation between the structuredsemantics and the stackless semantics.

The static and dynamic semantics described in this section are closest to typicalimplementations of JVML. Thus, we consider these the official semantics of JVML0.

18

P [pc] = jsr LP ` 〈pc, f, s〉 → 〈L, f, (pc + 1) · s〉

P [pc] = ret xP ` 〈pc, f, s〉 → 〈f [x], f, s〉

Figure 10: Stackless dynamic semantics for jsr and ret


The stackless dynamic semantics consists of the rules of Figure 4 plus the rulesfor jsr and ret of Figure 10. The jsr instruction pushes the return address ontothe operand stack. To use this return address, a program must first pop it intoa local variable and then reference that local variable in a ret instruction. Theret x instruction extracts an address from local variable x; if x does not contain anaddress, then the program gets stuck (because 〈f [x], f, s〉 is not a state).


To define the stackless static semantics, we revise the rule for F, S ` P of Section 4.The new rule is:

F 1 = ES1 = εC1 = ε

∀i ∈ Dom(P ). C, i ` P strongly labeled∀i ∈ Dom(P ). F , S, i ` P

F, S ` PThe new, auxiliary judgment

C, i ` P strongly labeled

constrains C to be an approximation of the subroutine call stack. Each element of Cis a string of subroutine labels. For each address i, the string Ci is a linearization ofthe subroutine call graph to i. Of course, such a linearization of the subroutine callgraph exists only when the call graph is acyclic, that is, when subroutines do notrecurse. (We believe that we can prove our theorems while allowing recursion, butdisallowing recursion simplifies our proofs and agrees with Sun’s specification [LY96,p. 124].) Figure 11 contains the rules for this new judgment; note that a subsequenceis not necessarily a consecutive substring. Figure 12 shows an application of the rules

19

P [i] ∈ {inc, pop, push0, load x, store x}Ci+1 = Ci


P [i] = if LCi+1 = CL = Ci


P [i] = jsr LL 6∈ Ci

Ci+1 = CiCL = L · c

Ci is a subsequence of cC, i ` P strongly labeled

P [i] ∈ {halt, ret x}C, i ` P strongly labeled

Figure 11: Rules approximating the subroutine call stack at each instruction

Ci i P [i]ε 01 push0ε 02 store 1ε 03 jsr 6ε 04 jsr 11ε 05 halt

6 · ε 06 store 06 · ε 07 load 16 · ε 08 jsr 156 · ε 09 store 16 · ε 10 ret 0

11 · ε 11 store 011 · ε 12 load 111 · ε 13 jsr 1511 · ε 14 ret 0

15 · 11 · 6 · ε 15 store 115 · 11 · 6 · ε 16 inc15 · 11 · 6 · ε 17 ret 1

Figure 12: Example of C labeling

20

P [i] = jsr LDom(F i+1) = Dom(F i)Dom(FL) ⊆ Dom(F i)

∀y ∈ Dom(F i)\Dom(FL). F i+1[y] = F i[y]∀y ∈ Dom(FL). FL[y] = F i[y]SL = (ret-from L) · Si(ret-from L) 6∈ Si

∀y ∈ Dom(FL). FL[y] 6= (ret-from L)i+ 1 ∈ Dom(P )L ∈ Dom(P )F, S, i ` P

P [i] = ret xRP,i = {L}x ∈ Dom(F i)

F i[x] = (ret-from L)

∀j. P [j] = jsr L⇒(∧ ∀y ∈ Dom(F i). F j+1[y] = F i[y]∧ Sj+1 = Si

)F, S, i ` P

Figure 13: Stackless static semantics for jsr and ret

of Figure 11 to the code from Figure 9. In this example, the order of 6 and 11 couldbe reversed in C15, C16, and C17.

As with R, more than one C may satisfy both C1 = ε and the constraints inFigure 11. We assume a canonical C for each program P , when one exists. We writeCP for this canonical C, and CP,i for the value of CP at address i. Programs thatsatisfy the constraints in Figure 11 also satisfy the constraints in Figure 7; we defineRP from CP as follows:

RP,i =

{{} when CP,i = ε{L} when CP,i = L · c for some c

On the other hand, programs with recursive subroutines may satisfy the constraintsin Figure 7 but never those in Figure 11.

Figure 13 contains the rules that define F, S, i ` P for jsr and ret; rules forother instructions are in Figure 5. The rule for jsr L assigns the type (ret-from L)to the return address pushed onto the operand stack. This type will propagate to anylocation into which this return address is stored, and it is checked by the following

21

hypotheses in the rule for ret:

x ∈ Dom(F i)F i[x] = (ret-from L)

Typing return addresses helps ensure that the return address used by a subroutine Lis a return address for L, not for some other subroutine. By itself, ensuring that thereturn address used by a subroutine L is a return address for L does not guaranteelast-in, first-out behavior. One also has to ensure that the only return address for Lavailable inside L is the most recent return address, not one tucked away during aprevious invocation. This is achieved by the following hypotheses in the rule for jsr:

(ret-from L) 6∈ Si∀y ∈ Dom(FL). FL[y] 6= (ret-from L)

These hypotheses guarantee that the only value of type (ret-from L) availableinside L is the most recent value of this type pushed by the jsr instruction. (Thesehypotheses might be redundant for reachable code; we include them because ourrules apply also to unreachable code.) Except for the lines discussed above, therules for jsr and ret are the same as those of the structured static semantics. Asan example, we may simply reuse the one of Figure 9, since the typing given thereremains valid in the stackless static semantics.


The one-step soundness theorem for the stackless semantics requires some auxiliarydefinitions. We develop those definitions top-down, after stating the theorem. Theproof of the theorem is in Appendix B.


∀pc, f, s, ρ, pc′, f ′, s′.∧ P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc⇒ ∃ρ′.∧ Consistent(P, F , S, pc′, f ′, s′, ρ′)∧ ∀y. ∃T ′. F(F, pc′, ρ′)[y] = T ′ ∧ f ′[y] : T ′

∧ s′ : Spc′

In this theorem, the judgment Consistent performs several functions:

22

• Like WFCallStack in the one-step soundness theorem for the structured se-mantics, Consistent implies an invariant on the structure of the subroutine callstack.

• In the stackless semantics, the subroutine call stack is not explicit in the dy-namic state. The judgment Consistent relates an implicit subroutine call stack(ρ or ρ′) to the state of an execution.

• The implicit subroutine call stacks of the stackless semantics are closely relatedto the explicit subroutine call stacks of the structured semantics. In particular,the following implication holds:

∀pc, f, s, ρ, pc′, f ′, s′.∧ P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉∧ Consistent(P, F , S, pc, f, s, ρ)⇒ ∃ρ′. P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉

Thus, Consistent helps us relate the structured and the stackless semantics.

The judgment Consistent is the conjunction of four other judgments:

Consistent(P, F , S, pc, f, s, ρ) ≡∧ WFCallStack(P, F , pc, ρ)∧ WFCallStack2(P, pc,ΛP (ρ), ρ)∧ GoodStackRets(Spc, s,ΛP (ρ), ρ)∧ GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)

The first judgment is the judgment WFCallStack from the previous section. Thenext judgment adds more constraints on the subroutine call stack. The last twojudgments constrain the values of return addresses in s (the operand stack) and f(local variables) to match those found in ρ (the implicit subroutine call stack).

The auxiliary function Λ returns the sequence of caller labels associated with asequence of return addresses. It is a partial function defined by two rules:

(Empty hypothesis)ΛP (ε) = ε

(l0)

P [p− 1] = jsr LΛP (ρ′) = λ′

ΛP (p · ρ′) = L · λ′(l1)

According to this definition, ΛP (ρ) is the string of labels of subroutines that havebeen called in ρ. The function ΛP (ρ) is used only when P is well-typed andWFCallStack holds, in which case ΛP (ρ) is always defined.

23

The extensions of WFCallStack made by WFCallStack2 are defined by two rules:

CP,pc = ε

WFCallStack2(P, pc, ε, ε)(wf20)

L 6∈ λ′L · λ′ is a subsequence of CP,pc

WFCallStack2(P, p, λ′, ρ′)WFCallStack2(P, pc, L · λ′, p · ρ′)

(wf21)

These rules ensure that no recursive calls have occurred. They also ensure thatthe dynamic subroutine call stack is consistent with the static approximation of thesubroutine call stack.

The judgment GoodStackRets constrains the values of return addresses in theoperand stack:

∀j, L. j ∈ Dom(α) ∧ α[j] = (ret-from L)⇒ GoodRet(L, s[j], λ, ρ)GoodStackRets(α, s, λ, ρ)

(gsr0)

(Empty hypothesis)GoodRet(K, p, ε, ε)

(gr0)

(Empty hypothesis)GoodRet(K, p,K · λ, p · ρ)

(gr1)

K ′ 6= KGoodRet(K, p, λ′, ρ′)

GoodRet(K, p,K ′ · λ′, p′ · ρ′)(gr2)

In these definitions, we introduce GoodRet as an auxiliary judgment. These defi-nitions say that, when a subroutine L has been called but has not returned, thevalues of return addresses for L found in s (the operand stack) agree with the returnaddress for L in ρ (the implicit subroutine call stack).

Similarly to the judgment GoodStackRets, the judgment GoodFrameRets con-strains the values of return addresses in local variables:

(Empty hypothesis)GoodFrameRets(F, pc, µ, f, ε, ε)

(gfr0)

∀y, L.(∧ y ∈ (Dom(Fpc)\µ)∧Fpc[y] = (ret-from L)

)⇒ GoodRet(L, f [y],K · λ′, p · ρ′)

GoodFrameRets(F, p,Dom(Fpc), f, λ′, ρ′)GoodFrameRets(F, pc, µ, f,K · λ′, p · ρ′)

(gfr1)

24

Because subroutines may not be able to read and write all variables, GoodFrameRetscannot constrain all variables simultaneously in the way that GoodStackRets con-strains all elements of the operand stack. Instead, the rules for GoodFrameRetsmust work inductively on the subroutine call stack, looking at the return addressesavailable to each subroutine in turn.

7 Main soundness theorem

Our main soundness theorem is:


∀pc, f0, f, s.(∧ P ` 〈1, f0, ε〉 →∗ 〈pc, f, s〉∧ 6∃pc′, f ′, s′. P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

)⇒ P [pc] = halt ∧ s : Spc

This theorem says that if a computation stops, then it stops because it has reacheda halt instruction, not because the program counter has gone out of bounds orbecause a precondition of an instruction does not hold. This theorem also says thatthe operand stack is well-typed when a computation stops. This last condition isimportant because, when a JVML method returns, its return value is on the operandstack.

The proof of the main soundness theorem is in Appendix C.

8 Sun’s rules

Sun has published two descriptions of the bytecode verifier, a prose specificationand a reference implementation. This section compares our rules with both of thesedescriptions.

8.1 Scope

While our rules simply check static information, Sun’s bytecode verifier infers thatinformation. Inference may be important in practice, but only checking is crucialfor type safety (and for security). It is therefore reasonable to study checking apartfrom inference.

JVML has around 200 instructions, while JVML0 has only 9. A rigorous treat-ment of most of the remaining JVML instructions should pose only minor problems.

25

In particular, many of these instructions are for well understood, arithmetic opera-tions; small difficulties may arise because of their exceptions and other idiosyncrasies.The other instructions (around 20) concern objects and concurrency. Their rigor-ous treatment would require significant additions to our semantics—for example, amodel of the heap. Fortunately, some of these additions are well understood in thecontext of higher-level, typed languages. As discussed below, Freund and Mitchellare currently extending our rules to a large subset of JVML.

8.2 Technical differences

Our rules differ from Sun’s reference implementation in the handling of recursivesubroutines. Sun’s specification disallows recursive subroutines, as do our rules, butSun’s reference implementation allows recursion in certain cases. We believe thatrecursion is sound in the sense that it does not introduce security holes. However,recursion is an unnecessary complication since it is not useful for compiling Java.Therefore, we believe that the specification should continue to disallow recursion andthat the reference implementation should be corrected.

Our rules deviate from Sun’s specification and reference implementation in a fewrespects.

• Sun’s rules forbid load x when x is uninitialized or holds a return address.Our rules are more general without compromising soundness.

• Sun’s rules allow at most one ret instruction per subroutine, while our rulesallow an arbitrary number.

• According to Sun’s rules, “each instruction keeps track of the list of jsr targetsneeded to reach that instruction” [LY96, p. 135]. Using this information, Sun’srules allow a ret to return not only from the most recent call but also from callsfurther up the subroutine call stack. Adding this flexibility to our rules shouldnot be difficult, but it would complicate the structured dynamic semantics andwould require additional information in the static semantics.

Finally, our rules differ from Sun’s reference implementation on a couple of otherpoints. Sun’s specification is ambiguous on these points and, therefore, does notprovide guidance.

• Sun’s reference implementation does not constrain unreachable code. Our rulesput constraints on all code. Changing our rules to ignore unreachable codewould not require fundamental changes.

26

• When it comes to identifying what subroutine an instruction belongs to, ourrules are more restrictive than the rules implicit in Sun’s reference implementa-tion. The flexibility of Sun’s reference implementation is important for compil-ing finally clauses that can throw exceptions. Changing our rules to captureSun’s approach would not be difficult, but changing our soundness proof tosupport this approach may be.

9 Other related work

In addition to Sun’s, there exist several implementations of the bytecode verifier.Only recently has there been any systematic attempt to understand all these imple-mentations. In particular, the Kimera project has tested several implementations,pointing out some mistakes and discrepancies [SMB97]. We take a complementaryapproach, based on rigorous reasoning rather than on testing. Both rigorous rea-soning and testing may affect our confidence in bytecode verification. While testingdoes not provide an adequate replacement for precise specifications and proofs, it isa cost-effective way to find certain flaws and oddities.

More broadly, there have been several other implementations of the Java VM. Ofparticular interest is a partial implementation developed at Computational Logic,Inc. [Coh97]. This implementation is defensive, in the sense that it includes strong(and expensive) dynamic checks that remove the need for bytecode verification. Theimplementation is written in a formal language, and is intended as a model ratherthan for production use. Ultimately, one may hope to prove that the defensiveimplementation is equivalent to an aggressive implementation plus a sound bytecodeverifier (perhaps one based on our rules).

There have also been typed intermediate languages other than JVML. Severalhave been developed for ML and Haskell [TIC97]. Here we discuss the TIL inter-mediate languages [Mor95, MTC+96] as representative examples. The TIL inter-mediate languages provide static guarantees similar to those of JVML. Althoughthese languages have sophisticated type systems, they do not include an analogueto JVML subroutines; instead, they include constructs as high-level as Java’s try-finally statement. Therefore, the main problems addressed in this paper do notarise in the context of TIL.

Finally, the literature contains many proofs of type soundness for higher-levellanguages, and in particular proofs for a fragment of Java [DE97, Nv98, Sym97].Those proofs have not had to deal with JVML peculiarities (in particular, withsubroutines); nevertheless, their techniques may be helpful in extending our work tothe full JVML.

In summary, there has not been much work closely related to ours. We do not

27

find this surprising, given that the handling of subroutines is one of the most originalparts of the bytecode verifier; it was not derived from prior papers or systems [Yel97].However, interest in the formal treatment of bytecode verification is mounting; sev-eral approaches are currently being pursued [FM98, Gol97, HT98, Qia98, Sar97].Goldberg, Qian, and Saraswat all develop other formal frameworks for bytecode ver-ification, basing them on constraints and dataflow analysis; their work is rather broadand not focused on subroutines. Hagiya and Tozawa generalize our rules for subrou-tines. Building on our type system, Freund and Mitchell study object initializationand its problematic interaction with subroutines; work is under way on a subset ofJVML that includes objects, classes, constructors, interfaces, and exceptions.

10 Conclusions

The bytecode verifier is an important part of the Java VM; through static checks, ithelps reconcile safety with efficiency. Common descriptions of the bytecode verifierare ambiguous and contradictory. This paper suggests the use of a type system asan alternative to those descriptions. It explores the viability of this suggestion bydeveloping a sound type system for a subset of JVML. This subset, despite its smallsize, is interesting because it includes JVML subroutines, a source of substantialdifficulty in the design of a type system.

Our results so far support the hypothesis that a type system is a good way todescribe the bytecode verifier. Significant problems remain in scaling up to the fullJVML, such as handling objects and concurrency. However, we believe that theseproblems will be no harder than those posed by subroutines, and that a completetype system for JVML could be both tractable and useful.

Appendix

A Proof of soundness for the structured semantics

First, we prove that a step of execution preserves WFCallStack. Next, we prove somelemmas about types. Finally, we prove the soundness theorem for the structuredsemantics (Theorem 2), first by showing that well-typing of the stack is preservedand then by showing that well-typing of local variables is preserved.

A.1 Preservation of WFCallStack

The following lemma states certain insensitivities of WFCallStack to the exact valueof pc:

28

Lemma 1 For all P , F , and S such that F, S s P :

∀pc, pc′, ρ.∧ WFCallStack(P, F , pc, ρ)∧ RP,pc′ = RP,pc∧ Dom(Fpc′) = Dom(Fpc)⇒WFCallStack(P, F , pc′, ρ)

Proof Assume that P , F , S, pc, pc′, and ρ satisfy the hypotheses of the lemma.We do a case split on ρ:

1. Case: ρ = ε. In this case, the assumption WFCallStack(P, F , pc, ρ) mustbe true by rule (wf0), so Dom(Fpc) = Var. Given this and the assumptionthat Dom(Fpc′) = Dom(Fpc), it follows that Dom(Fpc′) = Var. Thus, byrule (wf0), we can conclude that WFCallStack(P, F , pc′, ρ).

2. Case: ρ 6= ε. In this case, we know that ρ = p · ρ′ for some p and ρ′. Also, theassumption WFCallStack(P, F , pc, ρ) must be true by rule (wf1), so, for someL, we know that

P [p− 1] = jsr LL ∈ RP,pc

Dom(Fpc) ⊆ Dom(F p)WFCallStack(P, F , p, ρ′)

Substituting RP,pc′ for RP,pc and Dom(Fpc′) for Dom(Fpc), we have all weneed to establish WFCallStack(P, F , pc′, ρ) by rule (wf1).

2

Now we show that the structured dynamic semantics preserves WFCallStack:

Restatement of part of Theorem 2 For all P , F , and S such that F , S s P :

∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)⇒WFCallStack(P, F , pc′, ρ′)

Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, s′, and ρ′ satisfy the hypothesesof the lemma. The assumption that a step of execution is possible starting from〈pc, f, s, ρ〉 implies that P [pc] is defined; it also implies that P [pc] 6= halt. We doa case split on the instruction possible at P [pc], proceeding in three cases:

29

1. The first case is instructions that do not affect the subroutine call stack: inc,if, load, store, pop, and push0. For these instructions, their structureddynamic semantics implies that ρ′ = ρ and their structured static semanticsimplies that RP,pc = RP,pc′ and Dom(Fpc) = Dom(Fpc′). Given these equal-ities, we can conclude WFCallStack(P, F , pc′, ρ′) by Lemma 1.

2. The second case is P [pc] = jsr K for some K. In this case, the structureddynamic semantics implies that ρ′ = (pc + 1) · ρ, so we want to show that:

WFCallStack(P, F ,K, (pc + 1) · ρ)

To prove this judgment using rule (wf1), it suffices to show:

(a) that P [(pc + 1)− 1] = jsr K, which we are assuming;(b) that K ∈ RP,K , which is true because RP,K = {K} by the structured

static semantics of jsr (specifically, line 3 of the jsr rule in Figure 7);(c) that Dom(FK) ⊆ Dom(F (pc+1)), which also follows from the structured

static semantics of jsr K (specifically, lines 2 and 3 of the jsr rule inFigure 8);

(d) that WFCallStack(P, F , pc + 1, ρ). We are assuming that


From the structured static semantics of jsr K we know that RP,(pc+1) =RP,pc (line 2, jsr rule, Figure 7) and Dom(F (pc+1)) = Dom(Fpc) (line 2,jsr rule, Figure 8). Under these conditions,

WFCallStack(P, F , pc + 1, ρ)

follows from Lemma 1.

3. The third case is P [pc] = ret x for some x. The assumption that a step ofexecution is possible starting from 〈pc, f, s, ρ〉 implies that ρ is not empty. Ifρ is not empty, then by the structured dynamic semantics of ret we know thatρ = pc′ · ρ′. Given the assumption that


by substitution we can conclude that

WFCallStack(P, F , pc, pc′ · ρ′)This can be true only by rule (wf1), so

WFCallStack(P, F , pc′, ρ′)

must be true.

2

30

A.2 Lemmas about types

A few lemmas about types are needed. The first lemma helps us reason about thetypes of locations at program points that dynamically follow a ret instruction:


∀pc, f, s, p, ρ′, x.∧ WFCallStack(P, F , pc, p · ρ′)∧ P [pc] = ret x⇒∧ ∀y ∈ Dom(Fpc). F p[y] = Fpc[y]∧ Sp = Spc

Proof Assume that P , F , S, pc, f , s, p, ρ′, and x satisfy the hypotheses of thelemma. The assumption WFCallStack(P, F , pc, p · ρ′) must hold by rule (wf1), sothere exists an L such that L ∈ RP,pc and P [p− 1] = jsr L. Given such an L, theconjuncts of the conclusion are instantiations of the quantified term found in thestructured static semantics of ret (Figure 8). 2

The following lemma states an insensitivity of F to the exact value of pc:Lemma 3

∀F, ρ, y, T, pc, pc′.∧ Dom(Fpc′) = Dom(Fpc)∧ y ∈ Dom(Fpc)⇒ Fpc′ [y] = Fpc[y]∧ F(F, pc, ρ)[y] = T⇒ F(F, pc′, ρ)[y] = T

Proof Assume that F , ρ, y, T , pc, and pc′ satisfy the hypotheses of the lemma.If y ∈ Dom(Fpc), then the assumption that F(F, pc, ρ)[y] = T must be true

by rule (tt0), so we can conclude that T = Fpc[y]. Given y ∈ Dom(Fpc), byassumption Fpc′ [y] = Fpc[y], thus Fpc′ [y] = T . Therefore, by rule (tt0)

F(F, pc′, ρ)[y] = T

If y 6∈ Dom(Fpc), then F(F, pc, ρ)[y] = T must be true by rule (tt1), so we canconclude, for some p and ρ′, that ρ = p · ρ′ and that

F(F, p, ρ′)[y] = T (1)

Given the assumptions that y 6∈ Dom(Fpc) and Dom(Fpc) = Dom(Fpc′), it followsthat y 6∈ Dom(Fpc′). Given this and (1),

F(F, pc′, ρ)[y] = T

follows from rule (tt1). 2

31

The next two lemmas say that F does not change as a result of executing a jsror ret instruction. The lemma for jsr is:


∀pc, f, ρ, L, y, T.∧ P [pc] = jsr L∧ F(F, pc, ρ)[y] = T⇒ F(F,L, (pc + 1) · ρ)[y] = T

Proof Assume that P , F , S, pc, f , ρ, L, y, and T satisfy the hypotheses of thelemma. We do a case split on y:

• Case: y ∈ Dom(FL), that is, y is one of the variables accessible in the sub-routine being called. By the structured static semantics of jsr (line 3 of jsrrule in Figure 8), Dom(FL) ⊆ Dom(Fpc), so y is an element of Dom(Fpc) aswell. Given that y ∈ Dom(Fpc), the assumption that

F(F, pc, ρ)[y] = T

could be true only by rule (tt0), so we know that T = Fpc[y]. From y ∈Dom(FL) and the structured static semantics of jsr (Figure 8, line 5) it followsthat FL[y] = Fpc[y], so T = FL[y]. From y ∈ Dom(FL) and T = FL[y], itfollows from rule (tt0) that

F(F,L, (pc + 1) · ρ)[y] = T

• Case: y 6∈ Dom(FL). We know that

Dom(Fpc) = Dom(Fpc+1)y ∈ Dom(Fpc)⇒ Fpc[y] = Fpc+1[y]

F(F, pc, ρ)[y] = T

The first two lines follow from the structured static semantics of jsr (Figure 8,lines 2 and 4); the last we are assuming. From Lemma 3 it follows that

F(F, pc + 1, ρ)[y] = T

Since y 6∈ Dom(FL), it follows from rule (tt1) that

F(F,L, (pc + 1) · ρ)[y] = T

2

32

The next lemma says for ret what the previous lemma says for jsr. Here,however, we need to assume that we have a well-formed subroutine call stack.


∀pc, f, p, ρ′, x, y, T.∧ WFCallStack(P, F , pc, p · ρ′)∧ P [pc] = ret x∧ F(F, pc, p · ρ′)[y] = T⇒ F(F, p, ρ′)[y] = T

Proof Assume that P , F , S, pc, f , p, ρ′, x, y, and T satisfy the hypotheses of thelemma. We do a case split on y:

• Case: y ∈ Dom(Fpc). Given the WFCallStack assumption plus the as-sumptions that P [pc] = ret x and y ∈ Dom(Fpc), we can conclude thatF p[y] = Fpc[y] by Lemma 2. Given that y ∈ Dom(Fpc), the assumption that

F(F, pc, p · ρ′)[y] = T

must hold by rule (tt0), so T = Fpc[y] and thus T = F p[y]. The WFCallStackassumption could be true only by rule (wf1), so Dom(Fpc) ⊆ Dom(F p) andy ∈ Dom(F p). Given T = F p[y] and y ∈ Dom(F p):

F(F, p, ρ′)[y] = T

follows from rule (tt0).

• Case: y 6∈ Dom(Fpc). Given y 6∈ Dom(Fpc), the assumption that

F(F, pc, p · ρ′)[y] = T

could be true only by rule (tt1), so we can conclude

F(F, p, ρ′)[y] = T

2

The final lemma says that F is defined when WFCallStack holds:

Lemma 6 For all P , F , and S such that F, S ` P :

∀y, ρ, pc.WFCallStack(P, F , pc, ρ)⇒ ∃T. F(F, pc, ρ)[y] = T

33

Proof This lemma holds because WFCallStack ensures that Dom(F p) = Var forsome p in ρ, so we are guaranteed that rule (tt0) will apply for at least this p.

More formally, assume an arbitrary y in Var and P , F , S such that F , S ` P .We proceed by induction on ρ:

• In the base case, ρ = ε. Assume pc such that WFCallStack(P, F , pc, ρ). Be-cause ρ = ε, this WFCallStack assumption could be true only by rule (wf0),so Dom(Fpc) = Var and thus y ∈ Dom(Fpc). Therefore, by rule (tt0),F(F, pc, ρ)[y] = Fpc[y].

• In the inductive step, ρ = p · ρ′ for some p and ρ′. Assume pc such thatWFCallStack(P, F , pc, ρ).

If y ∈ Dom(Fpc), then we can conclude F(F, pc, ρ)[y] = Fpc[y] by rule (tt0).

If y 6∈ Dom(Fpc), then we need to use the induction hypothesis:

∀q. WFCallStack(P, F , q, ρ′)⇒ ∃T. F(F, q, ρ′)[y] = T

Because ρ is not empty, the assumption WFCallStack(P, F , pc, ρ) could be trueonly by rule (wf1), so WFCallStack(P, F , p, ρ′). This matches the antecedent ofthe induction hypothesis, so we can conclude that F(F, p, ρ′)[y] = T for someT . Thus, by rule (tt1), we can conclude that F(F, pc, ρ)[y] = T .

2

A.3 Preservation of stack typing

We now turn our attention back to the soundness theorem. This subsection showsthat the typing of the operand stack is preserved; the next shows that the typing ofvariables is preserved.


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc⇒ s′ : Spc′

Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, s′, and ρ′ satisfy the hypothesesof the lemma. The assumption that a step of execution is possible starting from〈pc, f, s, ρ〉 implies that P [pc] is defined; it also implies that P [pc] 6= halt. We doa case split on the instruction possible at P [pc]:

34

1. The first case is ret. We know from the structured dynamic semantics of retthat s′ = s and ρ = pc′ · ρ′. Given the latter equality and the WFCallStackassumption, Spc′ = Spc follows from Lemma 2. Given Spc′ = Spc, s′ = s, andthe assumption that s : Spc, it follows that s′ : Spc′ .

2. The second case is inc, pop, push0, if, jsr, and store.

Consider inc, for example. We know from the structured dynamic semanticsof inc that s = n · s′′ for some integer n and value string s′′, and that s′ =(n+ 1) · s′′. We know from the structured static semantics of inc that

Spc = Spc′ = Int · α

for some type string α. Given s : Spc, s = n · s′′, and Spc = Int · α, it followsthat s′′ : α. Given s′′ : α, s′ = (n+ 1) · s′′, and Spc′ = Int · α, it follows thats′ : Spc′ .

As another example, consider store x. We know from the structured dynamicsemantics of store that s = v ·s′ for some v, and we know from the structuredstatic semantics that Spc = T · Spc′ for some T . Given these equations andthe assumption that s : Spc, it follows that s′ : Spc′.

The proofs for the other instructions in this category are along similar lines.

3. The last case is load x. In this case, we use the assumptions to infer that f [x] :T for some T such that F(F, pc, ρ)[x] = T . The structured static semantics forload says that x ∈ Dom(Fpc), so F(F, pc, ρ)[x] = T must be true by rule (tt0),so that T = Fpc[x]. We know from the structured static semantics of load xthat Spc′ = Fpc[x] · Spc, so Spc′ = T · Spc. We know from the structureddynamic semantics of load x that s′ = f [x] · s. Given f [x] : T , s : Spc,s′ = f [x] · s, and Spc′ = T · Spc, it follows that s′ : Spc′ .

2

A.4 Preservation of local variable typing


∀pc, f, s, ρ, pc′, f ′, s′, ρ′, y.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ s : Spc∧ ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T⇒ ∃T ′. F(F, pc′, ρ′)[y] = T ′ ∧ f ′[y] : T ′

35

This statement is stronger than that in Section 5.3 in that the quantifier for y hasbeen moved to the top level.

Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, s′, ρ′, and y satisfy the hypothesesof the lemma. Let T be a type such that F(F, pc, ρ)[y] = T and f [y] : T ; by assump-tion, such a T exists. The assumption that a step of execution is possible startingfrom 〈pc, f, s, ρ〉 implies that P [pc] is defined; it also implies that P [pc] 6= halt. Wedo a case split on the instruction possible at P [pc]:

1. The first case is instructions that do not modify y or ρ: inc, if, load,pop, push0, and also store x for x 6= y. For these instructions, it followsfrom their structured static semantics that Dom(Fpc′) = Dom(Fpc) and thatFpc′ [y] = Fpc[y] when y ∈ Dom(Fpc). It follows from their structured dy-namic semantics that ρ′ = ρ. Thus, by Lemma 3, F(F, pc′, ρ′)[y] = T . Whatremains to be shown is f ′[y] : T . It follows from the structured dynamic se-mantics of these instructions that f [y] = f ′[y]; from f [y] : T and f [y] = f ′[y]it follows that f ′[y] : T .

2. The next case is store y. By the structured static semantics of store y,y ∈ Dom(Fpc′). Thus, by rule (tt0), F(F, pc′, ρ′)[y] = Fpc′ [y]. What remainsto be shown is f ′[y] : Fpc′ [y]. The structured static semantics for store ysays that Spc = Fpc′ [y] · Spc′ ; the structured dynamic semantics for store ysays that s = f [y] · Spc′ . Given these equations and the assumption s : Spc, itfollows that f ′[y] : Fpc′ [y].

3. The last case is jsr L and ret x. We can conclude that F(F, pc′, ρ′)[y] = Tby Lemma 4 when P [i] is jsr L and by Lemma 5 when P [i] is ret x. Whatremains to be shown is f ′[y] : T . From the structured dynamic semantics ofthese instructions we know that f ′ = f . Given f [y] : T and f [y] = f ′[y], itfollows that f ′[y] : T .

2

B Proof of soundness for the stackless semantics

This section proceeds roughly top-down, first stating a proposition and two lemmas,next using these to prove the soundness theorem for the stackless semantics, thenproving the two lemmas.

If F and S are a typing for P in the stackless semantics, then they are a typingfor P in the structured semantics:

Proposition 1 For all P , F , S, if F, S ` P , then F , S s P .

36

The proofs below implicitly use this proposition when they apply lemmas and theo-rems that have F, S s P in their hypotheses.

The first lemma establishes a simulation between the structured semantics andthe stackless semantics:



The second lemma states that Consistent is preserved by the structured dynamicsemantics:


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ Consistent(P, F , S, pc, f, s, ρ)⇒ Consistent(P, F , S′, pc′, f ′, s′, ρ′)

Given these two lemmas and the soundness theorem for the structured semantics,the soundness theorem of the stackless semantics follows directly.

Restatement of Theorem 3 For all P , F , and S such that F, S ` P :

∀pc, f, s, ρ, pc′, f ′, s′.∧ P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc⇒ ∃ρ′.∧ Consistent(P, F , S, pc′, f ′, s′, ρ′)∧ ∀y. ∃T ′. F(F, pc′, ρ′)[y] = T ′ ∧ f ′[y] : T ′

∧ s′ : Spc′

Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, and s′ satisfy the hypotheses ofthe theorem. Use Lemma 7 to pick ρ′ such that

P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉

The first conjunct of the conclusion follows from our selection for ρ′ and Lemma 8.The last two conjuncts follow from our selection for ρ′ and Theorem 2. 2

37

B.1 Analogous steps

Restatement of Lemma 7 For all P , F , and S such that F, S ` P :


Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, and s′ satisfy the hypothesesof the lemma. The assumption that a step of execution is possible starting from〈pc, f, s〉 implies that P [pc] is defined; it also implies that P [pc] 6= halt. We do acase split on the instruction possible at P [pc]:

• For all instructions except jsr and ret, pick ρ′ = ρ. A comparison of thestructured and stackless dynamics in Figure 10 and Figure 6 shows that theconclusion of our lemma follows directly from our assumptions.

• For jsr L, pick ρ′ = (pc + 1) · ρ. Again, a comparison of Figures 10 and 6shows that the conclusion of our lemma follows from our assumptions in thiscase as well.

• The case for the ret x instruction is the only complicated one.

First, we show (by contradiction) that our assumptions imply that ρ has atleast one element. If ρ had no elements, then the WFCallStack2 part of theConsistent assumption would have to be true by rule (wf20). This wouldrequire that CP,pc be empty, but the stackless static semantics for ret x (line 2,ret rule, Figure 13) together with the relationship between RP and CP implythat CP,pc must have at least one element. Thus, WFCallStack2 cannot holdby rule (wf20). Instead, it must hold by rule (wf21), so ρ must have at leastone element.

Therefore, ρ = p · ρ′′ for some p and ρ′′. We let ρ′ = ρ′′. To prove that thisselection of ρ′ satisfies the conclusion, we need to show that:

P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′′〉

In order to establish this judgment, we must show that pc′ = p. By thestackless dynamic semantics of ret x, pc′ = f [x], so this reduces to showingthat f [x] = p.

Because ρ has at least one element, the WFCallStack assumption must be trueby rule (wf1). From this we can infer that, for some L, P [p− 1] = jsr L andL ∈ RP,pc.

38

– Given P [p − 1] = jsr L and ρ = p · ρ′, we know by rule (l1) thatΛP (ρ) = L ·λ′ for some λ′. (As mentioned in Section 6.3, ΛP (ρ) is alwaysdefined for well-typed programs when WFCallStack holds.)

– Given that L ∈ RP,pc, we know by the stackless static semantics of ret xthat Fpc[x] = (ret-from L) (line 4, ret rule, Figure 13). We also knowfrom the stackless static semantics of ret x that x ∈ Dom(Fpc) (line 3).

Because ρ is not empty, the GoodFrameRets component of the Consistent as-sumption must be true by rule (gfr1). Given this, x ∈ Dom(Fpc), and

Fpc[x] = (ret-from L)

it must be the case that

GoodRet(L, f [x],ΛP (ρ), p · ρ′)

We know that ΛP (ρ) = L · λ′, so by substitution:

GoodRet(L, f [x], L · λ′, p · ρ′)

This could be true only by rule (gr1), so we can conclude that f [x] = p.

2

B.2 Preservation of Consistent

Expanding the definition of Consistent, we want to prove the following lemma:


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ WFCallStack2(P, pc,ΛP (ρ), ρ)∧ GoodStackRets(Spc, s,ΛP (ρ), ρ)∧ GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)⇒∧ WFCallStack(P, F , pc′, ρ′)∧ WFCallStack2(P, pc′,ΛP (ρ′), ρ′)∧ GoodStackRets(Spc′ , s′,ΛP (ρ′), ρ′)∧ GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ), ρ′)

We prove this by assuming the hypotheses and proving separately each conjunctof the conclusion. We prove the first conjunct (preservation of WFCallStack) inSection A.1. We prove the remaining conjuncts below after we state and prove somemiscellaneous lemmas.

39

B.2.1 Miscellaneous lemmas

The following lemma describes how a program step can change ΛP (ρ):


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)⇒∧ ∀L. (P [pc] = jsr L ⇒ ΛP (ρ′) = L · ΛP (ρ))∧ ∀x. (P [pc] = ret x ⇒ ∃L. L · ΛP (ρ′) = ΛP (ρ))


• jsr L: From the structured dynamic semantics of jsr L, ρ′ = (pc + 1) · ρ. Byassumption, P [pc] = jsr L. Thus, by rule (l1), ΛP (ρ′) = L · ΛP (ρ).

• ret x: From the structured dynamic semantics of ret x, ρ = pc′ ·ρ′. Thus, theWFCallStack assumption must be true by rule (wf1), so P [pc′−1] = jsr L forsome L. Thus, by rule (l1), ΛP (ρ) = L · ΛP (ρ′).

2

Lemma 10 For all P , pc, λ, and ρ:

WFCallStack2(P, pc, λ, ρ)⇒ ρ and λ have the same length

This can be proven by induction on the derivation of WFCallStack2(P, pc, λ, ρ).

Lemma 11 For all K, p, λ, ρ:

K 6∈ λ ∧ λ and ρ have the same length⇒ GoodRet(K, p, λ, ρ)

This lemma can be proven by induction on λ using rule (gr0) in the base case andrule (gr2) in the inductive step.

The following lemma states an insensitivity of GoodRet:

40

Lemma 12

∀P, F , pc, p, ρ′, L, λ′, v,K.∧ WFCallStack2(P, pc, L · λ′, p · ρ′)∧ GoodRet(K, v, L · λ′, p · ρ′)⇒ GoodRet(K, v, λ′, ρ′)

Proof Assume P , F , pc, p, ρ′, L, λ′, v, and K satisfy the hypotheses of ourlemma. The WFCallStack2 assumption implies that ρ′ and λ′ have the same length(by Lemma 10). We proceed with a case split on ρ′:

• When ρ′ is empty, so is λ′, thus GoodRet(K, v, λ′, ρ′) follows from rule (gr0).

• When ρ′ is not empty, we proceed with a case split on K. When K 6= L, theGoodRet assumption must be true by rule (gr2), so

GoodRet(K, v, λ′, ρ′)

The next case is K = L. Because ρ′ is not empty, the WFCallStack2 assump-tion must be true by rule (wf21), so L 6∈ λ′ and thus K 6∈ λ′. Given this andthe fact that λ′ and ρ′ have the same length, we can conclude by Lemma 11that

GoodRet(K, v, λ′, ρ′)

2

The next lemma is used to prove the following lemma:

Lemma 13

∀ρ, P, F , pc, µ, f, f ′, x, v.∧ WFCallStack(P, F , pc, ρ)∧ µ ⊆ Dom(Fpc)∧ x ∈ µ∧ f ′ = f [x 7→ v]∧ GoodFrameRets(F, pc, µ, f,ΛP (ρ), ρ)⇒ GoodFrameRets(F, pc, µ, f ′,ΛP (ρ), ρ)

Proof We proceed by induction on ρ:

• In the base case, ρ = ε. Pick arbitrary P , F , pc, µ, f , f ′, x, and v. In thiscase, GoodFrameRets(F, pc′, µ, f ′,ΛP (ρ), ρ) follows from rule (gfr0).

41

• In the inductive step, ρ = p ·ρ′ for some p and ρ′. Pick arbitrary P , F , pc, µ, f ,f ′, x, and v. Assume all hypotheses of Lemma 13 hold. The GoodFrameRetsassumption could be true only by rule (gfr1), so

∀y, L.(∧ y ∈ (Dom(Fpc)\µ)∧ Fpc[y] = (ret-from L)

)⇒ GoodRet(L, f [y],ΛP (ρ), ρ) (2)

and alsoGoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′), ρ′) (3)

First, we want to show that

∀y, L.(∧ y ∈ (Dom(Fpc)\µ)∧ Fpc[y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ), ρ) (4)

Assume y and L such that y is in Dom(Fpc)\µ and Fpc[y] = (ret-from L).Because x ∈ µ, it must be that y 6= x, so by our assumption relating f to f ′,f [y] = f ′[y]. Given this equality, GoodRet(L, f ′[y],ΛP (ρ), ρ) follows from (2).

Next, we want to show that

GoodFrameRets(F, p,Dom(Fpc), f ′,ΛP (ρ′), ρ′) (5)

We prove this using the induction hypothesis, which we instantiate as follows:

∧ WFCallStack(P, F , p, ρ′)∧ Dom(Fpc) ⊆ Dom(F p)∧ x ∈ Dom(Fpc)∧ f ′ = f [x 7→ v]∧ GoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′), ρ′)

⇒ GoodFrameRets(F, p,Dom(Fpc), f ′,ΛP (ρ′), ρ′)

The first two antecedents of this instantiation of the induction hypothesis fol-low from the fact that the WFCallStack assumption must be true by rule (wf1).The next antecedent follows from the assumptions that x ∈ µ and µ ⊆Dom(Fpc). The fourth antecedent we are assuming, and the last is (3). Thus,we can use the induction hypothesis to conclude (5).

Given (4) and (5),

GoodFrameRets(F, pc, µ, f ′,ΛP (ρ), ρ)

follows from rule (gfr1).

2

42

The final lemma in this set states an insensitivity of GoodFrameRets:Lemma 14

∀P, F , pc, f, ρ, pc′, f ′, x, v, T.∧ WFCallStack(P, F , pc, ρ)∧ x ∈ Dom(Fpc)∧ f ′ = f [x 7→ v]∧ Fpc′ = Fpc[x 7→ T ]∧ ∀K. T = (ret-from K)⇒ GoodRet(K, v,ΛP (ρ), ρ)∧ GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)⇒ GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ), ρ)

Proof Pick arbitrary P , F , pc, f , ρ, pc′, f ′, x, v, and T . We proceed by a casesplit on ρ:

• The first case is ρ = ε. In this case, GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ), ρ)follows from rule (gfr0).

• The second case is ρ = p · ρ′ for some p and ρ′. Assume all hypothesesof Lemma 14 hold. The GoodFrameRets assumption could be true only byrule (gfr1), so

∀y, L.(∧ y ∈ (Dom(Fpc))∧ Fpc[y] = (ret-from L)


and alsoGoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′), ρ′) (7)

First, we want to show that

∀y, L.(∧ y ∈ (Dom(Fpc′))∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ), ρ) (8)

By assumption, Dom(Fpc′) = Dom(Fpc), so (8) is equivalent to

∀y, L.(∧ y ∈ (Dom(Fpc))∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ), ρ)

Assume y and L such that y is in Dom(Fpc) and Fpc′ [y] = (ret-from L).For y 6= x, GoodRet(L, f ′[y],ΛP (ρ), ρ) follows from (6) and our assumptionsrelating f to f ′ and Fpc to Fpc′ . For y = x, GoodRet(L, f ′[y],ΛP (ρ), ρ) followsfrom the assumption

∀K. T = (ret-from K)⇒ GoodRet(K, v,ΛP (ρ), ρ)

43

Next, we want to show that

GoodFrameRets(F, p,Dom(Fpc′), f ′,ΛP (ρ′), ρ′) (9)

By assumption, Dom(Fpc) = Dom(Fpc′), so (9) is equivalent to

GoodFrameRets(F, p,Dom(Fpc), f ′,ΛP (ρ′), ρ′)

We prove this using Lemma 13. Our assumptions imply the following conclu-sions:

WFCallStack(P, F , p, ρ′)Dom(Fpc) ⊆ Dom(F p)x ∈ Dom(Fpc)f ′ = f [x 7→ v]GoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′), ρ′)

The first two conclusions follow from the fact that ρ is not empty and thusthe WFCallStack assumption can only be true by rule (wf1). The next two weare assuming directly. The last conclusion is (7). These conclusions are theantecedents of Lemma 13, so (9) follows.

Given (8) and (9),

GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ), ρ)

follows from rule (gfr1).

2

B.2.2 Preservation of WFCallStack2

Restatement of Part of Lemma 8 For all P , F , and S such that F, S ` P :

∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ WFCallStack2(P, pc,ΛP (ρ), ρ)⇒WFCallStack2(P, pc′,ΛP (ρ′), ρ′)

This restatement omits unneeded conjuncts in the hypotheses.


44

• For all instructions except jsr and ret, we have ρ′ = ρ (by the structured dy-namic semantics), so ΛP (ρ′) = ΛP (ρ) (by substitution). Also, by the stacklessstatic semantics of these instructions, CP,pc′ = CP,pc. Given these equations,

WFCallStack2(P, pc′,ΛP (ρ′), ρ′)

follows from the WFCallStack2 assumption, whether that assumption is trueby rule (wf20) or rule (wf21).

• In the case of jsr L, we have L = pc′ (by the structured dynamic semantics).We first prove that the WFCallStack2 assumption implies that

ΛP (ρ) is a subsequence of CP,pc (10)

If ρ 6= ε, then WFCallStack2 must hold by rule (wf21), so (10) must be truein this case. If ρ = ε, then ΛP (ρ) = ε (by rule (l0)) and CP,pc = ε (becauseWFCallStack2 must hold by rule (wf20)), so (10) must be true in this case aswell.

We use rule (wf21) to show that WFCallStack2(P, pc′,ΛP (ρ′), ρ′) is true. Ac-cording to this rule, it suffices to establish the following three intermediateresults:

1. L 6∈ ΛP (ρ). This follows from (10) and L 6∈ CP,pc, which follows from thestackless static semantics of jsr L (line 2, jsr rule, Figure 11).

2. ΛP (ρ′) is a subsequence of CP,pc′ . From Lemma 9 we know that ΛP (ρ′) =L · ΛP (ρ). Together with (10), this implies that

ΛP (ρ′) is a subsequence of L · CP,pc (11)

From the stackless static semantics of jsr L, we know that

L · CP,pc is a subsequence of CP,pc′ (12)

(lines 4 and 5, jsr rule, Figure 11). Given (11) and (12), we can concludethat

ΛP (ρ′) is a subsequence of CP,pc′

3. WFCallStack2(P, pc + 1,ΛP (ρ), ρ). We prove this in two cases, the firstwhen the WFCallStack2 assumption is true by rule (wf20), the secondwhen the WFCallStack2 assumption is true by rule (wf21). In both ofthese cases we have CP,pc+1 = CP,pc, which follows from the stacklessstatic semantics of jsr L (line 3, jsr rule, Figure 11).

45

If the WFCallStack2 assumption is true by rule (wf20), then ρ, ΛP (ρ),and CP,pc must be empty. Given CP,pc+1 = CP,pc, CP,pc+1 must beempty too. Thus,

WFCallStack2(P, pc + 1,ΛP (ρ), ρ)

follows from rule (wf20).If the WFCallStack2 assumption is true by rule (wf21), then there existp, ρ′′, K, and λ′′ such that

ρ = p · ρ′′ΛP (ρ) = K · λ′′

K 6∈ λ′′K · λ′′ is a subsequence of CP,pc

WFCallStack2(P, p, λ′′, ρ′′)

Because CP,pc+1 = CP,pc, K ·λ′′ is a subsequence of CP,pc+1 as well. SinceK 6∈ λ′′, K·λ′′ is a subsequence of CP,pc+1, and WFCallStack2(P, p, λ′′, ρ′′),

WFCallStack2(P, pc + 1,K · λ′′, p · ρ′′)

follows from rule (wf21). Since ρ = p · ρ′′ and ΛP (ρ) = K · λ′′, we obtain

WFCallStack2(P, pc + 1,ΛP (ρ), ρ)

• In the case of ret x, we know from Lemma 9 and from the structured dynamicsemantics that

ΛP (ρ) = L · ΛP (ρ′)ρ = pc′ · ρ′

for some L. Applying these equations to the WFCallStack2 assumption, wecan conclude that

WFCallStack2(P, pc, L · ΛP (ρ′), pc′ · ρ′)

This judgment can be true only by rule (wf21), so we can conclude

WFCallStack2(P, pc′,ΛP (ρ′), ρ′)

2

46

B.2.3 Preservation of GoodStackRets


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ WFCallStack2(P, pc,ΛP (ρ), ρ)∧ GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)

∧ ∀j, L.(∧ j ∈ Dom(Spc)∧ Spc[j] = (ret-from L)

)⇒ GoodRet(L, s[j],ΛP (ρ), ρ)

⇒ ∀j, L.(∧ j ∈ Dom(Spc′)∧ Spc′ [j] = (ret-from L)

)⇒ GoodRet(L, s′[j],ΛP (ρ′), ρ′)

Again, this restatement omits unneeded conjuncts in the hypotheses. Also, for con-venience, this restatement expands the definition of GoodStackRets.


• First we look at all instructions except jsr and ret. For these instructions,ρ = ρ′ by the structured dynamic semantics of the individual instructions; also,by substitution, ΛP (ρ) = ΛP (ρ′) as well.

Assume j and L such that j ∈ Dom(Spc′) and Spc′ [j] = (ret-from L); weneed to show that

GoodRet(L, s′[j],ΛP (ρ′), ρ′)

This follows from rule (gr0) when ρ′ = ε. Assume that ρ′ 6= ε and proceedwith a case split on j:

– The first case is stack slots that are unchanged, that is, stack slots thatare not written by the instruction but rather are carried over. The exactset is going to vary by instruction, but values of j in this set will satisfythe following:

j ∈ Dom(Spc) ∧ Spc′ [j] = Spc[j] ∧ s′[j] = s[j]

For these stack elements, if Spc′ [j] equals (ret-from L), then Spc[j] alsoequals (ret-from L). This implies GoodRet(L, s[j],ΛP (ρ), ρ), which bysubstitution means GoodRet(L, s′[j],ΛP (ρ′), ρ′).

47

– The second case is stack slots that are written by the instruction. It turnsout that this case applies only to load for the following reasons:∗ The instructions pop, if, and store do not write the stack, so this

second case does not apply to these instructions.∗ For inc and push0, if j is the element written (that is, the top el-

ement of s′), we know from the stackless static semantics that thetype Spc′ [j] is Int, which violates our assumption that Spc′ [j] equals(ret-from L), so this second case does not apply to these instruc-tions either.

For load x, we know from the stackless static semantics of load x thatx ∈ Dom(Fpc) and Fpc[x] = Spc′ [j]. Given the assumption that ρ′ 6=ε, the GoodFrameRets assumption must be true by rule (gfr1). Theantecedents of this rule and the assumptions that x ∈ Dom(Fpc) andSpc′ [j] = Fpc[x] = (ret-from L) imply that

GoodRet(L, f [x],ΛP (ρ), ρ)

By the structured dynamic semantics of load x, s′[j] = f [x] and ρ′ = ρ,so by substitution

GoodRet(L, s′[j],ΛP (ρ′), ρ′)

• For jsr and ret, assume j and L such that j ∈ Dom(Spc′) and Spc′ [j] =(ret-from L); we need to show that

GoodRet(L, s′[j],ΛP (ρ′), ρ′) (13)

But first we prove a helpful fact. For both jsr and ret, when j ∈ Dom(Spc) wecan assume that Spc[j] = Spc′ [j]. For jsr, this follows from the stackless staticsemantics (line 6, jsr rule, Figure 13). For ret, this follows from Lemma 2;to apply Lemma 2, we need that ρ = pc′ · ρ′, which follows from the structureddynamic semantics of ret.

Given Spc[j] = Spc′ [j] for j ∈ Dom(Spc), Spc′ [j] = (ret-from L), and theGoodStackRets assumption, we can conclude:

j ∈ Dom(Spc)⇒ GoodRet(L, s[j],ΛP (ρ), ρ)

Given this and the fact that s[j] = s′[j] for j ∈ Dom(Spc) (which follows fromthe structured dynamic semantics of jsr and ret), we can conclude:

j ∈ Dom(Spc)⇒ GoodRet(L, s′[j],ΛP (ρ), ρ) (14)

Now we return to proving (13) for the cases of interest:

48

– In the case of jsr K, we do a case split on j:

1. Case: j 6∈ Dom(Spc), that is, j is the index of the top of the stack.In this case we know from the stackless static semantics of jsrK thatK = L (line 6, jsr rule, Figure 13), we know from the structureddynamic semantics that s′[j] = pc + 1 and ρ′ = (pc + 1) · ρ, and weknow by Lemma 9 that ΛP (ρ′) = L · ΛP (ρ). Given these equations,(13) follows by rule (gr1).

2. Case: j ∈ Dom(Spc). In this case, we know from the stackless staticsemantics of jsr K that K 6= L (line 7, jsr rule, Figure 13). Weknow from the structured dynamic semantics that ρ′ = (pc + 1) · ρ,and we know from Lemma 9 that ΛP (ρ′) = K · ΛP (ρ). Given theseequations and (14), (13) follows from rule (gr2).

– In the case of ret, we know from Lemma 9 and the structured dynamicsemantics that:

ΛP (ρ) = K · ΛP (ρ′)ρ = pc′ · ρ′

for some K. Given these equations, we prove (13) by cases on L:

1. Case: K 6= L. By the stackless static semantics, Dom(Spc′) is equalto Dom(Spc), so j ∈ Dom(Spc). Given this, the conclusion of (14)must be true. Given ΛP (ρ) = L · ΛP (ρ′) and K 6= L, the conclusionof (14) can be true only by rule (gr2), so we can conclude

GoodRet(K, s′[j],ΛP (ρ′), ρ′)

2. Case: K = L. From the WFCallStack2 assumption and Lemma 10we know that ΛP (ρ) and ρ have the same length; given this, ΛP (ρ′)and ρ′ must also have the same length. The assumption that astep of execution is possible starting from 〈pc, f, s, ρ〉 implies thatρ is not empty, thus the WFCallStack2 assumption must be true byrule (wf21), so L 6∈ ΛP (ρ′). Thus, (13) follows from Lemma 11.

2

49

B.2.4 Preservation of GoodFrameRets


∀pc, f, s, ρ, pc′, f ′, s′, ρ′.∧ P s 〈pc, f, s, ρ〉 → 〈pc′, f ′, s′, ρ′〉∧ WFCallStack(P, F , pc, ρ)∧ WFCallStack2(P, pc,ΛP (ρ), ρ)∧ GoodStackRets(Spc, s,ΛP (ρ), ρ)∧ GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)⇒ GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ′), ρ′)

This time, we need all conjuncts in the hypotheses.

Proof Assume that P , F , S, pc, f , s, ρ, pc′, f ′, s′, and ρ′ satisfy the hypothesesof the lemma. The assumption that a step of execution is possible starting from〈pc, f, s, ρ〉 implies that P [pc] is defined; it also implies that P [pc] 6= halt.

When ρ′ = ε, we have ΛP (ρ′) = ε by rule (l0). Thus, we can conclude

GoodFrameRets(F, pc′, {}, f ′,ΛP (ρ′), ρ′)

by rule (gfr0).When ρ′ is not empty, we do a case split on the instruction possible at P [pc]:

• The first case is inc, pop, push0, if, and load x for any x. It follows from thestackless static semantics of these instructions that Dom(Fpc) = Dom(Fpc′)and, for all y in Dom(Fpc), Fpc[y] = Fpc′ [y]. It follows from the structureddynamic semantics of these instructions that f = f ′ and ρ = ρ′.

Because ρ is not empty, we can assume that ρ′ = p · ρ′′ for some p and ρ′′. TheGoodFrameRets assumption must be true by rule (gfr1), so we can concludethat

∧ ∀y, L.(∧ y ∈ Dom(Fpc)∧ Fpc[y] = (ret-from L)

)⇒ GoodRet(L, f [y],ΛP (ρ), ρ)

∧ GoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′′), ρ′′)

Using the equalities f = f ′, ρ = ρ′, Dom(Fpc) = Dom(Fpc′) and, for y inDom(Fpc), Fpc[y] = Fpc′ [y], we can conclude:

∧ ∀y, L.(∧ y ∈ Dom(Fpc′)∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ′), ρ′)

∧ GoodFrameRets(F, p,Dom(Fpc′), f ′,ΛP (ρ′′), ρ′′)

50

From this we can conclude


by rule (gfr1).

• The next case is store x. Let v and T satisfy s = v · s′ and Spc = T · Spc′ .We know these exist by the structured dynamic semantics and stackless staticsemantics of store x, respectively. We want to apply Lemma 14 to conclude:


To do so, we must discharge the antecedents of Lemma 14. The first an-tecedent, WFCallStack, we are assuming. The second and third antecedentsfollow from the stackless static semantics and the structured dynamic semanticsof store x, respectively. The fourth antecedent follows from the stackless staticsemantics. The fifth and sixth antecedents follow from the GoodStackRetsand GoodFrameRets assumptions, respectively. Thus, we can indeed applyLemma 14 to conclude


The structured dynamic semantics of store x implies that ρ = ρ′. Substitut-ing, we get


• The next case is jsr K. From Lemma 9 and the structured dynamic semanticsof jsr K we know that

ΛP (ρ′) = K · ΛP (ρ)ρ′ = (pc + 1) · ρ

f ′ = fpc′ = K

From the stackless static semantics we know that

Dom(FK) ⊆ Dom(Fpc)Dom(Fpc+1) = Dom(Fpc)

We want to prove:


Substituting, this is equivalent to

GoodFrameRets(F,K, {}, f,K · ΛP (ρ), (pc + 1) · ρ)

51

To establish this via rule (gfr1), it suffices to show that

∀y, L.(∧ y ∈ Dom(FK)∧ FK [y] = (ret-from L)

)⇒ GoodRet(L, f [y],K · ΛP (ρ), (pc + 1) · ρ)

(15)

and also that

GoodFrameRets(F, pc + 1,Dom(FK), f,ΛP (ρ), ρ) (16)

To prove (15), we assume y and L such that y is in Dom(FK) and FK [y] =(ret-from L) and show that

GoodRet(L, f [y],K · ΛP (ρ), (pc + 1) · ρ) (17)

To show this, first we show that

GoodRet(L, f [y],ΛP (ρ), ρ) (18)

If ρ is empty, then (18) follows from rule (gr0). If ρ is not empty, then theGoodFrameRets assumption must hold by rule (gfr1). Given the first line ofrule (gfr1) and the fact that y ∈ Dom(FK) and FK [y] = (ret-from L), (18)follows. We know K 6= L from the stackless static semantics of jsr K (line 8,Figure 13). Given this and (18), we can conclude (17) by rule (gr2).

Next, we need to prove (16). If ρ is empty, then (16) follows from rule (gfr0).Otherwise, we assume that ρ = p · ρ′′ for some p and ρ′′. Given ρ = p · ρ′′, toestablish (16) via rule (gfr1), it suffices to show that

∀y, L.(∧ y ∈ (Dom(Fpc+1)\Dom(FK))∧ Fpc+1[y] = (ret-from L)

)⇒ GoodRet(L, f [y],ΛP (ρ), ρ)

(19)and also that

GoodFrameRets(F, p,Dom(Fpc+1), f,ΛP (ρ′′), ρ′′) (20)

Because ρ 6= ε, the GoodFrameRets assumption must be true by rule (gfr1),so we can conclude that

∀y, L.(∧ y ∈ Dom(Fpc)∧ Fpc[y] = (ret-from L)


and also that

GoodFrameRets(F, p,Dom(Fpc), f,ΛP (ρ′′), ρ′′) (22)

52

To prove (20), we simply observe that, because Dom(Fpc+1) = Dom(Fpc),(22) is equivalent to (20).

To prove (19), we assume y and L such that y is in Dom(Fpc+1)\Dom(FK)and Fpc+1[y] equals (ret-from L) and show that

GoodRet(L, f [y],ΛP (ρ), ρ) (23)

From the stackless static semantics of jsr K (line 4, jsr rule, Figure 13),because y is in Dom(Fpc+1)\Dom(FK), we can conclude that Fpc[y] equalsFpc+1[y] which in turn equals (ret-from L). Also, given Dom(Fpc+1) =Dom(Fpc), we can conclude that y ∈ Dom(Fpc). Given (21), y ∈ Dom(Fpc),and Fpc[y] = (ret-from L), we can conclude (23).

• The last case is ret x. From Lemma 9 and the structured dynamic semanticsof ret x we know that, for some K:

ΛP (ρ) = K · ΛP (ρ′)ρ = pc′ · ρ′f ′ = f

We want to prove:


If ρ′ is empty, then this follows from rule (gfr0).

Otherwise, we have that ρ′ = p · ρ′′ for some p and ρ′′. We are assuming that

GoodFrameRets(F, pc, {}, f,ΛP (ρ), ρ)

Applying the equations above, this is equivalent to

GoodFrameRets(F, pc, {}, f ′,K · ΛP (ρ′), pc′ · ρ′)

This must be true by rule (gfr1), so we can conclude that

∀y, L.(∧ y ∈ Dom(Fpc)∧ Fpc[y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],K · ΛP (ρ′), pc′ · ρ′)

(24)

and also that

GoodFrameRets(F, pc′,Dom(Fpc), f ′,ΛP (ρ′), ρ′) (25)

53

Because ρ′ = p · ρ′′, (25) must be true by rule (gfr1), so

∀y, L.(∧ y ∈ (Dom(Fpc′)\Dom(Fpc))∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ′), ρ′)

(26)and also

GoodFrameRets(F, p,Dom(Fpc′), f ′,ΛP (ρ′′), ρ′′) (27)

Because ρ 6= ε, the WFCallStack assumption must be true by rule (wf1), soP [pc′−1] = jsrM for M such that RP,pc = {M}. Given this, by the stacklessstatic semantics of ret x (lines 2 and 5, ret rule, Figure 13), we know that,for y in Dom(Fpc), Fpc′ [y] = Fpc[y]. Combining this with (24), we get

∀y, L.(∧ y ∈ Dom(Fpc)∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],K · ΛP (ρ′), pc′ · ρ′)

Using Lemma 12 and the WFCallStack2 assumption, we can conclude

∀y, L.(∧ y ∈ Dom(Fpc)∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ′), ρ′) (28)

Combining (26) and (28), we can conclude

∀y, L.(∧ y ∈ Dom(Fpc′)∧ Fpc′ [y] = (ret-from L)

)⇒ GoodRet(L, f ′[y],ΛP (ρ′), ρ′) (29)

Given (29) and (27), we can use rule (gfr1) to conclude


2

C Proof of main soundness theorem

To prove Theorem 4, our main soundness theorem, we use two lemmas. This sectionfirst states these lemmas, proves the main soundness theorem, then proves the twolemmas.

The first lemma says that well-typed programs get “stuck” only at halt instruc-tions:

54


∀pc, f, s, ρ.∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc∧ pc ∈ Dom(P )∧ P [pc] 6= halt⇒ ∃pc′, f ′, s′. P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

The next lemma takes invariants stated about individual steps of execution andextends them over multi-step execution sequences:


∀pc, f0, f, s.P ` 〈1, f0, ε〉 →∗ 〈pc, f, s〉⇒ ∃ρ.∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc∧ pc ∈ Dom(P )

We use these lemmas to prove the main soundness theorem:

Restatement of main soundness theorem For all P , F , and S such thatF, S ` P :

∀pc, f0, f, s.(P ` 〈1, f0, ε〉 →∗ 〈pc, f, s〉∧ 6∃pc′, f ′, s′. P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

)⇒ P [pc] = halt ∧ s : Spc

Proof Assume that P , F , S, pc, f , and s satisfy the hypotheses of this theorem.Under this assumption, the second conjunct of the conclusion (s : Spc) followsdirectly from Lemma 16.

We prove the first conjunct of the conclusion by contradiction. Assume thatP [pc] 6= halt. Our previous assumptions about P , F , S, pc, f , and s imply thehypotheses of Lemma 16, so we can conclude that there exists ρ such that P , F ,S, pc, f , s, and ρ satisfy all hypotheses of Lemma 15. However, the conclusion ofLemma 15 contradicts our assumption that

6 ∃pc′, f ′, s′. P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

Thus, we are forced to conclude that P [pc] = halt. 2

55

C.1 Making progress

Lemma 15 says that a well-typed program does not get stuck unless it hits a haltinstruction:


∀pc, f, s, ρ.∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc∧ pc ∈ Dom(P )∧ P [pc] 6= halt⇒ ∃pc′, f ′, s′. P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉

Proof Assume that P , F , S, pc, f , s, and ρ satisfy the hypotheses of the lemma.We proceed with a case split on the instruction at P [pc], constructing pc′, f ′, and s′

satisfying the stackless dynamic semantics of the instruction at P [pc].

• For load x, push0, and jsr L, progress can always be made. For load x andpush0, s′ = s, f ′ = f , and pc′ = pc + 1. For jsr L, s′ = (L + 1) · s, f ′ = f ,and pc′ = pc + 1.

• For inc, if L, pop, and store x, progress can be made when the stack has atleast one value of the appropriate type in it. The assumption that s : Spc plusthe static constraints on Spc for each instruction ensure that the stack doesindeed have an appropriately typed value on top. For all instructions, assumes = v · s′′ for some v and s′′. For inc, if L, and pop, we take f ′ = f ; forstore x, we take f ′ = f [x 7→ v]. For if L, pop, and store x, we take s′ = s′′.For inc, v must be some integer n, and we take s′ = (n+ 1) · s′′. For inc, pop,and store x, pc′ = pc + 1. For if L, v must be some integer n, and we takepc′ = pc + 1 if n is zero and we take pc′ = L for other values of n.

• For ret x, progress can be made in states where f [x] is an address. Theassumption that

∃T. F(F, pc, ρ)[x] = T ∧ f [x] : T

and the stackless static semantics of ret x imply that f [x] does contain anaddress. Thus, progress is possible, and we take pc′ = f [x], f ′ = f , and s′ = s.

2

56

C.2 Chained soundness theorem

Before proving Lemma 16, we state and prove one more invariant about individualsteps of execution:


∀pc, f, s, ρ, pc′, f ′, s′.∧ P ` 〈pc, f, s〉 → 〈pc′, f ′, s′〉∧ Consistent(P, F , S, pc, f, s, ρ)⇒ pc′ ∈ Dom(P )

Proof Assume that P , F , S, pc, f , s, and ρ satisfy the hypotheses of the lemma.The assumption that a step of execution is possible starting from 〈pc, f, s〉 impliesthat P [pc] is defined; it also implies that P [pc] 6= halt. We do a case split on theinstruction possible at P [pc]:

• For pop, push0, inc, load, and store, pc′ = pc + 1, and pc + 1 is constrainedby the stackless static semantics of these instructions to be in Dom(P ).

• For if L, pc′ equals either pc + 1 or L, depending on which way the branchgoes. Both pc + 1 and L are constrained by the stackless static semantics ofif L to be in Dom(P ).

• For jsr L, pc′ = L, and L is constrained by the stackless static semantics ofjsr L to be in Dom(P ).

• For ret x, we first show that our assumptions imply that ρ has at least oneelement. If ρ had no elements, then the WFCallStack2 part of the Consistentassumption would have to be true by rule (wf20). This would require that CP,pcbe empty, but the stackless static semantics for ret x implies that CP,pc musthave at least one element. Thus, WFCallStack2 cannot hold by rule (wf20).Instead, it must hold by rule (wf21), so ρ must have at least one element.

Let ρ = p · ρ′ for some p and ρ′. Using Lemma 7, we have that pc′ = p. TheWFCallStack part of the Consistent assumption must be true by rule (wf1), soP [p− 1] = jsr L for some L. From the stackless static semantics of jsr L wehave that (i+1) ∈ Dom(P ) for all i such that P [i] = jsr L. Thus, substitutingp− 1 for i, we can conclude that p, and thus pc′, is in Dom(P ).

2

Lemma 16 applies Lemma 17 and Theorem 3 to multi-step executions startingfrom the initial state:

57


∀pc, f0, f, s.P ` 〈1, f0, ε〉 →∗ 〈pc, f, s〉⇒ ∃ρ.∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc∧ pc ∈ Dom(P )

Proof Assume P , F , and S such that F, S ` P and proceed by induction on thenumber of execution steps in P ` 〈1, f0, ε〉 →∗ 〈pc, f, s〉:

• In the base case, 〈pc, f, s〉 equals 〈1, f0, ε〉, that is, no steps of execution aretaken. In Section 3, it is assumed that P [1] is defined for all programs, so wecan conclude pc ∈ Dom(P ). We pick ρ = ε and observe that:

∧ Consistent(P, F , S, 1, f0, ε, ε)∧ ∀y. ∃T. F(F, 1, ε)[y] = T ∧ f0[y] : T∧ ε : S1

Given the values of F 1, S1, and CP,1 it is not hard to check this.

• In the inductive step, let 〈pcn, fn, sn〉 be a state such that

P ` 〈1, f0, ε〉 →∗ 〈pcn, fn, sn〉 → 〈pc, f, s〉

By induction, we know that there exists a ρn such that:

∧ Consistent(P, F , S, pcn, fn, sn, ρn)∧ ∀y. ∃T. F(F, pcn, ρn)[y] = T ∧ fn[y] : T∧ sn : Spcn∧ pcn ∈ Dom(P )

Our assumptions and these conjuncts satisfy the hypotheses of Theorem 3 andLemma 17, so we can conclude that

∧ Consistent(P, F , S, pc, f, s, ρ)∧ ∀y. ∃T. F(F, pc, ρ)[y] = T ∧ f [y] : T∧ s : Spc∧ pc ∈ Dom(P )

2

58

Acknowledgements

We thank Luca Cardelli, Drew Dean, Sophia Drossopoulou, Stephen Freund, Cyn-thia Hibbard, Mark Lillibridge, Greg Morrisett, George Necula, Frank Yellin, andanonymous referees for useful information and suggestions.

59

References

[Coh97] Richard M. Cohen. Defensive Java Virtual Machine version 0.5 alpharelease. Web pages at http://www.cli.com/, May 1997.

[DE97] Sophia Drossopoulou and Susan Eisenbach. Java is type safe—probably.In Proceedings of ECOOP ’97, pages 389–418, June 1997.

[FM98] Stephen N. Freund and John C. Mitchell. A type system for objectinitialization in the Java bytecode language. Technical Report STAN-CS-TN-98-62, Department of Computer Science, Stanford University, April1998. To appear in Proceedings of OOPSLA ’98.

[Gol97] Allen Goldberg. A specification of Java loading and bytecode ver-ification. To appear in Proceedings of the Fifth ACM Conferenceon Computer and Communications Security, 1998; Web page athttp://www.kestrel.edu/~goldberg/Bytecode.html, 1997.

[HT98] Masami Hagiya and Akihiko Tozawa. On a new method for dataflowanalysis of Java Virtual Machine subroutines. On the Web at http://nicosia.is.s.u-tokyo.ac.jp/members/hagiya.html; a preliminaryversion appeared in SIG-Notes, PRO-17-3, Information Processing So-ciety of Japan, pages 13–18, 1998.

[LY96] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification.Addison-Wesley, 1996.

[Mor95] Greg Morrisett. Compiling with Types. PhD thesis, Carnegie MellonUniversity, 1995.

[MTC+96] G. Morrisett, D. Tarditi, P. Cheng, C. Stone, R. Harper, and P. Lee. TheTIL/ML compiler: Performance and safety through types. In Workshopon Compiler Support for Systems Software, 1996.

[Nv98] Tobias Nipkow and David von Oheimb. Javalight is type-safe—definitely.In Proceedings of the 25th ACM Symposium on Principles of Program-ming Languages, pages 161–170, January 1998.

[Qia98] Zhenyu Qian. A formal specification of Java Virtual Machine instructionsfor objects, methods and subroutines. In Jim Alves-Foss, editor, FormalSyntax and Semantics of JavaTM. Springer-Verlag, 1998. To appear.

[Sar97] Vijay Saraswat. The Java bytecode verification problem. Web page athttp://www.research.att.com/~vj/main.html, 1997.

61

[SMB97] Emin Gun Sirer, Sean McDirmid, and Brian Bershad. Kimera: AJava system security architecture. Web pages at http://kimera.cs.washington.edu/, 1997.

[Sym97] Don Syme. Proving Java type soundness. Technical Report 427, Univer-sity of Cambridge Computer Laboratory, June 1997.

[TIC97] ACM SIGPLAN Workshop on Types in Compilation (TIC97). 1997.

[Yel97] Frank Yellin. Private communication. March 1997.

62

Research 158 SRC Report - HP Labs(including bytecode verifier) Figure 1: The Java VM and security...

Documents

Transcript of Research 158 SRC Report - HP Labs(including bytecode verifier) Figure 1: The Java VM and security...