The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language...

33
1 The Java Virtual Machine Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review Supplementary material: Java’s runtime organization and the Java Virtual Machine

Transcript of The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language...

Page 1: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

1The Java Virtual Machine

Course Overview

PART I: overview material1 Introduction

2 Language processors (tombstone diagrams, bootstrapping)

3 Architecture of a compiler

PART II: inside a compiler4 Syntax analysis

5 Contextual analysis

6 Runtime organization

7 Code generation

PART III: conclusion8 Interpretation

9 Review

Supplementary material:Java’s runtime organizationand the Java Virtual Machine

Page 2: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

2The Java Virtual Machine

What This Topic is About

We look at the JVM as an example of a real-world runtime system for a modern object-oriented programming language.

JVM is probably the most common and widely used VM in the world, so you’ll get a better idea what a real VM looks like.

• JVM is an abstract machine.• What is the JVM architecture?• What is the structure of .class files?• How are JVM instructions executed?• What is the role of the constant pool in dynamic linking?

Also visit this site for more complete information about the JVM:

http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html

Page 3: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

3The Java Virtual Machine

Recap: Interpretive Compilers

Why?A tradeoff between fast(er) compilation and a reasonable runtime performance.

How?Use an “intermediate language”• more high-level than machine code => easier to compile to• more low-level than source language => easy to implement as an

interpreter

Example: A “Java Development Kit” for machine M

Java–>JVM

M

JVMM

Page 4: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

4The Java Virtual Machine

Abstract Machines

Abstract machine implements an intermediate language in between the high-level language (e.g. Java) and the low-level hardware (e.g. Pentium)

Java

Pentium

Java

Pentium

JVM (.class files)

High level

Low level

Java compiler

Java JVM interpreteror JVM JIT compiler

Implemented in Java: Machine independent

Page 5: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

5The Java Virtual Machine

Abstract Machines

An abstract machine is intended specifically as a runtime system for a particular (kind of) programming language.

• JVM is a virtual machine for Java programs.

• It directly supports object-oriented concepts such as classes, objects, methods, method invocation etc.

• Easy to compile Java to JVM=> 1. easy to implement compiler 2. fast compilation

• Another advantage: portability

Page 6: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

6The Java Virtual Machine

Class Files and Class File Format

The JVM is an abstract machine in the truest sense of the word.The JVM specification does not give implementation details (can be dependent on target OS/platform, performance requirements, etc.) The JVM specification defines a machine independent “class file format” that all JVM implementations must support.

.class files

JVM

load

External representation(platform independent)

Internal representation(implementation dependent)

objects

classes

methods

arraysstrings

primitive types

Page 7: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

7The Java Virtual Machine

Data Types

JVM (and Java) distinguishes between two kinds of types:

Primitive types:• boolean: boolean • numeric integral: byte, short, int, long, char• numeric floating point: float, double• internal, for exception handling: returnAddress

Reference types:• class types• array types• interface types

Note: Primitive types are represented directly, reference types are represented indirectly (as pointers to array or class instances).

Page 8: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

8The Java Virtual Machine

JVM: Runtime Data Areas

Besides OO concepts, JVM also supports multi-threading. Threads are directly supported by the JVM.

=> Two kinds of runtime data areas: 1. shared between all threads 2. private to a single thread

Shared Thread 1 Thread 2

pc

JavaStack

NativeMethodStack

pc

JavaStack

NativeMethodStack

Garbage CollectedHeap

Method area

Page 9: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

9The Java Virtual Machine

Java Stacks

JVM is a stack based machine, much like TAM.

JVM instructions • implicitly take arguments from the stack top• put their result on the top of the stack

The stack is used to• pass arguments to methods• return a result from a method• store intermediate results while evaluating expressions• store local variables

This works similarly to (but not exactly the same as) what we previously discussed about stack-based storage allocation and routines.

Page 10: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

10The Java Virtual Machine

Stack Frames

The Java stack consists of frames. The JVM specification does not say exactly how the stack and frames should be implemented.

The JVM specification specifies that a stack frame has areas for:

Pointer to runtime constant pool

local vars

operand stack

args+

A new call frame is created by executing some JVM instruction for invoking a method (e.g. invokevirtual, invokenonvirtual, ...)

The operand stack is initially empty, but grows and shrinks during execution.

Page 11: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

11The Java Virtual Machine

Stack Frames

local vars

operand stack

args+

Stack for storing intermediate results during the execution of the method.• Initially it is empty.• The maximum depth is known at compile time.

The role/purpose of each of the areas in a stack frame:

pointer toconstant pool

Used implicitly when executing JVMinstructions that contain entries into theconstant pool (more about this later).

Space where the arguments and local variablesof a method are stored. This includes a space for the receiver (this) at position/offset 0.

Page 12: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

12The Java Virtual Machine

Stack Frames

An implementation using registers such as SB, ST, and LB and a dynamic link is one possible implementation.

LB

ST

dynamic link

SB to previous frame on the stack

to runtime constant pool

local vars

operand stack

args+

JVM instructions store and load(for accessing args and locals) use addresses which are numbersfrom 0 to #args + #locals - 1

Page 13: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

13The Java Virtual Machine

JVM Interpreter

The core of a JVM interpreter is basically this:do { byte opcode = fetch an opcode; switch (opcode) { case opCode1 : fetch operands for opCode1; execute action for opCode1; break; case opCode2 : fetch operands for opCode2; execute action for opCode2; break; case ...} while (more to do)

Page 14: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

14The Java Virtual Machine

Instruction-set: typed instructions!

JVM instructions are explicitly typed: different opCodes for instructions for integers, floats, arrays, reference types, etc.

This is reflected by a naming convention in the first letter of the opCode mnemonics:

Example: different types of “load” instructions

iloadlloadfloaddloadaload

integer loadlong loadfloat loaddouble loadreference-type load

Page 15: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

15The Java Virtual Machine

Instruction set: kinds of operands

JVM instructions have three kinds of operands: - from the top of the operand stack - from the bytes following the opCode - part of the opCode itselfEach instruction may have different “forms” supporting different kinds of operands.Example: different forms of “iload”

iload_0iload_1iload_2iload_3

Assembly code Binary instruction code layout26272829

21 niload n

wide iload n 196 n21

Page 16: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

16The Java Virtual Machine

Instruction-set: accessing arguments and locals

locals: indexes #args .. #args + #locals - 1

args: indexes 0 .. #args - 1

arguments and locals area inside a stack frame

Instruction examples:

iload_1iload_3aload 5aload_0

istore_1astore_1fstore_3

0:1:2:3:

• A load instruction takes something from the args/locals area and pushes it onto the top of the operand stack.

• A store instruction pops something from the top of the operand stack and places it in the args/locals area.

Page 17: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

17The Java Virtual Machine

Instruction-set: non-local memory access

In the JVM, the contents of different “kinds” of memory can be accessed by different kinds of instructions.

accessing locals and arguments: load and store instructions

accessing fields in objects: getfield, putfield

accessing static fields: getstatic, putstatic

Note: Static fields are a lot like global variables. They are allocated in the “method area” where also code for methods and representations for classes (including method tables) are stored.

Note: getfield and putfield access memory in the heap.

Note: JVM doesn’t have anything similar to registers L1, L2, etc.

Page 18: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

18The Java Virtual Machine

Instruction-set: operations on numbers

add: iadd, ladd, fadd, daddsubtract: isub, lsub, fsub, dsubmultiply: imul, lmul, fmul, dmul

etc.

Arithmetic

Conversion

i2l, i2f, i2d,l2f, l2d, f2d,f2i, d2i, …

Page 19: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

19The Java Virtual Machine

Instruction-set …

Operand stack manipulationpop, pop2, dup, dup2, swap, …

Control transferUnconditional: goto, jsr, ret, …Conditional: ifeq, iflt, ifgt, if_icmpeq, …

Page 20: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

20The Java Virtual Machine

Instruction-set …

Method invocation:invokevirtual: usual instruction for calling a method on an

object. invokeinterface: same as invokevirtual, but used

when the called method is declared in an interface (requires a different kind of method lookup)

invokespecial: for calling things such as constructors, which are not dynamically dispatched (this instruction is also known as invokenonvirtual).

invokestatic: for calling methods that have the “static” modifier (these methods are sent to a class, not to an object).

Returning from methods:return, ireturn, lreturn, areturn, freturn, …

Page 21: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

21The Java Virtual Machine

Instruction-set: Heap Memory Allocation

Create new class instance (object):new

Create new array:newarray: for creating arrays of primitive types.anewarray, multianewarray: for arrays of reference

types.

Page 22: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

22The Java Virtual Machine

Instructions and the “Constant Pool”

Many JVM instructions have operands which are indexes pointing to an entry in the so-called constant pool.

The constant pool contains all kinds of entries that represent “symbolic” references for “linking”. This is the way that instructions refer to things such as classes, interfaces, fields, methods, and constants such as string literals and numbers.

These are the kinds of constant pool entries that exist:

• Class_info• Fieldref_info• Methodref_info• InterfaceMethodref_info• String

• Integer• Float• Long• Double• Name_and_Type_info• Utf8_info (Unicode characters)

Page 23: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

23The Java Virtual Machine

Instructions and the “Constant Pool”

Example: We examine the getfield instruction in detail.

180 indexbyte1 indexbyte2Format:

CONSTANT_Fieldref_info { u1 tag; u2 class_index; u2 name_and_type_index;}

Class_info { u1 tag; u2 name_index;}

Utf8Infofully qualified class name

CONSTANT_Name_and_Type_info { u1 tag; u2 name_index; u2 descriptor_index;}

Utf8Infoname of field

Utf8Infofield descriptor

Page 24: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

24The Java Virtual Machine

Instructions and the “Constant Pool”

180 indexbyte1 indexbyte2Format:

Fieldref

Class

Utf8Infofully qualified class name

Name_and_Type

Utf8Infoname of field

Utf8Infofield descriptor

That previous picture is rather complicated, let’s simplify it a little:

Page 25: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

25The Java Virtual Machine

Instructions and the “Constant Pool”

Luckily, we have a Java assembler that allows us to write a kind of textual assembly code and that is then transformed into a binary .class file.

This assembler takes care of creating the constant pool entries for us. When an instruction operand expects a constant pool entry the assembler allows you to enter the entry “in place” in an easy syntax.

Example:getfield mypackage/Queue i I

The constant entries format is part of the Java class file format.

Page 26: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

26The Java Virtual Machine

Instructions and the “Constant Pool”

Fully qualified class names and descriptors in constant pool UTF8 entries.

1. Fully qualified class name: a package + class name string. Note this uses “/” instead of “.” to separate each level along the path.

2. Descriptor: a string that defines a type for a method or field.

Java descriptorboolean Zinteger IObject Ljava/lang/Object;String[] [Ljava/lang/String;int foo(int,Object) (ILjava/lang/Object;)I

Page 27: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

27The Java Virtual Machine

Linking

In general, linking is the process of resolving symbolic references in binary files.

Most programming language implementations have what we call “separate compilation”. Modules or files can be compiled separately and transformed into some binary format. But since these separately compiled files may have connections to other files, they have to be linked.

=> The binary file is not yet executable, because it has some kind of “symbolic links” in it that point to things (classes, methods, functions, variables, etc.) in other files/modules.

Linking is the process of resolving these symbolic links and replacing them by real addresses so that the code can be executed.

Page 28: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

28The Java Virtual Machine

Loading and Linking in JVM

In JVM, loading and linking of class files happens at runtime, while the program is running!

Classes are loaded as needed.

The constant pool contains symbolic references that need to be resolved before a JVM instruction that uses them can be executed (this is the equivalent of linking).

In JVM a constant pool entry is resolved the first time it is used by a JVM instruction.

Example: When a getfield is executed for the first time, the constant pool entry index in the instruction can be replaced by the offset of the field.

Page 29: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

29The Java Virtual Machine

Closing Example

class Factorial {

int fac(int n) { int result = 1; for (int i=2; i<n; i++) { result = result * i; } return result; }}

As a closing example on the JVM, we will take a look at the compiled code of the following simple Java class declaration.

Page 30: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

30The Java Virtual Machine

Compiling and Disassembling

% javac Factorial.java % javap -c -verbose FactorialCompiled from Factorial.javaclass Factorial extends java.lang.Object { Factorial(); /* Stack=1, Locals=1, Args_size=1 */ int fac(int); /* Stack=2, Locals=4, Args_size=2 */}

Method Factorial() 0 aload_0 1 invokespecial #1 <Method java.lang.Object()> 4 return

% javac Factorial.java % javap -c -verbose FactorialCompiled from Factorial.javaclass Factorial extends java.lang.Object { Factorial(); /* Stack=1, Locals=1, Args_size=1 */ int fac(int); /* Stack=2, Locals=4, Args_size=2 */}

Method Factorial() 0 aload_0 1 invokespecial #1 <Method java.lang.Object()> 4 return

Page 31: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

31The Java Virtual Machine

Compiling and Disassembling ...

// address: 0 1 2 3Method int fac(int) // stack: this n result i 0 iconst_1 // stack: this n result i 1 1 istore_2 // stack: this n result i 2 iconst_2 // stack: this n result i 2 3 istore_3 // stack: this n result i 4 goto 14 7 iload_2 // stack: this n result i result 8 iload_3 // stack: this n result i result i 9 imul // stack: this n result i result*i 10 istore_2 // stack: this n result i 11 iinc 3 1 // stack: this n result i 14 iload_3 // stack: this n result i i 15 iload_1 // stack: this n result i i n 16 if_icmplt 7 // stack: this n result i 19 iload_2 // stack: this n result i result 20 ireturn

Page 32: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

32The Java Virtual Machine

Writing Factorial in “jasmin”

.class package Factorial

.super java/lang/Object

.method package <init>( )V

.limit stack 50

.limit locals 1aload_0invokenonvirtual java/lang/Object/<init>( )Vreturn.end method

.class package Factorial

.super java/lang/Object

.method package <init>( )V

.limit stack 50

.limit locals 1aload_0invokenonvirtual java/lang/Object/<init>( )Vreturn.end method

Jasmin is a Java Assembler Interface. It takes ASCII descriptionsfor Java classes, written in a simple assembler-like syntax and usingthe Java Virtual Machine instruction set. It converts them into binaryJava class files suitable for loading into a JVM implementation.

Page 33: The Java Virtual Machine 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

33The Java Virtual Machine

Writing Factorial in “jasmin” (continued)

.method package fac(I)I

.limit stack 50

.limit locals 4

iconst_1

istore 2

iconst_2

istore 3

Label_1:

iload 3

iload 1

if_icmplt Label_4

iconst_0

goto Label_5

Label_4:

iconst_1

Label_5:

ifeq Label_2

iload 2iload 3imuldupistore 2popLabel_3:iload 3dupiconst_1iaddistore 3popgoto Label_1Label_2:iload 2ireturniconst_0ireturn.end method