1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32...
-
Upload
cynthia-mcdaniel -
Category
Documents
-
view
213 -
download
0
Transcript of 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32...
![Page 1: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/1.jpg)
1
Machine-Level Representation of Programs
I
![Page 2: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/2.jpg)
2
Outline
• Compiler drivers• History of the Intel IA-32 architecture• Assembly code and object code• Memory and Registers• Addressing Mode• Data Formats
• Suggested reading
– Chap 1.2, 1.4.1, 1.7.3, 3.1, 3.2, 3.3, 3.4.1
![Page 3: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/3.jpg)
3
The Hello Program
• It begins life as a high-level C program
– Can be read and understand by human beings
• The individual C statements must be
translated by compiler drivers
– So that the hello program can run on a
computer system
– Compiler :编译器
![Page 4: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/4.jpg)
4
The Hello Program
• The C programs are translated into – A sequence of low-level machine-language
instructions
• These instructions are then packaged in a form – called an object program
• Object program are stored as a binary disk file– Also referred to as executable object files
![Page 5: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/5.jpg)
5
The Context of a Compiler (gcc)
Source program (text)hello.c
Preprocessor (cpp)
Modified source program (text)hello.i
Assembly program (text)
Compiler (cc1)
hello.s
Assembler (as)
Relocatable object program (binary)hello.o
Linker (ld)
Executable object program (binary)hello
Figure 1.3 P5
Compiler: 编译器Assembler: 汇编器Linker: 连接器
![Page 6: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/6.jpg)
6
Characteristics of the high level programming languages
• Abstraction – Productive– reliable
• Type checking• As efficient as hand written code• Can be compiled and executed on a number of
different machines, whereas assembly code is highly machine specific
Productive :多产的Reliable: 可靠的
![Page 7: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/7.jpg)
7
Characteristics of the assembly programming languages
• Managing memory• Low level instructions to carry out the
computation• Highly machine specific
![Page 8: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/8.jpg)
8
Why should we understand the assembly code
• Understand the optimization capabilities of the compiler
• Analyze the underlying inefficiencies in the code
• Sometimes the run-time behavior of a program is needed
![Page 9: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/9.jpg)
9
From writing assembly code to understand assembly code
• Different set of skills– Transformations– Relation between source code and assembly
code
• Reverse engineering– Trying to understand the process by which a
system was created • By studying the system and • By working backward
Backward: 回溯
![Page 10: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/10.jpg)
10
A Historical Perspective
• Long evolutionary development
– Started from rather primitive 16-bit processors
– Added more features
• Take the advantage of the technology improvements
• Satisfy the demands for higher performance and for
supporting more advanced operating systems
– Laden with features providing backward compatibility
that are obsolete
* laden with: 承载
* compatibility: 兼容性
* obsolete: 陈旧的
![Page 11: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/11.jpg)
11
X86 family
• 8086(1978, 29K)
– The heart of the IBM PC & DOS
– 1M bytes addressable, 640K for users
• 80286(1982, 134K)
– More (now obsolete) addressing modes
– Basis of the IBM PC-AT & Windows
![Page 12: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/12.jpg)
12
X86 family
• i386(1985, 275K)
– 32 bits architecture, flat addressing model
– Support a Unix operating system
• I486(1989, 1.9M)
– Integrated the floating-point unit onto the
processor chip
![Page 13: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/13.jpg)
13
X86 family
• Pentium(1993, 3.1M)
• PentiumPro(1995, 6.5M)
– P6 microarchitecture
– Conditional mov
• Pentium/MMX(1997, 4.5M)
– New class of instructions for manipulating
vectors of integers
![Page 14: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/14.jpg)
14
X86 family
• Pentium II(1997, 7M)
– Implementing MMX instructions within P6
• Pentium III(1999, 8.2M)
– New class of instructions for manipulating
vectors of floating-point numbers(SSE, Stream
SIMD Extension)
![Page 15: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/15.jpg)
15
X86 family
• Pentium 4(2001, 42M)
– Netburst microarchitecture
– 144 new SSE2 instructions
![Page 16: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/16.jpg)
16
X86 family
• Advanced Micro Devices (AMD)
– Now are close competitors to Intel
– Developing own extension to 64-bits
![Page 17: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/17.jpg)
17
X86 family
• Transmeta
– In January of 2002, introduced CrucoeTM processor
– Radically different approach to implementation
• Translates x86 code into “Very Long Instruction Word”
(VLIW) code
• High degree of parallelism
– Shooting for low-power market such as lap-top
computers
![Page 18: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/18.jpg)
18
Hardware Organization Figure 1.4 P7
•CPU: Central Processing Unit•ALU: Arithmetic/Logic Unit•PC: Program Counter•USB: Universal Serial Bus
![Page 19: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/19.jpg)
19
Virtual spaces
• A linear array of bytes– each with its own unique address (array index)
starting at zero
… … … …
0xffffffff
0xfffffffe
0x2
0x1
0x0
addresses contents
![Page 20: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/20.jpg)
20
Data layout
• Object model in C– Different data types can be declared
![Page 21: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/21.jpg)
21
Data layout
• Object model in assembly– A large, byte-addressable array– No distinctions even between signed or
unsigned integers– Code, user data, OS data– Run-time stack for managing procedure call
and return– Blocks of memory allocated by user
![Page 22: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/22.jpg)
22•Figure 1.13 P17
![Page 23: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/23.jpg)
23
Operations in C constructs
• Arithmetic expression evaluation
• Loops
• Procedure calls and returns
• Translated into sequences of instructions
![Page 24: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/24.jpg)
24
Operations in Assembly Instructions
• Performs only a very elementary operation
• Normally one by one in sequential
• Operate data stored in registers
• Transfer data between memory and a
register
• Conditionally branch to a new instruction
address
![Page 25: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/25.jpg)
25
Assembly Programmer’s View Figure 3.2 P136
FF
BF
7F
3F
C0
80
40
00
Stack
DLLs
TextDataHeap
Heap
08
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
%al%ah
%dl%dh
%cl%ch
%bl%bh
%eip
%eflag
Addresses
Data
Instructions
![Page 26: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/26.jpg)
26
Programmer-Visible States P129
• Program Counter(%eip)
– Address of the next instruction
• Register File
– Heavily used program data
– Integer and floating-point
![Page 27: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/27.jpg)
27
Programmer-Visible States
• Conditional code register
– Hold status information about the most recently
executed instruction
– Implement conditional changes in the control
flow
![Page 28: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/28.jpg)
28
Code Examples P130
C codeint sum(int x, int y){ int t = x+y; return t;}
_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpret
Obtain with command
gcc –O2 -S code.c
Assembly file code.s
![Page 29: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/29.jpg)
29
Code Examples P131
55 89 e5 8b 45 0c 03 45 08 01 05 00 00 00 00 89 ec 5d c3
Obtain with command
gcc –O2 -c code.c
Relocatable object file code.o
![Page 30: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/30.jpg)
30
Code Examples
Obtain with command
objdump -d code.o
Disassembly output (P132 反汇编输出 )0x80483b4 <sum>:0x80483b4 550x80483b5 89 e50x80483b7 8b 45 0c0x80483ba 03 45 080x80483bd 01 05 00 00 00 000x80483c3 89 ec0x80483c5 5d0x80483c6 c3
push %ebp mov %esp,%ebp mov 0xc(%ebp),%eax add 0x8(%ebp),%eax mov %ebp,%esp add %eax, 0x0 pop %ebp ret nop
![Page 31: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/31.jpg)
31
C Code
• Add two signed integers
• int t = x+y;
![Page 32: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/32.jpg)
32
Assembly Code
• Operands:– x: Register %eax– y: Memory M[%ebp+8]– t: Register %eax
• Instruction– addl 8(%ebp),%eax– Add 2 4-byte integers– Similar to expression x +=y
• Return function value in %eax
![Page 33: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/33.jpg)
33
Object Code
• 3-byte instruction
• Stored at address 0x80483b7
• 0x80483b7: 03 45 08
![Page 34: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/34.jpg)
34
Operands P137
• In high level languages
– Either constants (常数)
– Or variable (变量)
• Example
– A = A + 4
variabl
e
constant
![Page 35: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/35.jpg)
35
Operands
• Counterparts in assembly languages– Immediate ( constant )
– Register ( variable )
– Memory ( variable )
• Examplemovl 8(%ebp), %eaxaddl $4, %eax
memory
register
immediate
![Page 36: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/36.jpg)
36
Simple Addressing Mode
• Immediate– represents a constant – The format is $imm ($4, $0xffffffff)
• Registers – The fastest storage units in computer systems– Typically 32-bit long
– Register mode Ea
• The value stored in the register
• Noted as R[Ea]
![Page 37: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/37.jpg)
37
Virtual spaces
• A linear array of bytes– each with its own unique address (array index)
starting at zero
… … … …
0xffffffff
0xfffffffe
0x2
0x1
0x0
addresses contents
![Page 38: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/38.jpg)
38
Memory References
• The name of the array is annotated as M
• If addr is a memory address
• M[addr] is the content of the memory starting at addr
• addr is used as an array index
• How many bytes are there in M[addr]?– It depends on the context
![Page 39: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/39.jpg)
39
Memory Addressing Mode
• An expression for – a memory address (or an array index)
• Most general form – imm (Eb, Ei, s)
– s: 1, 2, 4, 8
• The address represented by the above form– imm + R[Eb] + R[Ei] * s
• It gives the value– M[imm + R[Eb] + R[Ei] * s]
![Page 40: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/40.jpg)
40
Type Form Operand value Name
Immediate
$Imm Imm Immediate
Register Ea R[Ea] Register
Memory Imm M[Imm] Absolute
Memory (Ea) M[R[Ea]] Indirect
Memory Imm(Eb) M[Imm+ R[Eb]] Base+displacement
Memory (Eb, Ei) M[R[Eb]+ R[Ei]] Indexed
Memory Imm(Eb, Ei) M[Imm+ R[Eb]+ R[Ei]] Scaled indexed
Memory (, Ei, s) M[R[Ei]*s] Scaled indexed
Memory (Eb, Ei, s) M[R[Eb]+ R[Ei]*s] Scaled indexed
Memory Imm(Eb, Ei, s)
M[Imm+ R[Eb]+ R[Ei]*s]
Scaled indexed
Addressing Mode Figure 3.3 P137
![Page 41: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/41.jpg)
41
Address
Value
0x100 0xFF
0x104 0xAB
0x108 0x13
0x10C 0x11
Register
Value
%eax 0x100
%ecx 0x1
%edx 0x3
0x130x108
0x13260(%ecx,%edx)
0x11(%eax,%edx,4)
0x108$0x108
0xFF(%eax)
0x100%eax
ValueOperand
•Practice problem 3.1 P138
Comment
Register
Immediate
Address 0x100
Absolute address
Address 0x108
Address 0x10C
![Page 42: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/42.jpg)
42
Data Formats Figure 3.1 P135
C declaration Intel data type GAS suffix Size (byte)
char short int unsigned long int unsigned long char * float double long double
ByteWordDouble wordDouble wordDouble wordDouble wordDouble wordSingle precisionDouble precisionExtended precision
bwlllllslt
124444448
10/12
![Page 43: 1 Machine-Level Representation of Programs I. 2 Outline Compiler drivers History of the Intel IA-32 architecture Assembly code and object code Memory.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec85503460f94bd5e8b/html5/thumbnails/43.jpg)
43
Data Formats
• Move data instruction– mov (general)– movb (move byte)– movw (move word)– movl (move double word)