COS2014 IA-32 Processor Architecture
description
Transcript of COS2014 IA-32 Processor Architecture
COS2014IA-32 Processor
Architecture
2
Overview
Goal: Understand IA-32 architecture
Basic Concepts of Computer Organization Instruction execution cycle Basic computer organization Data storage in memory How programs run
IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
3
Recall: Computer Model for ASM
CPU
MemoryMOV AX, aADD AX, bMOV x, AX…
010100110010101a
110010110001010b
000000000010010x
AXBX
...
+ -
PC
Register
ALU
x a + b
Meanings of the Code (assumed)
Assembly code Machine codeMOV AX, a(Take the data stored inmemory address ‘a’, and move it to register AX)
ADD AX, b(Take the data stored inmemory address ‘b’, and add it to register AX)
MOV x, AX(Take the data stored inregister AX, and move it tomemory address ‘x’)
4
01 0000001 1000010
MOV registeraddress
AX memoryaddress
a
01 1000011 0000001
11 0000001 1000110
ADD
Another Computer Model for ASM
5
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
PC: program counterIR: instruction register
Stored program
architecture
Processor
Step 1: Fetch (MOV AX, a)
6
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PC
000011101 0000001 1000010
IR
Step 2: Decode (MOV AX,a)
7
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000011101 0000001 1000010
Controller
clock
Step 3: Execute (MOV AX,a)
8
…
ALU
MemoryRegister
AXBX
address
00000000 000000001
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000011101 0000001 1000010
Controller
clock
00000000 00000001
Step 1: Fetch (ADD AX,b)
9
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PC
000100011 0000001 1000110
IR
00000000 00000001
Step 2: Decode (ADD AX,b)
10
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000001
Step 3a: Execute (ADD AX,b)
11
…
ALU
MemoryRegister
AXBX
address
00000000 0000001001 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000001
00000000 00000011
++
Step 3b: Write Back (ADD AX,b)
12
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000011
00000000 00000011
00000000 00000001
13
Basic Computer Organization
Clock synchronizes CPU operations Control unit (CU) coordinates execution sequence ALU performs arithmetic and bitwise processing
14
Clock
Operations in a computer are triggered and thus synchronized by a clock
Clock tells “when”: (no need to ask each other!!) When to put data on output lines When to read data from input lines
Clock cycle measures time of a single operation Must long enough to allow signal propagation
one cycle
1
0
15
Instruction/Data for Operations
Where are the instructions needed for computer operations from?
Stored-program architecture: The whole program is stored in main memory,
including program instructions (code) and data CPU loads the instructions and data from memory
for execution Don’t worry about the disk for now
Where are the data needed for execution? Registers (inside the CPU, discussed later) Memory Constant encoded in the instructions
16
Memory
Organized like mailboxes, numbered 0, 1, 2, 3,…, 2n-1. Each box can hold 8 bits (1 byte) So it is called byte-addressing
Address of mailboxes: 16-bit address is enough for up to 64K 20-bit for 1M 32-bit for 4G Most servers need more than 4G!!
That’s why we need 64-bit CPUs like Alpha (DEC/Compaq/HP) or Merced (Intel)
…
17
Storing Data in Memory
Character String: So how are strings like “Hello, World!” are stored
in memory? ASCII Code! (or Unicode…etc.) Each character is stored as a byte Review: how is “1234” stored in memory?
Integer: A byte can hold an integer number:
‒ between 0 and 255 (unsigned) or ‒ between –128 and 127 (2’s complement)
How to store a bigger number? Review: how is 1234 stored in memory?
18
Big or Little Endian?
Example: 1234 is stored in 2 bytes. = 100 1101 0010 in binary= 04 D2 in hexadecimal
Do you store 04 or D2 first? Big Endian: 04 first Little Endian: D2 first Intel’s choice
Reason: more consistent for variable length (e.g., 2 bytes, 4 bytes, 8 bytes…etc.)
19
Cache Memory
High-speed expensive static RAM both inside and outside the CPU. Level-1 cache: inside the CPU chip Level-2 cache: often outside the CPU chip
Cache hit: when data to be read is already in cache memory
Cache miss: when data to be read is not in cache memory
20
How a Program Runs?
21
Load and Execute Process
OS searches for program’s filename in current directory and then in directory path
If found, OS reads information from directory OS loads file into memory from disk OS allocates memory for program information OS executes a branch to cause CPU to execute
the program. A running program is called a process
Process runs by itself. OS tracks execution and responds to requests for resources
When the process ends, its handle is removed and memory is released
How?OS is only a program!
22
Multitasking
OS can run multiple programs at same time Multiple threads of execution within the same
program Scheduler utility assigns a given amount of CPU
time to each running program Rapid switching of tasks
Gives illusion that all programs are running at the same time
Processor must support task switching
What supports are needed from hardware?
23
What's Next
General Concepts IA-32 Processor Architecture
Modes of operation Basic execution environment Floating-point unit Intel microprocessor history
IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
24
Modes of Operation
Protected mode native mode (Windows, Linux) Programs are given separate memory areas
named segments Real-address mode
native MS-DOS System management mode
power management, system security, diagnostics
• Virtual-8086 mode hybrid of Protected each program has its own 8086 computer
25
Basic Execution Environment
Address space: Protected mode
4 GB 32-bit address
Real-address and Virtual-8086 modes 1 MB space 20-bit address
26
Basic Execution Environment
Program execution registers: named storage locations inside the CPU, optimized for speed
CS
SS
DS
ES
EIP
EFLAGS
16-bit Segment Registers
EAX
EBX
ECX
EDX
32-bit General-Purpose Registers
FS
GS
EBP
ESP
ESI
EDI
ZN
Register
Memory
PCIR
ALU
clock
Controller
27
General Purpose Registers
Used for arithmetic and data movement Addressing:
AX, BX, CX, DX: 16 bits
Split into H and L parts, 8 bits each Extended into E?X to become 32-bit register (i.e.,
EAX, EBX,…etc.)
28
Index and Base Registers
Some registers have only a 16-bit name for their lower half:
29
Some Specialized Register Uses
General purpose registers EAX: accumulator, automatically used by
multiplication and division instructions ECX: loop counter ESP: stack pointer ESI, EDI: index registers (source, destination) for
memory transfer, e.g. a[i,j] EBP: frame pointer to reference function
parameters and local variables on stack EIP: instruction pointer (i.e. program counter)
30
Some Specialized Register Uses
Segment registers In real-address mode: indicate base addresses of
preassigned memory areas named segments In protected mode: hold pointers to segment
descriptor tables CS: code segment DS: data segment SS: stack segment ES, FS, GS: additional segments
EFLAGS Status and control flags (single binary bits) Control the operation of the CPU or reflect the
outcome of some CPU operation
31
Status Flags (EFLAGS)
Reflect the outcomes of arithmetic and logical operations performed by the CPU
Carry: unsigned arithmetic out of range Overflow: signed arithmetic out of range Sign: result is negative Zero: result is zero Auxiliary Carry: carry from bit 3 to bit 4 Parity: sum of 1 bits is an even number
ZN
Register
Memory
PCIR
ALU
clock
Controller
32
System Registers
Application programs cannot access system registers
IDTR (Interrupt Descriptor Table Register) GDTR (Global Descriptor Table Register) LDTR (Local Descriptor Table Register) Task Register Debug Registers Control registers CR0, CR2, CR3, CR4 Model-Specific Registers
33
Floating-Point, MMX, XMM Reg.
Eight 80-bit floating-point data registers ST(0), ST(1), . . . , ST(7) arranged in a stack used for all floating-point arithmetic
Eight 64-bit MMX registers Eight 128-bit XMM registers for
single-instruction multiple-data(SIMD) operations
34
Intel Microprocessors
Early microprocessors: Intel 8080:
‒ 64K addressable RAM, 8-bit registers‒ CP/M operating system‒ S-100 BUS architecture‒ 8-inch floppy disks!
Intel 8086/8088‒ IBM-PC used 8088‒ 1 MB addressable RAM, 16-bit registers‒ 16-bit data bus (8-bit for 8088)‒ separate floating-point unit (8087)
This is where “real-address mode” comes from!
35
Intel Microprocessors
The IBM-AT Intel 80286
‒ 16 MB addressable RAM‒ Protected memory‒ Introduced IDE bus architecture‒ 80287 floating point unit
Intel IA-32 Family Intel386: 4 GB addressable RAM, 32-bit registers,
paging (virtual memory) Intel486: instruction pipelining Pentium: superscalar, 32-bit address bus, 64-bit
internal data path
36
Intel Microprocessors
Intel P6 Family Pentium Pro: advanced optimization techniques in
microcode Pentium II: MMX (multimedia) instruction set Pentium III: SIMD (streaming extensions)
instructions Pentium 4 and Xeon: Intel NetBurst micro-
architecture, tuned for multimedia
37
What’s Next
General Concepts of Computer Architecture IA-32 Processor Architecture IA-32 Memory Management
Real-address mode Calculating linear addresses Protected mode Multi-segment model Paging
Components of an IA-32 Microcomputer Input-Output System
Understand it from the view point of the processor
Real-address Mode
Programs assigned 1MB (220) of memory Programs can access any area of memory Can run only one program at a time Segmented memory scheme
16-bit segment * 10h + 16-bit offset = 20-bit linear (or absolute) address
Segment value in CS, SS, DS, ES Offset value in IP, SP, BX & SI, DI
Example
Accessing a variable in the data segment DS (data segment) = 0A43 (16-bits) BX (offset) = 0030 (16-bits)
0A43 * 10 = 0A430 (20-bits) + 0030 (16-bits) 0A460 (linear address)
40
Segmented Memory
l inea
r ad
dres
s es
one segment
Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value (in CS, DS, SS, or ES) added to a 16-bit offset
8000 0000segmentvalue
offset
representedas
41
Calculating Linear Addresses
Given a segment address, multiply it by 16 (add a hexadecimal zero), and add it to the offset all done by the processor
Example:convert 08F1:0100 to a linear address
Adjusted Segment value: 0 8 F 1 0
Add the offset: 0 1 0 0
Linear address: 0 9 0 1 0
42
Protected Mode
Designed for multitasking Each process (running program) is assigned a
total of 4GB of addressable RAM Two parts:
Segmentation: provides a mechanism of isolating individual code, data, and stack so that multiple programs can run without interfering one another
Paging: provides demand-paged virtual memory where sections of a program’s execution environ. are moved into physical memory as needed Give segmentation the illusion that it has 4GB of physical memory
43
Segmentation in Protected Mode
Segment: a logical unit of storage (not the same as the “segment” in real-address mode) e.g., code/data/stack of a program, system data
structures Variable size Processor hardware provides protection All segments in the system are in the processor’s
linear address space (physical space if without paging)
Need to specify: base address, size, type, … segment descriptor & descriptor table linear address = base address + offset
44
Flat Segment Model
Use a single global descriptor table (GDT) All segments (at least 1 code and 1 data)
mapped to entire 32-bit address space
45
Multi-Segment Model
Local descriptor table (LDT) for each program One descriptor for each segment
located in a system segment of LDT type
46
Segmentation Addressing
Program references a memory location with a logical address: segment selector + offset Segment selector: provides an offset into the
descriptor table CS/DS/SS points to descriptor table for
code/data/stack segment
47
Convert Logical to Linear Address
Segment selector points to a segment descriptor, which contains base address of the segment.The 32-bit offset from the logical address is added to the segment’s base address, generating a 32-bit linear address
Selector Offset
Logical address
Segment Descriptor
Descriptor table
+
GDTR/LDTR
(contains base address ofdescriptor table)
Linear address
48
Paging
Supported directly by the processor Divides each segment into 4096-byte blocks
called pages Part of running program is in memory, part is on
disk Sum of all programs can be larger than physical
memory Virtual memory manager (VMM): An OS utility
that manages loading and unloading of pages Page fault: issued by processor when a page
must be loaded from disk
49
What's Next
General Concepts IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer
Skipped … Input-Output System
50
What's Next
General Concepts IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
How to access I/O systems?
51
Different Access Levels of I/O Call a HLL library function (C++, Java)
easy to do; abstracted from hardware slowest performance
Call an operating system function specific to one OS; device-independent medium performance
Call a BIOS (basic input-output system) function may produce different results on different systems knowledge of hardware required usually good performance
Communicate directly with the hardware May not be allowed by some operating systems
52
Displaying a String of Characters
When a HLL program displays a string of characters, the following steps take place: Calls an HLL library function to write the string to
standard output Library function (Level 3) calls an OS function,
passing a string pointer OS function (Level 2) calls a BIOS subroutine,
passing ASCII code and color of each character BIOS subroutine (Level 1) maps the character to a
system font, and sends it to a hardware port attached to the video controller card
Video controller card (Level 0) generates timed hardware signals to video display to display pixels
IA-32: Addressing Modes
IA-32 machine instructions act on zero or more operands. Some operands are specified explicitly in an instruction whereas other operands are implicit in an instruction. The data for a source operand can be located in any of the following places -
–The instruction itself (immediate operand) –A register –A memory location –An I/O Port
ตั�วแสดงการทำ�างานของรหั�สคำ�าส��ง ข�อมู�ลสามูารถเก�บไว�ทำ��–A register –A memory location –An I/O Port
– เราจะด�คำวามูแตักตั างของ addressing modes ใน IA-32. จะด�จากตั�วอย่ างคำ�าส��ง MOV ทำ��จะอธิ$บาย่การอ�งอ$งแอดเดรสในร�ปแบบตั างๆ .
– คำ�าส��ง MOV ใช้�เคำล(�อนย่�าย่ข�อมู�ลไปหัร(อมูาจาก memory, register หัร(อ immediate values. มู�กฏเกณฑ์,ด�งน�-
– MOV <destination operand>, <source operand>
– ตั�วแสดงการทำ�างานของรหั�สคำ�าส��งในส วนของ destination และ source อย่� ใน memory, register or immediate. However, they can’t both be memory at the same time.
ตั�วอย่ าง – MOV EAX, 5 ; Moves into the EAX register the value
5
– MOV EBX, EAX
; Moves into register EBX the value of register EAX
– MOV [EBX], EAX ; Moves into the memory address
pointed to by EBX the value EAX
The IA-32 provides a total of seven distinct addressing modes:
–register –immediate –direct –register indirect –based relative –indexed relative –based indexed relative
1.Register Addressing ModeExamples of register addressing mode follow:
– MOV BX, DX ; copy contents of register BX into register DX
– MOV ES, AX ; copy contents of register AX into segment register ES
– MOV EDX, ESI ;copy contents of register ESI into register EDX
Basic Operand TypesRegister Operands
Very efficient Examples
inc ECX mov BX, AX
mov SI, 10
– 2. Immediate Addressing Mode
– Examples:– MOV AX, 2550h ; move value 2550h
into the AX register
– MOV ESI, 12345678h ; move value 12345678h into the ESI register
– MOV BL, 40h ; move value 40h into the BL register
Basic Operand TypesImmediate Operands
Immediate operands are constants Examples
add AX, 10mov BL, 'a'mov BL, 61h
Numbers do not have a length attribute. For example, we can add 10 to AH, AX, and EAX.
Of course the immediate value fit in the destination mov AL, 12345678h ; Illegal
To move information to the segment registers, the data must first be moved to a general-purpose register and then into the segment registers. Example:
– real-address mode:– MOV AX, 2550h ;AX contains the segment address
– MOV DS, AX ; load the DS segment register with AX
– protected mode:– MOV AX, 08h ; AX contains the segment selector
– MOV DS, AX ; load the DS segment register with AX
3. Direct Addressing Mode
– real-address mode:– MOV DL, [2400h] ; moves content of DS:2400h
into DL register– MOV AX, [5500h] ; moves content of DS:5500h
into the AX register
– protected mode:– MOV EAX, [12345678h] ; moves content of
DS:12345678h into the EAX register
Basic Operand TypesDirect Operands
Allows accessing memory variables by name
Slower because memory is "slow" Example
mov DX, wVal1 add DX, wVal2 mov wData, DX
4. Register Indirect Addressing Mode– real-address mode:– MOV AL, [BX] ; moves into AL the contents of
the memory location pointed to by DS:BX– MOV [DI], AL ; moves into DS:DI the contents
of the register AL– protected mode:– MOV AL, [EAX] ; moves into AL the contents of
the memory location pointed to by DS:EAX– MOV [EBX], ESI ; moves into DS:EBX the
contents of the register ESI– MOV BL, [ESI] ; moves into BL the contents of
the memory location pointed to by DS:ESI
Indirect AddressingIndirect Addressing Modes
mov AX, [ESI]
AX
ESI
Memory
26
2426282A2C2E30
600
600
Indirect AddressingIndirect Addressing Modes: Example
.dataList DWORD 1, 3, 10, 6, 2, 9, 2, 8, 9 Number = ($ - List)/4 or LENGTHOF List.code …
; sum values in list mov EAX, 0 ; sum = 0 mov ECX, Number ; number of values mov ESI, OFFSET List ; ptr to ListL3: add EAX, [ESI] ; add value add ESI, 4 ; point to next value loop L3 ; repeat as needed
Indirect Addressing
Indirect Addressing Modes
String BYTE "the words"Long = $ - String or LENGTHOF String …; Print string 1 character at a time ; using WriteChar mov ECX, Long ; ECX <- number char mov ESI, OFFSET String; pt to stringL9: mov AL, [ESI] ; get char. call WriteChar ; print char inc ESI ; point to next char loop L9 ; repeat
5. Based Relative Addressing Mode– In real-address mode, base registers BX and BP ,
as well as a displacement value, are used to calculate what is called an effective address. The default segment registers used for the calculation of the physical address are DS for BX and SS for BP. Example:
– MOV CX, [BX+10h] ; moves 16-bit word at DS:BX+10h into CX register
– MOV [BP+22h], CX ; moves contents of CX register into memory location SS:BP+22h
– In case of protected mode, registers EAX, EBX, ECX, EDX, ESI, EDI, EBP and ESP can be used as base registers. The default segment registers used for calculation of physical address are DS for EAX, EBX, ECX, EDX, ESI and EDI and SS for EBP and ESP.
– MOV EAX, [EBX+10h] ; moves 32-bit quantity at DS:EBX+10h into the EAX register
– MOV [EBP+22h], EDX ; moves contents of EDX register into memory location SS: EBP+22h
6. Indexed Relative Addressing ModeIn real-address mode, index registers SI
and DI , as well as a displacement value, are used to calculate what is called an effective address. The default segment register used for the calculation of the physical address is the DS register. MOV CX, [SI+10h] ; moves 16-bit word at DS:SI+10h into CX register
– MOV [DI+22h], CX ; moves contents of CX register into memory location DS:DI+22h
– MOV EAX, [ESI+10h] ; moves 32-bit quantity at DS:ESI+10h into the EAX register
– In case of protected mode, registers EAX, EBX, ECX, EDX, ESI, EDI, EBP and ESP can be used as index registers. The default segment registers used for calculation of physical address are DS for EAX, EBX, ECX, EDX, ESI and EDI and SS for EBP and ESP.
– MOV EAX, [ESI+10h] ; moves 32-bit quantity at DS:ESI+10h into the EAX register
– MOV [EDI+22h], EDX ; moves contents of EDX register into memory location DS: EDI+22h
7. Based Indexed Addressing Mode– Examples:– real-address mode:– MOV CL, [BX+SI+8h] ; moves into CL
register, byte at memory location DS:BX+SI+8h
– MOV CL, [BX+DI+8h] ; moves into CL register, byte at memory location DS:BX+DI+8h
– MOV CL, [BP+SI+8h] ; moves into CL register, byte at memory location SS:BP+SI+8h
– MOV CL, [BP+DI+8h] ; moves into CL register, byte at memory location DS:BP+DI+8h
– protected mode:– MOV CL, [EBX+EAX+10h] ;moves into CL
register, byte at memory location DS:EBX+EAX+10h
– MOV CL, [EBP+EAX+10h] ;moves into CL register, byte at memory location SS:EBP+EAX+10h
– MOV CL, [ESI+EDI+10h] ;moves into CL register, byte at memory location DS:ESI+EDI+10h
– Segment overrides and 32-bit scaling factors
– MOV CX, ES:[SI+10h] ; moves 16-bit word at ES:SI+10h into CX register
– In case of 32-bit instructions, we can also add a scaling factor in any of the based, indexed or based indexed addressing modes. The scaling factor must always be a power of 2. Example :
– MOV EAX, [ESI + EAX*4] ; moves into EAX the 32-bit quantity at DS:ESI+EAX*4
Indirect AddressingBased and Indexed Addressing
mov AX, List[ESI]
AX
ESI 4
ListList+2List+4List+6List+8List+10List+12
Memory
100
100
Indirect AddressingBased and Indexed Addressing
mov ESI, OFFSET Listmov AX, [ESI+4]
AX
ESI OFFSET List
ListList+2List+4List+6List+8List+10List+12
Memory
100
100
Indirect AddressingBased - Indexed Addressing
mov EBX, OFFSET Listmov ESI, 4mov AX, [EBX+ESI]
AX
ESI 4
ListList+2List+4List+6List+8List+10List+12
Memory
100
100
EBX OFFSET List+
78
Summary
Central Processing Unit (CPU) Arithmetic Logic Unit (ALU) Instruction execution cycle Multitasking Floating Point Unit (FPU) Complex Instruction Set Real mode and Protected mode Motherboard components Memory types Input/Output and access levels