OK, we are now ready to begin Chapter 2 of our text We will begin looking at some preliminary stuff...
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of OK, we are now ready to begin Chapter 2 of our text We will begin looking at some preliminary stuff...
OK, we are now ready to begin Chapter 2 of our text
We will begin looking at some preliminary stuff
Then we will look at the the Intel IA-32 (CISC)
Then we will concentrate on the MIPS 32 (RISC)
Note: A number of the slides I will use for Patterson & Hennessy material are adapted, with permission, from slides of a computer engineering colleague:
Professor Mary Jane Irwin of Penn State
Where is the Market?
290
933
488
1143
892
135
4
862
1294
1122
1315
0
200
400
600
800
1000
1200
1998 1999 2000 2001 2002
Embedded
Desktop
Servers
Mill
ions
of C
om
pu
ters
ISA Type Sales
0
200
400
600
800
1000
1200
1400
1998 1999 2000 2001 2002
Other
SPARC
Hitachi SH
PowerPC
Motorola 68K
MIPS
IA-32
ARM
Mill
ions
of P
roce
sso
r
Moore’s Law
In 1965, Gordon Moore predicted that the number of transistors that can be integrated on a die would double every 18 to 24 months (i.e., grow exponentially with time).
The million transistor/chip barrier was crossed in the 1980’s. 2300 transistors, 1 MHz clock (Intel 4004) - 1971 16 Million transistors (Ultra Sparc III) 42 Million transistors, 2 GHz clock (Intel Xeon) – 2001 55 Million transistors, 3 GHz, 130nm technology,
250mm2 die (Intel Pentium 4) - 2004 140 Million transistor (HP PA-8500)
Processor Performance Increase
1
10
100
1000
10000
1987 1989 1991 1993 1995 1997 1999 2001 2003
Year
Per
form
ance
(S
PE
C I
nt)
SUN-4/260 MIPS M/120MIPS M2000
IBM RS6000
HP 9000/750
DEC AXP/500 IBM POWER 100
DEC Alpha 4/266DEC Alpha 5/500
DEC Alpha 21264/600
DEC Alpha 5/300
DEC Alpha 21264A/667Intel Xeon/2000
Intel Pentium 4/3000
DRAM Capacity Growth
10
100
1000
10000
100000
1000000
1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
Year of introduction
Kb
it c
apac
ity
16K
64K
256K
1M
4M
16M
64M128M
256M512M
Computer Instruction Formats
Three operand
e.g. Opcode Source1, Source2, Destination
Two operand
e.g. Opcode Source1, Source2Destination
One operand is used as Source & Destination
One operand
e.g. Opcode Source
Result is deposited in an Accumulator
The Intel IA 32
History of the IA-32 (Intel)
1971 – 4004 built by Intel as a calculator engine
1972 – 8008 introduced as an 8 bit computer
1974 – 8080 an 8 bit (16 address bit) enough power to build a computer
around it – Altair 8800, IMSAI 8080, Osborne I (first
portable computer 1981)
1976 – 8085 8080 with two interrupts
1978 – 8086 16 bit machine using enhanced 8080 instr & Reg
1980 - 8087 8086 floating pt co-processor
1981 - 8088 8 bit external data bus
1982 – 80186 & 80286 the later was the engine for the first IBM PC,
added memory management to become a multiuser machine
1985 – 80386 32 bit machine with 32 bit address space
1989 – 80486 multiprogramming, pseudo GPR machine
1992 – Pentium & Pentium Pro (1995) higher performance
1997 - Added MMX media extentions
1999 – Added another 70 instructions
2001 – Added another 144 instructions
2003 - Amdahl architecture increased address space to 64 bits and breaks legacy chain
2004 – Intel adopts AMD64 architecture with slight addition
Building a legacy nightmare !
IA-32 Registers
IA-32 Registers
IA-32 Flags Register
Example IA-32 Instruction Format
Sample IA-32 Instruction Formats
Note: Instruction lengths vary from 1 to 17 bytes
The MIPS 32
RISC - Reduced Instruction Set Computer
RISC philosophy (keep it simple!) fixed instruction length(s) (one word?) load-store instruction sets (don’t do anything else) limited addressing modes limited operations
MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC, Intel (Compaq), Alpha, …
Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them
Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability,
memory space (embedded systems)
MIPS R3000 Instruction Set Architecture (ISA)
Instruction Categories Computational Load/Store Jump and Branch Floating Point
- coprocessor
Memory Management Special
R0 - R31
PCHI
LO
Registers
OP
OP
OP
rs rt rd sa funct
rs rt immediate
jump target
3 Instruction Formats: all 32 bits wide
R format
I format
J format
MIPS Addressing Modes1. Operand: Register addressing
op rs rt rd funct Register
word operand
op rs rt offset
2. Operand: Base addressing
base register
Memory
word or byte operand
3. Operand: Immediate addressing
op rs rt operand
4. Instruction: PC-relative addressing
op rs rt offset
Program Counter (PC)
Memory
branch destination instruction
5. Instruction: Pseudo-direct addressing
op jump address
Program Counter (PC)
Memory
jump destination instruction||
MIPS Register Convention
Name Register Number
Usage Preserve on call?
$zero 0 constant 0 (hardware) n.a.$at 1 reserved for assembler n.a.$v0 - $v1 2-3 returned values no$a0 - $a3 4-7 arguments yes$t0 - $t7 8-15 temporaries no$s0 - $s7 16-23 saved values yes$t8 - $t9 24-25 temporaries no$gp 28 global pointer yes$sp 29 stack pointer yes$fp 30 frame pointer yes$ra 31 return addr (hardware) yes
MIPS 32 “Card”
MIPS Register FileRegister File
src1 addr
src2 addr
dst addr
write data
32 bits
src1data
src2data
32locations
325
32
5
5
32
Holds thirty-two 32-bit registers Two read ports and One write port
Registers are Faster than main memory
- But register files with more locations are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file)
- Read/write port increase impacts speed quadratically Easier for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack
Can hold variables so that- code density improves (since register are named with
fewer bits than a memory location)
write control
MIPS Organization
ProcessorMemory
32 bits
230
words
read/write addr
read data
write data
word address(binary)
0…00000…01000…10000…1100
1…1100Register File
src1 addr
src2 addr
dst addr
write data
32 bits
src1data
src2data
32registers
($zero - $ra)
32
32
3232
32
32
5
5
5
PC
ALU
32 32
3232
32
0 1 2 37654
byte address(big Endian)
FetchPC = PC+4
DecodeExec
Add32
324
Add32
32branch offset