OK, we are now ready to begin Chapter 2 of our text We will begin looking at some preliminary stuff...

OK, we are now ready to begin Chapter 2 of our text

We will begin looking at some preliminary stuff

Then we will look at the the Intel IA-32 (CISC)

Then we will concentrate on the MIPS 32 (RISC)

Note: A number of the slides I will use for Patterson & Hennessy material are adapted, with permission, from slides of a computer engineering colleague:

Professor Mary Jane Irwin of Penn State

Where is the Market?

290

933

488

1143

892

135

4

862

1294

1122

1315

0

200

400

600

800

1000

1200

1998 1999 2000 2001 2002

Embedded

Desktop

Servers

Mill

ions

of C

om

pu

ters

ISA Type Sales

0

200

400

600

800

1000

1200

1400

1998 1999 2000 2001 2002

Other

SPARC

Hitachi SH

PowerPC

Motorola 68K

MIPS

IA-32

ARM

Mill

ions

of P

roce

sso

r

Moore’s Law

In 1965, Gordon Moore predicted that the number of transistors that can be integrated on a die would double every 18 to 24 months (i.e., grow exponentially with time).

The million transistor/chip barrier was crossed in the 1980’s. 2300 transistors, 1 MHz clock (Intel 4004) - 1971 16 Million transistors (Ultra Sparc III) 42 Million transistors, 2 GHz clock (Intel Xeon) – 2001 55 Million transistors, 3 GHz, 130nm technology,

250mm2 die (Intel Pentium 4) - 2004 140 Million transistor (HP PA-8500)

Processor Performance Increase

1

10

100

1000

10000

1987 1989 1991 1993 1995 1997 1999 2001 2003

Year

Per

form

ance

(S

PE

C I

nt)

SUN-4/260 MIPS M/120MIPS M2000

IBM RS6000

HP 9000/750

DEC AXP/500 IBM POWER 100

DEC Alpha 4/266DEC Alpha 5/500

DEC Alpha 21264/600

DEC Alpha 5/300

DEC Alpha 21264A/667Intel Xeon/2000

Intel Pentium 4/3000

DRAM Capacity Growth

10

100

1000

10000

100000

1000000

1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002

Year of introduction

Kb

it c

apac

ity

16K

64K

256K

1M

4M

16M

64M128M

256M512M

Computer Instruction Formats

Three operand

e.g. Opcode Source1, Source2, Destination

Two operand

e.g. Opcode Source1, Source2Destination

One operand is used as Source & Destination

One operand

e.g. Opcode Source

Result is deposited in an Accumulator

The Intel IA 32

History of the IA-32 (Intel)

1971 – 4004 built by Intel as a calculator engine

1972 – 8008 introduced as an 8 bit computer

1974 – 8080 an 8 bit (16 address bit) enough power to build a computer

around it – Altair 8800, IMSAI 8080, Osborne I (first

portable computer 1981)

1976 – 8085 8080 with two interrupts

1978 – 8086 16 bit machine using enhanced 8080 instr & Reg

1980 - 8087 8086 floating pt co-processor

1981 - 8088 8 bit external data bus

1982 – 80186 & 80286 the later was the engine for the first IBM PC,

added memory management to become a multiuser machine

1985 – 80386 32 bit machine with 32 bit address space

1989 – 80486 multiprogramming, pseudo GPR machine

1992 – Pentium & Pentium Pro (1995) higher performance

1997 - Added MMX media extentions

1999 – Added another 70 instructions

2001 – Added another 144 instructions

2003 - Amdahl architecture increased address space to 64 bits and breaks legacy chain

2004 – Intel adopts AMD64 architecture with slight addition

Building a legacy nightmare !

IA-32 Registers

IA-32 Flags Register

Example IA-32 Instruction Format

Sample IA-32 Instruction Formats

Note: Instruction lengths vary from 1 to 17 bytes

The MIPS 32

RISC - Reduced Instruction Set Computer

RISC philosophy (keep it simple!) fixed instruction length(s) (one word?) load-store instruction sets (don’t do anything else) limited addressing modes limited operations

MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC, Intel (Compaq), Alpha, …

Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them

Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability,

memory space (embedded systems)

MIPS R3000 Instruction Set Architecture (ISA)

Instruction Categories Computational Load/Store Jump and Branch Floating Point

- coprocessor

Memory Management Special

R0 - R31

PCHI

LO

Registers

OP

OP

OP

rs rt rd sa funct

rs rt immediate

jump target

3 Instruction Formats: all 32 bits wide

R format

I format

J format

MIPS Addressing Modes1. Operand: Register addressing

op rs rt rd funct Register

word operand

op rs rt offset

2. Operand: Base addressing

base register

Memory

word or byte operand

3. Operand: Immediate addressing

op rs rt operand

4. Instruction: PC-relative addressing

op rs rt offset

Program Counter (PC)

Memory

branch destination instruction

5. Instruction: Pseudo-direct addressing

op jump address

Program Counter (PC)

Memory

jump destination instruction||

MIPS Register Convention

Name Register Number

Usage Preserve on call?

$zero 0 constant 0 (hardware) n.a.$at 1 reserved for assembler n.a.$v0 - $v1 2-3 returned values no$a0 - $a3 4-7 arguments yes$t0 - $t7 8-15 temporaries no$s0 - $s7 16-23 saved values yes$t8 - $t9 24-25 temporaries no$gp 28 global pointer yes$sp 29 stack pointer yes$fp 30 frame pointer yes$ra 31 return addr (hardware) yes

MIPS 32 “Card”

MIPS Register FileRegister File

src1 addr

src2 addr

dst addr

write data

32 bits

src1data

src2data

32locations

325

32

5

5

32

Holds thirty-two 32-bit registers Two read ports and One write port

Registers are Faster than main memory

- But register files with more locations are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file)

- Read/write port increase impacts speed quadratically Easier for a compiler to use

- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack

Can hold variables so that- code density improves (since register are named with

fewer bits than a memory location)

write control

MIPS Organization

ProcessorMemory

32 bits

230

words

read/write addr

read data

write data

word address(binary)

0…00000…01000…10000…1100

1…1100Register File

src1 addr

src2 addr

dst addr

write data

32 bits

src1data

src2data

32registers

($zero - $ra)

32

32

3232

32

32

5

5

5

PC

ALU

32 32

3232

32

0 1 2 37654

byte address(big Endian)

FetchPC = PC+4

DecodeExec

Add32

324

Add32

32branch offset

OK, we are now ready to begin Chapter 2 of our text We will begin looking at some preliminary stuff...

Documents

Transcript of OK, we are now ready to begin Chapter 2 of our text We will begin looking at some preliminary stuff...