ISA Bus Architecture

22
8/11/2019 ISA Bus Architecture http://slidepdf.com/reader/full/isa-bus-architecture 1/22 © Wen-mei Hwu and S. J. Patel, 2005 ECE 511, University of Illinois Lecture 3: Instruction Set Architecture

Transcript of ISA Bus Architecture

Page 1: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 1/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Lecture 3:

Instruction Set Architecture

Page 2: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 2/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Outline

• Instruction Set Architecture

 – Traditional issues

 – The (old) debate: RISC vs. CISC – New issues

Page 3: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 3/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

The Big PictureRequirements

Algorithms

Prog. Lang./OS

ISA

uArch

Circuit

Device

Problem Focus

Performance

Focus

BOX

BOX Si fin - Body!

DrainSource

Gate

f2() {

f3(s2, &j, &i);

*s2->p = 10;

i = *s2->q + i;

}

i1: ld r1, b <p1>

i2: ld r2, c <p1>

i3: ld r5, z <p3>

i4: mul r6, r5, 3 <p3>

i5: add r3, r1, r2 <p1>

f1 f2

f3

f4

f5 s q p

 j

i

fpf3

SPEC

Page 4: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 4/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Instruction Set Architecture

Application

Instruction Set Architecture

Implementation

…SPARC MIPS  ARM x86 HP-PA IA-64… 

Intel Pentium X

AMD K6, Athlon, Opteron

Transmeta Crusoe TM5x00

Page 5: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 5/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Instruction Set Architecture

• Strong influence on cost/performance

• New ISAs are rare, but versions are not

 – 16-bit, 32-bit and 64-bit X86 versions

• Longevity is a strong function of

marketing prowess

Page 6: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 6/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

• Strongly constrained by the number of bitsavailable to instruction encoding

• Opcodes/operands

• Registers/memory•  Addressing modes

• Orthogonality

• 0, 1, 2, 3 address machines

• Instruction formats

• Decoding uniformity

Traditional Issues

Page 7: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 7/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Instruction Formats

Alpha (fixed length)

32 bits

6 bits

opcode

opcode

opcode

opcode

RA

RA

RA RB

RB

RC

TRAP

Branch

Mem

Operate

x86 (variable length)

 prefixes opcode addr mode displ imm

0 to 4 bytes of prefix

1 or 2 bytes of opcode

0 to 2 bytes (ModR/M and SIB)

0 to 8 bytes

Page 8: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 8/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

The (old) Debate : RISC vs. CISC

•  At the time, IBM 370 and VAX dominated

• CMOS was up and coming technology – Small number of transistors per chip

• RISC was appealing

 – lower design complexity – easier to pipeline

 – higher performance when fit on a chip

IBM 801 (Cocke et al, 1982)

RISC I (Patterson et al, 1982)

MIPS (Hennesey et al, 1982)

Page 9: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 9/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

What is RISC?

• Fixed length instructions

• Few formats

• Load/Store• Few addressing modes

• Simple decode/control

• Many registers

• Few “unpipelinable” insts 

Compiler Complexity

Hardware Complexity

Page 10: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 10/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

The MIPS Pipeline

• Compiler knows the pipelineorganization

• Schedules instructions around

“hazards” 

• Branches are handled by delay slots

• No need to “interlock” the pipeline 

Fetch Decode ALU Memory WriteBack

Page 11: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 11/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Pipelining a CISC [Patt et al 85]

Fetch

Instruction Bytes

Decode

µOp

store

Emits RISC-like

micro-operations

RF Read Execute Mem WB

Page 12: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 12/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

RISC Baggage

In hindsight, like CISC, even RISCarchitectures suffered from legacy effects. – Delay slots

• Used for dealing with branches in short pipelines

• Helps primarily with target generation

• Becomes a burden for the future generations whosepipelines need to be deeper

 – Register windows• Quick save/restore state for procedure calls

• Reduce procedure call overhead to 1 cycle

• Makes register renaming and out of order executionmore complex

Page 13: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 13/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

The Dynamic-Static Interface

Perhaps the main contribution of the “RISC Revolution” 

John Cocke (IBM) is credited for the original idea.

John Hennessy a major driving force later, followed byIMPACT team at Illinois.

“…a willingness to make design tradeoffs freely… between

the architecture and implementation…” 

Colwell et al, 1985.

This legacy is still alive and kicking today.

DSI

DSI

Page 14: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 14/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

DSI and Static Optimization

Granularity of

ISA instruction

ISAs for

reconfigurable

architectures

VAX ISA

MIPS ISA

Potential for Static Optimization

Itanium ISA

Page 15: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 15/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Variable Instruction Format

• Motivation 1: to accommodate a largenumber of opcodes with nonuniformfrequency of occurrence

 – VAX has 304 opcodes. If we insisted usinguniform opcode encoding, we would need 9 bits.Due to the policy of byte alignment, one needs twobytes to encode each VAX opcode.

 –  An observation: some opcodes are used more

often than the others. The top 200 opcodes acountfor about 98% of the dynamic opcode usage.

 – Instead of using 2 bytes to encode all the 304opcodes, use 1 byte to encode the frequently used

ones and use 2 bytes to encode the infrequentlyused ones.

Page 16: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 16/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Variable Instruction Format (cont.)

• Motivation 2: to allow each instr. exactly thenumber of operands it needs. – RET (0), INC (1), ADD (2 or 3), … 

• Motivation 3: to allow each operand specifierexactly the number of bytes ineeds. – Reg (1), Disp (1 to 8), … 

•  All motvations come from reducing theamount of bytes needed to – represent the program

 – be fetched during execution

Page 17: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 17/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

VIF Cost

• Sequential Decoding Problem

 – The decoder cannot be sure where the 1st

operand specifier is until the opcode is decoded. – The decoder cannot locate the ith operand

specifier is until the (i-1)th operand specifier is

decoded.

 – The decoder can not be sure where the jthinstruction starts until the last operand specifier of

the (j-1)th instruction is decode.

 – Typical solution: instruction buffering with multi-

stage decode pipeline plus post-decode I-cache or

trace cache

Page 18: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 18/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

VIF Cost (cont.)

• Non-aligned Instruction Access: instructionsare not aligned to any byte position in eachmemory word. – Instruction opcodes and operand specifiers are

not aligned to the decoding logic when fetchedfrom memory.

 – Instructions may spill over cache block boundaries

and page boundaries – Typical solution: instruction buffer that decouples

fetch and decode

Page 19: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 19/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Data Dependent Decoding

• What an instruction does depends onthe values of the explicit and/or implicit

input operands. – Cause: generality of instructions

 – Example: string move instructions in x86generates different number of loads and

stores according to an input operand value – Typical solution: use microcode when

executing these instructions

Page 20: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 20/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Number of Registers

• The large number of registers allows thecompiler to eliminate memory

references and redundant computationby storing more values in the registerfile

 – Cost: more bits to encode register

operends – Benefit: suppot for the compiler to achieve

high performance

 – MIPS had 32, IA-64 has 128 (levels ofmetallines is a factor here)

Page 21: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 21/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Compatibility – subtle issues

• Most from incomplete ISA specification

 – Needed to have extendability

• User imposed requirements – Inappropriate use of ISA

• Undefined bits being used

• Implementation imposed compatibility – Bug compatibility• Pentium II had to reproduce the bugs of

Pentium

Page 22: ISA Bus Architecture

8/11/2019 ISA Bus Architecture

http://slidepdf.com/reader/full/isa-bus-architecture 22/22

© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois

Today’s Issues 

• Where should the DSI be placed? Whatcontrol is given to the compiler (static) andwhat is relegated to the hardware (dynamic). – This is becoming a more pressing issue as the

power crisis continue to grow

• Information flow across the DSI interface. – Speculation, predication, registers, analysis info

• There is an emerging difference between thetarget architecture and the implementationarchitecture. – Java, .NET