1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is...

35
1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

Transcript of 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is...

Page 1: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

1

COSC 3P92

Cosc 3P92

Week 8 Lecture slides

It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

Page 2: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

2

COSC 3P92

RISC machines• Reduced instruction set computer vs. CISC --

complex ... (680x0, IBM 360,...)

• CISC technology has evolved highly complex instruction sets, to bridge "semantic gap" between hardware and software

– simplify compilers

– alleviate software crisis

– improve architecture quality

• But has CISC design gone "over the top"?

• If one looks at the software being executed, it is typically simple and unsophisticated, and do not exploit the sophisticated features of CISC instruction sets.

Page 3: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

3

COSC 3P92

• Software studies (all values represent %)

Statement SAL XPL Fortran C Pascal AverageAssignment 47 55 51 38 45 47If 17 17 10 43 29 23Call 25 17 5 12 15 15Loop 6 5 9 3 5 6Goto 0 1 9 3 0 3Other 5 5 16 1 6 7

0 - % 0 22 0 411 80 1 17 1 192 15 2 20 2 153 3 3 14 3 94 2 4 8 4 7

5>= 0 5>= 20 5>= 8

Params/CallN Param

AssginmentsN Terms

Vars/ProcN Locals

Page 4: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

4

COSC 3P92

• RISC philosophy: create an instruction set that lets you do the most common computations, while maximising their efficiency

• To do this, throw away microprogramming, and aim for instructions which execute in 1 cycle

• RISC chips have many features which are exportable to contemporary CISC chips; also, there are points of contention about design.

• There are also chips which seem to have both CISC and RISC-like qualities.

Page 5: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

5

COSC 3P92

History

IBM 801 RISC1 MIPS

Year 1980 1982 1983

No. of instructions 120 39 55

Control memory

size (Kbit) 0 0 0

Instr. size (bits) 32 32 32

Machine type register register register

IBM 370/168 VAX-11/780 Dorado iAPX-432

Year 1973 1978 1978 1982

No. of instructions 208 303 270 222

Control memory

size (Kbit) 420 480 136 64

Instr. size (bits) 16-48 16-456 8-24 6-321

Machine type register register stack stack

Page 6: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

6

COSC 3P92

History

• The IBM 801 project (1975) was designed with the following principles:

– choose an instruction set to be a good target for a compiler

– provide a hardware engine that can execute its instructions in one machine cycle

– design the storage hierarchy so that the control engine does not have to wait for storage access

– base the entire system design on an optimizing compiler

Page 7: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

7

COSC 3P92

RISC vs CISC: characteristicsRISC CISC

1. simple instns taking 1 cycle 1. complex instns taking multiple cycles

2. only LOADs, STOREs 2. any instn. may access memory

access memory

3. designed around pipeline 3. designed around instn. set

4. instns. executed by h/w 4. instns interpreted by microprogram

5. Fixed format instns 5. variable format instns

6. Few instns and modes 6. Many instns and modes

7. Complexity in the compiler 7. Complexity in the microprogram

8. Multiple register sets 8. Single register set

Page 8: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

8

COSC 3P92

RISC Design

* Sacrifice everything to reduce the data path

cycle time.

* Microcode is not magic.

• Five steps:1. Find key operations in intended applications.

2. Design optimal data path for these operations.

3. Design instructions which perform these operations on this data path.

4. Add new instructions if they don't slow down machine

5. Repeat for other resources (cache, MMU,...)

Page 9: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

9

COSC 3P92

Design Issues• Single-cycle instructions

– key RISC characteristic

– rapid execution of simple instructions

– complex instns will require more compiled code

• Only LOAD and STORE instns access memory– permits pipelining efficiency

– not as many addressing modes

Load word

Store byte

Store halfword

Store word

Load signed byte

Load unsigned byte

Load signed halfword

Load unsigned halfword

Fig. 8.5 Load and Store instructions for a typical 32-bit machine

Page 10: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

10

COSC 3P92

Design Issues• Maximal pipelining

– permits n instructions in n cycles

• Problems: (i) memory accesses take 2 cycles

(ii) jumps ruin pipeline

• Solutions: For (i): - hardware interlock (wait)

- use incorrect register (means that

compiler needs to correct situation)

1 2 3 4 5 6 7 8 9 101 2 L 4 5 6 S 8 9 10

1 2 L 4 5 6 S 8 9L S

Cycle

Instruction fetchInstruction executionMemory reference

Fig. 8.6 A pipelined RISC machine with delayed LOAD, L, and STORE, S.

Page 11: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

11

COSC 3P92

Design Issues– For (ii): need to optimise pipeline (instruction order) at compile

time.

• No Micro-code– eliminate interpretation, max. data path efficiency.

– frees ALOT of chip space

• Fixed format instructions– simple to decode

OPCODE C DEST SOURCE OFFSET1

0 = Do not set condition codes1 = Set condition codes

0 = Not immediate1 = Immediate

Bits 7 1 5 5 1 13

Fig. 8-7. RISC 1 basic instruction format.

Page 12: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

12

COSC 3P92

• Reduced instruction set– because of simple instruction format

– with RISC I, offset can double as operand, yielding 3 operand instructions.

– to effect complex addressing, need to generate code to explicitly construct the addressing

• More compile-time complexity– compiler technology is the reason that RISC technology is

feasible.

– compiled code is executed directly, so compiler must account for delayed instructions, register usage,...

– lack of sophisticated instructions adds compiler complexity (eg. Multiply)

• Multiple register sets– RISC chips have lots of registers (100's!)

– techniques for organizing them.

Page 13: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

13

COSC 3P92

Register Usage• Need to maximize

pipeline, minimise memory access.

• memory traffic in CISC is largely caused during procedure calling

• RISC organises registers to minimize (remove) memory accesses during procedure calls

– overlapping register window organisation

32-bit Words

Outgoing Parameters

R0

- R

7R

8 -

R15

R16

- R

24R

25 -

R31

Global Variables

Incoming Parameters

Local Variables

Fig. 8.8 The 32 registers visible to a program at any instant of time.

Page 14: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

14

COSC 3P92

• CWP - current window pointer

• Output and input register sets double up in usage during procedure calls

• No stack needed UNLESS– too many parameters

– parameters are too large in value

– too many nested calls cause all registers to be used

• ... in which case standard stack techniques are used.

• Remember: most programs are simple!

• Philosophical point: – Registers vs Memory

?

Page 15: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

15

COSC 3P92

RISC vs CISC

• Benchmarking computers is difficult– effects of hardware organisation (I/O, memory mgmt, ...)

– different chip technologies: ECL (emitter coupled logic) vs MOS

– operating system

– language effects: C vs Prolog vs COBOL vs ...

– type of program: recursive vs iterative

• Overlapping register windows: – not part of MIPS chip

– could it be exported to CISC chips too?

Page 16: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

16

COSC 3P92

CISC vs RISC• Compiler writing for RISC

• Delayed JUMP

100 LOAD X, A

101 ADD 1, A

102 JUMP 105

103 ADD A, B

104 SUB C, B

105 STORE A, Z

106

Normal Branch

100 LOAD X, A

101 ADD 1, A

102 JUMP106

103 NO-OP

104 ADD A, B

105 SUB C, B

106 STOREA, ZDelayed Branch

Optimized Delayed Branch

100 LOAD X, A

101 JUMP 105

102 ADD 1, A

103 ADD A, B

104 SUB C, B

105 STORE A, Z

106

Page 17: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

17

COSC 3P92

• Compilers need to account for:– memory delays

– jump delays

– register allocation

– simple instruction set

• RISC compilers need to make the best use of registers

– preferable to use all the regs in a single window(and not memory)

• Optimising compilers can do data flow analysis on programs to see when variables are "active"

Page 18: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

18

COSC 3P92

Example 1: Pentium II (CISC)• Recall:

– instruction formats [5.13]

– addressing modes [5.26]

• Instruction set: [5.33]

• CISC instruction set– design determined for back-compatability

– superscalar microprocessor tries to “deconstruct” CISC instns into pipelineable microinstructions

– erratic variants for instn type, register usage, addressing modes

– [reference pages]

Page 19: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

19

COSC 3P92

Page 20: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

20

COSC 3P92

Page 21: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

21

COSC 3P92

Page 22: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

22

COSC 3P92

Example 2: UltraSparc II

• Recall: formats [5.14]– addressing: either immediate or register

– only load, store access memory

• [5.34]

Page 23: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

23

COSC 3P92

Page 24: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

24

COSC 3P92

Page 25: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

25

COSC 3P92

Page 26: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

26

COSC 3P92

Example 4: MIPS R4000• Microprocessor without Interlocking Pipe Stages

• similarities with UltraSPARC:– 64 bit design

– LOAD/STORE architecture

– 2^64 byte-addressable memory

– paging, coprocessors, ...

• differences:– configurable to either Big- or Little-endian (byte ordering in

words)

– no register file or register windows

– no condition codes: results of tests saved in regs

– 8 stage pipeline

• Generally, MIPS does not give as orthogonal an instruction set to programmer as SPARC, for hardware efficiency sake.

Page 27: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

27

COSC 3P92

• No window file: – pro:

» with saved space, can fit MMU, cache controller, MUL/DIV on chip

» removes overhead of saving 500 regs when multitasking

» registers not fixed in purpose

– con:

» more (software) overhead in procedure calling

Page 28: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

28

COSC 3P92

Summary

• Pentium II:– 2-address, 32-bit CISC

– irregular

• UltraSPARC– 3-address, 64-bit RISC

– 128-bit bus

– somewhat complex formats

• MIPS– another 64-bit RISC

SPARC vs. MIPS

orthogonal instns optimised H/W

register windows none

software MUL/DIV hardware

MUL/DIV

condition codes none

Page 29: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

29

COSC 3P92

Itanium• P4 has severe problems (IA-32)

– CISC

– 2 Address Memory Oriented ISA

– Small Register Set

» 6 registers

– Lack of Regs requires internal renaming of Regs,

» requires out of order execution to compensate for memory reference waits

» Hence expensive h/w

– Deep pipeline (result of out of order execution),

» Flushing becomes very expensive

– Speculative execution causing traps to set.

• A large portion of the P4 is devoted to dealing with the problems of its CISC architecture.

Page 30: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

30

COSC 3P92

Itanium.• Itanium (EPIC a better RISC)

– Has many functional units each able to work in parellel.– 3 Address Risc.– Much of the instruction work (reordering etc.) is moved to the

compiler.– Parallelism of the h/w is known by the compiler

» Take advantage of the h/w producing efficient code.– Simple Memory Model 264 bytes– 128 registers – reducing memory references

» 32 static» 96 for a register stack (like register windows of Ultra Sparc

III.– Procedure calls put the call stack on the register stack

» Parameters a placed in registers as part of the call frame.» Local variable are allocated on the stack by the procedure.

Page 31: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

31

COSC 3P92

Itanium..– 128 Floating point registers

– 64 Predicate registers

» Used for conditional branch prediction.

– 8 branch registers

– 128 special-purpose

» Inter application communication.

Page 32: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

32

COSC 3P92

Itanium…• Branch Prediction

• Branches are removed by allowing all instructions to execute.

– A Condition sets a predicate bit

– An instruction will write back the result if the predicate is true.

(a) An if statement.

(b) Generic assembly code for a).

(c) A conditional instruction.

Page 33: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

33

COSC 3P92

Itanium….

• CMOVZ, will execute if R1 is 0

• CMOVN, will execute if R1 is not 0

• This means:– Executing a few instr. Is cheaper then a branch.

– Most branches can be eliminated.

» No pipeline problems.

Page 34: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

34

COSC 3P92

Itanium…..

• Predicate registers are pairs,– E.g. P4 is false then P5 is true. And visa versa.

– Any instruction can be predicated.

Page 35: 1 COSC 3P92 Cosc 3P92 Week 8 Lecture slides It is dangerous to be right when the government is wrong. Voltaire (1694 - 1778)

35

COSC 3P92

The end