Chapter 5: Processor Design—Advanced Topics

25
5-1 Chapter 5—Processor Design—Advanced Topics Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan Chapter 5: Processor Design— Advanced Topics Topics 5.3 Microprogramming Control store and microbranching Horizontal and vertical microprogramming

description

Chapter 5: Processor Design—Advanced Topics. Topics 5.3 Microprogramming Control store and microbranching Horizontal and vertical microprogramming. M. a. s. t. e. r. S. t. r. t. W. a. i. t. D. o. n. e. O. p. C. o. d. e. I. R. O. t. h. e. r. s. i. g. n. a. - PowerPoint PPT Presentation

Transcript of Chapter 5: Processor Design—Advanced Topics

Page 1: Chapter 5: Processor Design—Advanced Topics

5-1 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Chapter 5: Processor Design—Advanced Topics

Topics

5.3 Microprogramming• Control store and microbranching

• Horizontal and vertical microprogramming

Page 2: Chapter 5: Processor Design—Advanced Topics

5-2 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Control Unit Implemented as Hardware - Hardwired Control Unit

.

Masterclock

Strt

EnableStep generator

Countln

Wait Done

Cou

nte

r

Controlstep

decoder

ResetLoad

Decoder

Other signals from the data path

Interruptsand other externalsignals

IROpCode

Generated control signals

Controlsignal

encoder

T0T1T2

T4

Tn – 1

shc

CON n = 0

ld add brW

ait

Gra

PC

in

AD

D

Ro

ut

PC

ou

t

Clocking logic

4

. . .

. . . . . .

. . .

. . .. . .

. . .

. . .

. . .

Fig 4.11 Control Unit Detail with Inputs and Outputs

Page 3: Chapter 5: Processor Design—Advanced Topics

5-3 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Microprogramming: Basic Idea

• Control unit job is to generate the sequence of control signals

• Hardwired approach uses an FSM implemented in hardware to generate these sequences

• An alternate solution is to build a smaller “computer” to perform this function - microcode engine

Step Concrete RTN Control SequenceT0 MA PC: C PC + 4; PCout, MAin, INC4, Cin, ReadT1 MD M[MA]: PC C; Cout, PCin, WaitT2 IR MD; MDout, IRin

T3 A R[rb]; Grb, Rout, Ain

T4 C A + R[rc]; Grc, Rout, ADD, Cin

T5 R[ra] C; Cout, Gra, Rin, End

• Recall control sequence for 1-bus SRC

Page 4: Chapter 5: Processor Design—Advanced Topics

5-4 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

The Microcode Engine

• A computer to generate control signals is much simpler than an ordinary computer

• At the simplest, it just reads the control signals in order from a read-only memory

• The memory is called the control store

• A control store word, or microinstruction, contains a bit pattern telling which control signals are true in a specific step

• The major issue is determining the order in which microinstructions are read

Page 5: Chapter 5: Processor Design—Advanced Topics

5-5 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Microcode Engine: Basic Implementation

Generated Control Signals ...

PC

out

PC

in

Cin

AN

D

OR

Gra

MA

in

Microstore(memory)

Microsequencegenerator

(computer)

microaddress

microinstructions(data from memory)

Microcoded Control Unit

microbranch information(if needed)

Decoder (optional)

• Some information in microstore can be used to alter control flow of microsequence generator (microbranches)

• Optional decoder can be used to expand microcode to control larger number of control points (horizontal vs. vertical microcode)

Page 6: Chapter 5: Processor Design—Advanced Topics

5-6 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Microcode Engine: More Details

• Microinstruction has branch control, branch address, and control signal fields

• Microprogram counter can be set from several sources to do the required sequencing

Sequencer

Ck CCs Other

Externalsource

IR

2

k n

m

n

n

n

Increment 4 1 Mux

PC

IRBranchcontrol

BranchaddressControl signals

PCout, etc.

Controlstore

PLA(computesstart addr)

Opcode

Fig 5.16 Block Diagram of Microcoded Control Unit

Page 7: Chapter 5: Processor Design—Advanced Topics

5-7 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Parts of the Microprogrammed Control Unit

• Since the control signals are just read from memory, the main function is sequencing

• This is reflected in the several ways the PC can be loaded

• Output of incrementer—PC + 1

• PLA output—start address for a macroinstruction

• Branch address from instruction

• External source—say for exception or reset

• Micro conditional branches can depend on condition codes, data path state, external signals, etc.

Page 8: Chapter 5: Processor Design—Advanced Topics

5-8 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Contents of a Microinstruction

• Main component is list of 1/0 control signal values

• There is a branch address in the control store

• There are branch control bits to determine when to use the branch address and when to use PC + 1

Branch control Control signals Branch address

PC

ou

t

MA

in

PC

in

Co

ut

Ain

End

Microinstruction format

Page 9: Chapter 5: Processor Design—Advanced Topics

5-9 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Fig 5.17 The Control Store

Microaddress

0

2n-1

Code for instruction fetch

Code for add

Code for br

Code for shr

a1

a2

a3

m bits wide

k branchcontrol bits

n branchaddr. bits

c controlsignals

• Common instruction fetch sequence

• Separate sequences for each (macro) instruction

• Wide words

Page 10: Chapter 5: Processor Design—Advanced Topics

5-10 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Tbl 5.2 Control Signals for the add Instruction

• Addresses 101–103 are the instruction fetch

• Addresses 200–202 do the add

• Change of control from 103 to 200 uses a kind of branch

.

1 0 11 0 21 0 32 0 02 0 12 0 2

• • •• • •• • •• • •• • •• • •

1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 00 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0000

0 0 1 1 10 0 0 0 0 0 0 0 0 0 0 01 1 1 10 0 0 0 0 0 0 00 0 0 0 0

1 0 0 0 0 0 0 0 1 0 0 0 0 0 01 1

Page 11: Chapter 5: Processor Design—Advanced Topics

5-11 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Uses for branching in the Microprogrammed Control Unit

• (1) Branch to start of code for a specific inst.

• (2) Conditional control signals, e.g. CON PCin

• (3) Looping on conditions, e.g. n 0 ... Goto6

• Conditions will control branches instead of being ANDed with control signals

• Microbranches are frequent and control store addresses are short, so it is reasonable to have a branch address field in every instruction

Page 12: Chapter 5: Processor Design—Advanced Topics

5-12 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Illustration of branching Control Logic

• We illustrate a branching control scheme by a machine having condition code bits N and Z

• Branch control has 2 parts:

• (1) selecting the input applied to the PC and

• (2) specifying whether this input or PC + 1 is used

• We allow 4 possible inputs to PC• The incremented value PC + 1

• The PLA lookup table for the start of a macroinstruction

• An externally supplied address

• The branch address field in the instruction word

Page 13: Chapter 5: Processor Design—Advanced Topics

5-13 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Fig 5.18 Branching Controls in the Microcoded Control Unit

• 5 branch conditions

• NotN

• N

• NotZ

• Z

• Unconditional

• To 1 of 4 places• Next

instruction

• PLA

• External address

• Branch address

External address

Z NPLA

2

2

2

2

2

2

2

4–1 Mux

Sequencer

PCIncr.

Control signals 244100000000

Controlstore

Mux control

00011011

Mux Ctl SelectIncrement PcPLAExternal addressBranch address

BrUnBrNotZ

BrZBrNotN

BrN

Branchaddress

Page 14: Chapter 5: Processor Design—Advanced Topics

5-14 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Some Possible branches Using the Illustrated Logic (Refer to Tbl 5.3)

• If the control signals are all zero, the instruction only does a test

• Otherwise test is combined with data path activity

.

Cont ro lSig nals

BranchAddress Branching act ion

00

01

10

11

11

11

0

1

0 0 0 0

0 0 0 0

0 0 1

1

1

1

0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 • • •0

• • •

• • •

• • •

• • •

• • •

XXX

XXX

XXX

300

206

204

None— next inst ruct ion

Branch t o out put of PLA

Br if Z t o Ext ern. Addr.

Br if N t o 300 (else next )

Br if N t o 206 (else next )

Br t o 204

Page 15: Chapter 5: Processor Design—Advanced Topics

5-15 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

• In horizontal microcode, each control signal is represented by a bit in the instruction

• In vertical microcode, a set of true control signals is represented by a shorter code

• The name horizontal implies fewer control store words of more bits per word

• Vertical code only allows RTs in a step for which there is a vertical instruction code

• Thus vertical code may take more control store words of fewer bits

Horizontal versus Vertical Microcode Schemes

Page 16: Chapter 5: Processor Design—Advanced Topics

5-16 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Fig 5.19 A Somewhat Vertical Encoding

• Scheme would save (16 + 7) - (4 + 3) = 16 bits/word in the case illustrated

4–16 decoder 3–8 decoder

16 ALU 7 Regout

controlsignals

controlsignals

F5 F8

ALUops field

Register-outfield

IR

4 3

Page 17: Chapter 5: Processor Design—Advanced Topics

5-17 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Fig 5.20 Completely Horizontal and Vertical Microcoding

PCHorizontal

controlstore

PC

Verticalcontrolstore

n to 2n decoderDatapath

PCout

MAin

Inc4

Cin

PC

out

MA

in

Inc 4

Cin

Page 18: Chapter 5: Processor Design—Advanced Topics

5-18 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Saving Control Store Bits with Horizontal Microcode

• Some control signals cannot possibly be true at the same time

• One and only one ALU function can be selected

• Only one register out gate can be true with a single bus

• Memory read and write cannot be true at the same step

• A set of m such signals can be encoded using log2m bits (log2(m + 1) to allow for no signal true)

• The raw control signals can then be generated by a k to 2k decoder, where 2k m (or 2k m + 1)

• This is a compromise between horizontal and vertical encoding

Page 19: Chapter 5: Processor Design—Advanced Topics

5-19 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

A Microprogrammed Control Unit for the 1-Bus SRC

• Using the 1-bus SRC data path design gives a specific set of control signals

• There are no condition codes, but data path signals CON and n = 0 will need to be tested

• We will use branches BrCON, Brn = 0, and Brn 0

• We adopt the clocking logic of Fig. 4.14

• Logic for exception and reset signals is added to the microcode sequencer logic

• Exception and reset are assumed to have been synchronized to the clock

Page 20: Chapter 5: Processor Design—Advanced Topics

5-20 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Tbl 5.4 The add Instruction

• Microbranching to the output of the PLA is shown at 102

• Microbranch to 100 at 202 starts next fetch

.

Addr.

Ot herCont ro lSig nals

BrAddr.

Act ions

100

101

102

200

201

202

• • •

• • •

• • •

• • •

• • •

• • •

XXX

XXX

XXX

XXX

XXX

100 R [ra] C: PC 100;

MA PC: C PC+4;

MD M[ MA] : PC C;

IR MD; PC PLA;

A R [rb] ;

C A + R[rc] ;

00 0 0 0 0 0 1 1

00 0 0 0 0 0 0 0

01 1 0 0 0 0 0 0

00

00

11

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 001 1

Page 21: Chapter 5: Processor Design—Advanced Topics

5-21 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Getting the PLA Output in Time for the Microbranch

• For the input to the PLA to be correct for the branch in 102, it has to come from MD, not IR

• An alternative is to use see-through latches for IR so the opcode can pass through IR to PLA before the end of the clock cycle

Page 22: Chapter 5: Processor Design—Advanced Topics

5-22 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

See-Through Latch Hardware for IR So PC Can Load Immediately

D

C l

IR31..27

5Bus D QQ

PC9..0

PLA

5 10

P R

S

Clockcycle

Strobe S

Bus Valid data

Valid dataData at P

ValidData at R

PLA output strobed into PC

Bus delay

Latch delay

PLA delay

• Data must have time to get from MD across Bus, through IR, through the PLA, and satisfy PC set up time before trailing edge of S

Page 23: Chapter 5: Processor Design—Advanced Topics

5-23 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Fig 5.21 SRC Microcode Sequencer

2

2 2

10

Sequencer

Exceptionn = 0CON Reset

2

2

2

2

n

n

n

2

Mux control

BrUn

BrCON

BrN 0

BrN = 0

End

2–

1 M

ux

Increment PC

4 –1 Mux

Externaladdress

PLABranchaddress

000

400

Page 24: Chapter 5: Processor Design—Advanced Topics

5-24 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Tbl 5.6 Somewhat Vertical Encoding of the SRC

Microinstruction

MuxCt l

Branchcontrol

EndOutsignals

Insignals

Misc.Gateregs.

ALUBranchaddress

00011011

000 BrUn001 BrCON010 BrCON011Br n=0100 Br n0101 None

0 Cont.1 End

000 PCout001 Cout010 MDout011 Rout100 BAout101 c1out110 c2out111 None

000 MAin001 PCin010 IRin011 Ain100 Rin101 MDin110 None

000 Read001 Wait010 Ld011 Decr100 CONin101 Cin110 Stop111 None

00 Gra01 Grb10 Grc11 None

0000 ADD0001 C=B0010 SHR0011 Inc4 • • •1111 NOT

10 bits

F1 F2 F3 F4 F5 F6 F7 F8 F9

2bits 3 bits 1 bit 2 bits3 bits 3 bits3 bits 4 bits 10 bits

Page 25: Chapter 5: Processor Design—Advanced Topics

5-25 Chapter 5—Processor Design—Advanced Topics

Computer Systems Design and Architecture by V. Heuring and H. Jordan © 1997 V. Heuring and H. Jordan

Other Microprogramming Issues

• Multiway branches: often an instruction can have 4–8 cases, say address modes

• Could take 2–3 successive branches, i.e. clock pulses

• The bits selecting the case can be ORed into the branch address of the instruction to get a several way branch

• Say if 2 bits were ORed into the 3rd and 4th bits from the low end, 4 possible addresses ending in 0000, 0100, 1000, and 1100 would be generated as branch targets

• Advantage is a multiway branch in one clock

• A hardware push-down stack for the PC can turn repeated sequences into subroutines

• Vertical code can be implemented using a horizontal engine, sometimes called nanocode