A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 ·...

26
Computer Architecture MT 2011 A3 Computer Architecture Engineering Science 3rd year A3 Lectures Prof David Murray [email protected] www.robots.ox.ac.uk/dwm/Courses/3CO Michaelmas 2000 1/1

Transcript of A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 ·...

Page 1: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

A3 Computer Architecture

Engineering Science

3rd year A3 Lectures

Prof David Murray

[email protected]/∼dwm/Courses/3CO

Michaelmas 2000

1 / 1

Page 2: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

3. The Control Unit

3A3 Michaelmas 2000

2 / 1

Page 3: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Last time we saw ...

1 that computation required transferring wordsfrom memory to registerbetween registerssometimes via the ALUthen back to memory.

2 how transfers of information in our BSA could be described usingassembler mnemonics which described several lines of RTL

3 how the Instruction Fetch sequence was described, and how theLDA operation was executed.

4 an 8-membered toy instruction set

5 that the Control Unit (CU) is responsible for generating the CSLsand CSPs

3 / 1

Page 4: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

In this lecture we look at ...

the RTL for the remaining instructions

a couple of ways of implementing the CU — the random logiccontroller.(There is nothing random about it — ‘ad hoc’ would be a betterword. The result is fast but inflexible.)

In the first Tute sheet you get to think about the D-type latchcontroller.

4 / 1

Page 5: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Inputs to and Outputs from the CU

Before starting to design the CU, it is useful to define what theinputs are and what the outputs are.

The inputs are:1 Any one of the 8 defined 8-bit opcodes corresponding to our

instruction set. (Of course there could be 255 of ’em, but ...)2 The status register — in particular the Z bit.3 The central clock signal.

The outputs are:1 the CSL levels — which configure pathways2 the CSP pulses — drive the register transfers

5 / 1

Page 6: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Details of instruction fetch and executionA reminder about the instruction fetch sequence for anyinstruction.

FETCH

MAR ← PCMBR ← 〈MAR 〉

IR ← MBR;PC←PC +1(CU ← IR (opcode) for decoding)

Instr Description Opcode MeaningLDA x AC←〈x 〉 00000000 Load ac with contents at xSTA x 〈x 〉 ←AC 001 Store ac in mem at memloc xADD x AC←AC + 〈x 〉 010 Add mem contents at x to ACAND x AC =AC ∧ 〈x 〉 011 Logical and ...JMP x PC←x 100 Jump to instruction at memloc xBEQ x if Z-flag=1 then PC←x 101 if Z-flag is set then jumpCMP AC←AC 110 Two’s complement the ACRSH AC←RightShift(AC) 111 Shift the AC 1bit to right

We now develop detailed RTL for each instruction for these.

6 / 1

Page 7: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Execution of LDA x and STA x

Memory

MAR

ALU

CUAC

Control Lines

IR(opcode)

SPPC

IR(address)

Status

MBR

IR

Inc(PC)

LDA x This is to read memorylocation x, and put its contentsinto the AC

MAR ← IR (address)MBR ← 〈MAR 〉

AC ← MBRSTA x This is the reverse,where a memory write takesplace to store the accumulatorat location x

MAR ← IR (address)MBR ← AC

〈MAR 〉 ← MBR

7 / 1

Page 8: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Execution of ADD x and AND x

Memory

MAR

ALU

CUAC

Control Lines

IR(opcode)

SPPC

IR(address)

Status

MBR

IR

Inc(PC)

ADD x This instruction involvesthe ALU. To perform this line ofRTL we must, before firing thetransfer, set up the ALU to per-form the binary addition of itsinputs.

MAR ← IR (address)MBR ← 〈MAR 〉

AC ← AC + MBRAND x This again involves theALU, but now we are perform-ing a logical AND.

MAR ← IR (address)MBR ← 〈MAR 〉

AC ← AC ∧ MBR

8 / 1

Page 9: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Unconditional and conditional branchin

Memory

MAR

ALU

CUAC

Control Lines

IR(opcode)

SPPC

IR(address)

Status

MBR

IR

Inc(PC)

JMP xThis instruction allows an un-conditional branching to a non-consecutive instruction.

PC ← IR (address)BEQ xBranch if EQual allows condi-tional branching. The condi-tion here is the Z flag, whichis set Z=1 by the ALU when itmakes a calculation whose re-sult is zero. If the condition isfalse (ie Z=0) the PC needs noalteration.

Sn → (Z)/(Sn+2)Sn+1 PC←IR (address)Sn+2 etc

9 / 1

Page 10: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Execution of Complement and RightShift

Memory

MAR

ALU

CUAC

Control Lines

IR(opcode)

SPPC

IR(address)

Status

MBR

IR

Inc(PC)

CMPComplements the contents ofthe AC. An ALU operation, soagain we need a level to config-ure the ALU.

AC ← ACRSHThis involves use of a shifter.The shifter is placed at the backend of the ALU.

AC←RightShift(AC)

10 / 1

Page 11: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

What control signals are required?

We have now specified theopcode inputs, and thesequence of actionsrequired.Each action (= line ofRTL) requires thegeneration of a CSL andCSP.As the CSP is generatedfrom the CSL (+ clock) wecan just think about aCSL.

CS ActionC0 AC←AC +MBRC1 AC←AC ∧ MBRC2 AC←ACC3 MBR←M〈MAR〉C4 M〈MAR〉←MBRC5 MBR←ACC6 AC←MBRC7 MAR←IR (address)C8 PC←IR (address)C9 PC←PC + 1 (or INC(PC))C10 MAR←PCC11 IR←MBRC12 AC←RightShift( AC)

11 / 1

Page 12: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

What signals are required /ctdThe table is:

Binary→ 0000 0001 0010 0011 0100 0101 0110 0111Phase↓ LDA STA ADD AND JMP BEQ CMP RSH0 ←−C10−→1 ←−C3−→2 ←−C11;C9−→3 C7 C7 C7 C7 C8 if Z=1, C8 C2 C124 C3 C5 C3 C3 — — — —5 C6 C4 C0 C1 — — — —

The longest command now takes 6 steps, but there are some thattake fewer.Remember that the fetch cycle the same for all.Where from? For LDA ...

Fetch MAR ← PC C10MBR ← 〈MAR 〉 C3

IR ← MBR;PC←PC +1 C11;C9

Execute MAR ← IR (address) C7MBR ← 〈MAR 〉 C3

AC ← MBR C6

12 / 1

Page 13: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Implementation of the CU in hardware

Several possibilities for hardware implementation:

1 PLA2 Discrete components3 D-type latch loop (Tute Sheet)4 PROM (Lecture 5)

Several needs are shared.

1 Need a clock2 Need to decode the opcode between fetch and execute3 Need to worry about flag inputs (from the ALU via Status Register)

13 / 1

Page 14: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

The clock

Convention demands thatthe cpu clock tick onceper instruction.But the longestfetch-execute cycle (sofar) takes 6 lines of RTLHence subdivide it intosubphases, φn.Here we have up to 8subphases ... could resetit after 6.

CK

Q

QK

J Q

QK

J Q

QK

J

CLR CLR CLR

1

1

1

1

1

1

Reset to state φ0

φ0

φ1

φ2

φ3

φ4

and so on....

φ0

φ1

φ2

φ7

14 / 1

Page 15: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Decoding the opcodes

For clarity, let us produce a single output line for each instruction,by decoding the opcode.Here’s a 3-8 line decoder ...

Bit1Bit2 Bit0

opcode

and so on

LDA

STA

ADD

AND

15 / 1

Page 16: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Implementation using a PLARecall that a PLA is able to create a general sum of products.

O1 = (I1.I2.I3 . . .) + (I1.I2.I3 . . .)+

16 / 1

Page 17: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Implementation using a PLANow recall the table ...Binary→ 0000 0001 0010 0011 0100 0101 0110 0111Phase↓ LDA STA ADD AND JMP BEQ CMP RSH0 ←−C10−→1 ←−C3−→2 ←−C11;C9−→3 C7 C7 C7 C7 C8 if Z=1, C8 C2 C124 C3 C5 C3 C3 — — — —5 C6 C4 C0 C1 — — — —

So we write down the outputs C in terms of the inputs:

C10 = φ0

C3 = φ1 + φ4 · (LDA + AND + ADD)

C8 = Z · φ3 · (BEQ)

etc

Note that the multiphase CK reset line can be set early.

Reset = φ3·(JMP+BEQ+CMP+RSH)+φ5·(LDA+STA+ADD+AND)

17 / 1

Page 18: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Implementation using a PLA /ctd

OP

CO

DE

DE

CO

DE

RM

ult

iPh

CL

OC

K

Reset CK

PLA

Inp

tuts

φ7

φ0

Z−flag

Control line outputs

18 / 1

Page 19: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Another way using discrete componentsThe fetch and execute stages are separated using an RS flip-flop.Here we used only 4 of the 8-phase clock cycles are utilized. Atthe ends of both the fetch and execute stages, a reset signal issend to the 8-phase clock.

φ0

φ1

φ0

φ2

φ3 φ3

Decoder

Execute

Fetch

Q

Q

Opcode

Execute and LDA

LDA

C10

C3

C3

C11;C9

C6

C7

C3

C3

and so on

Fetch/execute

flip/flop

Final

execute

cycles

in here

steps of

R

S

19 / 1

Page 20: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Multiplexing using CSL signal

In our BSA several registers feedout onto the same bus.For example, via the ALU the ACand the MBR feed onto the databus. Both the PC and the IR(address) feed onto the addressbus, etcClearly this will cause contention.The piece of hardware thatchooses one of several inputs iscalled a multiplexer. To select oneof 2n inputs, n selection lines entera decoder, and the 2n outputs fromthe decoder are AND-ed with theinputs. The outputs from the ANDgates are OR-ed.

Selection Lines (Binary Code)

n−to−2^n

Decoder

Inputs

Output

20 / 1

Page 21: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Using Tristate Logic for OE’ing

However when outputting onto buses it makes more sense touse a individual output-enable stages on the back of eachtransmitting register.

Why? Less knitting to do ...

21 / 1

Page 22: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Using Tristate Logic for OE’ingThe output-enable stage on the back of a register is madeusing tri-state logic driven by the CSL signal.Tri-state logic has three states – assertive 0, assertive 1, andnon-assertive.In the assertive mode, the output resistance of a tri-state gate islow, and the bus is forced to follow the output, whether it be 0 or1. In the non-assertive state, the output resistance of the gate ishigh, and the gate’s nominal output can safely be in a differentstate from that of the attached bus.

INPUT

OE

OUTPUT

22 / 1

Page 23: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Using Tristate Logic for OE’ing

Q3 Q2 Q1 Q0

D3 D2 D1 D0

QD QD QD QD

CSL=OE

BUS

REGISTER

TRI−STATE

OE STAGE

23 / 1

Page 24: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Using Tristate Logic for OE’ing

Suppose you wanted to do MAR← PC. You use the CSL toprovide OE2 and CSP to clock the MAR.

CKIR(adrs)

24 units in

parallel

MAR

SPPC

OE1 OE2 OE3

24 / 1

Page 25: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Texas Instruments Data Sheet

25 / 1

Page 26: A3 Computer Architecture - University of Oxforddwm/Courses/3CO_2000/3CO-L3.pdf · 2012-03-24 · Computer Architecture MT 2011 Inputs to and Outputs from the CU Before starting to

Computer Architecture MT 2011

Texas Instruments Data Sheet

26 / 1