Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system...

16
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019 1 Instructor: Daniel Llamocca Unit 1 - Digital System Design DIGITAL SYSTEM MODEL FSM (CONTROL) + DATAPATH CIRCUIT EXAMPLES CAR LOT COUNTER If A = 1 No light received (car obstructing LED A) If B = 1 No light received (car obstructing LED B) If car enters the lot, the following sequence (A|B) must be followed: 00 10 11 01 00 If car leaves the lot, the following sequence (A|B) must be followed: 00 01 11 10 00 A car might stay in a state for many cycles since the car speed is very large compared to that of the clock frequency. DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ Usually, when (asynchronous clear) and are not drawn, they are implied. FINITE STATE MACHINE resetn clock Inputs Outputs CONTROL CIRCUIT DATAPATH CIRCUIT B A photo receptors FINITE STATE MACHINE resetn clock A B CONTROL CIRCUIT Q 10 10-bit counter E ud E ud DATAPATH CIRCUIT

Transcript of Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system...

Page 1: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

1 Instructor: Daniel Llamocca

Unit 1 - Digital System Design

DIGITAL SYSTEM MODEL

FSM (CONTROL) + DATAPATH CIRCUIT

EXAMPLES

CAR LOT COUNTER If A = 1 No light received (car obstructing LED A)

If B = 1 No light received (car obstructing LED B)

If car enters the lot, the following sequence (A|B) must be followed: 00 10 11 01 00 If car leaves the lot, the following sequence (A|B) must be followed: 00 01 11 10 00

A car might stay in a state for many cycles since the car speed is very large compared to that of the clock frequency.

DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ Usually, when π‘Ÿπ‘’π‘ π‘’π‘‘π‘› (asynchronous clear) and π‘π‘™π‘œπ‘π‘˜ are not drawn, they are implied.

FINITE STATE MACHINEresetn

clock

Inputs

Outputs

CONTROL CIRCUIT

DATAPATH CIRCUIT

B

A

photoreceptors

FINITE STATE MACHINE

resetn

clock

A

B

CONTROL CIRCUIT

Q10

10-bit counter

E

ud

E

ud

DATAPATH CIRCUIT

Page 2: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

2 Instructor: Daniel Llamocca

β–ͺ Finite State Machine (FSM):

β–ͺ Algorithmic State Machine (ASM) chart:

S1

S2

resetn=0

yes

no

00

AB=00

AB11

S3

10

10AB

00

S4

11AB

10

01 11

00

S4

01AB

00

01

E, ud 1

1110

S6

01AB

00

S7

11AB

01

1011

00

S8

10AB

00

10

E 1

1101

01

S1 S2

00/00resetn = 0

A|B/E|ud

01,10,11/00 S3 S4

10/00 11/00

S5

01/00

00/11

00/00

S6

01/00

11/00

10/00 11/00 01/00

01/00

00/00 10/00 11/00

10/00

00/00

S7 S8

11/00 10/00

01/00 11/00

10/0010/00

00/00

00/10

01/00 11/00

00/00 01/00

Page 3: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

3 Instructor: Daniel Llamocca

DEBOUNCING CIRCUIT β–ͺ Mechanical bouncing lasts approximately 20 ms. The, we have to make sure that the input signal 𝑀 is stable (β€˜1’) for at least

20 ms before we assert 𝑀_𝑑𝑏. Then, to deassert 𝑀_𝑑𝑏, we have to make that the 𝑀 is stable (β€˜0’) for at least 20 ms.

DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ Counter 0 to N-1: E=1 Q = Q+1. sclr: Synchronous clear. The way it is designed, if sclr = β€˜1’ and E=’1’, then Q=0.

If T is the period of the clock signal, then 𝑁 =20π‘šπ‘ 

𝑇.

For example, for 100 MHz input clock, T = 10 ns. Then 𝑁 =20π‘šπ‘ 

10𝑛𝑠= 2 Γ— 106

β–ͺ Algorithmic State Machine:

S0resetn=0

E, sclr 1

S1

E 1

S2

w

0

1

1

0w

1

0w

z0

w_db 1

S3

0

1w

1

w_db, E 1

S4

0

1w

z0

1

sclr 1

waits for thefirst '1'

for w_db=1,w must be 1 forat least 20 ms

waits for thefirst '0'

for w_db=0,w must be 0 forat least 20 ms

sclr 1

resetn

Q

clock

n

counter0 to N-1

z

Q=N-1?

comparator

En

FSM

w w_db

sclr

E sclr z

w_db

w

20 ms20 ms

<20 ms <20 ms

Page 4: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

4 Instructor: Daniel Llamocca

BIT-COUNTING CIRCUIT (COUNTING 1’S) β–ͺ This circuit counts the number of bits in register A whose value is β€˜1’. Example: A = 00110111 C = 0101.

β–ͺ The sequential algorithm (pseudo-code) is available. In this case, we can follow this procedure to design a digital system: βœ“ Sketch the block diagram: We need start and done signals to indicate when the process starts and finishes. We also

include input (Data for the Register A) and output data (Count). βœ“ Sketch the high-level control mechanism (state machine) for the Bit-counting circuit. Here, you can include combinational

blocks as well as common synchronous blocks (registers, shift registers, counters). βœ“ With the block diagram and high-level control mechanism, we can start including the components and their signals in

the datapath. Based on the high-level state machine, we can design the actual state machine that includes specific signals controlling the components.

DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ The digital system is depicted below: FSM + Datapath. Example: For 𝑛 = 8: if 𝐴 = 00110110, then 𝐢 = 0100.

βœ“ m-bit counter: If 𝐸 = π‘ π‘π‘™π‘Ÿ = 1, the count is initialized to zero. If 𝐸 = 1, π‘ π‘π‘™π‘Ÿ = 0, the count is increased by 1.

βœ“ Parallel access shift register: If 𝐸 = 1: 𝑠_𝑙 = 1 Load, 𝑠_𝑙 = 0 Shift.

A

din

s_l

E

0

LA

EA

Parallel Accesss_l = 1 Load

s_l = 0 Shift

DA

z a0

Q

counter: m bits

E

sclr

EC

FINITE STATE MACHINEs

resetn S1

S2

resetn=0

1

0s

z

EC, sclrC 1

01

EC 1

1

0

EA 1

a0

S3

done 1

1s

0

C

done

𝑛

𝑛

= 2 𝑛 1

sclrCEA, LA 1 Shift-Right

C 0

while A 0

if a0 = 1 then

C C + 1

end if

right shift A

end while

S1

S2

Reset

1

0s

C 0

noy es

C C+1

1

0

Shif t right A

a0

S3

done

1s

0

Load A

A=0?

Sequential Algorithm

High-level FSM

Bit-counting circuit

s

done

C

DA

Page 5: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

5 Instructor: Daniel Llamocca

Timing diagram (𝑛 = 8, = 4):

INTEGER BASE-2 LOGARITHM β–ͺ This circuit computes ⌈ 2 π΄βŒ‰. Data (unsigned integer) is contained in register 𝐴 (n bits)

β–ͺ The sequential algorithm (pseudo-code) is available. We follow the procedure specified in the previous example. β–ͺ Note that instead of performing a right shift to 𝐴 to get ⌊𝐴 2⁄ βŒ‹, we will get ⌊𝐴 2⁄ βŒ‹ by created a shifted version of 𝐴. Example

(n=4): 𝐴 = π‘Ž3π‘Ž2π‘Ž1π‘Ž0 ⌊𝐴 2⁄ βŒ‹ = 0π‘Ž3π‘Ž2π‘Ž1.

β–ͺ Also asking 𝐴 = 1 is the same as asking whether ⌊𝐴 2⁄ βŒ‹ = 0.

clock

resetn

s

DA 0000111000110110

S1 S1 S2 S2 S2 S2 S2 S2 S2 S3 S1 S2 S2 S2 S2 S2 S3

A

z

3600 00 1B 0D 06 03 01

state

C

done

EA

LA

sclrC

EC

00 0E 07 03 0100 00 00 00

00000000 0000 0000 0001 0010 0010 0011 0100 0000 0000 0001 00100100 0100 0011 0011

L 0

while A 1

A A - A/2

L L + 1end while

S1

S2

Reset

1

0s

L 0

noy es update A

L L + 1

S3

done

1s

0

Load A

A=1

Sequential Algorithm

High-level FSM

Bit-counting circuit

s

done

L

DA

Page 6: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

6 Instructor: Daniel Llamocca

DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ The digital system is depicted below. For 𝑛 bits, ⌈ 2 π΄βŒ‰ ∈ [0, 𝑛]. Thus, the result needs ⌈ 2(𝑛 1)βŒ‰ bits.

βœ“ m-bit counter: If 𝐸 = π‘ π‘π‘™π‘Ÿ = 1, the count is initialized to zero. If 𝐸 = 1, π‘ π‘π‘™π‘Ÿ = 0, the count is increased by 1.

βœ“ Register: If 𝐸 = 1: π‘ π‘π‘™π‘Ÿ = 1 Clear, π‘ π‘π‘™π‘Ÿ = 0 Load.

AE

sclr

EA

0

DA

z

Q

Counter0 to n

E

sclr

EL

FSM

s

S1

S2

resetn=0

1

0s

z

EL, sclrL 1

0

1

EA 1

EL 1

S3

done 1

1s

0

L

done

𝑛

𝑛

= 2 𝑛 1

sclrL

sA, EA 1

-

𝑛 1

0

𝑛 0an-1an-2a1

0 1

𝑛

clock

resetn

sA

sA

unsigned

subtractor

clock

resetn

s

DA 0000101101101010

S1 S1 S2 S2 S2 S2 S2 S2 S2 S2 S3 S1 S2 S2 S2 S2 S2

A

z

4A00 00 35 1B 0E 07 04

state

L

done

EA

sA

sclrL

EL

01 01 0D 06 0302 01 02 01

00000000 0000 0001 0010 0011 0100 0101 0100 0000 0000 0001 00100110 0111 0011 0100

Page 7: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

7 Instructor: Daniel Llamocca

LEADING ZERO DETECTOR (ITERATIVE) β–ͺ A sequence of N bits is fed to the circuit serially. The circuit generates the number of 0’s that exist before the first 1.

β–ͺ Example (N=15):

βœ“ If the number is: 000000000000101 Output: 12

βœ“ If the number is: 000000000000000 Output: 15

βœ“ If the number is: 000010001000001 Output: 4

DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ Inputs: X (serial data, MSB first), start.

β–ͺ Outputs: z (leading 1 detected), done (N bits processed), and R (number of 0’s before the first 1).

β–ͺ The process beings with the assertion of start. When the first 1 is detected, then z=1 (for 1 clock cycle) and R is frozen.

When all the bits of the sequence have been processed, done=1. We can re-start the process after this. Note that if X is just

0’s, z is never β€˜1’.

β–ͺ Counter R and Q: modulo-(N+1): count from 0 to N.

LEADING-0DETECTOR

X R

start done

x

z

done

Q

R

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 1 1 0 1 0 1 0 0

start

100 0 0 1 2 3 4 5 6 7 8 9 11

12

13

14

15 0 0 1 2 3 4 5 6 7 8 9

z

12

12

10

11

12

120 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4

10

11

12

13

14

15

𝑛 = 2 𝑁 1

𝑛

clock

resetn

Q

counter0 to N

z

E

sclr

EQ

sclrQFSM

z

QD

E

Q

counter0 to N

E

sclr

ER

sclrR

Q R

db

eb

x

Q

eb

db

start

done

bb

ER 1, sclrR 1

EQ 1, sclrQ 1

eb 1, db 0

S1

1

resetn=0

start0

done 1,

EQ 1, sclrQ 1

ER 1, sclrR 1

S3

noQ=N-1

EQ 1

y es

start0

S2

b x

ER 1

z 1

eb 1

db 1

EQ 1

1

0

1

1

0

𝑛 𝑛

𝑛

z

EQ

sclrQ

ER

sclrR

Page 8: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

8 Instructor: Daniel Llamocca

SIMPLE PROCESSOR DIGITAL SYSTEM (FSM + Datapath circuit) β–ͺ This system is a basic Central Processing Unit (CPU). For completeness, a memory would need to be included. β–ͺ Here, the Control Circuit could be implemented as a State Machine. However, in order to simplify the State Machine design,

the Control Circuit is partitioned into a datapath circuit and a FSM.

OPERATION β–ͺ Every time w = '1', we grab the instruction from 𝑓𝑒𝑛 and execute it.

β–ͺ πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› = |𝑓2|𝑓1|𝑓0|𝑅𝑦1|𝑅𝑦0|𝑅π‘₯1|𝑅π‘₯0|. This is called β€˜machine language instruction’ or Assembly instruction:

βœ“ 𝑓2𝑓1𝑓0: Opcode (operation code). This is the portion that specifies the operation to be performed. βœ“ 𝑅π‘₯: Register where the result of the operation is stored (we also read data from 𝑅π‘₯). 𝑅π‘₯ can be R1, R2, R3, R4.

βœ“ 𝑅𝑦: Register where we only read data from. 𝑅𝑦 can be R1, R2, R3, R4.

f = f2f1f0 Operation Function

000 Load Rx, Data Rx Data

001 Move Rx, Ry Rx Ry

010 Add Rx, Ry Rx Rx + Ry

011 Sub Rx, Ry Rx Rx - Ry

100 Not Rx Rx NOT (Rx)

101 And Rx, Ry Rx Rx AND Ry

110 Or Rx, Ry Rx Rx OR Ry

111 Xor Rx, Ry Rx Rx XOR Ry

R0E

O_R0

E_R0

R1E

O_R1

E_R1

R2E

O_R2

E_R2

R3E

O_R3

E_R3

AE

O_G

E_A

GE

ALU

E_ext

CONTROL CIRCUIT

Data

op

4

w

fun7

done

BUS

B

E_G

QDn

Data_inn

Page 9: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

9 Instructor: Daniel Llamocca

β–ͺ Control Circuit: This is made out of some combinational units, a register, and a FSM: βœ“ Ex: Every time we want to enable register 𝑅π‘₯, the FSM only asserts Ex (instead of controlling E_R0, E_R1, E_R2, E_R3

directly). The decoder takes care of generating the enable signal for the corresponding register 𝑅π‘₯. βœ“ Eo, so: Every time we want to read from register 𝑅𝑦 (or 𝑅π‘₯), the FSM only asserts Eo (instead of controlling O_R0,

O_R1, O_R2, O_R3 directly) and so (which signals whether to read from 𝑅π‘₯ or 𝑅𝑦). The decoder takes care of generating

the enable signal for the corresponding register 𝑅π‘₯ or 𝑅𝑦.

β–ͺ Arithmetic-Logic Unit (ALU):

op Operation Function Unit

0000

0001

0010

0011

0100

0101

0110

0111

y <= A

y <= A + 1

y <= A - 1

y <= B

y <= B + 1

y <= B – 1

y <= A + B

y <= A – B

Transfer β€˜A’

Increment β€˜A’

Decrement β€˜A’

Transfer β€˜B’

Increment β€˜B’

Decrement β€˜B’

Add β€˜A’ and β€˜B’

Subtract β€˜B’ from 'A'

Arithmetic

1000

1001

1010

1011

1100

1101

1110

1111

y <= not A

y <= not B

y <= A AND B

y <= A OR B

y <= A NAND B

y <= A NOR B

y <= A XOR B

y <= A XNOR B

Complement β€˜A’

Complement β€˜B’

AND

OR

NAND

NOR

XOR

XNOR

Logic

Rx1

Rx0

Ex

DECODERwith

enable

0

1

2

3

0

1

E

E_R0

E_R1

E_R2

E_R3

Ry

Eo

DECODERwith

enable

0

1

2

3

0

1

E

O_R0

O_R1

O_R2

O_R3

Rx

so

0

1

2

22

FSMdone

w

f3

Ex Eo so E_G

O_G

E_ext

op

4

E_fun

QD

E

7fun

7funq

funq = |f2|f1|f0|Ry1|Ry0|Rx1|Rx0|

E_A

E_fun

Page 10: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

10 Instructor: Daniel Llamocca

β–ͺ Algorithmic State Machine (ASM): Every branch of the FSM implements an Assembly instruction.

S1

S2

resetn=0

1

0

E_ext, Ex 1

done 1

Eo, Ex 1

done 1

000

001

Eo, so 1

E_A 1

Eo, E_G 1

op 0110

O_G, Ex 1

done 1

010

S3a

S3b

Eo, E_G 1

op 0111

O_G, Ex 1

done 1

S4a

S4b

Eo, so 1

E_A 1

E_G 1

op 1000

O_G, Ex 1

done 1

S5a

S5b

Eo, so 1

E_A 1

Eo, E_G 1

op 1010

O_G, Ex 1

done 1

S6a

S6b

Eo, so 1

E_A 1

Eo, E_G 1

op 1011

O_G, Ex 1

done 1

S7a

S7b

Eo, so 1

E_A 1

Eo, E_G 1

op 1110

O_G, Ex 1

done 1

S8a

S8b

Eo, so 1

E_A 1

w

f

E_fun 1

011 100 101

110

111

Page 11: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

11 Instructor: Daniel Llamocca

BINARY TO BCD CONVERSION DOUBLE DABBLE ALGORITM β–ͺ Given an 𝑁-bit unsigned binary number, we left shift every one of the 𝑁 bits into a new register (that holds the BCD digits).

We thus have 𝑁 iterations. If after shifting, any of the BCD digits (one or more) is greater than 4, we increment it by 3. The

exception is on the last iteration, where after shifting, we do not increment any nibble. β–ͺ For 𝑁 bits, we need βŒˆπ‘ 3⁄ βŒ‰ BCD digits. So the BCD register is 4 Γ— βŒˆπ‘ 3⁄ βŒ‰ bits wide.

Number of bits Binary range Number of BCD digits

4 [0-15] 2

7 [0-127] 3

10 [0-1023] 4

14 [0-16383] 5

N [0, 2𝑁 1] βŒˆπ‘ 3⁄ βŒ‰

β–ͺ Why does it work? Think of BCD shifting: we want to shift the bits of a BCD number, but preserving the BCD representation.

If a number lower or equal than 4 is shifted, we get at most 8, still representable in BCD. If a number greater than 4 is

shifted, the minimum number we get is 10=10102, which is not in BCD. If we add 3 to the number and then shift, we will get

the proper BCD result with 2 BCD digits.

βœ“ Example: 5=01012. If we shift, we get 10102. 10102 would be 0001 0000 in BCD. To get this, we can add 3 to 01012, resulting

in 10002. After shifting, we get 0001 0000. The same happens for numbers 6, 7, 8, and 9.

β–ͺ Example: 255 = 111111112 (N=8). We need βŒˆπ‘ 3⁄ βŒ‰ = 3 BCD digits. Note that there are 𝑁 = 8 iterations.

The table shows the state after an operation has been applied. In the example, only a nibble gets incremented at a time.

BCD number Binary number Procedure just applied Procedure to apply

0000 0000 0000 1111 1111 Initialization Left shift

0000 0000 0001 1111 1110 Left shift Left shift

0000 0000 0011 1111 1100 Left shift Left shift

0000 0000 0111 1111 1000 Left shift. 0111>4. Then we add 3 to 0111

0000 0000 1010 1111 1000 Add 3 to 0111 Left shift

0000 0001 0101 1111 0000 Left shift. 0101>4. Then we add 3 to 0101

0000 0001 1000 1111 0000 Add 3 to 0101 Left shift

0000 0011 0001 1110 0000 Left shift Left shift

0000 0110 0011 1100 0000 Left shift. 0110>4. Then we add 3 to 0110

0000 1001 0011 1100 0000 Add 3 to 0110 Left shift

0001 0010 0111 1000 0000 Left shift 0111>4. Then we add 3 to 0111

0001 0010 1010 1000 0000 Add 3 to 0111 Left shift

0010 0101 0101 0000 0000 Left shift. None. We do not add 3 in the last iteration

β–ͺ Example: 172 = 10101100 (N=8). We need βŒˆπ‘ 3⁄ βŒ‰ = 3 BCD digits. Note that there are 𝑁 = 8 iterations.

The table shows the state after an operation has been applied. In general, more than a nibble can be incremented at a time.

BCD number Binary number Procedure just applied Procedure to apply

0000 0000 0000 1010 1100 Initialization Left shift

0000 0000 0001 0101 1000 Left shift Left shift

0000 0000 0010 1011 0000 Left shift Left shift

0000 0000 0101 0110 0000 Left shift 0101>4. Then we add 3 to 0101

0000 0000 1000 0110 0000 Add 3 to 0101 Left shift

0000 0001 0000 1100 0000 Left shift Left shift

0000 0010 0001 1000 0000 Left shift Left shift

0000 0100 0011 0000 0000 Left shift Left shift

0000 1000 0110 0000 0000 Left shift 1000>4, 0110>4. Then we add 3 to 1000 and 0110

0000 1011 1001 0000 0000 Add 3 to 1011 and 1001 Left shift

0001 0111 0010 0000 0000 Left shift Note. We do not add 3 in the last iteration

Page 12: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

12 Instructor: Daniel Llamocca

DIGITAL SYSTEM (FSM + Datapath circuit)

β–ͺ Algorithmic State Machine (ASM) Chart:

sclrB 1, EB"11..1"

E_C 1, sclr_C 1,

E_CB 1, sclr_CB 1

S1

1

resetn=0

start0

S2

done 1

S4

yesF > 4 &z_C=0

no

0z_CB E_CB 1

1

LA, EA 1

E_LB 1

EB(CB) 1

start10

EB "11..1", EA 1

S3

E_CB 1, sclr_CB 1

0z_C

1

E_C 1

E_C 1, sclr_C 1

LEFT SHIFT

REGISTER

ws_L

E

clock

resetn

0EALA

LEFT SHIFT

REGISTER

ws_L

EEB(0)LB(0)

i_bin

N

sosoLEFT SHIFT

REGISTER

ws_L

EEB(1)LB(1)

soLEFT SHIFT

REGISTER

ws_L

EEB(K-1)LB(K-1)

so ...sclr sclr sclr pB(0)pB(1)pB(K-1) pB(2)pB(K)

sclrB

DB4 4 4

CB

+

0011

Q

counter0 to N-1

z_C

E

sclr

E_CQ

counter0 to K-1

z_CB

E

sclr

E_CB

sclr_CB

012...

K-1

4 4 4...0

1

K-1

4F

DB

sclr_C

CB

E_LB

LB(0)

LB(1)

LB(2)

LB(K-1)

DECODER

FSM

E_A

L_AE

_LBsclr_BE

_Csclr_CE

_CB

sclr_CB

done

z_Cz_CB

EB

CBF

K

start

QB(0)QB(1)QB(K-1)

... 4

o_bcd(0)o_bcd(1)...o_bcd(K-1)

Page 13: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

13 Instructor: Daniel Llamocca

EXTRA TOPICS

LINEAR FEEDBACK SHIFT REGISTERS (LFSRS)

β–ͺ LFSRs have a simple and fairly regular structure. Typical components: D-type flip flops, and logic gates. β–ͺ Advantages: very little hardware and high speed operation. β–ͺ Inputs to flip flops are linear functions of the previous state. β–ͺ Despite the simple appearance, LFSRs are based on rather complex mathematical theory and have interesting applications

in the area of digital system testing, cryptography, and fault-tolerant computing. β–ͺ Typical applications: Error correction/detection, pseudo random number generators, fast counters. β–ͺ A generic 4-bit LFSR is depicted below: APPLICATION: CYCLIC REDUNDANCY CHECK (CRC) β–ͺ This error-detection code is used in digital communications (e.g.: Ethernet, CAN), and storage devices (e.g.: RAMs). Communication system β–ͺ π‘˜-bit message: 𝑀 = π‘˜βˆ’1 π‘˜βˆ’2… 1 0. β–ͺ CRC code: 𝑅 = π‘Ÿπ‘›βˆ’1π‘Ÿπ‘›βˆ’2β€¦π‘Ÿ1π‘Ÿ0. β–ͺ Stream sent: π‘˜βˆ’1 π‘˜βˆ’2… 1 0π‘Ÿπ‘›βˆ’1π‘Ÿπ‘›βˆ’2β€¦π‘Ÿ1π‘Ÿ0 β–ͺ Associated polynomials:

βœ“ 𝑀(π‘₯) = π‘˜βˆ’1π‘₯π‘˜βˆ’1 π‘˜βˆ’2π‘₯

π‘˜βˆ’2 β‹― 1π‘₯1 0π‘₯

0. Order: π‘˜ 1 βœ“ 𝑅(π‘₯) = π‘Ÿπ‘›βˆ’1π‘₯

π‘›βˆ’1 π‘Ÿπ‘›βˆ’2π‘₯π‘›βˆ’2 β‹― π‘Ÿ1π‘₯

1 π‘Ÿ0π‘₯0. Order: 𝑛 1

β–ͺ The CRC code is calculated as a remainder: 𝑅(π‘₯) = π‘Ÿπ‘’ π‘Žπ‘–π‘›π‘‘π‘’π‘Ÿ (π‘₯𝑛𝑀(π‘₯)

𝐺(π‘₯))

β–ͺ 𝐺(π‘₯) = 𝑔𝑛π‘₯𝑛 π‘”π‘›βˆ’1π‘₯

π‘›βˆ’1 β‹― 𝑔1π‘₯1 𝑔0π‘₯

0. This is the generating polynomial of order 𝑛. 𝐺 = π‘”π‘›π‘”π‘›βˆ’1…𝑔1𝑔0 β–ͺ Here, when dealing with polynomial operations, we use modulo-2 polynomial arithmetic (Galois field of two elements: GF(2)).

The coefficients of the polynomials can only be 0 or 1. The multiplication operation can be implemented as a simple AND gate, and the addition (or subtraction) operation can be implemented by a XOR gate: π‘Ž Β± 𝑏 = π‘Žπ‘, π‘Ž ∈ {0,1}. βœ“ Example: π‘₯𝑛 π‘₯𝑛 = π‘₯𝑛 π‘₯𝑛 = 0.

β–ͺ CRC generator: It computes 𝑅(π‘₯). Its input is π‘₯𝑛𝑀(π‘₯) ≑ π‘˜βˆ’1 π‘˜βˆ’2… 1 000…0. β–ͺ CRC checker: On the receiver side, both the message and CRC code might be corrupted by the channel: 𝑀′(π‘₯) and 𝑅′(π‘₯).

βœ“ So, we must check if π‘Ÿπ‘’ π‘Žπ‘–π‘›π‘‘π‘’π‘Ÿ (π‘₯𝑛𝑀′(π‘₯)

𝐺(π‘₯)) = 𝑅′(π‘₯). If so, we say that the system passed the CRC check.

βœ“ Note that this implies: π‘₯𝑛𝑀′(π‘₯) = 𝑄’(π‘₯)𝐺(π‘₯) 𝑅’(π‘₯) β†’ π‘₯𝑛𝑀′(π‘₯) 𝑅’(π‘₯) = π‘₯𝑛𝑀′(π‘₯) 𝑅’(π‘₯) = 𝑄’(π‘₯)𝐺(π‘₯). In modulo-2

arithmetic, 𝑅′(π‘₯) = 𝑅′(π‘₯). It follows that π‘₯𝑛𝑀′(π‘₯) 𝑅’(π‘₯) should be divisible by 𝐺(π‘₯).

βœ“ The CRC checker tests if π‘Ÿπ‘’ π‘Žπ‘–π‘›π‘‘π‘’π‘Ÿ (π‘₯𝑛𝑀′(π‘₯)+𝑅′(π‘₯)

𝐺(π‘₯)) = 0. If the remainder is zero, we say that it passed the CRC check. It

is still possible for 𝑅’(π‘₯) and 𝑀’(π‘₯) to make the remainder equal to zero but this is far less likely.

βœ“ The input to the CRC checker is π‘₯𝑛𝑀′(π‘₯) 𝑅′(π‘₯) ≑ β€²π‘˜βˆ’1 β€²π‘˜βˆ’2… β€²0π‘Ÿβ€²π‘›βˆ’1π‘Ÿβ€²π‘›βˆ’2β€¦π‘Ÿβ€²0.

CRC

Generator

CRC

Checker

Transmitted bitserror

D

E

Q

resetn

Q Q

clk

Q0

E

Q1Q2

D

E

D

E

QD

E

Q3

x

LINEAR FUNCTION

Page 14: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

14 Instructor: Daniel Llamocca

β–ͺ Example: CRC-4 (𝑛 = 4): 𝑀 = 11100110, 𝐺 = 11001 𝑀(π‘₯) = π‘₯7 π‘₯6 π‘₯5 π‘₯2 π‘₯ β†’ π‘₯𝑛𝑀(π‘₯) = π‘₯4(π‘₯7 π‘₯6 π‘₯5 π‘₯2 π‘₯) = π‘₯11 π‘₯10 π‘₯9 π‘₯6 π‘₯5 𝐺(π‘₯) = π‘₯4 π‘₯3 1 𝑅(π‘₯) = π‘₯2 π‘₯

CRC circuit β–ͺ The following LFSR implements 4-bit CRC. Components: flip flops, AND, XOR gates. β–ͺ Note that 𝑔4 = 1 is not used. This is because it is common in CRC to have 𝑔𝑛 = 1.

This is already considered in the design of the circuit. β–ͺ Example with 𝑀 = 11100110, 𝐺 = 11001:

βœ“ When integrated into the CRC generator, the input is given by: π‘₯𝑛𝑀(π‘₯) ≑ 111001100000. The output result is 𝑅 = 0110.

βœ“ When integrated into the CRC checker, the input is given by: π‘₯𝑛𝑀′(π‘₯) 𝑅′(π‘₯) ≑ 111001100110. The result of the circuit is π‘Ÿπ‘’ π‘Žπ‘–π‘›π‘‘π‘’π‘Ÿ =0000, indicating a valid transmission.

β–ͺ Applications: The circuit above can be implemented for any number of bits 𝑛. For

example: CRC-32 (Ethernet), CRC-15 (CAN), and CRC-1 (trivial even parity generator with 𝐺(π‘₯) = π‘₯ 1).

β–ͺ This is a serial implementation, i.e., each bit is fed to the circuit at a time. Other implementations process some bits in parallel (for CRC-1, the parity generator in ECE2700: Notes – Unit 1 processes all the bits at once).

β–ͺ The design of the generating polynomial 𝐺(π‘₯) is outside the scope of this class.

β–ͺ These circuits are better implemented in hardware for high speed and low resource utilization.

+ + +

D

E

Q

resetn

Q Q

clk

R0

E

R1 R2

D

E

D

E

QD

E

R3

x

G0 G1 G2 G3

clk

resetn

R

x

E

0000 0000 0001 0011 0111 1110 0101 1011 1110 0101 1010 1100 0000 0000 0000 0000

clk

resetn

R

x

E

0000 0000 0001 0011 0111 1110 0101 1011 1110 0101 0110 0110 01101010 1101 0011

D

E

Q

resetn

clk

R0

E

x

G0

Page 15: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

15 Instructor: Daniel Llamocca

MODULO-N COUNTER DESIGN β–ͺ A modulo-N counter could be designed by the State Machine method, but this can be very cumbersome if N is a large

number. For example, if N=1000, we need 1000 states. β–ͺ The figure shows a counter modulo-200 that uses a register, and adder, a comparator, and logic gates. 𝑄: count. 𝑧: output

signal asserted only when the maximum count (199) is reached. This can be easily described in VHDL using the structural description. You can also use the behavioral description, where the count is increased by 1 every clock cycle and β€˜z’ is asserted when the count reaches β€˜N-1’.

MEMORY DECODING

β–ͺ Physical implementation of memory is not homogeneous: βœ“ Different portions of memory are used for different purposes: RAM, ROM, I/O devices. βœ“ A processor can usually address a memory space that is much larger than the memory space covered by an individual

memory chip. Even if all the memory was of one type, we still have to implement it using multiple ICs. βœ“ For a given valid address, one and only one memory-mapped component must be accessed.

β–ͺ Address Decoding: Process of generating chip select (CS, CE) signals from the address bus for each device in the system. β–ͺ The Address bus line (N+M bits) is split into two sections:

βœ“ The N most significant bits are used to generate the CS signals for the different devices. βœ“ The M least significant bits are passed to the devices as addresses.

EXAMPLE β–ͺ A 20-bit address line in a processor handles up to 220 = 1 𝑀𝐡 of addresses, each address containing one-byte of

information. We want to connect four 256KB memory chips to the processor.

β–ͺ The 2 MSBs of the address line are used to generate the CS signals. The 18 LSBs are passed to the devices as addresses. β–ͺ The pink-shaded circuit: i) addresses the memory chips, and ii) enables only one memory chip (via CE: chip enable) when

the address falls in the corresponding range. Example: if π‘Žπ‘‘π‘‘π‘Ÿπ‘’π‘ π‘  = 0π‘₯5𝐹𝐹𝐹𝐹, only memory chip 2 is enabled (CE=1).

If π‘Žπ‘‘π‘‘π‘Ÿπ‘’π‘ π‘  = 0π‘₯𝐷0123, only memory chip 4 is enabled.

+

00000001

E

sclr

4

8

resetn

E

sclr

Q

sclr: synchronous clearIf E=sclr=1, then Q = 0

8 8

clock

A=B

= 199?

a7

b7

a6

b6

a5

b5

a4

b4

A

11000111

8B

A=B

z

a3

b3

a2

b2

a1

b1

a0

b0

00000

3FFFF

40000

7FFFF

80000

BFFFF

C0000

FFFFF

......

......256KB

256KB

256KB

256KB

256 KB

CE

256 KB

CE

256 KB

CE

256 KB

CE

Memory

space

20address

18 18 18 18

Memory

devices

1 2 3 41

2

3

4

w0w1

y0y1y2y3

address(17..0)

address(18)

address(19)

Page 16: Inputs FINITE STATE Outputs resetn MACHINEllamocca/Courses/W19_ECE3710/Notes - Unit 1.pdfThis system is a basic Central Processing Unit (CPU). For completeness, a memory would need

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-3710: Computer Hardware Design Winter 2019

16 Instructor: Daniel Llamocca

FLIP FLOPS: PRACTICAL ASPECTS

Reducing rate of change of a synchronous circuit: β–ͺ Here, we use FSM as an example of a synchronous circuit. But we can apply this technique to any synchronous circuit. β–ͺ We usually would like to reduce the rate at which FSM transitions occur. A straightforward option is to reduce the frequency

of the input clock. But this is a very complicated problem when a high precision clock is required. β–ͺ Alternatively, we can reduce the rate at which FSM transitions occur by including an enable signal in our FSM: this means

including an enable to every flip flop in the FSM. For any FSM transition to occur, the enable signal has to be β€˜1’. Then we assert the enable signal only when we need it. The effect is the same as reducing the frequency of the input clock. βœ“ The figure below depicts a counter modulo-N (from 0 to N-1) connected to a comparator that generates a pulse (output

signal 𝑧) of one clock period every time we hit the count β€˜N-1’. The number of bits the counter is given by 𝑛 = ⌈ 2π‘βŒ‰. The effect is the same as reducing the frequency of the FSM to 𝑓 𝑁⁄ , where 𝑓 is the frequency of the clock.

βœ“ Example: Timing diagram of a modulo-10 counter. Notice that β€˜z’ is only asserted when the count reaches β€œ1001”. This

β€˜z’ signal controls the enable of a FSM, so that the FSM transitions only occur every 10 clock cycles, thereby having the same effect as reducing the frequency by 10.

β–ͺ We can apply the same technique not only to FSMs, but also to any sequential circuit. This way, we can reduce the rate of

any sequential circuit (e.g. another counter) by including an enable signal of every flip flop in the circuit. Flip flop timing parameters: β–ͺ Propagation Delay. β–ͺ Setup Time: Interval of time before the

clock edge where the data must be held stable.

β–ͺ Hold Time: Interval of time after the clock where the data must be held

stable. β–ͺ If setup time or hold time are violated,

the output may become unpredictable, or even worse it might enter into metastability.

clk

resetn

Q 0000

z

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 0000 0001

E

resetn

Q

clock

n

counter0 to N-1

z

FSMOutputs

Inputs

Q=N-1?

comparator

EE

D Q

resetn

clk

QD

clk

Q

D

restricted

region