ee457 Quiz Sp2020 -

12
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada EE457 Quiz (~10%) Closed-book Closed-notes Exam; No cheat sheets; Ordinary calculators may be used but not the smart phone with calculators. Verilog Guides are not needed and are not allowed. Smart phones, tablets (and any kind of computing/Internet devices) are not allowed. This is a Crowdmark exam. Please do not write on margins or on backside. Use HB or 1H pencil. Spring 2020 Instructor: Gandhi Puvvada Thursday, 2/13/2020 (A 3-hour exam) 05:00 PM - 08:00 PM (180 min) in SGM123 Please do not write your student ID Student’s DEN D2L username: @usc.edu Viterbi School of Engineering, University of Southern California Ques# Topic Page# Time Points Score 1 State Diagram, RTL Design 2-4 60 min. 100 2 Unsigned and Signed numbers 5-6 25 min. 54 3 CPU Performance 7-7 20 min. 30 4 MIPS processor ISA, Byte-addressable processors 8-9 30 min. 77 5 Single-Cycle CPU 10-11 25 min. 64 Total 1+10+1 160 min. 325 Perfect Score 300

Transcript of ee457 Quiz Sp2020 -

Page 1: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

EE457 Quiz (~10%)Closed-book Closed-notes Exam; No cheat sheets;

Ordinary calculators may be used but not the smart phone with calculators. Verilog Guides are not needed and are not allowed.Smart phones, tablets (and any kind of computing/Internet devices) are not allowed.

This is a Crowdmark exam. Please do not write on margins or on backside. Use HB or 1H pencil.

Spring 2020Instructor: Gandhi Puvvada

Thursday, 2/13/2020 (A 3-hour exam) 05:00 PM - 08:00 PM (180 min) in SGM123Please do not write your student ID

Student’s DEN D2L username: @usc.edu

Viterbi School of Engineering, University of Southern California

Ques# Topic Page# Time Points Score

1 State Diagram, RTL Design 2-4 60 min. 100

2 Unsigned and Signed numbers 5-6 25 min. 54

3 CPU Performance 7-7 20 min. 30

4 MIPS processor ISA, Byte-addressable processors

8-9 30 min. 77

5 Single-Cycle CPU 10-11 25 min. 64

Total 1+10+1 160 min. 325

Perfect Score 300

Page 2: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

1 ( points) min. State Diagram and RTL design

1.1 Mealy machine design: Reproduced below is a partial solution of Q#2 on Array Division, C[I] <= A[I]/B[I]; of ee354_MT_Spring2017, which you were asked to go through.

Here A[I] is divided by a constant 3 and we are interested to know if the division is an even division (exact division with no reminder left) and if so whether the quotient is an even number.

Here you are given an array A[I] of 24 non-zero 8-bit unsigned numbers (A[0:23]). You are asked to consider the numbers in A[I] which are evenly divisible by 3 (= exactly divisible by 3 without leaving any remainder). From those, keep a count of those cases where the quotients are even. If the total number of such even quotients are even, then go to ENEQ (Even Number of Even Quotients) state. Otherwise, go to ONEQ (Odd number of Even Quotients). Zero is an even number. So, if none of the quotients are even, then you go to ENEQ state. Instead of maintaining a count of even quotients and checking if the number is even, we can start with a Flag called (say) ENEQ_F (Even number of Even Quotients Flag) , set it to 1 (for True) in the INI state (to say we have so far found 0 number of even quotients (i.e. even number of even quotients)). We flip it every time, we find an even quotient.

Like in the EE354L problem above, access time of A[I] is nearly one clock. So, you can only deposit A[I] into X (X <= A[I];) at the end of the clock, but you do not have any time left in the clock to start any processing of A[I] directly. There is no B[I] here, as the divisor is 3 (a constant here). Unlike in the above EE354L problem, we do not need to store the quotient in C[I]. Actually we do not even need a quotient 7-bit or an 8-bit Q register here! We can have 1-bit Q starting with a zero and flip it every time we are able to successfully subtract a 3 from X. If A[I] is a 6, then X becomes 6 and Q is zero in the first clock. X=3 and Q is scheduled to become 1 in the second clock. X=0 and Q goes back to zero in the 3rd clock. Probably you do not need to go through the third clock as (X==3) indicates the A[I] is evenly divisible. Your grader will look for correctness and efficiency.

Page 3: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

Complete the following table for a 4 element array A[0:3] (instead of a 24 element array, A[0:23]) containing the elements 2, 6, 7, and 6. Here, since, we have 6 and 6 which are not only evenly divisible by 3 but also generate even quotients (namely 2 and 2), we should go to ENEQ state.Show under each clock, for each variable, what is the value at the stating of the clock and what will happen to each variable at the end of the clock. For example we have already shown what happens in the LF state clock. X starts with unknown value x and becomes 2 (though X becomes 2 at the end of the clock) and I starts with 0 and becomes 1.

State transition conditions shall be arrived at carefully by considering the following facts. 1. Terminal count value of I? Did we access the last element? Did we increment I after (or while) accessing that element. Do I expect to have (I==23) or (I == 24) at that time?

2. If we accessed the last element, is it true that we are about to finish processing that last element? What indicates that? Is it (X < 3) or (X == 3) or (X >3) or a some combination of them? If this is an evenly divisible-by-3 case, it is also an even-quotient case? Is Q a zero or a 1?

3. Previously, did we find an even number of (or odd number of) even-quotient cases?What variable and what values of the variable indicate this? Does ENEQ_F help in this matter?

40pts

You can use these decision boxes to arrive at the state transition conditions, if you wish

Page 4: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

Start

Rese

t

INI

DIV

StartI <= 0;

ENEQ

ACK

ACK

LF

X <= A[I];

I <= I + 1;

Q <= 1’b0; 1

ENEQ_F <= 1’b1;

ONEQ

ACK

ACK

Rough work area: for this question

60pts

Page 5: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

2 ( 10+10+4+12+10+8 = 54 points) 25 min. Signed and unsigned numbers

2.1 Given below is part of the solution to Q#2 from Fall 2018 Quiz that you were asked to go through

2.1.1 Instead of the above 4XgtY, (4X greater than Y), if we needed 4XleY (4X less than or equal Y), (i.e. treating the numbers as signed numbers represented in 2’s complement form) how would you produce?Student #1: Simply invert the above 4XgtY to produce 4XleY.Student #2: But Y0 is being ignored above and we are taking about equality here.Student #3: Well, 740 is less than or equal 74X in decimal where X can be anything from 0 to 9. Also, 730 is less than or equal 74X in decimal whatever is X. So I agree with S#1.____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________

2.1.2 How would you produce 4XlosY? (4X lower or same as Y, treating them as unsigned numbers)?____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________

2.2 We know that all ones except for the least significant bit (example: 11110 (=11101+1)) is a minus 2 in 2’s complement notation. So, to decrement by 2, would you add all ones except for the least significant bit in the case of _______ (A/B/C) (A = signed numbers represented in 2’s complement notation, B = unsigned numbers, C = both signed and unsigned). Similarly, to increment by 2, would you add a number like 00010 (size depending on the finite number system in use, for example, for 8-bit system, it is 00000010) in the case of _______ (A/B/C) (legend for A, B, C same as above).

10pts

10pts

4pts

Page 6: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

2.2.1 Your lab partner started the following 5-bit design, utilizing the standard adder/subtracter design and adding or subtracting the constant 00010. She calls it a "increment-by-2 or decrement-by-2" unit. She believes that it can be used for both (i) unsigned numbers and (ii) signed numbers represented in 2’s complement notation. She believes that the idea can be used for any larger number system such as a 32-bit finite number system or a 64-bit number system. Of course, for the finite number system (of 5-bit here or any number of bits), one needs to produce overflow signals, UOV (Unsigned Overflow) and SOV (Signed Overflow). If you agree with her, then produce UOV and SOV below. If you do not agree with her, say so with a brief explanation.

2.2.2 By the way, she is quite outspoken, and did not like the Fall 2018 Midterm Incrementer/Decrementer design (reproduced on the left-side below) and offered her design on the right-side below. Again, if agree with her, then produce UOV and SOV below. If you do not agree with her, say so. You _____________ (agree/disagree) with her.

2.2.2.1 She asked you to help her and simplify her incrementer/decrementer design (whether you agree with her not on correctness of her design). Look at the 5 XOR gates with constants and show how it can be simplified on the left side below.

12pts

a bcin

scout C0

a bcin

scout

a bcin

scout

a bcin

scout

a bcin

scout

Raw

Car

ry

Carry

V

X4 X3 X2 X1 X0

Y4 Y3 Y2 Y1 Y0

Inc2/Dec2

Add/Sub

1 0000Explanation if you disagree with her

10pts

a bcin

scout C0

a bcin

scout

a bcin

scout

a bcin

scout

a bcin

scout

Raw

Car

ry

Carry

V

X4 X3 X2 X1 X0

Y4 Y3 Y2 Y1 Y0

Inc1/Dec1

Add/Sub

0 1000

Reason for disagreeing:

8pts

a bcin

scout C0

a bcin

scout

a bcin

scout

a bcin

scout

a bcin

scout

Raw

Car

ry

Carry

V

X4 X3 X2 X1 X0

Y4 Y3 Y2 Y1 Y0

Inc1/Dec1

Add/Sub

0 1000

Page 7: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

3 ( 12 + 10 + 8 = 30 points) 20 min. CPU Performance

3.1 CPI, IC, and performance: Our new CPU has only three types of instructions: A, B, and C. The hardware team did not tell us the CPI of category B. Two compiler designers, #1 and #2 designed totally different compilers with the following information.Do you have adequate information to conclude (a) which compiler is better (b) and by what factor? If you do not have adequate information for either or both of them, state what minimal information you need to arrive at the answer.

Better compiler is _______________________________ (#1 / #2 / inadequate data). Explain: _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

Better by a factor of ___________________________ (state value or inadequate data) Explain:________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

3.2 ABC and XYZ are implementing the same ISA licensed from ARM company and use the same compiler and suppose ABC claims double the MIPs rating (Millions of instructions per second rating) double that of XYZ. Do we have adequate data to arrive at the performance ratio (speed up factor)? Yes / NoIf yes, what is the speed up and if no, what data is lacking?____________________________________________________________________________________________________________________________________________________________________________________________________________________________________ What are the possible ways in which ABC could have boosted their MIPs rating? ________________________________________________________________________________________________________________________________________________________

Category CPI Frequency#1

ABC

CPIA = 5CPIB = nCPIC = 20

fA1=10%fB1=40%fC1=50%

100%

fA2=50%fB2=20%fC2=30%

100%

Frequency#2 Instruction Count #1

Instruction Count #2

100,000 200,000

12pts

10pts

8pts

Page 8: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

4 ( 4+8+24+7+16+18 = 77 points) 30 min. MIPs Instructions and Memory addresses

4.1 MIPS ISA (a RISAC ISA)

4.1.1 Stack Pointer is a ________ (GPR/SPR) where GPR stands for a General Purpose Register and SPR stands for a Special Purpose Register. MIPS ______________ (hardware/compiler) did not implement SP as a SPR.

4.1.2 The $31 is the __________ (link register/stack pointer) and is known to be having that functionality to _____ (A/B/C).The $29 $31 is the __________ (link register/stack pointer) and is known to be having that functionality to _____ (A/B/C) . Here, A = hardware implementation team, B = compiler implementation team, C = both teams .

4.1.3 A representative assembly language instruction listing is given on the side, demonstrating nested subroutine calls and returns. Here A calls B, which in turn calls C.Is the listing written assuming that the stack grows in the direction of __________________________3 pts

(decreasing/increasing) memory addresses. How can you tell? 9 pts ____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

6 pts Label the two instruction(s) _______________ (preceding/following) the JAL instruction in execution (in execution = in the dynamic execution trace), together with the JAL Subroutine instruction in MIPS (that make up the CISC CALL instruction) as C1, C2, C3 (C1 = Call Part 1, so forth).

6 pts Similarly label the two instruction(s) _______________ (preceding/following) the JR $31 instruction in execution (in execution = in the dynamic execution trace), together with the JR $31 instruction in MIPS (that make up the CISC RTN (Return) instruction) R1, R2, R3 (R1 = Return Part1, so forth).

4.2 Intel follows ___________ (Little Endian / Big Endian) system. In the Intel 80486 processor system address space, byte 0000_824CH is the ____________ (most / least) significant byte of the 32-bit word with system address ______________ (state in hexadecimal).The 32-bit word 4000 consists of the four bytes 4000, 4001, 4002, and 4003 in ________________________________ (Little-Endian / Big-Endian / both kinds of /neither kind of) processor.

4pts

8pts

1 A: ----2 ----3 jal B;4 ----5 B: ----6 addi $29, $29, -4;7 sw $31, 0($29);8 jal C;9 lw $31, 0($29);9 addi $29, $29, +4;10 ----11 C: ----12 jr $31

24pts

5+2pts

Page 9: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

4.3 Intel processors, 80486 and i860, are both 32-bit logical address, byte addressable processors.The 80486 is a 32-bit data processor where as the i860 is a 64-bit data processor. State the size of their address space(s): 80486: ___________, Intel i860: ________________________. If stacks of 16 MByte SRAM chips are placed in their byte-wide memory banks to fill-up their entire address spaces, what are the lowest and highest system byte addresses which map to the bottom and the top of the specific 16 MByte chip to which the system byte address 2E45_94CB hex maps to?

In the case of 80486, the bottom is _ _ _ _ _ _ _ _ _ hex and the top is _ _ _ _ _ _ _ _ _ hex.

And in the case of i860, the bottom is _ _ _ _ _ _ _ _ _ hex and the top is _ _ _ _ _ _ _ _ _ hex.And if this chip goes bad, what is the total system address range in hex that needs to be declared as unusable?

(i) in the case of 80486 processor, it is _ _ _ _ _ _ _ _ _ hex to _ _ _ _ _ _ _ _ _ hex.

(ii) in the case of i860 processor, it is _ _ _ _ _ _ _ _ _ hex to _ _ _ _ _ _ _ _ _ hex.

4.3.1 Complete address decoding to generate Group-Selects (/GS_486 and /GS_860) for the row of chips and also show the rest of the labels for address, data, and byte-enable.

16 pts

18pts

A31A30A29A28

CS

WERD

A[ ]D[7:0]

D[ ]

A[ : ]

BE

16 MByte

2

/GS_486

A31A30A29A28

CS

WERD

A[ ]D[7:0]

D[ ]

A[ : ]

BE

16 MByte

2

/GS_860

Intel 80486 Intel i860

Rough work box:

Page 10: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

5 ( 15 + 30 + 10 + 10 + 9 = 64 points) 25 min. Single-cycle CPU:

You are familiar with the branch instruction, the ordinary jump instruction J (Jump with the 26-bit jump address field), and also the indirect jump instruction Jr rs, (Jump register rs).

5.1 The data path on the next page is nearly complete. Complete the connections to the 9 loose ends which

were marked with numbered arrows .

5.2 Control Signal Table: Complete the three rows for addi, JR Rs, and J and three columns for RegWrite, JR, Jump and a few other erased cells. Whenever possible, use don’t cares.

5.2.1 Occasionally it is possible to have two columns in the Control Signal Table to have identical bits. T / FExplain: __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________Occasionally it is possible to have two rows in the Control Signal Table to have identical bits. T / FExplain: __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

5.3 You save hardware components such as muxes/adder in the datapath if you do not have to support (circle your choices): (i) addi (ii) JR Rs (iii) J Explain the ones you did not circle: _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

Inst

ruct

ion

Mem

Rea

d

Mem

Wri

te

Reg

Wri

te

Mem

tore

g

Reg

Dst

AL

USr

c

AL

UO

p1

AL

Uop

0

Bra

nch

JR Jum

p

R-format 0 0 0 1 0 1 0 0

lw 1 0 1 0 1 0 0 0

sw 0 1 X 1 0 0 0

addi

beq 0 0 X 0 0 1 1

JR rs 1 J 1

15pts

1

30pts

10pts

9pts

Page 11: ee457 Quiz Sp2020 -

PC

Instructionmemory

Readaddress

Instruction[31–0]

Instruction [20 16]

Instruction [25 21]

Add

Instruction [31 26]

4

Instruction [15 0]

Mux

0

1

Control

Instruction [15 11]

Control

JumpJR

Instruction [5 0]

MemtoReg

ALUOp

MemWrite

RegWrite

MemRead

BranchRegDst

ALUSrc

16 32

0

0

Add ALUresult

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

PCSrc

Datamemory

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

ALUAddress

PCSrcRegDst

Branch

MemReadMemtoReg

ALUOp

MemWriteALUSrc

RegWrite

Zero

ALUcontrol

1

0

1

0

JR Jump

6

Jump Address [31:0]Instruction [31:0]

PC+4 [31:28]

21 3 4 5

7

89

Page 12: ee457 Quiz Sp2020 -

February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada

Blank page: Please write your name and email. Tear it off and use for rough work. Do not submit at the end.

Student’s Last Name:____________________ email: __________________

It is not difficult to get an A in EE457. You need to work for it and seek help from the 457 teaching team on whatever you do not understand. We are eager to help you. The next four topics, Multi-cycle CPU, pipelined CPU, cache and virtual memory are interesting and challenging too. They are the focus of the midterm exam. Then we cover advanced topics. Best! Gandhi, TA: Kartik, Mentors: Sanjanai, Gengyu HW Graders: Adithya, Gurucharan, Lab Graders: Guowei, and Ting-Yu