Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you...

16
CS250 Section 4 9/21/10 Yunsup Lee Image Courtesy: Tilera

Transcript of Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you...

Page 1: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

CS250 Section 49/21/10Yunsup Lee

Image Courtesy: Tilera

Page 2: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Any questions on lab 2 & lab 3?

• Doing okay with gate-level simulations?

Page 3: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Announcements

• I’m still working to get physical libraries for lab 3 work

• probably a day or more

• RTL simulation is working, in the mean time you can get your circuit to work

• RISC-V Specification is taking a little bit more time

• We are still changing the ISA

• You should get lab 2 done during this week

• Lab 3 is also a lot of work

Page 4: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Lab 2: RISC-V Processor

riscvTestHarness

riscvProctestrig_fromhost

imemreq_bits_addr

imemreq_val

testrig_tohost

clkreset

InstructionMemory

DataMemory

imemresp_bits_data

dmemreq_rw

dmemreq_bits_addr

dmemreq_bits_data

dmemreq_val

dmemresp_bits_data clk

clk

Page 5: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Lab 3: RISC-V Core

riscvCore

riscvTestHarness

riscvProc

testrig_fromhostimemreq_bits_addr

imemreq_rdytestrig_tohost

clkreset

InstructionCache

DataCache

imemresp_bits_data

dmemreq_rw

dmemreq_bits_addr

dmemreq_bits_data

dmemresp_val

dmemresp_bits_data

log_control imemreq_val

imemresp_val

dmemreq_rdy

dmemreq_valclkreset

ic_mem_req_addr

ic_mem_req_rdy

mem_resp_data

ic_mem_req_val

ic_mem_resp_val

dc_mem_req_rw

dc_mem_req_addr

mem_req_data

dc_mem_resp_val

mem_resp_data

dc_mem_req_rdy

dc_mem_req_valArbiter

mem_req_rw

mem_req_addr

mem_req_data

dc_mem_resp_val

mem_resp_data

mem_req_rdy

mem_req_val

clkreset

mem_req_tag

mem_resp_tag

clkreset_ext

32

32

32

128

128

128

128

128

clkreset

Page 6: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Lab 3: Note on the arbiter

Page 7: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Lab 3: Things you need to do

• Clean up your RISC-V v2 3-stage pipeline with a BTB

• Now change pipeline structure (add stall signals) to deal with the cache memory interface

• You can’t always put a memory request (when [i|d]memreq_rdy is deasserted)

• Memory response may take some time (when [i|d]memresp_val is deasserted)

• Start with the Instruction cache first

• Run RISC-V v2 assembly test programs and benchmarks

• Globally installed programs

• ~cs250/install/{riscv-tests, riscv-bmarks}

• Local programs in your lab harness

Page 8: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Lab 3: Things you need to do

• Push your design all the way through the flow

• Count instruction mix for all assembly tests and benchmarks

• Measure energy consumption

• Generate analytic energy model

• A: instruction mix matrix

• x: energy/instruction

• b: energy consumption

• Ax = b holds. Use MATLAB regression to get a best guess to x

• Use your energy model and measure the error

Page 9: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Area Breakdown

Page 10: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

ALU

• If you make your ALU using behavioral Verilog, this is what you get

• It’s always good to start with behavioral Verilog to get things working, but after you get things working, you might optimize your ALU

Page 11: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

ALU operations

• Here’s the ALU operations you need to support

• ADD, SUB, SLT, SLTU, SLTI, SLTIU

• SL, SR, SRA

• AND, OR, XOR, NOR

Page 12: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

ALU Operations: ADD/SUB

• ADDW xc,xa,xb

• xc = xa + xb

• SUBW xc,xa,xb

• xc = xa - xb

• Simple arithemetic for SUBW

• xc = xa + (~xb + 1)

• two inputs: xa and ~xb

• carry input: 1

Page 13: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

ALU Operations: SLTU/SLTIU

• Did you notice the operands of SLTU and SLTIU are flipped?

• SLTU xc, xa, xb: xc = (xa < xb) ? 32’d1 : 32’d0

• SLTIU xa, xb, imm: xa = (xb < imm) ? 32’d1 : 32’d0

• You can’t use the same adder, since the inputs are in a different place

• The natural alignment is “imm - xb” since we use “xa - xb”.

• We can now know whether imm - xb < 0, or imm - xb >= 0

• imm < xb, imm >= xb. We’re almost there.

• If we can do imm >= xb+1, we know imm > xb

• imm - (xb+1) = imm + (~xb)

• Add with no carry

• Think about the signed case

Page 14: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Suggested Datapath

+4

Instruction Mem

RegFile

SignExtend

DecoderData Mem

ir[24:20]

branchpc+4

pc_s

el

ir[11:0]

ir[11:0]

rd1

ALU

ControlSignals

wb_

sel

RegFile

rf_w

en

val

rw

PC

tohosttestrig_tohost

tohost_en

val

addrwdata

rdata

nop

killf

IR

JumpTargGen

BranchTargGen

ZeroExtend

ir[19:15]

ir[26:0]

ir[11:0]PC+4

jalr

0

rd0

BranchCondGen

eq?lt?

test

rig_f

rom

host

ir[24

:20]

ir[4:

0]

jump

ir[4:0]

ir[19:0]

wa_sel

Fetch Stage Execute Stage

ltu?

31

Page 15: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

ALU Operations: SL/SR/SRA

• Use one right signed shifter

• How would you do a shift left with a right shifter?

• Reverse bits!

• How would you do logical right shift?

• Control the MSB of your operand

Page 16: Image Courtesy: Tilera CS250 Section 4cs250/fa10/handouts/section4-web.pdf · Lab 3: Things you need to do • Clean up your RISC-V v2 3-stage pipeline with a BTB • Now change pipeline

Optimize ALU

• You can make an ALU with one adder, shifter, and a logic unit

• Let’s form groups of three

• (First letter of your name’s ASCII) % 4