Spr 2015, Jan 26... ELEC 5200-001/6200-001 Lecture 3 1 ELEC 5200-001/6200-001 Computer Architecture...

60
Spr 2015, Jan Spr 2015, Jan 26 . . . 26 . . . ELEC 5200-001/6200-001 ELEC 5200-001/6200-001 Lecture 3 Lecture 3 1 ELEC 5200-001/6200-001 ELEC 5200-001/6200-001 Computer Architecture and Computer Architecture and Design Design Spring 2015 Spring 2015 Instruction Set Instruction Set Architecture Architecture (Chapter 2) (Chapter 2) Vishwani D. Agrawal Vishwani D. Agrawal James J. Danaher Professor James J. Danaher Professor Department of Electrical and Computer Department of Electrical and Computer Engineering Engineering Auburn University, Auburn, AL 36849 Auburn University, Auburn, AL 36849 http://www.eng.auburn.edu/~vagrawal [email protected]

Transcript of Spr 2015, Jan 26... ELEC 5200-001/6200-001 Lecture 3 1 ELEC 5200-001/6200-001 Computer Architecture...

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 11

ELEC 5200-001/6200-001ELEC 5200-001/6200-001Computer Architecture and DesignComputer Architecture and Design

Spring 2015Spring 2015

Instruction Set ArchitectureInstruction Set Architecture(Chapter 2)(Chapter 2)

Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor

Department of Electrical and Computer EngineeringDepartment of Electrical and Computer EngineeringAuburn University, Auburn, AL 36849Auburn University, Auburn, AL 36849

http://www.eng.auburn.edu/[email protected]

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 22

Designing a ComputerDesigning a Computer

Control

Datapath MemoryCentral Processing

Unit (CPU)or “processor”

Input

Output

FIVE PIECES OF HARDWARE

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 33

Start by Defining ISAStart by Defining ISAWhat is instruction set architecture (ISA)?What is instruction set architecture (ISA)?

ISAISA– Defines registersDefines registers– Defines data transfer modes (instructions) between Defines data transfer modes (instructions) between

registers, memory and I/Oregisters, memory and I/O– There should be There should be sufficient sufficient instructions to efficiently instructions to efficiently

translate any program for machine processingtranslate any program for machine processing

Next, define instruction set format – binary Next, define instruction set format – binary representation used by the hardwarerepresentation used by the hardware– Variable-length vs. fixed-length instructionsVariable-length vs. fixed-length instructions

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 44

Types of ISATypes of ISAComplex instruction set computer (CISC)Complex instruction set computer (CISC)– Many instructions (several hundreds)Many instructions (several hundreds)– An instruction takes many cycles to executeAn instruction takes many cycles to execute– Example: Intel PentiumExample: Intel Pentium

Reduced instruction set computer (RISC)Reduced instruction set computer (RISC)– Small set of instructions (typically 32)Small set of instructions (typically 32)– Simple instructions, each executes in one clock Simple instructions, each executes in one clock

cycle – cycle – REALLY? Well, almost.REALLY? Well, almost.– Effective use of pipeliningEffective use of pipelining– Example: ARMExample: ARM

On Two Types of ISAOn Two Types of ISA

Brad Smith, “ARM and Intel Battle over the Brad Smith, “ARM and Intel Battle over the Mobile Chip’s Future,” Computer, vol. 41, Mobile Chip’s Future,” Computer, vol. 41, no. 5, pp. 15-18, May 2008.no. 5, pp. 15-18, May 2008.

Compare 3Ps:Compare 3Ps:PerformancePerformance

Power consumptionPower consumption

PricePrice

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 55

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 66

Pipelining of RISC InstructionsPipelining of RISC Instructions

FetchInstruction

DecodeOpcode

FetchOperands

ExecuteOperation

StoreResult

Although an instruction takes five clock cycles,one instruction can be completed every cycle.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 77

Growth of ProcessorsGrowth of ProcessorsLanguage of the MachineLanguage of the MachineWe’ll be working with the We’ll be working with the MIPS instruction set MIPS instruction set architecturearchitecture– similar to other similar to other

architectures architectures developed since the developed since the 1980's1980's

– Almost 100 million Almost 100 million MIPS processors MIPS processors manufactured in 2002manufactured in 2002

– used by NEC, Nintendo, used by NEC, Nintendo, Cisco, Silicon Cisco, Silicon Graphics, Sony, …Graphics, Sony, …

1400

1300

1200

1100

1000

900

800

700

600

500

400

300

200

100

01998 2000 2001 20021999

Other

SPARC

Hitachi SH

PowerPC

Motorola 68K

MIPS

IA-32

ARM

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 88

MIPS Instruction Set (RISC)MIPS Instruction Set (RISC)Instructions execute simple functions.Instructions execute simple functions.Maintain regularity of format – each Maintain regularity of format – each instruction is one word, contains instruction is one word, contains opcodeopcode and and argumentsarguments..Minimize memory accesses – whenever Minimize memory accesses – whenever possible use registers as arguments.possible use registers as arguments.Three types of instructions:Three types of instructions:

Register (R)-type – only registers as arguments.Register (R)-type – only registers as arguments.Immediate (I)-type – arguments are registers and Immediate (I)-type – arguments are registers and numbers (constants or memory addresses).numbers (constants or memory addresses).Jump (J)-type – argument is an address.Jump (J)-type – argument is an address.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 99

MIPS Arithmetic InstructionsMIPS Arithmetic InstructionsAll instructions have 3 operandsAll instructions have 3 operands

Operand order is fixed (destination first)Operand order is fixed (destination first)

Example:Example:

C code: C code: a = b + c;

MIPS ‘code’:MIPS ‘code’: add a, b, c

“The natural number of operands for an operation like addition is “The natural number of operands for an operation like addition is three… requiring every instruction to have exactly three three… requiring every instruction to have exactly three operands conforms to the philosophy of keeping the hardware operands conforms to the philosophy of keeping the hardware simple”simple”

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1010

Arithmetic Instr. (Continued)Arithmetic Instr. (Continued)Design Principle: simplicity favors regularity. Design Principle: simplicity favors regularity.

Of course this complicates some things...Of course this complicates some things...

C code:C code: a = b + c + d;

MIPS code:MIPS code: add a, b, cadd a, a, d

Operands must be registers Operands must be registers (why?) (why?) Remember von Remember von Neumann bottleneck.Neumann bottleneck.

32 registers provided32 registers provided

Each register contains 32 bitsEach register contains 32 bits2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1111

Registers vs. MemoryRegisters vs. Memory

Processor I/O

Control

Datapath

Memory

Input

Output

Arithmetic instructions operands must be registersArithmetic instructions operands must be registers32 registers provided32 registers provided

Compiler associates variables with registers.Compiler associates variables with registers.What about programs with lots of variables? What about programs with lots of variables? Must use Must use memory.memory.

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1212

Memory OrganizationMemory OrganizationViewed as a large, single-dimension array, with an Viewed as a large, single-dimension array, with an address.address.

A memory address is an index into the array.A memory address is an index into the array.

"Byte addressing" means that the index points to a byte "Byte addressing" means that the index points to a byte of memory.of memory.

2004 © Morgan Kaufman Publishers

32 bit word

32 bit word

32 bit word

32 bit word

.

.

.

8 bits of data 8 bits of data 8 bits of data 8 bits of data

8 bits of data 8 bits of data 8 bits of data 8 bits of data

8 bits of data 8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data 8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

Byte 0 byte 1 byte 2 byte 3

byte 4 byte 10

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1313

Memory OrganizationMemory OrganizationBytes are nice, but most data items use larger "words"Bytes are nice, but most data items use larger "words"

For MIPS, a word contains 32 bits or 4 bytes.For MIPS, a word contains 32 bits or 4 bytes.

word addressesword addresses

223232 bytes with addresses from 0 to 2 bytes with addresses from 0 to 232 32 – 1 – 1

223030 words with addresses 0, 4, 8, ... 2 words with addresses 0, 4, 8, ... 232 32 – 4 – 4

Words are alignedWords are alignedi.e., what are the least 2 significant bits of a word i.e., what are the least 2 significant bits of a word

address?address?

...

Registers hold 32 bits of data

Use 32 bit address

2004 © Morgan Kaufman Publishers

0

4

8

12

.

32 bits of data

32 bits of data

32 bits of data

32 bits of data

32 bits of data

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1414

InstructionsInstructionsLoad and store instructionsLoad and store instructionsExample:Example:

C code:C code: A[12] = h + A[8];

MIPS code:MIPS code: lw $t0, 32($s3) #addr of A in reg s3add $t0, $s2, $t0 #h in reg s2sw $t0, 48($s3)

Can refer to registers by name (e.g., $s2, $t2) instead of numberCan refer to registers by name (e.g., $s2, $t2) instead of numberStore word has destination lastStore word has destination lastRemember arithmetic operands are registers, not memory!Remember arithmetic operands are registers, not memory!

Can’t write: Can’t write: add 48($s3), $s2, 32($s3)

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1515

Our First ExampleOur First ExampleCan we figure out the code of subroutine?Can we figure out the code of subroutine?

Initially, k is in reg 5; base address of v is in reg 4; Initially, k is in reg 5; base address of v is in reg 4; return addr is in reg 31return addr is in reg 31

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}

swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1616

What Happens?What Happens?

When the program reaches “call swap” statement:When the program reaches “call swap” statement:– Jump to swap routineJump to swap routine

Registers 4 and 5 contain the arguments (register convention)Registers 4 and 5 contain the arguments (register convention)

Register 31 contains the return address (register convention)Register 31 contains the return address (register convention)

– Swap two words in memorySwap two words in memory– Jump back to return address to continue rest of the programJump back to return address to continue rest of the program

.

.call swap...

return address

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1717

Memory and RegistersMemory and Registers

Word 0

Word 1

Word 2

v[0] (Word n)

048

12 .

4n . . .

4n+4k.

v[1] (Word n+1)

Register 0

Register 1

Register 2

Register 3

Register 4

Register 31

Register 5

v[k] (Word n+k)

4n

k

Memory

byte addr.

Ret. addr.v[k+1] (Word n+k+1)

.

.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1818

Our First ExampleOur First ExampleNow figure out the code:Now figure out the code:

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}

swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1919

So Far We’ve LearnedSo Far We’ve LearnedMIPSMIPS

— loading words but addressing bytes— loading words but addressing bytes— arithmetic on registers only— arithmetic on registers only

InstructionInstruction MeaningMeaning

add $s1, $s2, $s3 $s1 = $s2 + $s3sub $s1, $s2, $s3 $s1 = $s2 – $s3lw $s1, 100($s2) $s1 = Memory[$s2+100]

sw $s1, 100($s2) Memory[$s2+100] = $s1

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2020

Instructions, like registers and words of data, are also 32 bits Instructions, like registers and words of data, are also 32 bits longlong

– Example: Example: add $t1, $s1, $s2– registers are numbered, registers are numbered, $t1=8, $s1=17, $s2=18

Instruction Format:Instruction Format:

000000 10001 10010 01000 00000 100000

opcode rs rt rd shamt funct

Can you guess what the field names stand for?Can you guess what the field names stand for?

Machine LanguageMachine Language

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2121

Violating Regularity for a Good CauseViolating Regularity for a Good Cause

GrandCentralStation

TimesSquare

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2222

Consider the load-word and store-word instructions,Consider the load-word and store-word instructions,– What would the regularity principle have us do?What would the regularity principle have us do?– New principle: Good design demands a compromiseNew principle: Good design demands a compromise

Introduce a new type of instruction formatIntroduce a new type of instruction format– I-type for data transfer instructionsI-type for data transfer instructions– other format was R-type for registerother format was R-type for register

Example: Example: lw $t0, 32($s2)lw $t0, 32($s2)

35 18 9 32

opcode rs rt 16 bit number

Where's the compromise?Where's the compromise?

Machine LanguageMachine Language

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2323

Instructions are bitsInstructions are bits

Programs are stored in memory Programs are stored in memory to be read or written just like datato be read or written just like data

Fetch and Execute CyclesFetch and Execute CyclesInstructions are fetched and put into a special registerInstructions are fetched and put into a special register

Opcode bits in the register "control" the subsequent actionsOpcode bits in the register "control" the subsequent actions

Fetch the “next” instruction and continueFetch the “next” instruction and continue

Processor Memorymemory for data, programs,

compilers, editors, etc.

Stored Program ConceptStored Program Concept

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2424

Decision making instructionsDecision making instructions– alter the control flow,alter the control flow,– i.e., change the "next" instruction to be executedi.e., change the "next" instruction to be executed

MIPS conditional branch instructions:MIPS conditional branch instructions:

bne $t0, $t1, Label beq $t0, $t1, Label

Example:Example: if (i==j) h = i + j;

bne $s0, $s1, Labeladd $s3, $s0, $s1

Label: ....

ControlControl

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2525

MIPS unconditional branch instructions:MIPS unconditional branch instructions:j label

Example:Example:if (i!=j) beq $s4, $s5, Lab1 h=i+j; add $s3, $s4, $s5else j Lab2 h=i-j; Lab1:sub $s3, $s4, $s5

Lab2:...

Can you build a simpleCan you build a simple for for looploop??

ControlControl

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2626

So Far We’ve LearnedSo Far We’ve LearnedInstructionInstruction MeaningMeaningadd $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1bne $s4,$s5,Label Next instr. is at Label if

$s4 ≠ $s5beq $s4,$s5,Label Next instr. is at Label if

$s4 = $s5j Label Next instr. is at Label

Formats:Formats:

op rs rt rd shamt funct

op rs rt 16 bit address

op 26 bit address

R

I

J2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2727

Three Ways to Jump: j, jr, jalThree Ways to Jump: j, jr, jal

jj instrinstr # jump to machine instruction # jump to machine instruction instrinstr

(unconditional jump)(unconditional jump)

jrjr $ra$ra # jump to address in register ra# jump to address in register ra

(used by calee to go back to (used by calee to go back to caller)caller)

jaljal addraddr # set $ra = PC+4 and go to # set $ra = PC+4 and go to addraddr

(jump and link; used to jump to (jump and link; used to jump to a a procedure) procedure)

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2828

We have: beq, bne, what about Branch-if-less-than?We have: beq, bne, what about Branch-if-less-than?

New instruction:New instruction:if $s1 < $s2 then

$t0 = 1slt $t0, $s1, $s2 else

$t0 = 0

Can use this instruction to build new “pseudoinstruction”Can use this instruction to build new “pseudoinstruction”

blt $s1, $s2, Label

Note that the assembler needs a register to do this,Note that the assembler needs a register to do this,— there are policy of use conventions for registers— there are policy of use conventions for registers

Control FlowControl Flow

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2929

PseudoinstructionsPseudoinstructions

bltblt $s1, $s2, reladdr$s1, $s2, reladdrAssembler converts to:Assembler converts to: sltslt $1, $s1, $s2$1, $s1, $s2

bnebne $1, $zero, reladdr$1, $zero, reladdr

Other pseudoinstructions: bgt, ble, bge, li, moveOther pseudoinstructions: bgt, ble, bge, li, moveNot implemented in hardwareNot implemented in hardwareAssembler expands pseudoinstructions into Assembler expands pseudoinstructions into machine instructionsmachine instructionsRegister 1, called $at, is reserved for converting Register 1, called $at, is reserved for converting pseudoinstructions into machine code.pseudoinstructions into machine code.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3030

Policy of Register Usage Policy of Register Usage (Conventions)(Conventions)

Name Register number Usage

$zero 0 the constant value 0

$v0-$v1 2-3 values for results and expression evaluation

$a0-$a3 4-7 arguments

$t0-$t7 8-15 temporaries

$s0-$s7 16-23 saved

$t8-$t9 24-25 more temporaries

$gp 28 global pointer

$sp 29 stack pointer

$fp 30 frame pointer

$ra 31 return address

Register 1 ($at) reserved for assembler, 26-27 for operating system2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3131

Small constants are used quite frequently (50% of operands) Small constants are used quite frequently (50% of operands) e.g., e.g., A = A + 5;A = A + 5;

B = B + 1;B = B + 1;C = C – 18;C = C – 18;

Solutions? Why not?Solutions? Why not?– put 'typical constants' in memory and load them. put 'typical constants' in memory and load them. – create hard-wired registers (like $zero) for constants like one.create hard-wired registers (like $zero) for constants like one.

MIPS Instructions:MIPS Instructions: addi $29, $29, 4

slti $8, $18, 10andi $29, $29, 6ori $29, $29, 4

Design Principle: Make the common case fast. Design Principle: Make the common case fast. Which format?Which format?

ConstantsConstants

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3232

We'd like to be able to load a 32 bit constant into a registerWe'd like to be able to load a 32 bit constant into a register

Must use two instructions, new "load upper immediate" instructionMust use two instructions, new "load upper immediate" instruction

lui $t0, 1010101010101010

Then must get the lower order bits right, i.e.,Then must get the lower order bits right, i.e.,ori $t0, $t0, 1010101010101010

1010101010101010 0000000000000000

0000000000000000 1010101010101010

1010101010101010 1010101010101010

ori

1010101010101010 0000000000000000

filled with zeros

How About Larger Constants?How About Larger Constants?

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3333

Assembly provides convenient symbolic representationAssembly provides convenient symbolic representationmuch easier than writing down numbersmuch easier than writing down numbers

e.g., destination firste.g., destination first

Machine language is the underlying realityMachine language is the underlying realitye.g., destination is no longer firste.g., destination is no longer first

Assembly can provide 'pseudoinstructions'Assembly can provide 'pseudoinstructions'e.g., “move $t0, $t1” exists only in Assembly e.g., “move $t0, $t1” exists only in Assembly

implemented using “add $t0, $t1, $zero” implemented using “add $t0, $t1, $zero”

When considering performance you should count real When considering performance you should count real instructions and clock cyclesinstructions and clock cycles

Assembly Language vs. Assembly Language vs. Machine LanguageMachine Language

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3434

simple instructions, all 32 bits widesimple instructions, all 32 bits wide

very structured, no unnecessary baggagevery structured, no unnecessary baggage

only three instruction formatsonly three instruction formats

rely on compiler to achieve performancerely on compiler to achieve performance

op rs rt rd shamt funct

op rs rt 16 bit address

op 26 bit address

R

I

J

Overview of MIPSOverview of MIPS

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3535

Instructions:Instructions:

bne $t4, $t5, Labelbne $t4, $t5, Label Next instruction is at Label Next instruction is at Label

ifif $t4 $t4 ≠≠ $t5 $t5

beq $t4, $t5, Labelbeq $t4, $t5, Label Next instruction is at Label Next instruction is at Label

ifif $t4 = $t5$t4 = $t5

j Labelj Label Next instruction is at LabelNext instruction is at Label Formats:Formats:

op rs rt 16 bit rel. address

op 26 bit absolute address

I

J

Addresses in Branches and JumpsAddresses in Branches and Jumps

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3636

Instructions:Instructions:bne $t4,$t5,Labelbne $t4,$t5,Label Next instruction is at Label if $t4 Next instruction is at Label if $t4 ≠≠ $t5$t5beq $t4,$t5,Labelbeq $t4,$t5,Label Next instruction is at Label if $t4 = $t5Next instruction is at Label if $t4 = $t5

Formats:Formats: – 215 to 215 – 1 ~ ±32 Kwords±32 Kwords

Relative addressingRelative addressing 226 = 64 64 Mwords Mwords– with respect to PC (program counter)with respect to PC (program counter)– most branches are local (principle of locality)most branches are local (principle of locality)

Jump instruction just uses high order bits of PC Jump instruction just uses high order bits of PC – address boundaries of 256 MBytes (maximum jump 64 Mwords)address boundaries of 256 MBytes (maximum jump 64 Mwords)

Addresses in BranchesAddresses in Branches

2004 © Morgan Kaufman Publishers

op rs rt 16 bit address

op 26 bit address

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3737

Example: Loop in C (p. 74)Example: Loop in C (p. 74)

while ( save[i] == k )while ( save[i] == k )

i += 1;i += 1;

Given a value for k, set i to the index of Given a value for k, set i to the index of element in array save [ ] that does not equal element in array save [ ] that does not equal k.k.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3838

MIPS Code for While LoopMIPS Code for While LoopCompiler assigns variables to registers:

$s3 (reg 19) ← i initially 0$s5 (reg 21) ← k$s6 (reg 22) ← memory address where save [ ] begins

Then generates the following assembly code:

Loop: sll $t1, $s3, 2 # Temp reg $t1 = 4 * iadd $t1, $t1, $s6 # $t1 = address of save[i]lw $t0, 0($t1) # Temp reg $t0 = save[i]bne $t0, $s5, Exit # go to Exit if save[i] ≠ kaddi $s3, $s3, 1 # i = i + 1j Loop # go to Loop

Exit:

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3939

Machine Code and Mem. AdressesMachine Code and Mem. Adresses

00 00 1919 99 22 00

00 99 2222 99 00 3232

3535 99 88 00

55 88 2121 Exit = +2Exit = +2

88 1919 1919 11

22 Loop = 20000 (memory word address)Loop = 20000 (memory word address)

. . . . .. . . . .

sll

add

lw

bne

addi

j

80000

80004

80008

80012

80016

80020

80024

Memory Machine codeByte addr. Bits 31-26| 25-21 | 20-16 | 15-11 | 10 – 6 | 5 – 0 |

Note: $t0 ≡ Reg 8, $t1 ≡ Reg 9, $s3 ≡ Reg 19, $s5 ≡ Reg 21, $s6 ≡ Reg 22temp temp i k save

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4040

Finding Branch Address Finding Branch Address ExitExit

Exit = +2Exit = +2 is a 16 bit integer in bne instruction is a 16 bit integer in bne instruction 000101 01000 10101 000101 01000 10101 0000000000000010 0000000000000010 = 2= 2$PC = 80016 is the byte address of the next instruction$PC = 80016 is the byte address of the next instruction 00000000000000010011100010010000 00000000000000010011100010010000 = 80016= 80016Multiply bne argument by 4 (convert to byte address)Multiply bne argument by 4 (convert to byte address) 0000000000001000 0000000000001000 = 8= 8$PC $PC ← ← $PC + 8$PC + 800000000000000010011100010011000 00000000000000010011100010011000 = 80024= 80024Thus, Thus, ExitExit is memory byte address 80024. is memory byte address 80024.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4141

Finding Jump Address Finding Jump Address LoopLoopJJ 2000020000

000010000010 00000000000100111000100000 = 20000 00000000000100111000100000 = 20000

$PC = 80024, when jump is being executed$PC = 80024, when jump is being executed

000000000000000000010011100010011000 = 800240000000000010011100010011000 = 80024

Multiply J argument by 4 (convert to byte address)Multiply J argument by 4 (convert to byte address)

0000000000010011100010000000 = 800000000000000010011100010000000 = 80000

Insert four leading bits from $PCInsert four leading bits from $PC

000000000000000000010011100010000000 = 800000000000000010011100010000000 = 80000

Thus, Thus, LoopLoop is memory byte address 80000. is memory byte address 80000.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4242

Summary: MIPS Registers and Summary: MIPS Registers and MemoryMemory

$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform

32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is

$fp, $sp, $ra, $at reserved for the assembler to handle large constants.

Memory[0], Accessed only by data transfer instructions. MIPS uses byte

230 memoryMemory[4], ..., addresses, so sequential words differ by 4. Memory holds data

words Memory[4294967292] structures, such as arrays, and spilled registers, such as those

saved on procedure calls.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4343

Summary: MIPS InstructionsSummary: MIPS InstructionsMIPS assembly language

Category Instruction Example Meaning Commentsadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers

Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers

add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants

load w ord lw $s1, 100($s2) $s1 = Memory[$s2 + 100]Word from memory to register

store w ord sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory

Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100]Byte from memory to register

store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memoryload upper immediate

lui $s1, 100 $s1 = 100 * 216 Loads constant in upper 16 bits

branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100

Equal test; PC-relative branch

Conditional

branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100

Not equal test; PC-relative

branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0

Compare less than; for beq, bne

set less than immediate

slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; else $s1 = 0

Compare less than constant

jump j 2500 go to 10000 Jump to target address

Uncondi- jump register jr $ra go to $ra For sw itch, procedure return

tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4444

Byte Halfword Word

Registers

Memory

Memory

Word

Memory

Word

Register

Register

1. Immediate addressing

2. Register addressing

3. Base addressing

4. PC-relative addressing

5. Pseudodirect addressing

op rs rt

op rs rt

op rs rt

op

op

rs rt

Address

Address

Address

rd . . . funct

Immediate

PC

PC

+

+

2004 © Morgan Kaufman Publishers

Addressing ModesExample

addi

add

lw, sw

beq, bne

j

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4545

Design alternative:Design alternative:

– provide more powerful operationsprovide more powerful operations

– goal is to reduce number of instructions executedgoal is to reduce number of instructions executed

– danger is a slower cycle time and/or a higher CPIdanger is a slower cycle time and/or a higher CPI

Let’s look (briefly) at IA-32Let’s look (briefly) at IA-32

Alternative ArchitecturesAlternative Architectures

–“The path toward operation complexity is thus fraught with peril.

To avoid these problems, designers have moved toward simpler

instructions”

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4646

IA–32 (a.k.a. x86)IA–32 (a.k.a. x86)1978:1978: The Intel 8086 is announced (16 bit architecture)The Intel 8086 is announced (16 bit architecture)1980:1980: The 8087 floating point coprocessor is addedThe 8087 floating point coprocessor is added1982:1982: The 80286 increases address space to 24 bits, The 80286 increases address space to 24 bits,

+instructions+instructions1985:1985: The 80386 extends to 32 bits, new addressing modesThe 80386 extends to 32 bits, new addressing modes1989-1995:1989-1995: The 80486, Pentium, Pentium Pro add a few The 80486, Pentium, Pentium Pro add a few instructionsinstructions

(mostly designed for higher performance)(mostly designed for higher performance)1997: 1997: 57 new “MMX” instructions are added, Pentium II57 new “MMX” instructions are added, Pentium II1999: 1999: The Pentium III added another 70 instructions (SSE – The Pentium III added another 70 instructions (SSE –

streaming SIMD extensions)streaming SIMD extensions)2001: 2001: Another 144 instructions (SSE2)Another 144 instructions (SSE2)2003: 2003: AMD extends the architecture to increase address space to AMD extends the architecture to increase address space to

64 bits, widens all registers to 64 bits and makes other 64 bits, widens all registers to 64 bits and makes other changes (AMD64)changes (AMD64)

2004: 2004: Intel capitulates and embraces AMD64 (calls it EM64T) and Intel capitulates and embraces AMD64 (calls it EM64T) and adds more media extensionsadds more media extensions

““This history illustrates the impact of the “golden handcuffs” of compatibility:This history illustrates the impact of the “golden handcuffs” of compatibility:““adding new features as someone might add clothing to a packed bag”adding new features as someone might add clothing to a packed bag”“an architecture that is difficult to explain and impossible to love”“an architecture that is difficult to explain and impossible to love” 2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4747

IA-32 OverviewIA-32 OverviewComplexity:Complexity:– Instructions from 1 to 17 bytes longInstructions from 1 to 17 bytes long– one operand must act as both a source and destinationone operand must act as both a source and destination– one operand can come from memoryone operand can come from memory– complex addressing modescomplex addressing modes

e.g., “base or scaled index with 8 or 32 bit displacement”e.g., “base or scaled index with 8 or 32 bit displacement”

Saving grace:Saving grace:– the most frequently used instructions are not too difficult to buildthe most frequently used instructions are not too difficult to build– compilers avoid the portions of the architecture that are slowcompilers avoid the portions of the architecture that are slow

““what the x86 lacks in style is made up in quantity, what the x86 lacks in style is made up in quantity, making it beautiful from the right perspective”making it beautiful from the right perspective”

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4848

IA-32 RegistersIA-32 RegistersRegisters in the 32-bit subset that originated with 80386Registers in the 32-bit subset that originated with 80386

GPR 0

GPR 1

GPR 2

GPR 3

GPR 4

GPR 5

GPR 6

GPR 7

Code segment pointer

Stack segment pointer (top of stack)

Data segment pointer 0

Data segment pointer 1

Data segment pointer 2

Data segment pointer 3

Instruction pointer (PC)

Condition codes

Use

031Name

EAX

ECX

EDX

EBX

ESP

EBP

ESI

EDI

CS

SS

DS

ES

FS

GS

EIP

EFLAGS 2004 © Morgan Kaufman Publishers

Eight generalpurpose registers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4949

IA-32 Register RestrictionsIA-32 Register Restrictions

Fourteen major registers.Fourteen major registers.

Eight 32-bit general purpose registers.Eight 32-bit general purpose registers.ESP or EBP cannot contain memory address.ESP or EBP cannot contain memory address.

ESP cannot contain displacement from base ESP cannot contain displacement from base address.address.

. . .. . .

See Figure 2.38, page 154 (Fifth Edition).See Figure 2.38, page 154 (Fifth Edition).

2004 © Morgan Kaufman Publishers

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5050

IA-32 Typical InstructionsIA-32 Typical InstructionsFour major types of integer instructions:Four major types of integer instructions:– Data movement including move, push, popData movement including move, push, pop– Arithmetic and logical (destination register or memory)Arithmetic and logical (destination register or memory)– Control flow (use of condition codes / flags )Control flow (use of condition codes / flags )– String instructions, including string move and string compareString instructions, including string move and string compare

2004 © Morgan Kaufman Publishers

Some IA-32 InstructionsSome IA-32 Instructions

PUSHPUSH 5-bit opcode, 3-bit register operand5-bit opcode, 3-bit register operand

JEJE 4-bit opcode, 4-bit condition, 8-bit jump offset4-bit opcode, 4-bit condition, 8-bit jump offset

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5151

5-b | 3-b

4-b | 4-b | 8-b

Some IA-32 InstructionsSome IA-32 Instructions

MOVMOV 6-bit opcode, 8-bit register/mode*, 8-bit offset6-bit opcode, 8-bit register/mode*, 8-bit offset

XORXOR 8-bit opcode, 8-bit reg/mode*, 8-bit base, 8-b index8-bit opcode, 8-bit reg/mode*, 8-bit base, 8-b index

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5252

6-b |d|w| 8-b | 8-b

bit indicates move to or from memory

bit indicates byte or double word operation

8-b | 8-b | 8-b | 8-b

*8-bit register/mode: See Figure 2.42, page 158 (Fifth Edition).

Some IA-32 InstructionsSome IA-32 Instructions

ADDADD 4-bit opcode, 3-bit register, 32-bit immediate4-bit opcode, 3-bit register, 32-bit immediate

TESTTEST 7-bit opcode, 8-bit reg/mode, 32-bit immediate7-bit opcode, 8-bit reg/mode, 32-bit immediate

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5353

4-b | 3-b |w| 32-b

7-b |w| 8-b | 32-b

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5454

Additional ReferencesAdditional ReferencesIA-32, IA-64 (CISC)IA-32, IA-64 (CISC)

A. S. Tanenbaum, A. S. Tanenbaum, Structured Computer Organization, Fifth Structured Computer Organization, Fifth EditionEdition, Upper Saddle River, New Jersey: Pearson Prentice-, Upper Saddle River, New Jersey: Pearson Prentice-Hall, 2006, Chapter 5.Hall, 2006, Chapter 5.

ARM (RISC)ARM (RISC)D. Seal, D. Seal, ARM Architecture Reference Manual, Second EditionARM Architecture Reference Manual, Second Edition, , Addison-Wesley Professional, 2000.Addison-Wesley Professional, 2000.

SPARC (Scalable Processor Architecture)SPARC (Scalable Processor Architecture)

PowerPCPowerPCV. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer V. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer Organization, Fourth Edition, New York: McGraw-Hill, 1996.Organization, Fourth Edition, New York: McGraw-Hill, 1996.

Instruction ComplexityInstruction Complexity

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5555

Increasing instruction complexity

Pro

gra

m s

ize

in m

ach

ine

inst

ruct

ion

s (P

)

Av.

exe

cutio

n ti

me

per

inst

ruct

ion

(T

)

PT

P×T

URISC: The Other ExtremeURISC: The Other ExtremeInstruction set has a single instruction:Instruction set has a single instruction:

label:label: uriscurisc dest, src1, targetdest, src1, target

Subtract operand 1 from operand 2, replace operand 2 with the Subtract operand 1 from operand 2, replace operand 2 with the result, and jump to target address if the result is negative.result, and jump to target address if the result is negative.

See, B. Parhami, See, B. Parhami, Computer Architecture, from Computer Architecture, from Microprocessors to SupercomputersMicroprocessors to Supercomputers, New York: Oxford, , New York: Oxford, 2005, pp. 151-153.2005, pp. 151-153.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5656

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5757

Instruction complexity is only one variableInstruction complexity is only one variable– lower instruction count vs. higher CPI / lower clock rate – lower instruction count vs. higher CPI / lower clock rate –

we will see performance measures laterwe will see performance measures later

Design Principles:Design Principles:– simplicity favors regularitysimplicity favors regularity

– smaller is fastersmaller is faster

– good design demands compromisegood design demands compromise

– make the common case fastmake the common case fast

Instruction set architectureInstruction set architecture– a very important abstraction indeed!a very important abstraction indeed!

Links to some instruction sets – next slide.Links to some instruction sets – next slide.

SummarySummary

2004 © Morgan Kaufman Publishers

Some Instruction SetsSome Instruction SetsMIPSMIPShttp://www.d.umn.edu/~gshute/mips/MIPS.html

ARMARMhttp://simplemachines.it/doc/arm_inst.pdf

IA32/64IA32/64http://brokenthorn.com/Resources/OSDevX86.html

PowerPCPowerPChttp://pds.twi.tudelft.nl/vakken/in101/labcourse/instruction-set/

SPARCSPARChttp://www.cs.unm.edu/~maccabe/classes/341/labman/node9

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5858

Preview: Project – to be assignedPreview: Project – to be assigned

Part 1 – Design an instruction set for a 16-bit Part 1 – Design an instruction set for a 16-bit processor.processor.

The ISA may contain no more than 16 unique instructions. However, you may have multiple formats for a given type of instruction, if necessary.

Of the 16 instructions, at least one instruction should make your processor HALT.

The ISA is to support 16-bit data words only. (No byte operands.) All operands are to be 16-bit signed integers (2’s complement). Each instruction must be encoded using one 16-bit word.

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5959

Project Preview (Cont.)Project Preview (Cont.)The ISA is to support linear addressing of 1K, 16-bit words memory. The memory is to be word-addressable only - not byte-addressable.

The ISA should contain appropriate numbers and types of user-programmable registers to support it. Since this is a small processor, the hardware does not necessarily need to support dedicated registers for stack pointer, frame pointer, etc.

The ISA must “support” C Programming Language constructs.

Control flow structures: “if-else” structures, “while” loops, “for” loops.

Functions (call and return).

Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 6060