ECE2030 Introduction to Computer Engineering Lecture 18: Instruction Set Architecture Prof....

22
ECE2030 Introduction to Computer Engineering Lecture 18: Instruction Set Architecture Prof. Hsien-Hsin Sean Lee Prof. Hsien-Hsin Sean Lee School of Electrical and Computer School of Electrical and Computer Engineering Engineering Georgia Tech Georgia Tech

Transcript of ECE2030 Introduction to Computer Engineering Lecture 18: Instruction Set Architecture Prof....

ECE2030 Introduction to Computer Engineering

Lecture 18: Instruction Set Architecture

Prof. Hsien-Hsin Sean LeeProf. Hsien-Hsin Sean Lee

School of Electrical and Computer EngineeringSchool of Electrical and Computer Engineering

Georgia TechGeorgia Tech

2

Breakdown of a Computing Problem

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

ProblemProblemProblemProblem AlgorithAlgorithmsmsAlgorithAlgorithmsms

Programming inProgramming inHigh-Level LanguageHigh-Level LanguageProgramming inProgramming inHigh-Level LanguageHigh-Level Language

Compiler/Assembler/Compiler/Assembler/LinkerLinkerCompiler/Assembler/Compiler/Assembler/LinkerLinker

System LevelSystem LevelSystem LevelSystem Level

Human LevelHuman LevelHuman LevelHuman Level

System architectureSystem architectureSystem architectureSystem architecture

Target Machine Target Machine (one implementation)(one implementation)Target Machine Target Machine (one implementation)(one implementation)Micro-architectureMicro-architectureMicro-architectureMicro-architecture

Functional units/Functional units/Data Path Data Path Functional units/Functional units/Data Path Data Path

Gates Level Gates Level Design Design

Gates Level Gates Level Design Design

TransistorsTransistorsTransistorsTransistors ManufacturingManufacturingManufacturingManufacturing

RTL Level RTL Level RTL Level RTL Level

Logic Level Logic Level Logic Level Logic Level

Circuit Level Circuit Level Circuit Level Circuit Level

Silicon Level Silicon Level Silicon Level Silicon Level

3

Instruction Set Architecture (ISA)

• An abstraction – Interface between hardware and low-level

software– Alleviate programmers from specifying control

signals to harness a machine

• Defined by– An Instruction Set– Software convention

• Independent from a specific internal implementation (microarchitecture + system architecture)

4

ISA design principles• Compatibility• Implementability• Programmability• Usability• Encoding efficiency

High Level Language

ISA

CompilerCompiler

… lw r2, mem[r7] add r3, r4, r2 st r3, mem[r8]

main() { int i,b,c,a[10]; for (i=0; i<10; i++)… a[2] = b + c*i;}

AssemblerAssembler

Binary code

5

General Purpose Computer

Central Processing UnitCentral Processing Unit(CPU)(CPU)

MemoryMemory

DataData &Instruction

0101 1001 1010 1001 1000 0100 1000 1110 1111 00110010 1011 1000 …… ……

A stored-program computer called EDVACEDVAC proposed in 1944 while developing ENIACENIAC, first general purpose computer

Contributors:Presper EckertJohn MauchlyJohn von Neumann

EDSACEDSAC built by Maurice Wilkes implements the first operational stored-program machine

Von Neumann Machine

6

Basic Operation

1000 1100 1110 0010 0000 0000 0000 00001000 1100 1110 0010 0000 0000 0000 0000 (= lw R2, mem[R7])

Instruction fetch from memory

Instruction Decoder/Microcode ROM

Datapath UnitData written back to memory

It’s called HarvardHarvard ArchitectureArchitecture (Mark-III/IV) if instruction and data memory are separated

MICROPROCESSORMICROPROCESSOR

7

Commercial ISA

• CDC6600, IBM 360, DEC VAX (good old days, 360 is now IBM z-series)

• x86 (Intel 32, Intel 64, AMD64), Itanium (IA-64)

• Sun Sparc• Xscale (PocketPC)• IBM PowerPC (Mac, BlueGene)• ARM, MIPS (embedded, MIPS once was

popular in workstations)

8

Basic Instruction Format (Assembly code)

• ISA defines a set of “architectural registersarchitectural registers” to avoid going to memory all the time– X86: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP– MIPS: r0 to r31 and Hi, Lo (or sometimes we use alias to

show the software convention when using these registers)• Instruction main classes

– Arithmetic / Logical– Data transfer (load or store for different data sizes)– Change-of-flow

• Conditional branches• Unconditional branches (e.g. jump, subroutine calls.)

• Operands – Architectural registers– Memory addresses– Target address for change-of-flow

<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>

9

MIPS Register AliasesRegister Names Usage by Software Convention

$0 $zero Hardwired to zero

$1 $at Reserved by assembler

$2 - $3 $v0 - $v1 Function return result registers

$4 - $7 $a0 - $a3 Function passing argument value registers

$8 - $15 $t0 - $t7 Temporary registers, caller saved

$16 - $23 $s0 - $s7 Saved registers, callee saved

$24 - $25 $t8 - $t9 Temporary registers, caller saved

$26 - $27 $k0 - $k1 Reserved for OS kernel

$28 $gp Global pointer

$29 $sp Stack pointer

$30 $fp Frame pointer

$31 $ra Return address (pushed by call instruction)

$hi $hi High result register (remainder/div, high word/mult)

$lo $lo Low result register (quotient/div, low word/mult)

10

Basic Instruction Format (Assembly code)

<instruction mnemonic><instruction mnemonic> <destination operand>, <source op>, <source op><destination operand>, <source op>, <source op>

R8 = R6 + R7 add $8, $6, $7or add $t0, $a1, $a2

R9 = R9 + 2004 addi $9, $9, 2004

R3 = R4 R5 xor $3, $4, $5

operationoperation MIPS assemblyMIPS assembly

R10 = R8 << R9 sllv $10, $8, $9

R24 = R15 >> 2 sra $24, $15, 2 (arith right shift)

R2 = mem[R3+100] lw $2, 100($3)

mem[R3+100] = R2 sw $2, 100($3)

if (R2<R3) R4=1 else R4=0 slt $4, $2, $3

Procedural call jal _func $31=PC+4; go to address pointed label _func (assuming no delay slot)

11

MIPS R-format Encoding

31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

add $4, $3, $2rt

rs

rd

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 0

Encoding = 0x00622020

12

MIPS R-format Encoding

31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

sll $3, $5, 7shamt

rt

rd

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 0

Encoding = 0x000519C0

13

MIPS I-format Encoding

31

opcode rs rt Immediate Value

26 25 21 20 1615 0

lw $5, 3000($2)Immediate

rs

rt

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0

Encoding = 0x8C450BB8

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

14

MIPS I-format Encoding

31

opcode rs rt Immediate Value

26 25 21 20 1615 0

sw $5, 3000($2)Immediate

rs

rt

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0

Encoding = 0xAC450BB8

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

15

MIPS J-format Encoding

31

opcode Target Address

26 0

jal 0x00400030Target

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 0

Encoding = 0x0C10000C

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 031

opcode

26 25 0

Target Address

25

0000 0000 0100 0000 0000 0000 0011 0000XInstruction=4 bytesTarget Address

•jal will jump and pushreturn address in $ra ($31)•Use “jr $31” to return

16

JR and JALR• JALR (Jump And Link Register) and JR (Jump

Register)– Considered as R-type– Unconditional jump – JALR used for procedural call

0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 10 031

opcode rs 0 rd (default=31)

26 25 21 20 1615 11 10 6 5 0

0 funct

jalr r2

jr r2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 031

opcode rs 0

26 25 21 20 1615 11 10 6 5 0

0 funct0

17

Assembly Program Example.data.globl array

array: .word 0x12345678 .word 0x98765432 .word 0x66bbccdd .word 0x44332211

.text

.globl __start__start:

jal main# more code below

.globl mainmain:

la $8, arraylb $9, ($8)lb $10, 1($8)add $11, $9, $10sb $11, ($8)addiu $8, $8, 4lh $9, ($8)lhu $10, 2($8)add $11, $9, $10sh $11, ($8)addiu $8, $8, 4lw $9, ($8)lw $10, 4($8)sub $11, $9, $10sw $11, ($8)

18

Interface for System Services

Backup

20

ISA Design Philosophy

RISC Reduced Instruction Set Computers

versus

CISC Complex Instruction Set Computers

• IBM 801 led by John Cocke pioneered RISC concept• Berkeley’s RISC-I and Stanford’s MIPS led the first academic implementations

21

RISC versus CISC• Why CISC?

– Memory are expensive and slow back then

– Cramming more functions into one instruction

– Using microcode ROM (μROM) for “complex” operations

• Justification for RISC– Complex apps are mostly

composed of simple assignments

– RAM speed catching up– Compiler (human) getting

smarter– Frequency shorter

pipe stages (also easier to design a regular pipeline)

CISC RISCVariable length instructions

Fixed-length instructions

Abundant instructions and addressing modes

Fewer instructions and addressing modes

Longer decoding Easier decoding

Mem-to-mem operations Load/store architecture

Use on-core microcode No microinstructions, directly executed by HW logic

Less pipelineability Better pipelineability

Closer semantic gap between high level code and assembly (shift complexity to microcode)

Needs smart compilers

Intel IA32, IBM 360, DEC VAX, Motorola 68030

MIPS, IBM 801, IBM PowerPC, Sun Sparc

22

Other ISA Design Philosophy• VLIW (Very Long Instruction Word)

– A Dumb Machine with a Smart Compiler– Packing multiple (RISC-like) operation into one VLIW– Instruction scheduling performed completely by

compiler – Multiflow, Cydrome in the 80s and most of the

digital signal processor (DSP) today

• EPIC (Explicit Parallel Instruction Computing)– The return of the VLIW– With new features in the ISA such as

• Data and control speculation• Full Predication

– Intel/HP’s Itanium and Itanium 2 (or once called IA-64)