Lecture 5: Instruction Set Architecture

26
Lecture 5: Instruction Set Architecture Computer Engineering 585 Fall 2001

description

Lecture 5: Instruction Set Architecture. Computer Engineering 585 Fall 2001. Summary, #1. Designing to Last through Trends CapacitySpeed Logic2x in 3 years2x in 3/2 years DRAM4x in 3 years2x in 10 years Disk4x in 3 years2x in 10 years - PowerPoint PPT Presentation

Transcript of Lecture 5: Instruction Set Architecture

Page 1: Lecture 5: Instruction Set Architecture

Lecture 5: Instruction Set Architecture

Computer Engineering 585Fall 2001

Page 2: Lecture 5: Instruction Set Architecture

Summary, #1• Designing to Last through Trends

Capacity Speed

Logic 2x in 3 years 2x in 3/2 years

DRAM 4x in 3 years 2x in 10 years

Disk 4x in 3 years 2x in 10 years

• 6yrs to graduate => 16X CPU speed, DRAM/Disk size

• Time to run the task– Execution time, response time, latency

• Tasks per day, hour, week, sec, ns, …– Throughput, bandwidth

• “X is n times faster than Y” means ExTime(Y) Performance(X)

--------- = --------------

ExTime(X) Performance(Y)

Page 3: Lecture 5: Instruction Set Architecture

Summary, #2 Amdahl’s Law:

CPI Law:

Execution time is the REAL measure of computer performance!

Good products created when have: Good benchmarks, good ways to summarize

performance Die Cost goes roughly with die area4

Can PC industry support engineering/research investment?

Speedupoverall =ExTimeold

ExTimenew

=

1

(1 - Fractionenhanced) + Fractionenhanced

Speedupenhanced

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

Page 4: Lecture 5: Instruction Set Architecture

Computer Architecture Is …the attributes of a [computing] system as

seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation.

Amdahl, Blaaw, and Brooks, 1964

SOFTWARESOFTWARE

Page 5: Lecture 5: Instruction Set Architecture

Computer Architecture’s Changing Definition

1950s to 1960s: Computer Architecture Course: Computer Arithmetic

1970s to mid 1980s: Computer Architecture Course: Instruction Set Design, especially ISA appropriate for compilers

1990s-2000s: Computer Architecture Course:Design of CPU, memory system, I/O system, Multiprocessors

Page 6: Lecture 5: Instruction Set Architecture

Instruction Set Architecture (ISA)

instruction set

software

hardware

Page 7: Lecture 5: Instruction Set Architecture

Interface DesignA good interface:

• Lasts through many implementations (portability, compatibility)

• Is used in many different ways (generality)

• Provides convenient functionality to higher levels

• Permits an efficient implementation at lower levels

Interfaceimp 1

imp 2

imp 3

use

use

use

time

Page 8: Lecture 5: Instruction Set Architecture

Evolution of Instruction Sets Single Accumulator (EDSAC 1950)

Accumulator + Index Registers(Manchester Mark I, IBM 700 series 1953)

Separation of Programming Model from Implementation

High-level Language Based Concept of a Family(B5000 1963) (IBM 360 1964)

General Purpose Register Machines

Complex Instruction Sets Load/Store Architecture

RISC

(Vax, Intel 432 1977-80) (CDC 6600, Cray 1 1963-76)

(Mips,Sparc,HP-PA,IBM RS6000, . . .1987)

Page 9: Lecture 5: Instruction Set Architecture

A "Typical" RISC

32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take

pair) 3-address, reg-reg arithmetic instruction Single address mode for load/store:

base + displacement no indirection

Simple branch conditions Delayed branch

see: SPARC, MIPS, HP PA-Risc, DEC Alpha, IBM PowerPC, CDC 6600, CDC 7600, Cray-1, Cray-2, Cray-3

Page 10: Lecture 5: Instruction Set Architecture

Evolution of Instruction Sets Major advances in computer architecture are

typically associated with landmark instruction set designs Ex: Stack vs GPR (System 360)

Design decisions must take into account: technology machine organization programming languages compiler technology operating systems

And they in turn influence these

Page 11: Lecture 5: Instruction Set Architecture

Example: MIPS

Op

31 26 01516202125

Rs1 Rd immediate

Op

31 26 025

Op

31 26 01516202125

Rs1 Rs2

target

Rd Opx

Register-Register

561011

Register-Immediate

Op

31 26 01516202125

Rs1 Rs2/Opx immediate

Branch

Jump / Call

Page 12: Lecture 5: Instruction Set Architecture

Architecture, Implementation Architecture deals with functions

provided to the programmer: addressing, addition, interrupt, and I/O

Implementation deals with method used to achieve this function, such as a parallel datapath and a microprogrammed control

Realization is means used to materialize this method: electrical, magnetic or mechanical devices; power and packaging.

Page 13: Lecture 5: Instruction Set Architecture

Clock Architecture

1 23

12

4567

8910

11

Architecture

Variant Realizations

Page 14: Lecture 5: Instruction Set Architecture

Architecture: Two arms – small one for hour, longer one for minutes, may be alarm.

Realization: Shape of clock arms and dial, numbers. Mechanical or digital mechanism. Energy source a wound spring or a battery.

Page 15: Lecture 5: Instruction Set Architecture

Instruction Set Design: (1) Ease of Use

consistency: with a partial knowledge of the system, one can predict the remainder. e.g. including square-root as an instruction should almost fully define everything else. FP op halve was added to IBM 360 as an afterthought and lacked post-normalization.

orthogonality: Two independent concerns should be handled as such. e.g. clock architecture -- (1) luminous dial (2) alarm.

IBM 650, low order addr bits determine amount of shift. Yet, if address exceeds address space, a violation occurs.

Page 16: Lecture 5: Instruction Set Architecture

transparency: an architectural function is transparent if its implementation does not produce any architecturally visible side-effects. e.g. pipelining should not affect the compiler-visible machine.

generality: Designer should not limit a function by his/her own notions about its use. Intel 8080 has a restart op intended to restart after an interrupt. Its larger use is a return from a subroutine, since it was designed in all its generality.

open-endedness: provision for future expansion.

completeness: all functions of a given class are provided. special case: symmetry: inverse is also provided.

Page 17: Lecture 5: Instruction Set Architecture

Instruction Set Design: (2) Program size: memory size; CPU-MM

bandwidth; frequently used (written-down) instructions should be short.

(3) Execution speed: time required to execute an instruction Can they be pipelined? Are they uniform in

execution length? Control and cache are often in the critical path of a

processor design. Uniform length requirements at loggerheads with

(2) above. (4) Complexity of control unit: Some

instructions should not even be in the instruction set. (RISC)

Page 18: Lecture 5: Instruction Set Architecture

Instruction Set Classification

internal CPU operand storage mechanism: registers, stack, accumulator

# explicit operands / instruction: 0, 1, 2, 3

presumed operand locations: memory, stack

Operations type and size of operands

Instruction: Opcode ---- Operands: ADD R1, 20

Page 19: Lecture 5: Instruction Set Architecture

Instruction Formats

#Ops Instruction

Semantics Machine

4 NI op A B C C = A op B IBM 650 µ-code

3 op A B C C=A op B RISC, Cray

2 op A B A=A op B IBM370, VAX

1 op A Acc = Acc op A

PDP8, M6809

0 op X=X op Y stack machines transputer, B5500

Page 20: Lecture 5: Instruction Set Architecture

Stack/Reg/Acc Architectures

Stack Accumulator

Reg-Mem Reg-Reg

PUSH A LOAD A LOAD R1, A LOAD R1, A

PUSH B ADD B ADD R1, B LOAD R2, B

ADD STORE C STORE C, R1

ADD R3, R1, R2

POP C STORE R3, C

C = A+B

Page 21: Lecture 5: Instruction Set Architecture

Stack: short inst, post-fix model of expression evaluation; sequential operand access --- hard for

compilers, Implementation issues --- how deep,

exception handling e.g. when empty? Accumulator: short inst and relatively

small machine state, (easier context-switch); high memory traffic.

Reg-Reg: Easiest for compiler optimization -- most general model.

long instructions and large state.

Page 22: Lecture 5: Instruction Set Architecture

Endian-ness of Memory AddressingCohen's article: On Holy Wars and a Plea for Peace,IEEE Computer, Oct 81.

CPUwords, pages

Memorybits, bytes

What order are they composed in order to form the nextobject in the hierarchy?

LSB (less-significant unit) travels first little endians (Lilliputians)MSB (more-significant unit) travels first big endians (Blefuscians)

Page 23: Lecture 5: Instruction Set Architecture

Endian-ness Big-endian: IBM 360, MIPS, Motorola

680xx, SPARC, DLX Little-endian: DEC VAX,Compaq/HP

Alpha, Intel 80x86 Selectable: PowerPC, MIPS: mode bit: 0-

Big, 1-LittleA Content

s

4 0x10

5 0x20

6 0x30

7 0x40

Word at Addr 4: 0X10203040 (Big) 0x40302010 (Little)

Page 24: Lecture 5: Instruction Set Architecture

Memory addressing contd: data alignment

Most machines are byte addressable.

Object Aligned at byte addr Misaligned at byte addr

Byte 0,1,2,3,4,5,6,7 Never

Half word 0,2,4,6 1,3,5,7

Word 0,4 1,2,3,5,6,7

Double word 0 1,2,3,4,5,6,7

Page 25: Lecture 5: Instruction Set Architecture

1 B

Decoder

1KX1B memory

.. …1K

1K to 132 to 1

decoder 32

32X32B memory

32 B

32 to 1 multiplexor

5 MSBAddr bits

5 LSBAddr bits

10 addr. bits

Physical Rationale for Alignment

Page 26: Lecture 5: Instruction Set Architecture

Costs of misalignment

0 1 2 3

a2=1

4 5 6 7

Memory Multiplexor

3 addr bits: a3, a2, a1

a3=0 a3=1

a2=0