ISA Bus Architecture
-
Upload
donakomeah -
Category
Documents
-
view
220 -
download
1
Transcript of ISA Bus Architecture
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 1/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Lecture 3:
Instruction Set Architecture
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 2/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Outline
• Instruction Set Architecture
– Traditional issues
– The (old) debate: RISC vs. CISC – New issues
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 3/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
The Big PictureRequirements
Algorithms
Prog. Lang./OS
ISA
uArch
Circuit
Device
Problem Focus
Performance
Focus
BOX
BOX Si fin - Body!
DrainSource
Gate
f2() {
f3(s2, &j, &i);
*s2->p = 10;
i = *s2->q + i;
}
i1: ld r1, b <p1>
i2: ld r2, c <p1>
i3: ld r5, z <p3>
i4: mul r6, r5, 3 <p3>
i5: add r3, r1, r2 <p1>
f1 f2
f3
f4
f5 s q p
j
i
fpf3
SPEC
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 4/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Instruction Set Architecture
Application
Instruction Set Architecture
Implementation
…SPARC MIPS ARM x86 HP-PA IA-64…
Intel Pentium X
AMD K6, Athlon, Opteron
Transmeta Crusoe TM5x00
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 5/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Instruction Set Architecture
• Strong influence on cost/performance
• New ISAs are rare, but versions are not
– 16-bit, 32-bit and 64-bit X86 versions
• Longevity is a strong function of
marketing prowess
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 6/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
• Strongly constrained by the number of bitsavailable to instruction encoding
• Opcodes/operands
• Registers/memory• Addressing modes
• Orthogonality
• 0, 1, 2, 3 address machines
• Instruction formats
• Decoding uniformity
Traditional Issues
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 7/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Instruction Formats
Alpha (fixed length)
32 bits
6 bits
opcode
opcode
opcode
opcode
RA
RA
RA RB
RB
RC
TRAP
Branch
Mem
Operate
x86 (variable length)
prefixes opcode addr mode displ imm
0 to 4 bytes of prefix
1 or 2 bytes of opcode
0 to 2 bytes (ModR/M and SIB)
0 to 8 bytes
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 8/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
The (old) Debate : RISC vs. CISC
• At the time, IBM 370 and VAX dominated
• CMOS was up and coming technology – Small number of transistors per chip
• RISC was appealing
– lower design complexity – easier to pipeline
– higher performance when fit on a chip
IBM 801 (Cocke et al, 1982)
RISC I (Patterson et al, 1982)
MIPS (Hennesey et al, 1982)
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 9/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
What is RISC?
• Fixed length instructions
• Few formats
• Load/Store• Few addressing modes
• Simple decode/control
• Many registers
• Few “unpipelinable” insts
Compiler Complexity
Hardware Complexity
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 10/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
The MIPS Pipeline
• Compiler knows the pipelineorganization
• Schedules instructions around
“hazards”
• Branches are handled by delay slots
• No need to “interlock” the pipeline
Fetch Decode ALU Memory WriteBack
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 11/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Pipelining a CISC [Patt et al 85]
Fetch
Instruction Bytes
Decode
µOp
store
Emits RISC-like
micro-operations
RF Read Execute Mem WB
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 12/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
RISC Baggage
In hindsight, like CISC, even RISCarchitectures suffered from legacy effects. – Delay slots
• Used for dealing with branches in short pipelines
• Helps primarily with target generation
• Becomes a burden for the future generations whosepipelines need to be deeper
– Register windows• Quick save/restore state for procedure calls
• Reduce procedure call overhead to 1 cycle
• Makes register renaming and out of order executionmore complex
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 13/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
The Dynamic-Static Interface
Perhaps the main contribution of the “RISC Revolution”
John Cocke (IBM) is credited for the original idea.
John Hennessy a major driving force later, followed byIMPACT team at Illinois.
“…a willingness to make design tradeoffs freely… between
the architecture and implementation…”
Colwell et al, 1985.
This legacy is still alive and kicking today.
DSI
DSI
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 14/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
DSI and Static Optimization
Granularity of
ISA instruction
ISAs for
reconfigurable
architectures
VAX ISA
MIPS ISA
Potential for Static Optimization
Itanium ISA
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 15/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Variable Instruction Format
• Motivation 1: to accommodate a largenumber of opcodes with nonuniformfrequency of occurrence
– VAX has 304 opcodes. If we insisted usinguniform opcode encoding, we would need 9 bits.Due to the policy of byte alignment, one needs twobytes to encode each VAX opcode.
– An observation: some opcodes are used more
often than the others. The top 200 opcodes acountfor about 98% of the dynamic opcode usage.
– Instead of using 2 bytes to encode all the 304opcodes, use 1 byte to encode the frequently used
ones and use 2 bytes to encode the infrequentlyused ones.
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 16/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Variable Instruction Format (cont.)
• Motivation 2: to allow each instr. exactly thenumber of operands it needs. – RET (0), INC (1), ADD (2 or 3), …
• Motivation 3: to allow each operand specifierexactly the number of bytes ineeds. – Reg (1), Disp (1 to 8), …
• All motvations come from reducing theamount of bytes needed to – represent the program
– be fetched during execution
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 17/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
VIF Cost
• Sequential Decoding Problem
– The decoder cannot be sure where the 1st
operand specifier is until the opcode is decoded. – The decoder cannot locate the ith operand
specifier is until the (i-1)th operand specifier is
decoded.
– The decoder can not be sure where the jthinstruction starts until the last operand specifier of
the (j-1)th instruction is decode.
– Typical solution: instruction buffering with multi-
stage decode pipeline plus post-decode I-cache or
trace cache
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 18/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
VIF Cost (cont.)
• Non-aligned Instruction Access: instructionsare not aligned to any byte position in eachmemory word. – Instruction opcodes and operand specifiers are
not aligned to the decoding logic when fetchedfrom memory.
– Instructions may spill over cache block boundaries
and page boundaries – Typical solution: instruction buffer that decouples
fetch and decode
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 19/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Data Dependent Decoding
• What an instruction does depends onthe values of the explicit and/or implicit
input operands. – Cause: generality of instructions
– Example: string move instructions in x86generates different number of loads and
stores according to an input operand value – Typical solution: use microcode when
executing these instructions
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 20/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Number of Registers
• The large number of registers allows thecompiler to eliminate memory
references and redundant computationby storing more values in the registerfile
– Cost: more bits to encode register
operends – Benefit: suppot for the compiler to achieve
high performance
– MIPS had 32, IA-64 has 128 (levels ofmetallines is a factor here)
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 21/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Compatibility – subtle issues
• Most from incomplete ISA specification
– Needed to have extendability
• User imposed requirements – Inappropriate use of ISA
• Undefined bits being used
• Implementation imposed compatibility – Bug compatibility• Pentium II had to reproduce the bugs of
Pentium
8/11/2019 ISA Bus Architecture
http://slidepdf.com/reader/full/isa-bus-architecture 22/22
© Wen-mei Hwu and S. J. Patel, 2005ECE 511, University of Illinois
Today’s Issues
• Where should the DSI be placed? Whatcontrol is given to the compiler (static) andwhat is relegated to the hardware (dynamic). – This is becoming a more pressing issue as the
power crisis continue to grow
• Information flow across the DSI interface. – Speculation, predication, registers, analysis info
• There is an emerging difference between thetarget architecture and the implementationarchitecture. – Java, .NET