Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter...
-
Upload
alisha-lardner -
Category
Documents
-
view
215 -
download
0
Transcript of Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter...
Slide 1Michael Flynn EE382 Winter/99
EE382Processor Design
Stanford UniversityWinter Quarter 1998-1999Instructor: Michael Flynn
Teaching Assistant: Steve ChouAdministrative Assistant: Susan Gere
Lecture 1 - Introduction
Slide 2Michael Flynn EE382 Winter/99
Class Objectives Learn theoretical analysis and limits
— develop intuition— project long-term trends and bound design space more
efficiently than simulation Learn models for VLSI component cost tradeoffs
— emphasis on microprocessor Learn modeling techniques for computer system
performance— emphasis on queuing
Put it all together to balance system performance and cost— Emphasis on multiprocessors, memory, and I/O— Practical examples and design targets
Slide 3Michael Flynn EE382 Winter/99
Course Prerequisites Computer Architecture and Organization (EE282)
— Instruction Set Architecture— Machine Organization— Basic Pipeline Design— Cache Organization— Branch Prediction— Superscalar Execution
• In-Order• Out-of-Order
Statistics— Basic probability
• distribution functions• statistical measures
— Familiarity with stochastic processes and Markov models is helpful, but not required
Slide 4Michael Flynn EE382 Winter/99
Course Information Access to the course web page is necessary
http://www-leland.stanford.edu/class/ee382/— Course info, assignments, old exams, design
tools,FAQs, ... Textbook and reference material
— Computer Architecture: Pipelined and Parallel Processor Design, Michael J. Flynn
Problem set and design problem philosophy— Learn by doing: maximize learning/effort
Exam philosophy— Extend what you have learned— Open-book, not a speed or trick contest
You are expected to give us feedback— Questions, office hours, email, surveys
Slide 5Michael Flynn EE382 Winter/99
Grading Problem Sets and Design Problems 40%
— 6 problem sets, — 2 design problems
Midterm 20% Final Exam 40%
— Covers entire course— Scheduled March 15, 8:30-11:30AM
Slide 6Michael Flynn EE382 Winter/99
Key Concepts of Abstraction Instruction Set Architecture (ISA)
— Functional interface for assembly-language programmer— Examples: SGI MIPS, Sun SPARC, PowerPC, HPPA, DEC
Alpha, Intel (x86), IBM System/390, IBM AS/400 Implementation (Machine Organization)
— Partitioning into units and logic design— Examples
• Intel386 CPU, Intel486 CPU, Pentium® Processor, Pentium® Pro Processor
• Alpha 21064, 21164, 21264 Realization
— Physical fabrication and assembly— Examples
• IBM 709(‘54) built with vacuum tubes and 7090(‘59) built with transistors• Pentium Processor in 0.8 m, 0.6m, 0.35 m BiCMOS/CMOS
Slide 7Michael Flynn EE382 Winter/99
Instruction Set Architecture “... the attributes of a [computing] system as seen by the
programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flow and controls, the logical design, and the physical implementation.” Amdahl, Blaauw, and Brooks, 1964
Consists of:— Organization of storage— Data types— Encodings and representations (instruction formats)— Instruction (or Operation Code) Set— Modes for addressing data Items and instructions— Program visible exceptional conditions
Specifies requirements for binary compatibility across implementations
Slide 8Michael Flynn EE382 Winter/99
Instruction Set Types Load/Store (L/S)
— Only load and store instructions refer to memory• no memory ALU ops
— used by several microprocessors• Power PC, HP, DEC Alpha
Register/Memory (R/M)— ALU operations can have either source or destination in
memory— Used by mainframes and most microprocessors
• IBM System/370, Intel Architecture (x86), all x86 compatables Register or Memory (R+M)
— ALU operations can have any/all operands in memory— Not used commonly now
• DEC Vax
Slide 9Michael Flynn EE382 Winter/99
L/S ISA General Characteristics 32 GPR x 32b....more recently 64b instr size: 32b... more recently 64b instr types
— R1 <- R2 op R3 for ALU ops— R1 <-> MEM [RB,D] for LD/ST
Slide 10Michael Flynn EE382 Winter/99
R/M ISA General Characteristics 16 GPR x 32b instr size...16b, 32b, 48b instr types
— RR R1 <- R1 op R2
— RM R1 <- R1 op MEM [RB,RX,D]— MM MEM1 [RB,RX,D] <- MEM1 [RB,RX,D] op MEM2
[RB,RX,D] used for character, decimal ops only.
Slide 11Michael Flynn EE382 Winter/99
ISA Syntax Terminology OP.type destination, source1,source2
— eg ADD.F R1,R2,R3 puts result of floating pt. add in floating reg 1.
— OP without type implies integer type unless fp is clear from the context.
— destination is always first operand, so that store is ST MEM [RB,RX,D], R2
Slide 12Michael Flynn EE382 Winter/99
ISA Assumptions assume all i.s. have a PSW and condition codes...CC Branch is BC.CC target, target is either R or Mem. unconditional branch is BR, even though it’s
implemented with BC other branches BCT, BAL (branch and link)
Slide 13Michael Flynn EE382 Winter/99
Moore’s Law
Moore’s Law: No. Tx per chip increases 4X every 3 yearsCAGR = 60%
Source: Intel
TransistorsPer Die
Memory
Microprocessor
Pentium™Processor
80808086
80286Intel386™Processor
Intel486™Processor
20001970 1975 1980 1985 1990 1995
4004
108
107
106
105
104
103
102
101
1
16M
1K4K 16K
64K256K
1M4M
Slide 14Michael Flynn EE382 Winter/99
Die Size Growth
Source: Intel
10
100
1000
1975 1980 1985 1990 1995 2000
Year
Die
Siz
e (m
m2)
64K
256K
1M
4M
16M
DRAM8086
6800080286
68020
8048668040
80386
LOGICPentium (tm)
Slide 15Michael Flynn EE382 Winter/99
Finer Lithography
Source: Intel
0.01
0.1
1
10
'83 '86 '89 '92 '95 '98 '01
YEAR
Res
olut
ion
( m
)
Resolution
Overlay
CD Control
Generation1.0
0.80.5
0.350.25
Slide 16Michael Flynn EE382 Winter/99
Limits on scaling As device sizes get smaller there are difficulties
maintaining the rate of down sizing of feature sizes It currently appears that around 50nm several factors
may limit scaling— hot carrier effects— time dependent dielectric breakdown— gate tunneling current— short channel effects and effect on VT
Slide 17Michael Flynn EE382 Winter/99
Beyond CMOS MOSFETs If “limits” prove real; there are alternative technologies
with system’s implications— low temperature CMOS— sub threshold logic— new gate oxide materials— SOI
Slide 18Michael Flynn EE382 Winter/99
Fabrication Facility Costs
1
10
100
1000
10000
1965 1970 1975 1980 1985 1990 1995 2000
Dollars in Millions
Source: VLSI Research, Inc.
Moore’s Second Law: Fab Costs Grow 40% Per Year
Slide 19Michael Flynn EE382 Winter/99
Microprocessor Business Model New “generation” of silicon technology every 2.5-3 years
— 30% reduction in linear dimensions => 50% in area— 30% reduction in device delay => 50% increase in speed— Used to reduce cost and improve performance on previous
generation microprocessor— Used to enable new generation of microprocessor with
wider, more parallel, more functional machine organization— Incremental changes between generations
Business growth enables investment in new technology— Driven by performance, new applications, and “dancing
bunny people”
Slide 20Michael Flynn EE382 Winter/99
Performance Growth
HP 9000/750
SUN-4/260
MIPS M2000
MIPS M/120
IBMRS6000
100
200
300
400
500
600
700
800
900
1100
DEC Alpha 5/500
DEC Alpha 21264/600
DEC Alpha 5/300
DEC Alpha 4/266
DEC AXP/500IBM POWER 100
Year
Per
for m
anc e
0
1000
1200
19971996199519941993199219911990198919881987
Workstation Performance Improving 54% per yearThat’s almost 1% per week!Figure 1.20 from P&H
Slide 21Michael Flynn EE382 Winter/99
PC Shipment Growth
Performance Growth and New Applications Drive Volume
Source: Dataquest by A. Yu in IEEE Micro 12/96
Slide 22Michael Flynn EE382 Winter/99
System Price/Performance
DEC VAX11/7801 MIPS1 MB$200K
$200K per MIPS
1977
IBM System 360/500.15 MIPS
64 KB$1M
$6.6M per MIPS
Dell Dimension XPS-300725 MIPS
64 MB$2412 (1/4/98)
$3.33 per MIPS
1965 1998
Photographs from Virtual Computing History Group
Slide 23Michael Flynn EE382 Winter/99
Representative System
L2 Cache
Pipelines
Registers
L1Icache
L1Dcache
CPU CPU• • •
Chipset Memory
I/O Bus(es)
Slide 24Michael Flynn EE382 Winter/99
Summary Current architectures exploit parallelism for performance
— Multiple pipelines and caches— Multiprocessors
Technology costs are increasing rapidly— High volume is critical to recover costs
• interface standards and evolution necessary — Product success depends on cost-effective area allocation and
partitioning Technology capacity and performance increasing rapidly
— Critical to evaluate broad space of design options at each generation• Opportunity to learn from the past and to innovate
Theoretical analysis and modeling combined with designtargets are powerful tools for developing computer systems.
This course will help prepare you to apply thosefor your future career in theory or practice.
Slide 25Michael Flynn EE382 Winter/99
This Week Check access to the web page
— Make sure you can read and print— First problem set will be posted by Friday
Reading— Scan Chapter 1— Sections 2.1,2.2
Room Change— move to Gates B03— no festival Friday lecture