Ch6a- 2EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
Automobile Manufacturing1. Build frame. 60 min.
2. Add engine. 50 min.
3. Build body. 80 min.
4. Paint. 40 min.
5. Finish. 80 min.
310 min.
Latency: Time from start to finish for one car.
Throughput: Number of finished cars per time unit.
1 car/310 min = 0.19 cars/hour
310 minutes per car.
Issues: How can we make the process better by adding more workers?
(smaller is better)
(larger is better)
6.1
Ch6a- 3EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
An Assembly line
6.1
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
60 50 80 40 80
Short stagescan’t produce faster thanone car/80 min or a backlog will occurat longer stages.
80 80 80
Latency: 400 min/carThroughput: 4 cars/640 min (1 car/160 min)tim
e
Will approach 1 car/80 min as time goes on
Ch6a- 4EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
Applying Assembly Lines to CPUs
• The single-cycle design did everything “at once”
• Can we break the single-cycle design up into stages?
• Use the multi-cycle design to help us decide what can go together
6.1
• Issues:
• Why not base the design on multi-cycle?
• Car assembly works well. Will it be so easy to do the same technique to a CPU?
Ch6a- 5EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
InstructionMemory
Data Memory
AddAdd
4
Read address
Instruction [31-0]
Read address
Write address
Write data
Read dataResult
Zero
Result
Result Sh.Left2
1
00
1
signextend
PC
16 32
Read reg. num A
RegistersRead reg num B
Write reg num
Write reg data
Read reg data A
Read reg data B
Read reg num A
0
1
Imm:[15-0]
Rs:[25-21]
Rt:[20-16]
Rd:[15-11]
1
0
Instr. Fetch,PC=PC+4
Instr. DecodeRegister Fetch
Execute,Address Calc.
Memory
Reg.Write-back
Breaking up the Single-Cycle Datapath
6.2
Stages frommulti-cycle design
Ch6a- 6EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
InstructionMemory
Data Memory
AddAdd
4
Read address
Instruction [31-0]
Read address
Write address
Write data
Read dataResult
Zero
Result
Result Sh.Left2
1
00
1
signextend
PC
16 32
Read reg. num A
RegistersRead reg num B
Write reg num
Write reg data
Read reg data A
Read reg data B
Read reg num A
0
1
Imm:[15-0]
Rs:[25-21]
Rt:[20-16]
Rd:[15-11]
1
0
Instr. Fetch,PC=PC+4
Instr. DecodeRegister Fetch
Execute,Address Calc.
Memory
Reg.Write-back
The Key - Pipeline Registers
6.2
clock
PC+4
If only one instruction is processed at a time, this is similar to multi-cycle
Ch6a- 7EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
InstructionMemory
Data Memory
AddAdd
4
Read address
Instruction [31-0]
Read address
Write address
Write data
Read dataResult
Zero
Result
Result Sh.Left2
1
00
1
signextend
PC
16 32
Read reg. num A
RegistersRead reg num B
Write reg num
Write reg data
Read reg data A
Read reg data B
Read reg num A
0
1
Imm:[15-0]
Rs:[25-21]
Rt:[20-16]
Rd:[15-11]
1
0
Example: ADD Instruction
6.2
PC+4
Writes the correct data to the wrong register
In general, arrows that go backwards across pipeline stages may be bad news...
A new instruction enters the IF stage each cycle
ADD $Rd, $Rs, $RtADD $Rd, $Rs, $Rt
ADD $Rd, $Rs, $Rt ADD $Rd, $Rs, $Rt
AD
D $
Rd,
$R
s, $
Rt
Ch6a- 8EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
InstructionMemory
Data Memory
AddAdd
4
Read address
Instruction [31-0]
Read address
Write address
Write data
Read dataResult
Zero
Result
Result Sh.Left2
1
00
1
signextend
PC
16 32
Read reg. num A
RegistersRead reg num B
Write reg num
Write reg data
Read reg data A
Read reg data B
Read reg num A
Imm:[15-0]
Rs:[25-21]
Rt:[20-16]
0
1Rd:[15-11]
1
0
Correcting the Write Register Problem
6.2
PC+4
Rt:[20-16]
Rd:[15-11]
Ch6a- 9EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
Assembly-line Control Signals
135 4
In an assembly line, the manufacturing instructions can be attachedto the car. The instructions then move along with the car.
F: StandardE: 135 HPB: 2-doorP: GreenF: Leather
E: 190 HPB: 4-doorP: BlueF: Cotton
B: 2-doorP: LavenderF: Leather
P: GreenF: Vinyl
F: Leather
2
By separating the control signals by stages, only the signals needed for the current stage must be decoded.
All signals for later stages must be passed along.
6.1
Ch6a- 10EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
InstructionMemory
Data Memory
AddAdd
4
Read address
Instruction [31-0]
Read address
Write address
Write data
Read dataResult
Zero
ResultResult
Sh.Left2
1
00
1
signextend
PC
16 32
Read reg. num A
RegistersRead reg num B
Write reg num
Write reg data
Read reg data A
Read reg data B
Read reg num A
Imm:[15-0]
Rs:[25-21]
Rt:[20-16]
1
0
The Pipelined Control Logic
6.3
PC+4
0
1
Rt:[20-16]
Rd:[15-11]
ALUcontrol
ALUOp
RegWrite
Mem
To
Reg
MemWrite
MemRead
ALUSrc
PCSrc
RegDest
Op:[31-26]
W
ME
Control W
MW
Branch
Ch6a- 11EE/CS/CPE 3760 - Computer OrganizationSeattle Pacific University
How’d we do?
• Compared to Single-cycle
• 5 stages --> Potentially 5x speedup
• Not likely• Stages won’t all be equally long• Pipeline registers will cause some delays
• Latency --> Greater than in single-cycle design
• More complexity, but nicely divided up
• Compared to Multi-cycle
• Smaller speedup since some multi-cycle instructions are shorter
• Complexity may be simpler (but wait…)
Top Related