ICE 4121 - dept.ru.ac.bddept.ru.ac.bd/ice/faculty/download/CmpOrgaPart42015.pdfRegister IR...
Transcript of ICE 4121 - dept.ru.ac.bddept.ru.ac.bd/ice/faculty/download/CmpOrgaPart42015.pdfRegister IR...
ICE 4121
Preliminaries
2
The Processor Level Processor level (a.k.a. System level): Is the highest level in the design hierarchy. Deals with storage and processing of blocks of information, e.g., Programs, Data files Components: Complex, sequential circuits based on VLSI technology. Design: Very much Heuristic, i.e., very little design theory exists. Component types: Four main groups: Processors, Memories, IO devices and interconnection bus.
Micro- processor
(CPU)
Main Memory
Interconnection network (System bus)
Input/ Output devices
3
The Processor Level contd.
Central Processing Unit (CPU) : A general purpose, instruction-set processor with responsibility for program interpretation and execution.
General purpose: Not restricted to specialized processors like IO processors (IOPs) Instruction set:
Operates on word-organized instructions and data obtained from external memory.
Also stores the result in the external memory. Microprocessors: Processor on a single VLSI
Most contemporary processors are Microprocessor.
4
The Processor Level contd.
Essential Internal Organization of CPU at register level
System Bus
Cache
Program counter PC
Address Generation
Instruction Decoding
Datapath (E unit)
Program Control Unit (I unit)
Control Signal
Instruction Register IR Arithmatic
logic Unit ALU
Register file
Main Memory and IO
5
The Processor Level contd.
Cache: A fast buffer designed to hold an active portion of the system’s address space.
Each memory request generated by the CPU is first directed to the cache.
I-unit: Fetches instruction or Data from Cache or Memory Generates control signals required for instruction execution
Data Path: Arithmatic Logic Unit (ALU): Performs Arithmatic and logical operations Registers : Temporary data storage
System bus: Main communication link among CPU-Cache, memory and IO devices.
6
The Processor Level contd.
CPU’s clock: Clock period is the basic unit of time. In one clock cycle CPU can perform a register-transfer operation.
Example: IR:=M(PC), IR->Instruction register, M-> Memory PC-> Program counter.
I-unit: Decodes the instruction to determine action Generates control signal The entire process->Fetching, Decoding and Executing is called instruction cycle.
7
The Processor Level Processor level (a.k.a. System level): Is the highest level in the design hierarchy. Deals with storage and processing of blocks of information, e.g., Programs, Data files Components: Complex, sequential circuits based on VLSI technology. Design: Very much Heuristic, i.e., very little design theory exists. Component types: Four main groups: Processors, Memories, IO devices and interconnection bus.
Micro- processor
(CPU)
Main Memory
Interconnection network (System bus)
Input/ Output devices
8
The Processor Level contd.
Central Processing Unit (CPU) : A general purpose, instruction-set processor with responsibility for program interpretation and execution.
General purpose: Not restricted to specilaized processors like IO processors (IOPs) Instruction set:
Operates on word-organized instructions and data obtained from external memory.
Also stores the result in the external memory. Microprocessors: Processor on a single VLSI
Most contemporary processors are Microprocessor.
9
The Processor Level contd.
CPU Overview
System Bus
Cache
Program counter PC
Address Generation
Instruction Decoding
Datapath (E unit)
Program Control Unit (I unit)
Control Signal
Instruction Register IR Arithmatic
logic Unit ALU
Register file
Main Memory and IO
Processor Level design
Prototype Structure:Basic
CPU Memory
ICN
I/O I/O I/O ….
Processor Level design
Prototype Structure: Computer with Cache and IOPs
CPU Memory
ICN
I/O I/O ….
Cache
IOP1 IOP2
I/O I/O I/O
Processor Level design
Prototype Structure: Computer with Multiple CPUs and Main Memory Banks
CPU Memory
ICN
I/O I/O ….
Cache
IOP1 IOP2
I/O I/O I/O
Memory Memory CPU
Cache
ICE 4121 August 18, 2015,
Class No. 02
Processor Basics
Processor Basics
Fundamentals: Program execution steps
1. The CPU transfers instructions and when applicable their input data from main memory to registers in CPU
2. The CPU executes the instruction in their stored sequences if not altered by some branching instruction.
3. The CPU transfers output data to from CPU registers to main memory.
Processor Basics
Fundamentals: Processor Memory Communication
CPU Memory
M
Instruction
Data
Cache Memory
CM
Main Memory
MM
Instruction
Data
CPU
Instruction
Data
Without Cache
With Cache
Processor Basics
Fundamentals: User and Supervisory Modes
User Programs: In the interest of the computer user, e.g., Word Processor
Supervisory Programs: Manages various routine aspects of the computer. e.g., O/S Receives requests supervisory services directly
from secondary memory units or IO devices. Such requests are known as Interrupts
Processor Basics Fundamentals: CPU operation: Overview of CPU behavior
Begin
Instructions waiting?
Fetch the next Instruction
Execute the Instruction
Interrupts waiting?
Transfer to interrupt handling program
Y
N
N Y
ICE 4121 August 19, 2015,
Class No. 03
Processor Basics
Processor Basics
Instruction cycle: Sequence of operations performed by CPU in processing an instruction
Consists of two steps Fetch: Instruction is read from memory Execute: Operations specified in the instruction
are carried out.
Processor Basics
Instruction cycle: Sequence of operations performed by CPU in processing an instruction
Consists of two steps Fetch: Instruction is read from memory Execute: Operations specified in the instruction
are carried out.
CPU Registers: High speed memory locations inside CPU
Inside Program Control Unit (PCU) PC: Program counter AR: Address Register IR: Instruction Register
Inside Data Processing Unit (DPU) AC: Accumulator DR: Data Register
…
21
The Processor Level contd.
Accumulator based CPU
System Bus
DR
Arithmatic Logic Unit (ALU)
DPU
AC
M and IO
IR PC
PCU
AR
Instruction Decoder
Control Signals
Processor Basics
Basic Instruction type: X1=f(X1,X2)
X1,X2-> AC, DR, PC or external memory location M(adr)
f-> Addition, subtraction, shifting and logical (word-gate) operations.
Instruction format I=op.adr
Fetching means, IR.AR=M(PC) When fetched, IR:=op, AR:=adr CPU decodes OP of I in IR and executes it
Processor Basics
An Example: Z:=X+Y 3 Operands
HDL Format Assembly Language Format
Narrative format (Comment)
AC:=M(X) LD X Load X from M into Accumulator
DR:=AC MOV DR, AC Move contents of AC to DR
AC:=M(Y) LD Y Load Y from M into Accumulator
AC:=AC+DR ADD ADD DR to AC
M(Z):=AC ST Z Store contents of AC in M
uses load/store architecture for memory access
ICE 4121 August 20, 2015,
Class No. 04
Processor Basics
25
The Processor Level contd.
Central Processing Unit (CPU) : A general purpose, instruction-set processor with responsibility for program interpretation and execution.
General purpose: Not restricted to specialized processors like IO processors (IOPs) Instruction set:
Operates on word-organized instructions and data obtained from external memory.
Also stores the result in the external memory. Microprocessors: Processor on a single VLSI
Most contemporary processors are Microprocessor.
26
The Processor Level contd.
Essential Internal Organization of CPU at register level
System Bus
Cache
Program counter PC
Address Generation
Instruction Decoding
Datapath (E unit)
Program Control Unit (I unit)
Control Signal
Instruction Register IR Arithmatic
logic Unit ALU
Register file
Main Memory and IO
27
The Processor Level contd.
Cache: A fast buffer designed to hold an active portion of the system’s address space.
Each memory request generated by the CPU is first directed to the cache.
I-unit: Fetches instruction or Data from Cache or Memory Generates control signals required for instruction execution
Data Path: Arithmatic Logic Unit (ALU): Performs Arithmatic and logical operations Registers : Temporary data storage
System bus: Main communication link among CPU-Cache, memory and IO devices.
28
The Processor Level contd.
CPU’s clock: Clock period is the basic unit of time. In one clock cycle CPU can perform a register-transfer operation.
Example: IR:=M(PC), IR->Instruction register, M-> Memory PC-> Program counter.
I-unit: Decodes the instruction to determine action Generates control signal The entire process->Fetching, Decoding and Executing is called instruction cycle.
Processor Level design
Prototype Structure:Basic
CPU Memory
ICN
I/O I/O I/O ….
Processor Level design
Prototype Structure: Computer with Cache and IOPs
CPU Memory
ICN
I/O I/O ….
Cache
IOP1 IOP2
I/O I/O I/O
Processor Level design
Prototype Structure: Computer with Multiple CPUs and Main Memory Banks
CPU Memory
ICN
I/O I/O ….
Cache
IOP1 IOP2
I/O I/O I/O
Memory Memory CPU
Cache
ICE 4121 August 26, 2015,
Class No. 05
Processor Basics
ICE 4121 August 27, 2015,
Class No. 06
Processor Basics
…
34
The Processor Level contd.
Accumulator based CPU
System Bus
DR
Arithmatic Logic Unit (ALU)
DPU
AC
M and IO
IR PC
PCU
AR
Instruction Decoder
Control Signals
Processor Basics
Basic Instruction type: X1=f(X1,X2)
X1,X2-> AC, DR, PC or external memory location M(adr)
f-> Addition, subtraction, shifting and logical (word-gate) operations.
Instruction format I=op.adr
Fetching means, IR.AR=M(PC) When fetched, IR:=op, AR:=adr CPU decodes OP of I in IR and executes it
Processor Basics
An Example: Z:=X+Y 3 Operands
HDL Format Assembly Language Format
Narrative format (Comment)
AC:=M(X) LD X Load X from M into Accumulator
DR:=AC MOV DR, AC Move contents of AC to DR
AC:=M(Y) LD Y Load Y from M into Accumulator
AC:=AC+DR ADD ADD DR to AC
M(Z):=AC ST Z Store contents of AC in M
uses load/store architecture for memory access
Processor Basics
An Example: Z:=X+Y with memory referencing
HDL Format Assembly Language Format
Narrative format (Comment)
AC:=M(X) LD X Load X from M into Accumulator
AC:=AC+M(Y) ADD Y Load Y into DR and add to AC
M(Z):=AC ST Z Store contents of AC in M
ADD Y will take more time than ADD
But overall execution time may reduce since MV and LD removed.
ICE 4121 Sep 09, 2015,
Class No. 07
Processor Basics
Processor Basics
Instruction Set: The basic set of commands or instructions that a microprocessor understands
RISC: Reduced Instruction Set Computing CISC: Complex Instruction Set Computing
Type Instruction HDL Format
Assem. Lang. Format
Narrative format (Comment)
Data Transfer
Load Store
AC:=M(X) M(X):=AC
LD X ST X
Load X from M into AC Store content of AC in M
Move register Move register
DR:=AC AC:=DR
MOV DR, AC MOV AC, DR
Copy contents of AC to DR Copy Contents of DR to AC
Data Process.
Add Subtract And Not
AC:=AC+DR AC:=AC-DR AC:=AC and DR AC:=not AC
ADD SUB AND NOT
Add DR to AC Subtract DR from AC And bitwise DR to AC Complement contents of AC
Program Control
Branch Branch zero
PC:=M(adr) If AC=0 then PC:= M(adr)
BRA adr BZ adr
Jump to instruction with address adr Jump to instruction with address adr if AC=0
Processor Basics
Negation with SUB
HDL Format
Assem. Lang. Format
Narrative format (Comment)
DR:=AC MOV DR, AC
Copy contents of AC to DR
AC:=AC-DR SUB Subtract DR from AC
AC:=AC-DR
SUB Subtract DR from AC
Processor Basics
A Multiplication Program: AC:=ACxN Multiplicand: Initial content of AC, Multiplier: N variable stored in memory
Principle: use ADD instruction N times A loop is needed. N must be decremented by 1 after every ADD, i.e., N must be moved to AC N must be checked for Zero for loop continuation or exit, i.e., N must be moved to AC
Other memory locations will be needed
Processor Basics Line Location Inst. or Data Comment
0 one 00…001 The const. 1
1 mult N The multiplier
2 ac 00…000
Location for initial value Y of AC
3 prod 00…000 Location for (partial) product P
4 st ac Save initial value Y of AC
5 loop
LD mult Load N into AC to test for loop termination
6 BZ exit Exit if N=0; otherwise continue
7 LD one Load 1 into AC
8 MOV DR, AC Move 1 from AC to DR
9 LD mult Load N into AC to decrement it
10 SUB Subtract 1 from N
11 ST mult Store decremented N
12 LD ac Load initial value Y of AC
13 MOV DR, AC Move Y from AC to DR
14 LD prod Load current partial product P
15 ADD Add Y to P
16
ST prod Store the new partial product P
17 BRA loop Branch to loop
18
exit
ICE 4121 September 16, 2015,
Class No.08
Processor Basics
Program Execution
Clock Cycle PCU action (Fetch) DPU action (Execute)
1 IR.AR:=M(PC), PC:=PC+1 2 M(AR):=AC
2 clock cycles for 1 instruction
For PC, both reading and writing in one clock cycle
Memory locations 1000->one 1001->mult 1002->ac 1003->prod 1004->ST ac
Program execution starts at 1004.
1005->loop ….. 1017->BRA loop 1018->exit
Program Execution Clock Cycle
Inst cy. PC AR PCU action DPU action
1 ST ac 1004 IR.AR:=M(PC), PC:=PC+1 2 1002 M(AR):=AC 3 LD mult 1005 IR.AR:=M(PC), PC:=PC+1 4 1001 AC:=M(AR) 5 BZ exit 1006 IR.AR:=M(PC), PC:=PC+1 6 1001 Test A; No further action if
A!=0 None
7 LD one 1007 IR.AR:=M(PC), PC:=PC+1 8 1000 AC:=M(AR) 9 MOV DR, AC 1008 IR.AR:=M(PC), PC:=PC+1 10 dddd DR:=AC 11 LD mult 1009 IR.AR:=M(PC), PC:=PC+1 12 1001 AC:=M(AR) 13 SUB 1010 IR.AR:=M(PC), PC:=PC+1
14 dddd AC:=AC-DR 15 ST mult 1011 IR.AR:=M(PC), PC:=PC+1 16
1001 M(AR):=AC
17 LD ac 1012 IR.AR:=M(PC), PC:=PC+1 18
1002 AC:=M(AR)
Program Execution
Clock Cycle
Inst cy. PC AR PCU action DPU action
19 MOV DR, AC 1013 IR.AR:=M(PC), PC:=PC+1 20 dddd DR:=AC 21 LD prod 1014 IR.AR:=M(PC), PC:=PC+1 22 1003 AC:=M(AR) 23 ADD 1015 IR.AR:=M(PC), PC:=PC+1 24 dddd AC:=AC+DR
25 ST prod 1016 IR.AR:=M(PC), PC:=PC+1 26 1003 M(AR):=AC 27 BRA loop 1017 IR.AR:=M(PC), PC:=PC+1 28 1005 PC:=AR None 29 LD mult 1005 IR.AR:=M(PC), PC:=PC+1 30 1001 AC:=M(AR) 31 BZ exit 1006 IR.AR:=M(PC), PC:=PC+1
32 1018 Test A: PC:=AR if A=0 None 33 1018
ICE 4121 October 13, 2015,
Class No.09
Processor Basics
Revisiting Mult Prog Execution..Class 8
Additional Features of Processor
Architecture Extension Pipelining
Architecture Extension Register files: Multipurpose register set known as replacing AC, DR or AR Additional data, instruction and address types: Added circuitry for multiplication or division operation Status Register: Register to indicate computation status(e.g. division by zero) Program Control Stack: Transfer control of programs called by subroutine.
…
50
Accumulator Based
System Bus
DR
Arithmatic Logic Unit (ALU)
DPU
AC
M and IO
IR PC
PCU
AR
Instruction Decoder
Control Signals
51
CPU with general register Organization
System Bus M and IO
Data Register Status Register
DPU
ALU Register file
Address Generation
logic
Control circuits
PCU Instruction
Register Address Register
Program counter
Stack pointer
Internal control signal
Additional Features of Processor
Pipelining : Instruction level parallelism
Implemented through overlapping of operation between DPU and PCU as long as they don’t work on/with common resource
While current instruction being executed by DPU, next instruction can be fetched by PCU
Pipelining: Example
Negation without pipelining Clock cycle Instruction PC PCU
action DPU action
1 MOV DR, AC 2000 IR.AR:=M(PC), PC:=PC+1
2 2001 DR:=AC
3 SUB 2001 IR.AR:=M(PC), PC:=PC+1
4 2002 AC:=AC-DR
5 SUB 2002 IR.AR:=M(PC), PC:=PC+1
6 2003 AC:=AC-DR
Negation with pipelining Clock cycle Instruction PC PCU
action DPU action
1 MOV DR, AC 2000 IR.AR:=M(PC), PC:=PC+1
2 MOV/SUB1 2001 IR.AR:=M(PC), PC:=PC+1 DR:=AC
3 SUB1/SUB2 2002 IR.AR:=M(PC), PC:=PC+1 AC:=AC-DR
4 SUB2 2003 AC:=AC-DR
Pipelining contd.
RISC processors speed up their operation with Pipelining Graphical representation of two stage pipelining mentioned before
Fetch Exec. Instruction I1
Fetch Exec.
Fetch Exec.
Instruction I2
Instruction I3
Clock cycle 1 2 3 4 5 6