System-on-Chip Design - Hanyangecadlab.hanyang.ac.kr/lecture/Lecture_01.pdf · System-on-Chip...
Transcript of System-on-Chip Design - Hanyangecadlab.hanyang.ac.kr/lecture/Lecture_01.pdf · System-on-Chip...
System-on-Chip Design
Jong-Wha Chong
Wireless Location and SOC Lab.
Hanyang University
Agenda
Introduction of SoC
Technology Trend
Closing Remarks
- Electronic System Level (ESL) Design
- Low-Power Design
- Design-For-Manufacturability (DFM)
Technical SoC Challenges
2
3
History of IT Industry
Radio C-TV PC Internet
D-TV
HHP
IMT2000
1950~70’s 1980’s 1990’s 2000’s
Transistor
Logic IC
6.0 ~ 3.0um 2.0 ~ 1.0um 0.8 ~ 0.6um 0.5 ~ 0.25um 0.18 ~ 0.06 um
VLSI SOC
8G NAND 256M DRAM 4M DRAM 256K DRAM
4
What is SoC?
MODEM
R
F IMAGE
ME
MO
RY
CPU
I/O
SoC + System Chip =
Complete system in a single package
Need expertise of System & Semiconductor
5
Inside SoC
“An IC dedicated to a specific application
that contains a computation engine
(microprocessor core, DSP core, MPEG
core or graphics core), memory and logic
on a single chip”
Hardware-independent Software
Applications
User defined
Interface
Libraries
Hardware-Dependent Software
Operating Systems
(Kernel)
Device Drivers
Hardware
Analog
CPU
Core
DSP ROM
Middle
ware
MPEG Cache
DRAM
Logic [Gartner, 2003.2Q ]
6
SoC Applications
Media Players
SOC
Mobile
Communication
Mobile
Computing
Home
Entertainment
7
Digital TV on SoC 2000
CPU
MPEG2 TS
HDTV Channel EQ.
HDTV Channel Decoder OSGM
D-TV
CPU TS
A/V Decoder
Format Converter
OSGM
Channel Decoder
Channel EQ.
2Chip 1Chip 7Chip
CPU
MPEG2 TS
FORMAT Converter
OSGM
FORMAT Converter
MPEG2 A/V Decoder
MPEG2 A/V Decoder
HDTV Channel Decoder
HDTV Channel EQ.
CPU TS
A/V Decoder
Format Converter
OSGM
Channel Decoder
Channel EQ.
CPU TS
A/V Decoder
Format Converter
OSGM
Channel Decoder
Channel EQ.
2003 ~ 2005 2001 ~ 2002
8
CPU
Cellular Phone on SoC
HHP
(CDMA)
5Chip
2001 ~ 2003 2000 2004 ~ 2005
Multimedia
Modem 3Chip 1Chip
MP3
MP3
Modem Memory
BBA
RF RF
BBA RF
Modem & GPS
Memory
RF
DSP MP3/ MPEG-4
BBA
Modem
Memory
Memory
BBA
Multimedia
Modem
Memory
RF
BBA
GPS
9
CD-MP3 player on SoC 2001 ~ 2002 ~2000 2003 ~
CD-MP3
player
Digital Servo
CD DSP
Audio DAC 3 Die
MP3 Decoder
ESP 2 Die
4Chip
RF
Digital Servo + CD DSP + Audio DAC
MCU
RF
MCU ESP
CD DSP
Audio DAC
MCU
RF
Digital Servo
CD DSP
Audio DAC
MP3 Decoder
ESP
MCU
RF
MP3 Decoder + ESP
MP3 Decoder
Digital Servo
Digital Servo
CD DSP
Audio DAC
MP3 Decoder
ESP
MCU
RF 7Chip 1Chip
10
DVD player on SoC 2001 ~ 2002 ~2000 2003 ~
CD-MP3
player
3 Die
3Chip
Data Processor
Servo
MPEG2 Audio
Decoder
MCU
RF 6Chip 1Chip
RF
MPEG2 Video
Decoder
MCU MPEG2 A/V Decoder
Data
P
roc
esso
r
Servo RF
Data Processor + Servo
Data Processor
Servo
MPEG2 Audio
Decoder
MCU
RF
MPEG2 Video
Decoder
2 Die
Data Processor
Servo
MPEG2 Audio
Decoder
MCU
RF
MPEG2 Video
Decoder
MPEG2 A/V Decoder + MCU
11
SoC Market Vision
12
Key SoC Challenges
Challenges
High-Performance
Small Size
Low-Cost
Low power
Requirement
■ Design Technology
- System-Level Design
- Low-Power Design
■ Manufacturing Technology
- DFM(Design-For-Manufacturability)
/ DFY(Design-For-Yield)
■ Others in Nano Technology
13
14
Market Trend
15
Convergence of Mobile/Home Platforms
Convergence will continue on each side of mobile/home platforms - Home platform case, IBS 2004.
Television
Set-top Box
Game console
PC
IT vs. CE
War!
Consumer Electronics
CE World
Information Tech.
IT World
16
Chip Architecture MPSoC w/ Reconfigurability
HW DSP
CPU
Application Proc.
RF IC
Currently (90nm)
3G modem+AP+RF IC’s
3G modem chip 5M gates, 100MHz, 250mW
Application processor chip 15M gates, 500MHz, 500mW
Reconf.
HW
HW-S
DSP’s
Multi CPU cores
Multi-mode/band RF
In 2010 (45nm)
Single MPSoC
Multi-CPU 10M gates, +1GHz, 400mW
HW-accelerated scalable (HW-S) multi-DSP
10M gates, +1GHz, 300mW
Reconfigurable HW for SDR 10M gates, 100mW
17
Battery Capacity Requirement
Low Power
2x / year increase in
power efficiency!
18
Process Technology Scaling Vs. Cost
19
10M
100M
SubWavelength & Process Variation
20
Electronic System Level (ESL) Design
Low-Power Design
Design for Manufacturability (DFM)
21
22
Evolution of Design Abstract abst
ract
Transistor model
1970’s
cluster
abst
ract
Gate-level model
1980’s
cluster
abst
ract
Register-transfer-level model
1990’s
SW models SW models SW models
OS/Drivers
MCU DSP
On Chip Bus
IP IP IP
IP IP IP
MCU MEM DSP
SW
abst
ract
cluster
Platform with IP reuses
2000’s 2010+
HW adator
IP’s HW adator
IP’s HW adator
IP’s
On Chip Network
SW tasks
SW adaptor
MCU core
HW adaptor
SW tasks
SW adaptor
MCU core
HW adaptor
SW tasks
SW adaptor
MCU core
HW adaptor
communication
HW
SW SW
abst
ract
cluster
Multi-core SoC with HW/SW virtual components
23
SoC Design and Verification Flow
System-level design
HW-SW co-development is important at early design stage.
HW-SW co-development
SystemSpec.
SystemDesign
HW/SWPartitioning
HWDevelopment
SWDevelopment
HW refinement(UT->T->RTL)
Gate
HW IP
SW IP
Software
Verification
Functional
Verification
Gate-Level
Verification
HW-SW
Co-Design
HW-SW
Co-
Verification
SW refinement(RTOS
mapping)
Final code
24
ESL Definition
Electronic System Level (ESL) design and verification is an emerging electronic design methodology that focuses on the higher abstraction level concerns first and foremost.
It is defined in the ESL Design and Verification book[1] as: "the utilization of appropriate abstractions in order to increase comprehension about a system, and to enhance the probability of a successful implementation of functionality in a cost-effective manner."
The basic premise is to model the behavior of the entire system using a high-level language such as C, C++, or MATLAB.
Rapid and correct-by-construction implementation of the system can be automated using EDA and embedded software tools, although much of it is performed manually today. ESL can also be accomplished through the use of SystemC as an abstract modeling language.
25
[1] Brian Bailey, Grant Martin and Andrew Piziali, ESL Design and Verification: A Prescription for Electronic System
Level Methodology. Morgan Kaufmann/Elsevier, 2007.
ESL for SoC
Electronic System Level is now an established approach at most of the world’s leading System-on-a-chip (SoC) design companies, and is being used increasingly in system design.
From its genesis as an algorithm modeling methodology with ‘no links to implementation’, ESL is evolving into a set of complementary methodologies that enable embedded system design, verification, and debugging through to the hardware/software implementation of custom SoC.
Challenges Huge complexity (i.e., many heterogeneous IP modules)
Configurable IP (used to avoid over design) adds another dimension of difficulty in verification and functional/code coverage
Hardware/software co-development makes the integration harder especially early in the design flow
26
Need of Virtual Platform
Relative Effort by Designer Role
0%
50%
100%
150%
200%
250%
350nm 250nm 180nm 130nm 90nm
Software
Validation
Physical
Verification
Architecture
Architecture effort & Software costs!!!!
IBS Nov. 2002
HW designer
SW designer
Get HW/SW designers into a common playground!!
Challenges Requirements
• Fast & Accurate
• Early Availability & Reusability
• Hardware and software co-verification
• Early development and verification of software
• Quantitative analysis and exploration of system architecture
• Executable spec for HW/SW/System designer
• Managing large scale SoCs
27
An Evolution of the “Traditional” Flow
Virtual Platform
Co-Verification
High Level Model
Consistent
Verification
Requirement
Follow Up
28
Main Features of Virtual Platform
Design methodology based on high-level design abstraction Captures the concept of platform-based design approach
Emphasizes the systematic IP reuse
The main features Modeled at transaction-level (TL)
Architecture exploration and software development can happen early in design process
Software designers can prepare fully-optimized and error minimized code before RTL design
29
(untimed) Functional Level Modeling executable specification (C/C++, Matlab, SystemC)
can verify only algorithm or function
Transaction Level Modeling analyze SoC architecture
early SW development
can estimate timing/power accurately
Register-Transfer Level Modeling RTL/behavioral HW design and verification
can simulate very accurately, but slow
UTFU TF UTF
UTFU TF
TL MT LM TL M
TL M TL M
RTLR TL RTL
RTL RTL
At TLM, concerns only focus on mapping out data flow details: the type of data that flo
ws and where it is stored
What is TLM?
30
Basic of Transaction-Level Modeling
ctrl1/cmd1
Req
Addr
Grant
Data
ack1
ack0
Transaction
RTL
Transaction : exchange of a data or an event between two components of a modeled and simulated system
Module : structural entity, which contain processes, ports, channels, and other modules
Channel : implements one or more interfaces, and serves as a container for communication functionality
Port : object through which a module can access a channel’s interface.
Abstraction Models
A. "Specification model"
"Untimed functional models"
B. "Component-assembly model"
"Architecture model"
"Timed functional model"
C. "Bus-arbitration model"
"Transaction model"
D. "Bus-functional model"
"Communication model"
"Behavior level model"
E. "Cycle-accurate computation
model"
F. "Implementation model"
"Register transfer model"
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
Time granularity for communication/computation objects can be classified into 3 basic categories.
Models B, C, D and E could be classified as TLMs.
32
v2 = v1 + b*b; v3= v1- b*b;
v1
v1 = a*a;
v2
v4 = v2 + v3;c = sequ(v4);
B1
B2
v3
B3
B4
B2B3
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
A: “Specification Model”
Objects
- Computation
-Behaviors
- Communication
-Variables
33
B: “Component-Assembly Model”
v2 = v1 + b*b; v3= v1- b*b;
v1
v1 = a*a;
v2
v4 = v2 + v3;c = sequ(v4);
B1
B2
v3
B3
B4
B2B3
A
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
cv2
cv
12
cv11
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
Objects
- Computation
- Proc
- IPs
- Memories
- Communication
-Variable channels
34
C: “Bus-Arbitration Model”
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
cv2
cv
12
cv11
B
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
cv12
cv11
cv2
PE4(Arbiter)
3
1 2
1. Master interface2. Slave interface3. Arbiter interface
Objects
- Computation
- Proc
- IPs (Arbiters)
- Memories
- Communication
- Abstract bus
channels
35
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
cv12
cv11
cv2
PE4(Arbiter)
3
1 2
1. Master interface2. Slave interface3. Arbiter interface
C
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
PE4(Arbiter)
3
1 2
1: master interface2: slave interface3: arbitor interface
ready
ack
address[15:0]
data[31:0]
IPro
toco
lSla
v
e
ready
ack
address[15:0]
data[31:0]
D: “Bus-Functional Model”
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
Objects
- Computation
- Proc
- IPs (Arbiters)
- Memories
- Communication
- Protocol bus
channels
36
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
cv12
cv11
cv2
PE4(Arbiter)
3
1 2
1. Master interface2. Slave interface3. Arbiter interface
C
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
E: “Cycle-Accurate Computation Model”
PE3
cv12
cv11
cv2
3
1 2
1. Master interface2. Slave interface3. Arbiter interface4. Wrapper
S0
S1
S2
S3
S4
PE4
S0
S1
S2
S3
4
4
PE2
PE1
MOV r1, 10MUL r1, r1, r1
....
...MLA r1, r2, r2, r1
....
4
4
Objects
- Computation
- Proc
- IPs (Arbiters)
- Memories
- Wrappers
- Communication
- Abstract bus
channels
37
PE2PE1
PE3PE4
S0
S1
S2
S3
S4
MOV r1, 10MUL r1, r1, r1
....
...MLA r1, r2, r2, r1
....
S0
S1
S2
S3
MCNTRMADDRMDATA
interrupt
interrupt
interrupt
req req
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1PE3
cv12
cv11
cv2
PE4(Arbiter)
3
1 2
1. Master interface2. Slave interface3. Arbiter interface4. Wrapper
S0
S1
S2
S3
S4
4
E
F: “Implementation Model”
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3
v3= v1- b*b;
B3
v4 = v2 + v3;c = sequ(v4);
B4
PE3
PE4(Arbiter)
3
1 2
1: master interface2: slave interface3: arbitor interface
ready
ack
address[15:0]
data[31:0]
IPro
toco
lSla
v
e
ready
ack
address[15:0]
data[31:0]
D
Computation
Communication
A B
C
D F
Un-
timedApproximate-
timed
Cycle-
timed
Un-
timed
Approximate-
timedE
Cycle-
timed
Objects
- Computation
- Proc
- IPs (Arbiters)
- Memories
- Communication
-Buses (wires)
38
Characteristics of Different Abstraction Models
Models Communication
time
Computation
time
Communication
scheme
PE interface
Specification
model
no no variable (no PE)
Component-
assembly model
no approximate variable channel abstract
Bus-arbitration
model
approximate approximate abstract bus
channel
abstract
Bus-functional
model
time/cycle
accurate
approximate protocol bus
channel
abstract
Cycle-accurate
computation
model
approximate cycle-accurate abstract bus
channel
pin-accurate
Implementation
model
cycle-accurate cycle-accurate bus (wire) pin-accurate
39
Hardware vs. Software
40
System Architect vs. Chip Designer
41
42
Design through Various Levels of Abstraction
- Behavior
- Area,PWR,
- Speed
estimation
Architecture level
- Functions
- Timing
Register Transfer Level
43
&0
0
0
- Bits
- Timing
Logic Level
- Voltages
- Currents
Tr. Level
- Dimensions
Physical Level (=Layout)
44
Dopant A
Dopant B
Doping
Depth
- In Characteristic
Device Level
- Impurity Profiles
Technology Level
45
Behavioral Domain Structure Domain Physical Domain
System Level Performance specs.
CPU`s
Memories
Switches
Controllers
Buses
Physical partitions
Algorithm Level
Algorithms (manipulation of data structure)
Hardware modules
Data structures Clusters
Microarchitectural Level
Operations
Register transfers
State sequencing
ALUs
MUXs
Registers
Microstore
Floorplans
Logic Level Boolean equations
FSM
Gates
Flip-Flops
Cells
Cells, module plan
Circuit Level Transfer functions, timing
Transistors
Wires
Contacts
Layout
46
47
Memory element input equations
(1)
(2)
The next state equation for J-K Flip-Flop
(3)
For the memory element y0, y1,
the next state equation of (1), (2)
(4)
(5)
1
1
0 1 0
0 1 0
( )
( )
Y
Y
Y
Y
J x
K x
J Z x y y
K x y y
1v v v vy J y K y
1
1 1 1
1
0 1 0 0 1 0 0 1 0[( ) ] [( ) ] ( )
v v v v v v
v v v v v v v v v v v v
y x y x y x
y x y y y x y y y x y y
J
Q
Q
K
SET
CLR
J
Q
Q
K
SET
CLR
X
Z
ClockY0
Y1
Y0
JY1
JY0
Y1
48
1
vy
0 1 0 1
0 0
0 1
1 0
1 1
0 0
0 0
0 0
0 1
1 1
1 0
0 1
1 1
0
0
0
1
1
0
0
1
vx
Transition table
state = 0 0
state = 0 1
state = 1 1
state = 1 0
1q
2q
3q
4q
1y 2y
49
1 0
v vy y
1 1
1 0
v vy y vz
Present
states
0 1
0
0
0
1
1
0
0
1
vq
vx
1q
2q
3q
4q
vz
1vq
State table
1q
2q
3q
4q
/X Z
1/1
0/0
0/0
0/0
0/1
1/0
1/0 1/1
Input
Output
State diagram
50
1q3q
1q4q
1q 4q
2q 3q
State
diagram
Functional
description
State
table
Minimal
State table Circuit
Transition
table
Memory
Element
Input equation State assignment
Analysis Process
Design Process (Logic Synthesis)
State
diagram
Functional
description
State
table
Minimal
State table
51
Structure of Digital System
52
8 bits
16 bits
B - Reg Acc T - Reg Inst. Reg HL Reg PC
DATA
Buffer
MEMORY
ALUDecoder
+ 1
ADD
BufferCont. Signal
Timing
Control Contrl Bus
CPU
53
T1 address bus (PC)
T2 PC (PC) + 1
T3 IR (M)
T4 decode
T1 address bus (PC)
2 T2 PC (PC) + 1
T3 Z (M)
T1 address bus (PC)
3 T2 PC (PC) + 1
T3 W (M)
T1 address bus (WZ)
4 T2 --------
T3 Acc (M)
1
54
IAR+1
MAR Memory
IAR
Adder
IAROp
10 bits
10 bits16 bits
Control
55
ADR
ACC
Logic Design Flow & Data, Instruction Format
SPEC
Func. Design
Functional Simulation
Collection of Condions
Gate Level Circuit Design
M A Gs
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A D Rop
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Logic Design Flow Data Format
Data & Instruction Format
S : Sign (+ if “0”, - if “1”)
MAG : Magnitude
OP : OP Code
ADR : Address
56
Instruction Set
OP Code Operation
LOAD 000100 ACC (M(ADR))
STORE 001000 M(ADR) ACC
ADD 010000 ACC ACC+(M(ADR))
BRANCH 100000 BRANCH TO ADR
BRANCH-POSITIVE 100001 BRANCH TO ADR IF
(ACC)>=0
ACC : Accumulator
ADR : Address part in instruction
M(ADR) : Address No. of memory
57
State Diagram for instruction Cycle
ADS : Address Set
IFT : Instruction Fetch
DEC : Decode
LDA : Load
STA : Store
ADD : Add
BRA : Branch
BRP : Branch-Positive
ADS
IFT
DEC
ADDLDA BRPBRASTA
58
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_misc.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_components.all;
use WORK.MATH.all;
entity CPU is
port ( CLOCK : in std_logic;
RESET : in std_logic);
end CPU;
architecture BEHAVIORAL of CPU is
type STATE_TYPE is (ADS, IFT, DEC, LDA, STA, ADD, BRA, BRP);
type MEMORY is array(0 to 6) of std_logic_vector(15 downto 0);
signal CURRENT_STATE, NEXT_STATE : STATE_TYPE;
signal MEM : MEMORY := MEMORY'("0001000000000011",
"0100000000000100", "0010000000000101",
"0000000000000111",
"0000000000001001",
"0000000000000000",
"0000000000000000"
);
signal ACC, IR : std_logic_vector(15 downto 0);
signal MAR, IAR : integer range 0 to 1023;
begin
-- process to hold combinational logic
COMBIN : process(CURRENT_STATE, RESET)
variable temp : std_logic_vector(15 downto 0);
begin
if RESET = '1' then
MAR <= '0';
IAR <= '0';
NEXT_STATE <= ADS; 59
else
case CURRENT_STATE is
when ADS =>
MAR <= IAR;
IAR <= IAR+1;
NEXT_STATE <= IFT;
when IFT =>
IR <= MEM(MAR);
NEXT_STATE <= DEC;
when DEC =>
MAR <= vector2int(IR(9 downto 0));
case IR(15 downto 10) is
when "000100" =>
NEXT_STATE <= LDA;
when "001000" =>
NEXT_STATE <= STA;
when "010000" =>
NEXT_STATE <= ADD;
when "100000" =>
NEXT_STATE <= BRA;
when "100001" =>
NEXT_STATE <= BRP;
when others =>
null;
end case;
60
when LDA =>
ACC <= MEM(MAR);
NEXT_STATE <= ADS;
when STA =>
MEM(MAR) <= ACC;
NEXT_STATE <= ADS;
when ADD =>
temp := add_sub(ACC, MEM(MAR), TRUE);
ACC <= temp;
NEXT_STATE <= ADS;
when BRA =>
IAR <= MAR;
NEXT_STATE <= ADS;
when BRP =>
if ACC(15) = '0' then
IAR <= MAR;
end if;
end case;
end if;
end process;
-- process to hold syschronous elements (flip-flops)
SYSCH : process
begin
wait until CLOCK'event and CLOCK = '1';
CURRENT_STATE <= NEXT_STATE;
end process;
end BEHAVIORAL;
configuration CFG_CPU_BEHAVIORAL of CPU is
for BEHAVIORAL
end for;
end CFG_CPU_BEHAVIORAL;
61
library IEEE;
use IEEE.std_logic_1164.all;
package MATH is
function add_sub(L, R : std_logic_vector; ADD : BOOLEAN)
return std_logic_vector;
function vector2int(S : std_logic_vector(9 downto 0))
return INTEGER;
end MATH;
package body MATH is
function add_sub(L, R : std_logic_vector; ADD : BOOLEAN)
return std_logic_vector is
variable carry : std_logic;
variable A, B, sum : std_logic_vector(L'length-1 downto 0);
begin
if ADD then
-- prepare for an "add" operation
A := L;
B := R;
carry := '0';
else
-- prepare for a "subtract" operation
A := L;
B := not R;
carry := '1';
end if;
-- create a ripple-carry chain; sum up bits
for i in 0 to A'left loop
sum(i) := A(i) xor B(i) xor carry;
carry := (A(i) and B(i)) or (A(i) and carry) or (carry and B(i));
end loop;
return sum; -- result
end; 62
function vector2int(S : std_logic_vector(9 downto 0))
return INTEGER is
variable result : INTEGER range 0 to 1023 := 0;
begin
for i in 9 downto 0 loop
result := result * 2;
if S(i) = '1' then
result := result + 1;
end if;
end loop;
return result;
end vector2int;
end MATH;
63
64
Time CLK <(10)>
<TIME> CLK <(10)>.
<AUTOMATION> CPU : CLK :
<STATES>
ADS : MAR <- IAR, IAR <- IAR+1, -> IFT
IFT : IR <- M(MAR), -> DEC.
DEC : MAR <- ADR,
?OP #4 -> LDA
#8 -> STA
#16 -> ADD
#32 -> BRA
#33 -> BPR.
LDA : ACC <- M(MAR), -> ADS.
STA : M(MAR) <- ACC, -> ADS.
ADD : ACC <- ACC+M(MAR), -> ADS.
BRA : IAR <- ADR, -> ADS.
BRP : |* -ACC(0) *| IAR <- ADR, -> ADS.
<END>.
<END> CPU.
ADS IFT
n-1 n
n n+1
MAR
IAR
<TIME> CLK<(1)>. <STATES> ADS : MAR<-IAR, IAR<-IAR+1, ->IFT.
65
SCL (Simulation Control Language)
Memory Address
0 LOAD 3 Program
1 ADD 4 Program
2 STORE 5 Program
3 7 Data
4 9 Data
5 16 Result
66
Slmulation Control program by SCL
INIT M(0) = X’ 1003’
INIT M(1) = X’ 4004’
INIT M(2) = X’ 2005’
INIT M(3) = X’ 0007’
INIT M(4) = X’ 0009’
INIT CPU = ADS = B’ 0’ CPU
TRACEX AT 0 AT 1000 IAR, MAR, IR, ACC, M(5)
START 0
Memory Initialization
CPU Initialization
Output Format
Simulation Start Time Setting
67
Program Example & memory Initialization
68
Simulation Execution Result
CLK IAR MAR IR ACC M(5)
0 X’000’ X’0000’ X’0000’ X’0000’ X’0000’
LOAD 3
10 X’001’ X’0000’ X’0000’ X’0000’ X’0000’
20 X’001’ X’0000’ X’1003’ X’0000’ X’0000’
30 X’001’ X’0003’ X’1003’ X’0000’ X’0000’
40 X’001’ X’0003’ X’1003’ X’0007’ X’0000’
50 X’002’ X’0001’ X’1003’ X’0007’ X’0000’
ADD 4 60 X’002’ X’0001’ X’4004’ X’0007’ X’0000’
70 X’002’ X’0004’ X’4004’ X’0007’ X’0000’
80 X’002’ X’0004’ X’4004’ X’0010’ X’0000’
90 X’003’ X’0002’ X’4004’ X’0010’ X’0000’
STORE 5 100 X’003’ X’0002’ X’2005’ X’0010’ X’0000’
110 X’003’ X’0005’ X’2005’ X’0010’ X’0000’
120 X’003’ X’0005’ X’2005’ X’0010’ X’0010’
69
Circuit Design
Terminal
HDL Description Circuit Symbol
BUS a
<Terminal> A, B, C
----A
----B
----C
BUS b <Terminal> BUS(4)
----BUS(0)
----BUS(1) --/--BUS
----BUS(2) 4
----BUS(3)
70
Boolean
71
Combination Logic
72
Register
HDL Description Circuit Diagram
F L I P - F L O P
<REGISTER> R
R E G I S T E R a
<REGISTER> R(4)
R E G I S T E R b
<TERMINAL> DATA 1(4)
DATA 2(4), C1, C2, DATA1
<REGISTER> R(4)
<AUTOMATION> MPXR:CLK:
<LOGIC>
|*C1*| R<-DATA1.,
|*C2*| R<-DATA2..
<END> MPXR
Input Data
Clock
cond
Output Data R
L
CG
R / 4
/ 4
/ 4
R(0)
R(1) R(2) R(3)
Data 1
Data 2
C1
C2
4 /
4 /
/ 4
/ 4
R / 4
/ 4
clk
73
Flip-Flop
Name Circuit Symbol Truth Table
RSFF S R Q Q
0 0
0 1
1 0
1 1
Q0 Q0
0 1
1 0
JKFF J K Q Q
0 0
0 1
1 0
1 1
Q0 Q0
0 1
1 0
Q0 Q0
GLFF G L Q Q
0 0
0 1
1 0
1 1
Q0 Q0
Q0 Q0
0 1
1 0
Q
QSET
CLR
S
R
Q
QSET
CLR
S
R
J
Q
Q
K
SET
CLR
L
G
74
Collection of Conditions
<AUTOMATION> CPU:CLK:
<LOGIC>
|*ADS*| MAR(0:9) <- IAR(0:9),
|*ADS*| IAR(0:9) <- IAR(0:9) +1.,
|*ADS*| -> IFT.,
|*IFT*| IR(0:15) <- M(MAR(0:9), 0:15).,
|*IFT*| -> DEC.,
|*DEC*| MAR(0:9) <- IR(6:15).,
|*DEC & (IR(0:5) := 4)*| -> LDA.,
|*DEC & (IR(0:5) := 8)*| -> STA.,
|*DEC & (IR(0:5) := 16)*| -> ADD.,
|*DEC & (IR(0:5) := 32)*| -> BRA.,
|*DEC & (IR(0:5) := 33)*| -> BRP.,
|*LDA*| ACC(0:15) <- M(MAR(0:9), 0:15).,
|*LDA*| -> ADS.,
|*STA*| M(MAR(0:9), 0:15) <- ACC(0:15).,
|*STA*| -> ADS.,
|*ADD*| ACC(0:15) <- ACC(0:15)+M(MAR(0:9),
0:15).,
|*ADD*| -> ADS.,
|*BRA*| IAR(0:9) <- IR(6:15).,
|*BRA*| -> ADS.,
|*BRP & ¬ACC(0)*| IAR(0:9) <- IR(6:15).,
|*BRP*| ->ADS.,
<END> CPU.
Device Name Source Transfer Condi.
MAR(0:9)
IAR(0:9) ADS
IR(6:15) DEC
IR(0:15) M(MAR(0:9), 0:15) IFT
IAR(0:9)
IAR(0:9)+1 ADS
IR(6:15) BRA|BRP & ~ACC(0)
ACC(0:15)
M(MAR(0:9), 0:15) LDA
ACC(0:15)+M(MAR(0:9), 0:15)
ADD
M(MAR(0:9),0:15)
ACC(0:15) STA
Transition Condition to Reg. & Memory
75
|*ADS*| -> IFT.,
|*IFT*| -> DEC.,
|*DEC & (IR(0:5) := 4)*| -> LDA.,
|*DEC & (IR(0:5) := 8)*| -> STA.,
|*DEC & (IR(0:5) :=16)*| -> ADD.,
|*DEC & (IR(0:5) :=32)*| -> BRA.,
|*DEC & (IR(0:5) :=33)*| -> BRP.,
|*LDA*| -> ADS.,
|*STA*| -> ADS.,
|*ADD*| -> ADS.,
|*BRA*| -> ADS.,
|*BRP*| -> ADS.,
Next State Current State Transfer
Condition
ADS LDA
STA
ADD
BRA
BRP
IFT ADS
DEC IFT
LDA DEC IR(0:5):=4
STA DEC IR(0:5):=8
ADD DEC IR(0:5):=16
BRA DEC IR(0:5):=32
BRP DEC IR(0:5):=33
76
Circuit Design
IAR
ADS
DAC
IR(6:15)
CLK
MAR
MAR(0:9)
10
10
10
10
10 10
MAR Design
IR Design
IR
IR(0:15)M(MAR,0:15)
CLK
IFT
16 16
77
DEC
IAR(0:9)+1
ADS
IR(6:15)
CLK
IAR
IAR(0:9)
10
10
10
10
10 10
IAR Design
BRA
BRP
ACC(0)
M(MAR,0:15)
LDA
ACC(0:15)+M(MAR,0:15)
CLK
ACC
ACC(0:15)
16
16
16
16
16 16
ACC Design
ADD
78
M
M(MAR,0:15)
ACC(0:15)
STA
10
16
16
MAR(0:9)
Data Transfer of M
M(MAR(0:9),0:15) ACC(0:15) STA
source Cond
79
Data Path Design
CLK
IAR(0:9)10
1010 10
IR(0:15)
CLK
16
M
CLK
MAR(0:9)10
1010 10
CLK
ACC
STA
CO
NT
RO
L
LO
GIC
ADDER
+110
ACC(0)
ADS
ADS
16
6
10
ADSIFTDECLDSSTAADDBRABRP
OP
ADD
LDA
DEC
ADR
16
ACC(0:15)
16
16 16
1616
16
WRITE ENABLE
80
Control Logic Design
State ST(0) ST(1) St(2)
ADS 0 0 0
IFT 0 0 1
DEC 0 1 0
LDA 0 1 1
STA 1 0 0
ADD 1 0 1
BRA 1 1 0
BRP 1 1 1
Number of States : 8
Required F.F : 3
ST(0)
ST(1)
ST(2)
81
State Assignment
1. 목적
순서회로(Sequential circuit)에 대해
최소 길이의 2진 코드를 이용.
각 상태에 서로 다른 코드를 할당.
조합 회로부를 최소화.
2. 필요성
순서회로의 최적 설계를 위함
각 내부 상태에 대한 코드 할당 – 최종회로의 논리 관계 형성.
할당방식에 따라 서로 다른 회로 면적을 갖음.
상태간의 연관 관계를 고려 최적 할당 코드를 구함.
82
State Assignment
3. 순서회로의 기본 구조
현 상태 및 입력 조건에 의한 다음 상태 및 출력을 결정.
구성
조합회로(PLA, random logic)
기억소자(register)
순서회로는 상태천이도 or 상태천이표로 나타냄
83
주 입력 주 출력
그림 1. 순서 회로의 일반 구조
State Assignment
4. 상태 할당의 기본 개념
84
Present State Q1Q2
Input X
0 1
00 01 11 10
10 11 00 --
10 01 10 01
그림 3. 상태천이표
X
Q1Q2
0 1
00 1 1
01 1 0
11 0 1
10 - 0
그림 4. Output(D1) 그림 5. Output(D2)
D1 = Q1`Q2`+Q1`X`+Q1Q2X
D2 = Q1`Q2+ Q1Q2`
X
Q1Q2
0 1
00 0 0
01 1 1
11 0 0
10 - 1
Present State Q1Q2
Input X
0 1
S1 S2 S3 S4
S4 S3 S1 --
S4 S2 S4 S2
S1->00, S2->01
S3->11, S4->10
그림 2. 상태천이표
|*ㄱST(0)&ㄱST(1)&ㄱST(2)*|ST(2)<-1
|*ㄱST(0)&ㄱST(1)&ST(2)*|ST(1)<-1, ST(2)<-0.
|*ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=4)*|ST(2)<-1.
|*ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=8)*|ST(0)<-1,ST(1)<-0.
|*ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=16)*|ST(0)<-1,ST(1)<-0.ST(2)<-1
|*ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=32)*|ST(0)<-1.
|*ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=33)*|ST(0)<-1.ST(2)<-1
|*ㄱST(0)&ST(1)&ST(2)*|ST(1)<-0, ST(2)<-0.
|*ST(0)&ㄱST(1)&ㄱST(2)*|ST(0)<-0.
|*ST(0)&ㄱST(1)&ST(2)*|ST(0)<-0, ST(2)<-0.
|*ST(0)&ST(1)&ㄱST(2)*|ST(0)<-0, ST(1)<-0.
|*ST(0)&ST(1)&ST(2)*|ST(0)<-0, ST(1)<-0,ST(2)<-0.
ST(0) SET CONDITION
ㄱST(0)&ST(1)&ㄱST(2)&(IR(0:5):=8|IR(0:5):=16|IR(0:5):=32|IR(0:5):=33)
ST(0) RESET CONDITION
ST(0)
ST(1) SET CONDITION
ㄱST(0)& ㄱ ST(1)&ST(2)
ST(1) RESET CONDITION
ST(1)&(ㄱST(0)&ㄱST(2)&((IR(0:5):=8)|(IR(0:5):=16))|ST(2))
ST(2) SET CONDITION
ㄱST(0)&ㄱST(2) &(ㄱST(1)|IR(0:5):=4)|(IR(0:5):=16|IR(0:5):=33)
ST(2) RESET CONDITION
ST(2)
85
Control Path
86
ST(0)
ST(1)
ST(2)
Q Q
Q Q
S R
S R
S R
Q Q
op 0 1 2 3 4 5
ADS IFT DEC LDA STA ADD BRA BRP
ST(0) ST(1)
ADS 0 0
IFT 0 1
DEC 1 1
EXC 1 0
LDA = EXC & (OP := 4)
STA = EXC & (OP := 8)
ADD = EXC & (OP := 16)
BRA = EXC & (OP := 32)
BRP = EXC & (OP := 33)
※ Op code is stored in IR Reg. OP during decode the instruction and execution the instruction.
IFT
EXC
ADS
DEC
87
Simplified Control Bus
S Q
R Q
S Q
R Q
ADS IFT DEC EXE
210 543OP
LDA
STA
ADD
BRA
EXE
88
Symbol
Truth table
HDL Descript
Circuit Diagram
(a)
Full Adder
A B C’ S C
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1
<TERMINAL>
A,B,C′,S,C.
<BOOLEAN>
C = A & B |
(A @ B) & C′,
S = A@B@C′
(b)
Half Adder
A B S C
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
<TERMINAL>
A, B, S, C.
<BOOLEAN>
C=A&B.
S=A@B.
A
B
c
C
S
A
B
C
S
F A
H A
c
A
B
C
S
A
B
C
S
Design of Adder
89
FA
FA
FA
FA
A(0)
B(0)
C(0)
S(0)
C(1)
S(1)
A(1)
B(1)
C(2)
A(14)
B(14)C(14)
S(14)
C(15)
S(15)
A(15)
B(15)
< OPERATOR > S<(A, B)> (16)
< TERMINAL > A(16), B(16), C(16).
< BOOLEAN >
C = A & B | (A @ B) & C(1:15) || B ‘0’,
S = A @ B @ C(1:15) || B ‘0’.< END > S.
90
HA
HA
HA
HA
A(0) C(0)
I(0)
C(1)
I(1)
A(1)
C(2)
A(8) C(8)
I(8)
C(9)
I(9)
A(9)
I
< OPERATOR > I <A> (10)
< TERMINAL > A(10), C(10).
< BOOLEAN >
C = A & C (1:9) || B ‘1’,
I = A @ C(1:9) || B ‘1’.< END > I.
91
ADDER
IR
16
16
6
16
LSI 1
SIR
LIR IR
16
16
6
16
LSI 2
SACC
LACC IR
16
16
6
10
LSI 3
SIAR
LIAR
MAR
LSI 4
1616
16
10
LSI 6
10
LMAR
MMAD16
READ
WRITE
B’1(16)
ONE
BBUS
16 15
ABUS
CBUS
16
ROM
ADR
ROM
LSI 7COND
NE
G
OP
C
SOP
OPA
RBUS
RAD
NEXT
4 4
4
4
4
4
6
ABUS = siR& IR |SACC&ACC|siAR& B ‘000000’||IAR BBUS = READ&M(MAR)|ONE&B’0000000000000001’ CBUS INPUT = ABUS + BBUS CBUS OUTPUT ACC <- |*LACC*|CBUS., IR <- |*LIR*|CBUS., IAR <- |*LIAR*|CBUS(0:9)., MAR <- |*LMAR*|CBUS(0:9)., M(MAR) <- |*WRITE*|CBUS..
Micro-Programmed CPU Design
92
IAR ACC
S L S L S L L R W O S C
NEXT(4)
I I A A I I M E R N O O
R R C C A A A A I E P N
C C R R R D T D
E
0 1 1 0 0 0 1
1 1 1 1 0 0 1 0
2 1 1 0 0 1 1
3 1 1 1
4 1 1 0 0 0 0
5 1 1 0 0 0 0
6 1 1 1 0 0 0 0
7 1 1 0 0 0
8 1 1 0 0 0 0
9 0 0 0 0
ADS : MAR <- IAR
IAR <- IAR + 1L.
IFT : IR <- M(MAR).
DEC : MAR <- ADR.
LDA : ACC <- M(MAR).
STA : M(MAR) <- ACC.
ADD : ACC <- ACC + M(MAR).
BRP :
BRA : IAR <- ADR.
-> ADS.
93
Empties represent 0
Control Rom
16 bit first 10bit => control signal
final 4bit => next address to read
add change => set 10th, 11th SOP COND to “1”
Address 0 : SIAR, LMAR is 1, then MAR IAR
next address is 0001
Address 1 : SIAR, LIAR, ONE is 1, then IAR IAR+1
Address 3 : SIR, LMAR is 1, so MAR ADR of IR
Next instruction <= OP (set SOP=1)
Address 7 : COND=1, first bit of ACC = 0 then ROM ADR become 1000,
first bit of ACC = 1 then ROM ADR become 1001.
94
<SYSTEM> MICRO:
<TIME> CLK<(10)>.
<STORAGE> M(1024,16), ROM(16,16).
<REGISTER> IR(16)=OP(6)||ADR(10).
ACC(16),IAR(10),MAR(10),ROMADR(4).
<TERMINAL> ABUS(16),BBUS(16),CBUS(16),RBUS(4),OPA(4),
CONTROL(16) =SIR||LIR||SACC||LACC||SIAR||LIAR||
LMAR||READ||WRITE||ONE||SOP||COND||NEXT(4).
<BOOLEAN>
ABUS=SIR & IR|SACC & ACC|SIAR & B’000000’||IAR,
BBUS=READ & M(MAR)|ONE & B’1(16)’,
CBUS=ABUS+BBUS,
CONTROL=ROM(ROMADR),
RBUS=SOP & OPA|~SO[P & NEXT|B’000’||(COND & ~ACC(0),
? OP # B’000100’ OPA = B’0100’
# B’001000’ OPA = B’0101’
# B’010000’ OPA = B’0110’
# B’100000’ OPA = B’1000’
# B’100001’ OPA = B’0111’..
<AUTOMATION> CPU:CLK:
<LOGIC>|* LIR *| IR <- CBUS.,
|* LACC *| ACC <- CBUS.,
|* LIAR *| IAR <- CBUS(0:9).,
|* LMAR *| MAR <- CBUS(0:9).,
|* WRITE *| M(MAR) <- CBUS.,
ROMADR <- RBUS.
<END> CPU.
<END> MICRO.
95
Partition
<SYSTEM> LSIS:
<TIME> CLK<(10)..
<STORAGE> M(1024,16), ROM(16,16).
<TERMINAL> ABUS(16), BBUS(16), CBUS(16), OPC(6), NEG, MAD(10), RAD(4), CONTROL(16)=SIR||LIR||SACC||LACC||SIAR||LMAR||READ||WRITE||ONE||SOP||COND||NEXT(4),ABUS1(16),ABUS2(16),ABUS3(16).
<BOOLEAN> ABUS=ABUS1|ABUS2|ABUS3, CONTROL=ROM(RAD).
<AUTOMATION> LSI1: CLK:
<REGISTER> IR(16)=OP(6)||ADR(10).
<LOGIC> ABUS1=SIR&IR, IR & a peripheral circuit
OPC=OP,
|* LIR *| IR <- CBUS..
<END> LSI1.
<AUTOMATION> LSI2: CLK:
<REGISTER> ACC(16).
<LOGIC> ABUS2 = SACC & ACC,
NEG = ~ACC(0), ACC & a peripheral circuit
|* LACC *| ACC <- CBUS..
<END> LSI2.
<AUTOMATION> LSI3: CLK:
<REGISTER> IAR(10).
<LOGIC> ABUS3(0:9) = SIAR & IAR,
|* LIAR *| IAR <- CBUS..
<END> LSI3.
96
<AUTOMATION> LSI4: CLK:
<BOOLEAN> CBUS = S < (ABUS,BBUS) >.
<OPERATOR> S < (A,B) > (16)
<TERMINAL> A(16), B(16), C(16).
<BOOLEAN>
C = A & B|(A@B) & C(1:15)||B1’0,
S = A@B@C(1:15)||B’0’.
<END> S.
<END> LSI4.
<AUTOMATION> LSI5: CLK:
<REGISTER> MAR(10).
<LOGIC> MAD = MAR,
|* LMAR *| MAR <- CBUS(0:9)..
<END> LSI5.
<AUTOMATION> LSI6: CLK:
<BOOLEAN>
BBUS = READ & M(MAD)|ONE & B’1(16)’.
<END> LSI6.
<AUTOMATION> LSI7: CLK:
<REGISTER> ROMADR(4).
<TERMINAL> OPA(4), RBUS(4).
<LOGIC>
RBUS = SOP & OPA|~SOP & NEXT|B’000’||(NEG & COND),
RAD = ROMADR,
? OPC # B’000100’ OPA = B’0100’
# B’001000’ OPA = B’0101’
# B’010000’ OPA = B’0110’
# B’100000’ OPA = B’1000’
# B’100001’ OPA = B’0111’.,
ROMADR <- RBUS.
<END> LSI7.
<END> LSIS. 97
<AUTOMATION> LSI1 IR -- and its peripherals
<AUTOMATION> LSI2 ACC -- and its peripherals
<AUTOMATION> LSI3 IAR -- and its peripherals
<AUTOMATION> LSI4 ADDER
<AUTOMATION> LSI5 MAR --and its peripherals
<AUTOMATION> LSI6 BBUS -- and its peripherals
<AUTOMATION> LSI7 ROMADR, RBUS -- and its peripherals
<SYSTEM> TEST7:
<ENTERANCE> OPC(6), NEG.
<TIME> CLK <(10)>.
<STORAGE> ROM(16,16).
<TERMINAL> RAD(4),
CONTROL(16) = SIR||LISR||SACC||LACC||SIAR||LIAR||LMAR||
READ||WRITE||ONE||SOP||COND||NEXT||(4).
<BOOLEAN> CONTROL = ROM(RAD).
<AUTOMATION> LSI7: CLK:
<END> LSI7.
<END> TEST7.
98