Introduction to Multiprocessor System-on-Chip
description
Transcript of Introduction to Multiprocessor System-on-Chip
courseware
Introduction to Multiprocessor System-on-Chip
Prof. Jan Madsen
Informatics and Mathematical ModelingTechnical University of Denmark
Richard Petersens Plads, Building 321DK2800 Lyngby, Denmark
(c) Jan Madsen 2SoC-MOBINET courseware
Embedded systems
CPUmem
rom
if ...
then ... else ...
for { ...
..}
func
io
001010100101101101011101101010001010011101101110101001010011101010101010001111101010111101010111101101010
bit-pattern
(c) Jan Madsen 3SoC-MOBINET courseware
Embedded systems
Systems which use a computer to perform a specific function, but are neither used nor perceived as a computer
They are embedded within larger electronic devices
Repeatedly carrying out a particular function
Often completely unrecognized by the device’s user
(c) Jan Madsen 4SoC-MOBINET courseware
Embedded systems design
hardware software
validation validation
hardware prototype
software prototype
Several design groups
Separated validations
Prototype realization
hardware model
software model
Problems arise at a very late point in the design process
(c) Jan Madsen 5SoC-MOBINET courseware
Principples of Codesign
void UnitControl() { up = down = 0; open = 1; while (1) { while (req == floor); open = 0; if (req > floor) { up = 1;} else {down = 1;} while (req != floor); open = 1; delay(10); } }}
void UnitControl() { up = down = 0; open = 1; while (1) { while (req == floor); open = 0;
if (req > floor) { up = 1;} else {down = 1;} while (req != floor); open = 1; delay(10); } }}
SW synthesis
CPU
ASIC
HW synthesis
Interface synthesis
(c) Jan Madsen 6SoC-MOBINET courseware
Overview
Technology Processors IC fabric
Codesign for speed-up component execution timing (SW and HW)
Building sub-system Hardware/software partitioning
Building system System-level issues of codesign
(c) Jan Madsen 7SoC-MOBINET courseware
Software
Elements of computation Store data Transform data Move data
if ...
then ... else ...
for { ...
..}
func
pe
(c) Jan Madsen 8SoC-MOBINET courseware
Processor
Architecture components Processing elements – transform data Memories – store data Interconnect – move data
if ...
then ... else ...
for { ...
..}
func
(c) Jan Madsen 9SoC-MOBINET courseware
Processor: General Purpose
Availability Low cost (mass production) Simple design flow High flexibility
if ...
then ... else ...
for { ...
..}
func
inst mem controller datapath data mem
func
pc
ir cu
reg
+/-*
(c) Jan Madsen 10SoC-MOBINET courseware
Processor: General Purpose - example
if ...
then ... else ...
for { ...
..}
func
inst mem controller datapath data mem
func
pc
ir cu
reg
+/-
x = x + A[i] * p1
*A[i]
p1
5 cycles
(c) Jan Madsen 11SoC-MOBINET courseware
Processor: Custom (ASIC)
High performance Low power Complex design flow No flexibility
if ...
then ... else ...
for { ...
..}
func
controller datapath
cu
+/-*+
mem
(c) Jan Madsen 12SoC-MOBINET courseware
Processor: Custom (ASIC) – example
if ...
then ... else ...
for { ...
..}
func
controller datapath
cu
+/-*+
mem
A[i]
p1
x = x + A[i] * p1 1 cycle
(c) Jan Madsen 13SoC-MOBINET courseware
Processor: Semicustom (ASIP)
Costumized datapath – 16, 8 or 4 bit Optimized for particular class of programs - MACC ”Simple” design flow High flexibility
if ...
then ... else ...
for { ...
..}
func
inst mem controller datapath data mem
func
pc
ir cu
reg
+/-
+*
(c) Jan Madsen 14SoC-MOBINET courseware
Processor: Semicustom - example
if ...
then ... else ...
for { ...
..}
func
inst mem controller datapath data mem
func
pc
ir cu
reg
+/-
+*
p1
A[i]
x = x + A[i] * p1 2 cycles
(c) Jan Madsen 15SoC-MOBINET courseware
IC fabrics
IC is an interconnection of transistors following one of several possible styles – fabrics
The fabric defines how and when transistors are composed
”the material of processors” IC fabrics differ in terms of customizability and
generality
(c) Jan Madsen 16SoC-MOBINET courseware
IC fabrics: Custom
Exact implementation of processor components High NRE cost – mask set ~ 1M$
(c) Jan Madsen 17SoC-MOBINET courseware
IC fabrics: Semicustom
Several semicustom fabrics Library of standard cells Cell arrays (sea-of-gates)
Most processing steps are pre manufactured (high volume)
(c) Jan Madsen 18SoC-MOBINET courseware
IC fabrics: Programmable
Set of interconnected modules Set of modules programmed to implement different
components FPGA
Programmable logic modules, storage and interconnect
(c) Jan Madsen 19SoC-MOBINET courseware
Chips: Implementing IC fabric
(c) Jan Madsen 20SoC-MOBINET courseware
Hardware/software codesign?
Many possible mappings Processor may not exist yet! Exploring the design space Need to estimate
if ...
then ... else ...
for { ...
..}
func
(c) Jan Madsen 21SoC-MOBINET courseware
Hardware/Software Codesign
Optimizing Timing (high performance, hard deadlines) Area (cost) Power consumption Flexibility Reliability ...
We will focus on timing
(c) Jan Madsen 22SoC-MOBINET courseware
Processing element timing
Execution path Control data dependent Input data dependent
Function implementation Component architecture Compiler or synthesis
if ...
then ... else ...
for { ...
..}
func
(c) Jan Madsen 23SoC-MOBINET courseware
Formal execution path timing analysis
then ...
else {
... }
for { ...
..}
if ... b1b3
b4
b2
bi basic block or program segment
tpe(bi,pej) execution time of bi on processing element pej
c(bi) execution frequency of bi
worst/best case timing bounds
)c(b,pe ) (bF,pe )t iI
i (pe j tpe j
(c) Jan Madsen 24SoC-MOBINET courseware
Formal execution path timing analysis
then ...
b2
,pe ) (b itpe j +
+
-
* *
model
+
+
-
*
*
hardware
+
+
-
*
*
software
(c) Jan Madsen 25SoC-MOBINET courseware
Memory models
Access time Control overhead Burst access (packets) Cache
hit/miss time overhead Based on execution history
PE
D$ I$
FlashRAM
SDRAM
(c) Jan Madsen 26SoC-MOBINET courseware
Advanced architectures
Modern high performance processors includes architectural features which complicates timing analysis Dynamic instruction scheduling Speculative execution
Though fast, it makes the processor very power hungry tight bounds on timing very difficult Computation less predictable
Issues which are important for embedded systems
(c) Jan Madsen 27SoC-MOBINET courseware
Building sub-systems
Initial codesign problem Hardware/software partitioning the LYCOS cosynthesis tool
Automatic partitioning from C (subset) and VHDL (single process) Developed at DTU
if ...
then ... else ...
for { ...
..}
func
processor ASIC
(c) Jan Madsen 29SoC-MOBINET courseware
Architectural choices
Which processor should be selected and how fast should it be?
Which ASIC technology should be chosen and how fast should the ASIC be?
How large an ASIC can we afford and which functions should it execute?
How should the processor and ASIC communicate?
(c) Jan Madsen 30SoC-MOBINET courseware
Partitioning Model
Determines granularity and simplifying assumptions w.r.t. communication, HW sharing, etc
Specification
BB
Model SW HW
(c) Jan Madsen 31SoC-MOBINET courseware
Estimation
SW HW
SWEstimator
Sa
tS
SWLib
tH
Estimator
HW
Lib
HW
aH
tC
EstimatorLibComCom
Ca
(c) Jan Madsen 32SoC-MOBINET courseware
Process communication
then ...
else { send(...); receive(...);... }
for { ...
..}
if ... b1b2
b3
b4 )c(b)r(bFr
)c(b)s(bFs
iI
i
iI
i
)(
)(
s(bi) sent data in bi
r(bi) received data in bi
c(bi) execution frequency of bi
Communication time
s(bi) and r(bi) determined by data volume Data encoding Communication protocol
(c) Jan Madsen 33SoC-MOBINET courseware
Solving the Partitioning Problem
SW HW
1
2
3
4
5
6
Just try all combinations...
(c) Jan Madsen 34SoC-MOBINET courseware
Solving the Partitioning Problem
Knapsack Stuffing
No communicationinterleaved exec. additive areas
Parallel executionnon-additive areas
Interleaved communication additive areas
Large scale linear/nonlinear integer programming
Heuristics needed!
SW HW
1
2
3
4
5
6
SW HW
1
2
3
4
5
6
1
2
6
7
HW
3
4
5
SW
(c) Jan Madsen 35SoC-MOBINET courseware
LYCOS Design Flow
Partitioning
Comm. Estim.
HW Estim.
SW Estim.
HWSW
Assembler NetlistSW/HW
Synthesis SynthesisComm.
Synthesis
Translate
Specification
SWModel
Model
ModelComm.
HW
Analysis
RequireFunctional
CDFG
CDFG
(c) Jan Madsen 36SoC-MOBINET courseware
Building Systems
Platform architectures are heterogeneous Different processing element types Different interconnection networks and
communication protocols Different memory types Different scheduling and
synchronization strategies M
CoP
M
M
PDSP
M
P
(c) Jan Madsen 37SoC-MOBINET courseware
Managing HW platform complexity
Development of APIs to hide complexity from application programmer and improve portability
Specialized RTOS to control resource sharing and interfaces
Complex multi-level HW/SW architecture
(c) Jan Madsen 38SoC-MOBINET courseware
Software architecture
Bus
RTOS
CPU
I/O IntBus-CTRL
TimerTimer
drivers
RTOS-APIs
Periphery
Cache
mem
private
private
private
private
sha
red
Hardware
Software
HW/SWPlattform
application
ce1
application
pe1
(c) Jan Madsen 39SoC-MOBINET courseware
Platform design challenges
Integration Design process integration Heterogeneous component and language integration
Design space exploration and optimization Verification
(c) Jan Madsen 40SoC-MOBINET courseware
Complex run-time interdependencies
Run-time dependencies of independent components via communication
Influence on timing and power Need to handle resource sharing
Process/task scheduling Communication scheduling Scheduling strategies (static, dynamic, time or priority driven)
CoP
PEPE
(c) Jan Madsen 41SoC-MOBINET courseware
Interdependency example
Complex non-functional interdependencies Periodic task executing on PE Task writes to bus at the end of each periodic execution
PE
Short execution timehigh bus load
long execution timelow bus load
Local decision on improving performance may impact the global system performance
(c) Jan Madsen 42SoC-MOBINET courseware
System-on-Chip challenge
processor
memory
iorouter
(c) Jan Madsen 43SoC-MOBINET courseware
Network-on-Chip
a
b
c
dM
M
M
Multi-hop Segmented communication
Concurrency Multiple simultaneous
communications
(c) Jan Madsen 44SoC-MOBINET courseware
Network-on-Chip
Multi-hop Segmented communication
Concurrency Multiple simultaneous
communications
Sharing Quasi-simultaneous
resource usage Multiple communication
events occupying some or all resources in an interleaved fashion
a
b
c
dM
M
M
(c) Jan Madsen 46SoC-MOBINET courseware
platform designPlatform-based design
New design paradigme ...
platform
specification
IP
re-configure
re-designMapping
(c) Jan Madsen 47SoC-MOBINET courseware