- 1 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Embedded System Hardware
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
actuators
- 2 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Summary
Processing units Power efficiency of target technologies ASICs Processors
• Energy efficiency• Code size efficiency and code compaction• Run-time efficiency• DSP processors• Multimedia processors• Very long instruction word (VLIW) machines• Micro-controllers
Reconfigurable Hardware Memory
Processing units Power efficiency of target technologies ASICs Processors
• Energy efficiency• Code size efficiency and code compaction• Run-time efficiency• DSP processors• Multimedia processors• Very long instruction word (VLIW) machines• Micro-controllers
Reconfigurable Hardware Memory
- 3 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Memory
For the memory, efficiency is again a concern: speed (latency and throughput); predictable timing energy efficiency size cost other attributes (volatile vs. persistent, etc)
For the memory, efficiency is again a concern: speed (latency and throughput); predictable timing energy efficiency size cost other attributes (volatile vs. persistent, etc)
- 4 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Access times and energy consumption increases with the size of the memory
Example (CACTI Model): "Currently, the size of some applications is doubling every 10 months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 2003]
- 5 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Access times and energy consumptionfor multi-ported register files
Rixner’s et al. model [HPCA’00], Technology of 0.18 mRixner’s et al. model [HPCA’00], Technology of 0.18 m
Cycle Time (ns) Area (2x106) Power (W)
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
16 32 64 128
Register File Size
0
1
2
3
4
5
6
7
16 32 64 128
GP6M2 GP6M3
0
2
4
6
8
10
12
14
16 32 64 128
Register File Size
Sou
rce
and
© H
. Val
ero,
200
1
- 6 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Reducing energy consumption with Sub-banking
shorter wires,lower capacitances,faster access
shorter wires,lower capacitances,faster access
0
5
10
15
20
25
30
Memory Size (bytes)
Ene
rgy
[uJ]
1-Subbank 2-Subbanks4-Subbanks 8-Subbanks16-Subbanks
http://www.ecse.rpi.edu/frisc/theses/MaierThesis/FIG319.GIF
- 7 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
How much of the energy consumption of a system is memory-related?
Mobile PCThermal Design (TDP) System Power
Note: Based on Actual Measurements
600/500 MHz uP37%
LCD 10"19%
HDD9%
Memory+Graphics12%
Power Supply10%
Other13%
Mobile PCAverage System Power
600/500 MHz uP13%
LCD 10"30%
HDD19%
Memory+Graphics15%
Power Supply10%
Other13%
CPU Dominates Thermal Design Power
Multiple Platform Components Comprise
Average Power [Courtesy: N. Dutt; Source: V. Tiwari]
- 8 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
„CPU“ Power Dissipation
Power PC
Clock19%
Control L.16%
Cache23%
Data Flow11%
I/O 7%
PLA5%
ROM2%
TLB17%
Strong ARM
Icache26%
Dcache16%
Clock10%
IMMU9%
EBOX8%
DMMU8%
Others5%
Ibox18%
IEEE Journal of SSC Nov. 96
Proceedings of ISSCC 94Based on slide by and ©: Osman S. Unsal, Israel Koren, C. Mani Krishna, Csaba Andras Moritz, University of Massachusetts, Amherst, 2001
42%/40% again memory-related !
- 9 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Access-times will be a problem
Speed gap between processor and main DRAM increasesSpeed gap between processor and main DRAM increases
2
4
8
2 4 5
Speed
years
CPU
(1.5
-2 p
.a.)
DRAM (1.07 p.a.)
31
• early 60ties (Atlas):page fault ~ 2500 instructions
• 2002 (2 GHz µP):access to DRAM ~ 500 instructions
penalty for cache miss about same as for page fault in Atlas
• early 60ties (Atlas):page fault ~ 2500 instructions
• 2002 (2 GHz µP):access to DRAM ~ 500 instructions
penalty for cache miss about same as for page fault in Atlas
[P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 2002, U. Brisbane]
2x every 2 years
10
- 10 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Predictability is a problem (1)
Embedded system systems are often real-time systems:
Have to guarantee meeting timing constraints.
Predictability: For satisfying timing constraints in hard real-time systems, predictability is the most important concern;pre run-time scheduling is often the only practical means of providing predictability in a complex system [Xu, Parnas]
Time-triggered, statically scheduled operating systems
Embedded system systems are often real-time systems:
Have to guarantee meeting timing constraints.
Predictability: For satisfying timing constraints in hard real-time systems, predictability is the most important concern;pre run-time scheduling is often the only practical means of providing predictability in a complex system [Xu, Parnas]
Time-triggered, statically scheduled operating systems
- 11 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Predictability is a problem (2)
Currently available caches don‘t solve the problem:• Improve the average case behavior• Use „non-deterministic“ cache replacement algorithms
Scratch-pad/tightly coupled memory based predictability
Currently available caches don‘t solve the problem:• Improve the average case behavior• Use „non-deterministic“ cache replacement algorithms
Scratch-pad/tightly coupled memory based predictability
- 12 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Hierarchical memoriesusing scratch pad memories (SPM)
Address space
Address space ARM7TDMI
cores, well-known for low power consumption
scratch pad memory
0
FFF..
main
SPM
processor
HierarchyHierarchy
ExampleExample
no tag memory
- 13 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Comparison of currents using measurements
E.g.: ATMEL board with ARM7TDMI andext. SRAM
Current32 Bit-Load Instruction (Thumb)
48,2 50,9 44,4 53,1
11677,2 82,2
1,16
0
50
100
150
200
Prog Main/ DataMain
Prog Main/ DataSPM
Prog SPM/ DataMain
Prog SPM/ Data SPM
mA
Core+SPM (mA) Main Memory Current (mA)
- 14 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Comparison of energy consumption
Energy32 Bit-Load Instruction (Thumb)
115,8
51,6
76,5
16,4
0,020,040,060,080,0
100,0120,0140,0
Prog Main/ Data Main Prog Main/ Data SPM Prog SPM/ Data Main Prog SPM/ Data SPM
10 n
J
Energy
Example: Atmel ARM-Evaluation board
€
Main memory access takes more cyclessavings (86%) larger than for current.
energy reduction:/ 7.06
energy reduction:/ 7.06
- 15 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Why not just use a cache ?
[P. Marwedel et al., ASPDAC, 2004]
1. Predictability?Worst case execution time
(WCET) may be large
- 16 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Why not just use a cache ? (2)
0
1
2
3
4
5
6
7
8
9
256 512 1024 2048 4096 8192 16384
memory size
En
erg
y p
er
ac
ce
ss
[n
J]
.
Scratch pad
Cache, 2way, 4GB space
Cache, 2way, 16 MB space
Cache, 2way, 1 MB space
[R. Banakar, S. Steinke, B.-S. Lee, 2001]
2. Energy for parallel access of sets, in comparators, muxes.
- 17 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Influence of the associativity
Parameters different from previous slides
[P. Marwedel et al., ASPDAC, 2004]
- 18 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Flash Memory Based on EPROM/EEPROM-Memory
floating gate FG can be charged by high voltage.Charged transistor non-conducting if row selected.Discharging with UV light.
EPROM storage cell
FG charged
FG not charged
Read amplifierGround
Row (j)
- 19 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
EEPROMS storage cell
EEPROM= electrically erasable read only memory.EEPROM needs additional transistor per bit
Writing and erasing requires just electrical signals
FG charged
FG not chargedGround
Row (j)
- 20 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
NOR- and NAND-Flash
NOR: Transistor between bit line and ground
NAND: Several transistor between bit line and ground
NOR: Transistor between bit line and ground
NAND: Several transistor between bit line and ground
[ww
w.s
amsu
ng.c
om/P
rod
ucts
/S
emic
ondu
ctor
/Fla
sh/F
lash
New
s/F
lash
Str
uctu
re.h
tm]
contact
contact
- 21 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Properties of NOR-
and NAND-Flash
memories
Type/Property NOR NAND
Random access Yes No
Erase block Slow Fast
Size of cell Larger Small
Reliability Larger Smaller
Execute in place Yes No
Applications Code storage, boot flash, set top box
Data storage, USB sticks, memory cards
[ww
w.s
amsu
ng.c
om/P
rodu
cts/
Sem
icon
duct
or/F
lash
/Fla
shN
ews/
Fla
shS
truc
ture
.htm
]
- 22 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Embedded System Hardware
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
actuators
For Impressive display technology see http://www.date-conference.com/conference/ 2003/keynotes/index.htm
- 23 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Digital-to-Analog (D/A) Converters
Various types, can be quite simple, e.g.:
- 24 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
3
0
3
0123
2
842
i
ii
ref
refrefrefref
xR
VR
Vx
R
Vx
R
Vx
R
VxI
0'1 IRV
'II
01 IRV
3
0
131 )(8
2i
refi
iref xnatR
RVx
R
RVV
Due to Kirchhoff‘s laws:
Due to Kirchhoff‘s laws:
Current into Op-Amp=0:
Hence:
Finally:
Output voltage no. represented by x
- 25 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Possible to reconstruct analog value from digitized value?
Let fs be the sampling frequency
Input signals with frequency components > fs/2 cannot be distinguished from signals with frequency components < fs/2.
Example: Signal: 5.6 Hz; Sampling: 9 Hz
Let fs be the sampling frequency
Input signals with frequency components > fs/2 cannot be distinguished from signals with frequency components < fs/2.
Example: Signal: 5.6 Hz; Sampling: 9 Hz
-1.5
-1
-0.5
0
0.5
1
1.5
- 26 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Frequency spectrum of sampled signal
Let Xc(): frequency spectrum of the continuous signal, cut-off frequency ΩN=2π fN
Let Ωs=2π fs: sampling frequency
Let Xs: frequency spectrum of the sampled signal
Xs consists of multiple copies of Xc, separated by Ωs
Let Xc(): frequency spectrum of the continuous signal, cut-off frequency ΩN=2π fN
Let Ωs=2π fs: sampling frequency
Let Xs: frequency spectrum of the sampled signal
Xs consists of multiple copies of Xc, separated by Ωs
Formally: Xs = Xc folded with S,with S: frequency spectrum of clock
Formally: Xs = Xc folded with S,with S: frequency spectrum of clock
Cannot be distinguished in sampled signal
- 27 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Aliasing
If Ωs < 2 ΩN copies of spectrum will overlap
(we don’t know the original frequencies any more)
If Ωs < 2 ΩN copies of spectrum will overlap
(we don’t know the original frequencies any more)
No problem for signal reconstruction if this is avoided.No problem for signal reconstruction if this is avoided.
- 28 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Nyquist theorem
• Analog input to sample-and-hold can be precisely reconstructed from its output, provided that sampling proceeds at double of the highest frequency found in the input voltage. [Nyquist 1928, Shannon, 1949]
• Analog input to sample-and-hold can be precisely reconstructed from its output, provided that sampling proceeds at double of the highest frequency found in the input voltage. [Nyquist 1928, Shannon, 1949]
S/H A/D-converter D/A-converter
= ?
Inter-polate
Does not capture effect of value quantization:Quantization noise prevents precise reconstruction.
Does not capture effect of value quantization:Quantization noise prevents precise reconstruction.
- 29 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Embedded System Hardware
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
Embedded system hardware is frequently used in a loop(„hardware in a loop“):
actuators
- 30 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Actuators and output
Huge variety of actuators and output devices, impossible to present all of them.Microsystems motors as examples (© MCNC):
(© MCNC)
- 31 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Actuators and output (2)
Courtesy and ©:E. Obermeier, MAT,
TU Berlin
- 32 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Stepper Motor
Stepper motor: rotates fixed number of degrees when given a “step” signal.
In contrast, DC motor just rotates when power applied. Rotation achieved by applying specific voltage sequence
to coils Controller greatly simplifies this
Stepper motor: rotates fixed number of degrees when given a “step” signal.
In contrast, DC motor just rotates when power applied. Rotation achieved by applying specific voltage sequence
to coils Controller greatly simplifies this
http://www.cise.ufl.edu/~prabhat/Teaching/cis6930-f04/comp4.pdf
- 33 - P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6
Universität DortmundUniversität Dortmund
Summary
We have covered: sensors (briefly) sample-and-hold and A/D converter circuits communication information processing hardware
• ASICs (briefly)• processors• FPGAs• memory
D/A-converters actuators (briefly)
We have covered: sensors (briefly) sample-and-hold and A/D converter circuits communication information processing hardware
• ASICs (briefly)• processors• FPGAs• memory
D/A-converters actuators (briefly)
Top Related