Data-flow Analysis for Interrupt- driven Microcontroller Software Nathan Cooprider Advisor: John...
-
date post
15-Jan-2016 -
Category
Documents
-
view
221 -
download
1
Transcript of Data-flow Analysis for Interrupt- driven Microcontroller Software Nathan Cooprider Advisor: John...
Data-flow Analysis for Interrupt-driven
Microcontroller SoftwareNathan Cooprider
Advisor: John Regehr
Dissertation defense
School of Computing
University of Utah
2
• A whole program analysis• Targeting embedded C programs• Suitable for use in a compiler
Data-flow Analysis for Interrupt-driven
Microcontroller Software
3
Microcontrollers (MCUs)
• 10 billion units / year • $12.5 billion market in 2006• Cheap • Resource constrained• e.g. Wireless sensor networks
– Mica2 mote ATmega 128L (4 MHz 8-bit MCU)128 kB code, 4 kB data SRAM
4
Problem
• Resources are constrained• Software outlives hardware
– Code reuse leads to bloat
• Low-level code confuses analysis– Interrupt-driven concurrency– Device register access
5
Solution
• Traditional data-flow analysis – Not adequate precision for MCU
software
• New techniques to increase precision– Deal with concurrency– Track volatile data
• Use in code transformations– Optimizations
Thesis statement
6
Contributions
• Analysis techniques– Interatomic concurrent data-flow (ICD)– Tracking data through volatile variables
• Tool – cXprop• Applications
– Practical memory safety – Safe TinyOS– Offline RAM Compression
7
• Open-source OS for WSNs• Written in nesC
– Dialect of C
• Concurrency– Tasks and interrupts– No threads– Atomic sections
main
taskInterrupt
Interrupttask
task
8
Abstractinterpretation
SafeTinyOS
RAMcompression
ICD
Conditional xpropagation
Pointeranalysi
s
Volatiletracking
cXprop
9
switch (x) {. . .
break;case 42: case 7: case -1:
if (x < 0)x *= -1;
x++;if (x == 0)
assert(0);break;
. . .
Abstract interpretation
• Abstract domain– Abstract values– Form poset
• Subset relation ()
– Lattice• Undefined ()• Unknown (⊥)
x={42,7,-1}
{42,7,-1} or ⊥
{42,7}
{7}
{7,-1} {42,-1}
{42} {-1}
{} or
10
switch (x) {. . .
break;case 42: case 7: case -1:
if (x < 0)x *= -1;
x++;if (x == 0)
assert(0);break;
. . .
Abstract interpretation
• Abstract domain– Abstract values– Form poset
• Subset relation ()
– Lattice• Undefined ()• Unknown (⊥)
• Data-flow analysis– Transfer functions– Merging ()– Fixed point
x={42,7,-1}
11
Τ
Τ
ΤΤΤ
Τ
Τ
Abstract interpretation
• Abstract domain– Abstract values– Form poset
• Subset relation ()
– Lattice• Undefined () • Unknown (⊥)
• Data-flow analysis– Transfer functions– Merging ()– Fixed point
x<0
x++;
x==0 assert(0);
x*=-1;
{42,7,-1}
{-1}
{42,7,1}
Τ
{1}{42,7}
Τ
{43,8,2}
{43,8,2}
*=<
++
==
12
Abstractinterpretation
SafeTinyOS
RAMcompression
ICD
Conditional xpropagation
Pointeranalysi
s
Volatiletracking
cXprop
13
Interrupt-driven concurrency
• Problems– C statements not necessarily atomic
x = 0x4242;
ldi r24, 0x42
ldi r25, 0x42
Interrupt
14
Interrupt-driven concurrency
• Problems– C statements not necessarily atomic– Preempts sequential control flow
• Complicated control flow• Synchronization
– One flow does not “break” another– Bad synchronization happens
• Difficult or impossible to reason about• Must deal with conservatively (⊥)
A race
15
Related work
• Thread-based concurrency– M. B. Dwyer, L. A. Clarke, J. M.Cobleigh, and G.
Naumovich. Flow analysis for verifying properties of software systems. TOSEM 2004.
– M. C. Rinard. Analysis of multithreaded programs. SAS 2001.
• Leveraging race detection– R. Chugh, J. W. Voung, R. Jhala, and S. Lerner. Dataflow
analysis for concurrent programs using datarace detection. PLDI 2008.
• Formal semantics– X. Feng, Z. Shao, Y. Dong, Y. Gho. Certifying low-level
programs with hardware interrupts and preemptive threads. PLDI 2008.
16
Race detection
• Lockset analysis - standard technique – Lock status = interrupt enable bit status– Only one lock – no lock aliasing– nesC uses lexical nesting
• Data classification– Unshared – accessed only from main– Shared – accessed from interrupts
17
Race detection
• Data classification– Unshared – accessed only from main– Shared – accessed from interrupts
Accessed without lockingWritten in shared or
unlocked unshared codeAccessed in shared code
RACE
18
Not racing
Race detection case analysis
Interruptor
task Atomic section
UseInterrupt ReadWrite
Racing
AccessReadWrite
19
Data classification
Data
Static(Global)
StackHeap
Shared Unshared
Racing Not racing
⊥Concurrent
Sequential
6% 44%
50%
20
Atomic interleaving
main
Atomicsection
Atomicsection
Interrupt
Atomicsection
Interatomic Concurrent Data-flow
Published at LCTES 2006
21
Volatile
• C type qualifier – volatile int• Special case of C’s memory model
– Read value may change “randomly”– Write may affect system state
• E.g., racing data, device registers• Behavior opaque at C level• Prevents compiler optimizations
22
Tracking volatile RAM
• Locate variables backed by RAM• Introduce concurrency information
– Interatomic concurrent dataflow
• Have sound approximation of mutators– Behavior not opaque at system level
• Safely analyze volatile variables in RAM
23
Tracking volatile device registers
• Hardware registers– Memory mapped I/O– Hardware not actually random (volatile)
• Can track using MCU-specific information– OK to track individual bits
• Instead of whole register• Interrupt bit of status register
Volatile tracking
24
Pointer analysis• Points-to sets – must and may alias
– Two pluggable domains– Subtleties from context-insensitivity
• Targets:– Device registers– Scalars– Structs– Arrays– not-NULL– Heap Pointer analysis
25
Conditional X propagation
• Pluggable abstract domains– From conditional constant propagation
• Clean domain interface – Transfer functions– Abstract
interpretation utility functions
Analysis
Abstract domain
Conditional X propagation
26
ConstantBitwise
Interval
Value set
Domains
Conditional X propagation
27
Abstractinterpretation
SafeTinyOS
RAMcompression
ICD
Conditional xpropagation
Pointeranalysi
s
Volatiletracking
cXprop
28
Struct splitter
Inliner
Cleaner
Fixed point computation
Value-flow Pointer-flow
ICD Volatile tracking
Cleaner
Transformations• Constant propagation• Dead code elimination• Dead data elimination
Implemented as a CIL extension
29
Suppose we have a WSN…
30
• What happened?– State got corrupted – array out-of-bounds– Hard to debug
• Limited visibility into executing systems• Difficult to replicate complex bugs
• Memory safety can– Catch all pointer and array bounds errors
• Before they corrupt state
– Provide a choice of recovery action• Display error message or reboot
Suppose we have a WSN…
Memory safety error
31
Safe TinyOS
Published at SenSys 2007
Expand
into system safety
• Modify TinyOS to work with Deputy
• Enforce Deputy’s safety model under concurrency
• Reduce overheadcXprop
Deputy: existing solution for making C safe
32
int post(val_t* COUNT(n) buf, int n);
cXprop
cXpropwhole-programoptimization
whole-programoptimization
compresserror messages
compresserror messages
deal withconcurrency
deal withconcurrency
enforce safetyusing Deputyenforce safetyusing Deputy
Safe TinyOS toolchain
run modifiednesC compiler
AnnotateSafe
TinyOScode
TinyOScode
run modifiednesC compiler
Modify TinyOS to work with Deputy
Enforce Deputy’s safety model under concurrency
Reduce overhead
Safe TinyOS
app
int post(val_t* buf, int n);
33
Concurrency
• Deputy enforces safety in sequential code
• cXprop avoids extraneous protection– Only racing
variables need protection
Potentially unsafe readIf ( )
Deputy checkInterrupt
Potentially unsafe read
to local
Read localA
tom
ic b
lock
35
Code size
36
Code size
35%13%
-11%
SafeTinyOS
37
A closer look at RAM usage
• On-chip RAM for MCUs expensive– Kilobytes, not megabytes or gigabytes– Data in SRAM – 6 transistors / bit– SRAM can dominate power consumption
of a sleeping chip
38
A closer look at RAM usage
• On-chip RAM for MCUs expensive– Kilobytes, not megabytes or gigabytes– Data in SRAM – 6 transistors / bit– SRAM can dominate power consumption
of a sleeping chip
• Is RAM used efficiently?– Performed value profiling for MCU apps
• Apps already heavily tuned for RAM usage
– Result: Average byte stores four values!
On-chip RAM is persistently scarcein tiny MCU-based systems
39
Offline RAM compression
• Automated sub-word packing for statically allocated scalars, pointers, structs, arrays– No heap on targeted MCUs– Trades ROM and CPU cycles
for RAM
Published at PLDI 2007
40
Method
x ≝ variable that occupies n bits
Vx ≝ conservative estimate of value set
log2|Vx| < n ⇒ RAM compression possible
Cx ≝ another set such that |Cx| = |Vx|
fx ≝ bijection between Vx and Cx
n - log2|Cx| ⇒ bits saved through compression of x
41
Example Compression
void (*function_queue[8])(void);
42
Example Compression
x
n = size of a function pointer = 16 bits
void (*function_queue[8])(void);
43
Example Compression
&function_A
&function_B
&function_C
NULL
Vxx
44
Example Compression
|Vx| = 4
Vxxn = 16 bits
log2|Vx| < n
2 < 16
45
Example Compression
Vxx Cx
0
1
2
3
fx ≝ Vx to Cx ≝ compression
fx-1 ≝ Cx to Vx ≝ decompression
46
Example Compression
Vx = { , , , }x Cx
0
1
2
3
ROM
fx ≝ compression table scan
fx-1 ≝ decompression
table lookup
47
Example Compression
Vx = { , , , }x Cx
0
1
2
3
128 bits reduced to 16 bits
112 bits of RAM saved
ROM
49
RAM compression results
50
RAM compression results
Compression22% RAM reduction3.6% ROM reduction
29% duty cycle increase
cXprop (no compression)10% RAM reduction20% ROM reduction
5.9% duty cycle reduction
Tradeoffs
51
Abstractinterpretation
SafeTinyOS
RAMcompression
ICD
Conditional xpropagation
Pointeranalysi
s
Volatiletracking
cXprop
52
Conclusion
• Interatomic concurrent data-flow • Volatile data may be tracked• Better analysis more optimizations
– Safe TinyOS – practical memory safety– RAM compression – 22% RAM reduction
http://www.cs.utah.edu/~coop/research/cxprop/http://www.cs.utah.edu/~coop/safetinyos/http://www.cs.utah.edu/~coop/research/ccomp/
Thank you
53
54
Su ≝ original size
Sc ≝ compressed size
C ≝ access profile
V ≝ cardinality of value set
A,B ≝ platform-specific costs
Su−Sc
C i A i B i V
Cost/Benefit Ratio
55
Turning the RAM Knob
0%
56
Turning the RAM Knob
10%
57
Turning the RAM Knob
20%
58
Turning the RAM Knob
30%
59
Turning the RAM Knob
40%
60
Turning the RAM Knob
50%
61
Turning the RAM Knob
60%
62
Turning the RAM Knob
70%
63
Turning the RAM Knob
80%
64
Turning the RAM Knob
90%
65
Turning the RAM Knob
100%
66
Turning the RAM Knob
95%
67
Future work
• Triggering and sequencing
• Caching compressed values
Timerinterrupthandler
Sense
Data ready
interrupt handler
Data
Fire FireTrigger
read x read x read xdecompress x decompress x decompress x
68
More related work• Safe TinyOS
– R. K. Rengaswamy, E. Kohler, and M. Srivastava. Software-based memory protection in sensor nodes. EmNets 2006.
– B. L. Titzer. Virgil: Objects on the head of a pin. OOPSLA 2006.
– S. Kowshik, D. Dhurjati, and V. Adve. Ensuring code safety without runtime checks for real-time control systems. CASES 2002.
• Offline RAM compression– Y. Zhang and R. Gupta. Compressing heap data for
improved memory performance. Software—Practice and Experience 2006.
– L. S. Bai, L. Yang, and R. P. Dick. Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems. CASES 2006.
69
PAG
• Program Analysis Generator– Domain specific language input describes
• Domain lattice• Transfer functions• Language-describing grammar• Fixed point solution method
– Data-flow analyzer as output
• Does not deal with concurrency• Used to evaluate fixed point solutions
70
Feature comparison
12%
5.5%
71
Domain comparison
72
Resource reduction
12%
8.3%
2.5%
1.8%
73
Atomic interleaving
main
Atomicsection
Atomicsection
Interrupt
Atomicsection
Interrupt
Atomicsection
Interatomic Concurrent Data-flow
Published at LCTES 2006
74
goo(int *z) *z = 42; a = *z;
a = {27}z = {&x}a = {7,27,42}z = {&x}
Context insensitivity
foo int x = 7;
bar(int *y)
a is a global variable
a = {27}y = {&x}
a = {27}x = {7}
bar(&x);goo(y);
a = {27}x = {7,42}
75
Benchmark descriptions
• AVR ATmega128 code• TinyOS• 3,000-26,000 lines of C code• Analysis times - seconds to an hour• Metrics
– Duty cycle• % of time processor is on• Obtained from Avrora
– Cycle-accurate simulator for WSNs
– Code size and data size
76
Wireless sensor networks
• 10 billion units / year • $12.5 billion market in 2006• Cheap • Resource constrained• e.g. Wireless sensor networks
– Mica2 mote ATmega 128L (4 MHz 8-bit MCU)128 KB code, 4 KB data SRAM