EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected...

80
8/6/2019 1 EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi Professor R.M.D. Engineering College R.M.D.Engineering College

Transcript of EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected...

Page 1: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019 1

EC6703 -Embedded Real Time Systems

Dr.D.Rukmanidevi

Professor

R.M.D. Engineering College

R.M.D.Engineering College

Page 2: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

DESIGNING WITH

COMPUTING PLATFORMS

Designing with microprocessors.

Development and debugging.

System-level performance analysis.

2 R.M.D.Engineering College

Page 3: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

System architectures

Architectures and components:

software;

hardware.

Some software is very hardware-dependent.

3 R.M.D.Engineering College

Page 4: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Hardware platform

architecture

Contains several elements:

CPU;

bus;

memory;

I/O devices: networking, sensors, actuators, etc.

How big/fast much each one be?

4 R.M.D.Engineering College

Page 5: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Software architecture

Functional description must be broken into pieces:

division among people;

conceptual organization;

performance;

testability;

maintenance.

5 R.M.D.Engineering College

Page 6: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Hardware and software

architectures

Hardware and software are intimately related:

software doesn’t run without hardware; how much hardware you need is

determined by the software requirements:

speed;

memory.

6 R.M.D.Engineering College

Page 7: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Evaluation boards

Basic Platform chip and a variety of I/O Devices

Designed by CPU manufacturer or others.

Includes CPU, memory, some I/O devices.

May include prototyping section.

CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design.

7 R.M.D.Engineering College

Page 8: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Beegle board

open source platform is used to develop a low-cost board for embedded systems. This board consists of ARM Cortex TM –A8 processor, several built-in I/O devices and many connectors (flash memory, video and audio). It is primarily intended to support software development and serve as a starting point for a product design

8/6/2019 8 R.M.D.Engineering College

Page 9: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Adding logic to a board

Programmable logic devices (PLDs) provide low/medium density logic.

Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic.

Application-specific integrated circuits (ASICs) are manufactured for a single purpose.

9 R.M.D.Engineering College

Page 10: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

The PC as a platform

Advantages:

cheap and easy to get;

rich and familiar software environment.

Disadvantages:

requires a lot of hardware resources;

not well-adapted to real-time.

More power hungry

More expensive

10 R.M.D.Engineering College

Page 11: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Typical PC hardware

platform

CPU

CPU bus

memory

DMA

controller timers

bus

interface bus

inte

rfac

e

high-speed bus

low-speed bus

device

device

intr

ctrl

11 R.M.D.Engineering College

Page 12: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Typical PC hardware

platform

The CPU provides basic computational facilities.

RAM is used for program storage.

ROM holds the boot program.

A DMA controller provides DMA capabilities.

Timers are used by the operating system for a variety of purposes.

A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently with the rest of the system.

A low-speed bus provides an inexpensive way to connect simpler devices and may be necessary for backward compatibility as well.

8/6/2019 12 R.M.D.Engineering College

Page 13: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Typical busses

PCI: standard for high-speed interfacing

33 or 66 MHz.

PCI Express.

USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed.

13 R.M.D.Engineering College

Page 14: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Software elements

IBM PC uses BIOS (Basic I/O System) to implement low-level functions:

boot-up;

minimal device drivers.

BIOS has become a generic term for the lowest-level system software.

14 R.M.D.Engineering College

Page 15: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Example: StrongARM

StrongARM system includes:

CPU chip (3.686 MHz clock)

system control module (32.768 kHz clock). • Real-time clock;

• operating system timer

• general-purpose I/O;

• interrupt controller;

• power manager controller;

• reset controller.

15 R.M.D.Engineering College

Page 16: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Host/target design

Use a host system to prepare software for target system:

target

system

host system serial line

16 R.M.D.Engineering College

Page 17: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Host-based tools

Cross compiler:

compiles code on host for target system.

Cross debugger:

displays target state, allows target system to

be controlled.

17 R.M.D.Engineering College

Page 18: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Debugging Techniques

The serial port (USB)- development debugging but also for diagnosing problems in the field.

A breakpoint allows the user to stop execution, examine system state, and change state and to specify an address at which the program’s execution is to break

LEDs can be used to show error conditions, when the code enters certain routines, or to show idle time activity

The microprocessor in-circuit emulator (ICE) is a specialized hardware tool that can help debug software in a working embedded system. Allows you to stop execution, examine CPU state, modify registers.

A logic analyzer is an array of low-grade oscilloscopes

8/6/2019 18 R.M.D.Engineering College

Page 19: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Logic analyzers

A logic analyzer is an array of low-grade oscilloscopes:

19 R.M.D.Engineering College

Page 20: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Logic analyzer

architecture

UUT sample

memory microprocessor

controller

system clock

clock

gen

state or

timing mode

vector

address

display keypad

20 R.M.D.Engineering College

Page 21: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Debugging Challenges

Logical errors in software can be hard to track down, but errors in real-time code can create problems that are even harder to diagnose.

Real-time programs are required to finish their work within a certain amount of time; if they run too long, they can create much unexpected behavior.

The exact results of missing real-time deadlines depend on the detailed characteristics of the I/O devices and the nature of the timing violation. This makes debugging real-time problems especially difficult

21 R.M.D.Engineering College

Page 22: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Boundary scan

Simplifies testing of

multiple chips on a

board.

Registers on pins can be configured as a

scan chain.

Used for debuggers, in-circuit emulators.

8/6/2019 22 R.M.D.Engineering College

Page 23: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

How to exercise code

Run on host system.

Run on target system.

Run in instruction-level simulator.

Run on cycle-accurate simulator.

Run in hardware/software co-simulation environment.

23 R.M.D.Engineering College

Page 24: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Debugging real-time code

Bugs in drivers can cause non-deterministic behavior in the foreground problem.

Bugs may be timing-dependent.

24 R.M.D.Engineering College

Page 25: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

System-level performance

analysis

Performance depends

on all the elements of

the system:

CPU.

Cache.

Bus.

Main memory.

I/O device.

8/6/2019

memory

CPU

cache

25 R.M.D.Engineering College

Page 26: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Bandwidth as performance

Bandwidth applies to several components:

Memory.

Bus.

CPU fetches.

Different parts of the system run at different clock rates.

Different components may have different widths (bus, memory).

26 R.M.D.Engineering College

Page 27: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Bandwidth and data

transfers

Video frame: 320 x 240 x 3 = 230,400 bytes.

Transfer in 1/30 sec.

Transfer 1 byte/msec, 0.23 sec per frame.

Too slow.

Increase bandwidth:

Increase bus width.

Increase bus clock rate.

27 R.M.D.Engineering College

Page 28: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Bus bandwidth

T: # bus cycles.

P: time/bus cycle.

Total time for transfer:

t = TP.

D: data payload

length.

O1 + O2 = overhead

O.

8/6/2019

O1 D O2

W

Tbasic(N) = (D+O)N/W

28 R.M.D.Engineering College

Page 29: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Bus burst transfer

bandwidth

T: # bus cycles.

P: time/bus cycle.

Total time for transfer:

t = TP.

D: data payload

length.

O1 + O2 = overhead

O.

8/6/2019

B O

W

Tburst(N) = (BD+O)N/(BW)

2 1

29 R.M.D.Engineering College

Page 30: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Memory aspect ratios

64 M 16 M

8 M

1 4 8

30 R.M.D.Engineering College

Page 31: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Memory access times

Memory component access times comes from chip data sheet.

Page modes allow faster access for

successive transfers on same page.

If data doesn’t fit naturally into physical words:

A = [(E/w)mod W]+1

31 R.M.D.Engineering College

Page 32: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Embedded Real Time Systems

Dr.D.Rukmanidevi

Professor

R.M.D.Engineering College

8/6/2019 1 R.M.D.Engineering College

Page 33: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

COMPONENTS FOR

EMBEDDED PROGRAMS

Components that are commonly used in embedded software: the state machine, the circular buffer, and the

queue

State machines are well suited to reactive systems such as user interfaces; circular buffers and queues are

useful in digital signal processing.

8/6/2019 2 R.M.D.Engineering College

Page 34: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Software state machine

State machine keeps internal state as a variable, changes state based on inputs.

Uses:

control-dominated code;

reactive systems.

3 R.M.D.Engineering College

Page 35: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

State machine example

idle

buzzer seated

belted

no seat/-

seat/timer on

no belt

and no

timer/-

no belt/timer on

belt/- belt/

buzzer off

Belt/buzzer on

no seat/-

no seat/

buzzer off

4 R.M.D.Engineering College

Page 36: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

C implementation

#define IDLE 0

#define SEATED 1

#define BELTED 2

#define BUZZER 3

switch (state) {

case IDLE: if (seat) { state = SEATED; timer_on = TRUE; }

break;

case SEATED: if (belt) state = BELTED;

else if (timer) state = BUZZER;

break;

}

5 R.M.D.Engineering College

Page 37: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

circular buffer

Commonly used in signal processing:

new data constantly arrives;

each datum has a limited lifetime.

Use a circular buffer to hold the data stream.

d1 d2 d3 d4 d5 d6 d7

time t time t+1

6 R.M.D.Engineering College

Page 38: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Circular buffer

x1 x2 x3 x4 x5 x6

t1 t2 t3

Data stream

x1 x2 x3 x4

Circular buffer

x5 x6 x7

7 R.M.D.Engineering College

Page 39: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Circular buffers

Indexes locate currently used data, current input data:

d1

d2

d3

d4

time t1

use

input d5

d2

d3

d4

time t1+1

use

input

8 R.M.D.Engineering College

Page 40: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Circular buffer

implementation: FIR filter

int circ_buffer[N], circ_buffer_head = 0;

int c[N]; /* coefficients */

int ibuf, ic;

for (f=0, ibuff=circ_buff_head, ic=0;

ic<N; ibuff=(ibuff==N-1?0:ibuff++), ic++)

f = f + c[ic]*circ_buffer[ibuf];

9 R.M.D.Engineering College

Page 41: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Queues

Elastic buffer: holds data that arrives irregularly.

10 R.M.D.Engineering College

Page 42: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Models of programs

Source code is not a good representation for programs:

clumsy;

leaves much information implicit.

Compilers derive intermediate representations to manipulate and optimize the program.

11 R.M.D.Engineering College

Page 43: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Data flow graph

DFG: data flow graph.

Does not represent control.

Models basic block: code with no entry or exit.

Describes the minimal ordering requirements on operations.

12 R.M.D.Engineering College

Page 44: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Single assignment form

x = a + b;

y = c - d;

z = x * y;

y = b + d;

original basic block

x = a + b;

y = c - d;

z = x * y;

y1 = b + d;

single assignment form

13 R.M.D.Engineering College

Page 45: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Data flow graph

x = a + b;

y = c - d;

z = x * y;

y1 = b + d;

single assignment form

+ -

+ *

DFG

a b c d

z

x y

y1

14 R.M.D.Engineering College

Page 46: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

DFGs and partial orders

Partial order:

a+b, c-d; b+d x*y

Can do pairs of operations in any

order.

+ -

+ *

a b c d

z

x y

y1

15 R.M.D.Engineering College

Page 47: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Control-data flow graph

CDFG: represents control and data.

Uses data flow graphs as components.

Two types of nodes:

decision;

data flow.

16 R.M.D.Engineering College

Page 48: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Data flow node

Encapsulates a data flow graph:

Write operations in basic block form for simplicity.

x = a + b;

y = c + d

17 R.M.D.Engineering College

Page 49: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

Control

cond T

F

Equivalent forms

value v1

v2 v3

v4

18 R.M.D.Engineering College

Page 50: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

8/6/2019

for loop

for (i=0; i<N; i++)

loop_body();

for loop

i=0;

while (i<N) {

loop_body(); i++; }

equivalent

i<N

loop_body()

T

F

i=0

19 R.M.D.Engineering College

Page 51: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Basic Compilation

Techniques

Compilation flow.

Basic statement translation.

Basic optimizations.

Interpreters and just-in-time compilers.

8/6/2019 20 R.M.D.Engineering College

Page 52: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Compilation

Compilation strategy (Wirth):

compilation = translation + optimization

Compiler determines quality of code:

use of CPU resources;

memory access scheduling;

code size.

8/6/2019 21 R.M.D.Engineering College

Page 53: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Basic compilation phases

High Level Language code

parsing, symbol table generation Semantic analysis

machine-independent

optimizations

Instruction level optimizations and code Generation

assembly

to break it into statements and expressions

symbol table is generated, which includes all the named objects in the

program

8/6/2019 22 R.M.D.Engineering College

Page 54: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Statement translation and

optimization

Source code is translated into intermediate form such as CDFG.

CDFG is transformed/optimized.

CDFG is translated into instructions with optimization decisions.

Instructions are further optimized.

8/6/2019 23 R.M.D.Engineering College

Page 55: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Arithmetic expressions

a*b + 5*(c-d)

expression

DFG

* -

*

+

a b c d

5

8/6/2019 24 R.M.D.Engineering College

Page 56: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

2

3

4

1

Arithmetic expressions,

cont’d.

ADR r4,a

MOV r1,[r4]

ADR r4,b

MOV r2,[r4]

ADD r3,r1,r2

DFG

* -

*

+

a b c d

5

ADR r4,c

MOV r1,[r4]

ADR r4,d

MOV r5,[r4]

SUB r6,r4,r5

MUL r7,r6,#5

ADD r8,r7,r3

code

8/6/2019 25 R.M.D.Engineering College

Page 57: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Control code generation

if (a+b > 0)

x = 5;

else

x = 7;

a+b>0 x=5

x=7

8/6/2019 26 R.M.D.Engineering College

Page 58: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

3

2 1

Control code generation,

cont’d. ADR r5,a

LDR r1,[r5]

ADR r5,b

LDR r2,b

ADD r3,r1,r2

BLE label3

a+b>0 x=5

x=7 LDR r3,#5

ADR r5,x

STR r3,[r5]

B stmtent

LDR r3,#7

ADR r5,x

STR r3,[r5]

stmtent ...

8/6/2019 27 R.M.D.Engineering College

Page 59: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Procedure linkage

Need code to:

call and return;

pass parameters and results.

Parameters and returns are passed on stack.

Procedures with few parameters may use

registers.

8/6/2019 28 R.M.D.Engineering College

Page 60: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Procedure stacks

proc1

growth

proc1(int a) {

proc2(5);

}

proc2

SP

stack pointer

(end of the current frame)

FP

frame pointer

(end of the last frame)

5 accessed relative to SP

8/6/2019 29 R.M.D.Engineering College

Page 61: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

ARM procedure linkage

APCS (ARM Procedure Call Standard):

r0-r3 pass parameters into procedure. Extra

parameters are put on stack frame.

r0 holds return value.

r4-r7 hold register values.

r11 is frame pointer, r13 is stack pointer.

r10 holds limiting address on stack size to

check for stack overflows.

8/6/2019 30 R.M.D.Engineering College

Page 62: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Data structures

Different types of data structures use different data layouts.

Some offsets into data structure can be computed at compile time, others must be computed at run time.

8/6/2019 31 R.M.D.Engineering College

Page 63: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

One-dimensional arrays

C array name points to 0th element:

a[0]

a[1]

a[2]

a

= *(aptr + i)

8/6/2019 32 R.M.D.Engineering College

Page 64: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Two-dimensional arrays

Column-major layout:

a[0,0]

a[0,1]

a[1,0]

a[1,1] = a[i*M+j]

...

M

...

N

8/6/2019 33 R.M.D.Engineering College

Page 65: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Structures

Fields within structures are static offsets:

field1

field2

aptr struct {

int field1;

char field2;

} mystruct;

struct mystruct a, *aptr = &a;

4 bytes

*(aptr+4)

8/6/2019 34 R.M.D.Engineering College

Page 66: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Expression simplification

Constant folding:

8+1 = 9

Algebraic:

a*b + a*c = a*(b+c)

Strength reduction:

a*2 = a<<1

8/6/2019 35 R.M.D.Engineering College

Page 67: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Dead code elimination

Dead code:

#define DEBUG 0

if (DEBUG) dbg(p1);

Can be eliminated by

analysis of control

flow, constant folding.

0

dbg(p1);

1

0

a Code that will never be executed can be

safely removed from the program

8/6/2019 36 R.M.D.Engineering College

Page 68: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Procedure inlining

The C++ programming language provides an inline construct that tells the

compiler to generate inline code for a function.

int foo(a,b,c) { return a + b - c;}

z = foo(w,x,y);

z = w + x + y;

does not have a separate procedure body and procedure linkage

inlined procedure is generated in expanded form whenever possible.

eliminate the procedure linkage instructions, when a cache is present,

having multiple copies of the function body may actually slow down

the fetches of these instructions.

Inlining also increases code size, and memory may be precious.

8/6/2019 37 R.M.D.Engineering College

Page 69: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Loop transformations

Goals:

reduce loop overhead;

increase opportunities for pipelining;

improve memory system performance.

8/6/2019 38 R.M.D.Engineering College

Page 70: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Loop unrolling

Reduces loop overhead, enables some other optimizations.

for (i=0; i<4; i++)

a[i] = b[i] * c[i];

for (i=0; i<2; i++) {

a[i*2] = b[i*2] * c[i*2];

a[i*2+1] = b[i*2+1] * c[i*2+1];

}

to replicate the code inside a loop body a number of times

8/6/2019 39 R.M.D.Engineering College

Page 71: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Loop fusion and

distribution

Fusion combines two loops into 1: for (i=0; i<N; i++) a[i] = b[i] * 5;

for (j=0; j<N; j++) w[j] = c[j] * d[j];

for (i=0; i<N; i++) {

a[i] = b[i] * 5; w[i] = c[i] * d[i];

}

Distribution breaks one loop into two.

Changes optimizations within loop body.

8/6/2019 40 R.M.D.Engineering College

Page 72: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Loop tiling

Breaks one loop into a nest of loops.

Changes order of accesses within array.

Changes cache behavior.

8/6/2019 41 R.M.D.Engineering College

Page 73: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Loop tiling example

for (i=0; i<N; i++)

for (j=0; j<N; j++)

c[i] = a[i,j]*b[i];

for (i=0; i<N; i+=2)

for (j=0; j<N; j+=2)

for (ii=0; ii<min(i+2,n); ii++)

for (jj=0; jj<min(j+2,N); jj++)

c[ii] = a[ii,jj]*b[ii];

8/6/2019 42 R.M.D.Engineering College

Page 74: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Register allocation

Goals:

choose register to hold each variable;

determine lifespan of varible in the register.

Basic case: within basic block.

8/6/2019 43 R.M.D.Engineering College

Page 75: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Register lifetime graph

w = a + b;

x = c + w;

y = c + d;

time

a

b

c

d

w

x

y

1 2 3

t=1

t=2

t=3

8/6/2019 44 R.M.D.Engineering College

Page 76: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Instruction scheduling

Non-pipelined machines do not need instruction scheduling: any order of instructions that satisfies data dependencies runs equally fast.

In pipelined machines, execution time of one instruction depends on the nearby instructions: opcode, operands.

8/6/2019 45 R.M.D.Engineering College

Page 77: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Software pipelining

Schedules instructions across loop iterations.

Reduces instruction latency in iteration i by inserting instructions from iteration i+1.

8/6/2019 46 R.M.D.Engineering College

Page 78: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Instruction selection

May be several ways to implement an operation or sequence of operations.

Represent operations as graphs, match possible instruction sequences onto graph.

*

+

expression templates

* +

*

+

MUL ADD

MADD

8/6/2019 47 R.M.D.Engineering College

Page 79: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Using your compiler

Understand various optimization levels (-O1, -O2, etc.)

Look at mixed compiler/assembler output.

Modifying compiler output requires care:

correctness;

loss of hand-tweaked code.

8/6/2019 48 R.M.D.Engineering College

Page 80: EC6703 -Embedded Real Time Systems Dr.D.Rukmanidevi ... Materials/7...a A high-speed bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently

Interpreters and Just In

Time(JIT) compilers

Interpreter: translates and executes program statements on-the-fly.

JIT compiler: between an interpreter and a stand-alone compiler. compiles small sections of code into instructions during program execution.

Eliminates some translation overhead.

Often requires more memory.

8/6/2019 49 R.M.D.Engineering College