Lecture 6

November 19, 2003November 19, 2003 Sam SiewertSam Siewert

Service System Integration and Service System Integration and IOIO

ECEN 4623/5623ECEN 4623/5623

Lecture 6Lecture 6

Sam SiewertSam Siewert 22

Review of Service Response Review of Service Response TimelineTimeline

InputLatency

DispatchLatency

EventSensedd Interrupt Dispatch

Execution

Preemption Dispatch

ExecutionInterference

Completion(IO Queued)

OutputLatency

Actuation(IO Completion)

Time

Response Time = TimeActuation – TimeSensed

(From Event to Response)


Services and IOServices and IO

Initial Input IO from SensorsInitial Input IO from Sensors– Low-Rate Input: Byte or Word Input from Memory-Mapped Low-Rate Input: Byte or Word Input from Memory-Mapped

Registers or IO PortsRegisters or IO Ports– High-Rate Input: Block Input from DMA Channels and Block High-Rate Input: Block Input from DMA Channels and Block

Reads from FIFOsReads from FIFOs

Final Response IO to ActuatorsFinal Response IO to Actuators– Low-Rate Output: Byte or Word Writes Posted to Output FIFO or Low-Rate Output: Byte or Word Writes Posted to Output FIFO or

Bus Interface QueueBus Interface Queue– High-Rate Output: Block Output on DMA Channels and Block High-Rate Output: Block Output on DMA Channels and Block

WritesWrites

Intermediate IOIntermediate IO– During Service Execution, Memory Mapped Register IODuring Service Execution, Memory Mapped Register IO– External Memory IOExternal Memory IO– Cache Loads and Write-BacksCache Loads and Write-Backs– In Memory Input/Output from Service to ServiceIn Memory Input/Output from Service to Service


Service Integration ConceptsService Integration ConceptsInitial Sensor Input and Final Actuator Response OutputInitial Sensor Input and Final Actuator Response Output– Device Interface DriversDevice Interface Drivers

Simple Service Has No Intermediate IO, No Simple Service Has No Intermediate IO, No SynchronizationSynchronization

Environment

Sensors Effectors

Digital Control Law Service

ActuatorOutput

Interface

SensorInput Interface

ActuatorElectronics

SensorElectronics


Complex Multi-Service SystemsComplex Multi-Service SystemsMultiple ServicesMultiple ServicesSynchronization Between ServicesSynchronization Between ServicesCommunication Between ServicesCommunication Between ServicesMultiple Sensor Input and Actuator Output InterfacesMultiple Sensor Input and Actuator Output InterfacesIntermediate IO, Shared Memory, MessagingIntermediate IO, Shared Memory, Messaging


Multi-Service PipelinesMulti-Service PipelinesRT Management

tBtvid(30 Hz)

Bt878 PCINTSC decoder

(30 Hz)

Serial A/B ISA ethernet

frameRdyinterrupt

RGB32 buffer

tFrmDisp(10 Hz)

grayscale buffer

tFrmLnk(2 Hz)

tNet(2 Hz)

frame

TCP/IP pkts

Event releases / Service control

tOpnav(15 Hz)

tTlmLnk(5 Hz)

tThrustCtl(15 Hz)

tCamCtl(3 Hz)

P0

P1

P2

P3

SWHW

frameRdy Control Plane

Data Plane

RemoteDisplay


Pipelined Architecture ReviewPipelined Architecture ReviewRecall that Pipeline Yields CPI of 1 or LessRecall that Pipeline Yields CPI of 1 or Less

Instruction Completed Each CPU ClockInstruction Completed Each CPU Clock

Unless Pipeline Stalls!Unless Pipeline Stalls!

IF ID ExecuteWrite-Back







Service ExecutionService Execution

Efficiency of ExecutionEfficiency of Execution

Memory Access for Inter-Service Memory Access for Inter-Service CommunicationCommunication

Intermediate IOIntermediate IO

Ideally Lock Data and Code into Cache or Ideally Lock Data and Code into Cache or Use Tightly Coupled MemoryUse Tightly Coupled Memory

PeriodClkCyclesStallCountInstPathLongestCPIWCET casebest )(


Service EfficiencyService EfficiencyPath Length for a ReleasePath Length for a Release– Instruction CountInstruction Count– Longest Path Given AlgorithmLongest Path Given Algorithm

BranchesBranchesLoop IterationsLoop IterationsData Driven?Data Driven?

Path ExecutionPath Execution– Number of Cycles to Complete PathNumber of Cycles to Complete Path– Clocks Per InstructionClocks Per Instruction– Number of Stall Cycles for PipelineNumber of Stall Cycles for Pipeline

Data Dependencies (Intermediate IO)Data Dependencies (Intermediate IO)Cache Misses (Intermediate Memory Access)Cache Misses (Intermediate Memory Access)Branch Mis-predictions (Small Penalty)Branch Mis-predictions (Small Penalty)


Hiding Intermediate IO LatencyHiding Intermediate IO Latency(Overlapping CPU and IO)(Overlapping CPU and IO)

ICT = Instruction Count Time (Time to execute a block of ICT = Instruction Count Time (Time to execute a block of instructions with no stalls = CPU Cycles x CPU Clock instructions with no stalls = CPU Cycles x CPU Clock Period)Period)IOT = Bus Interface IO Time (Bus IO Cycles x Bus Clock IOT = Bus Interface IO Time (Bus IO Cycles x Bus Clock Period)Period)OR = Overlap Required – percentage of CPU cycles that OR = Overlap Required – percentage of CPU cycles that must be concurrent with IO cyclesmust be concurrent with IO cyclesNOA = Non-Overlap Allowable for Si to meet Di – NOA = Non-Overlap Allowable for Si to meet Di – percentage of CPU cycles that can be in addition to IO percentage of CPU cycles that can be in addition to IO cycle time without missing service deadlinecycle time without missing service deadlineDi = Deadline for Service Si relative to release (interrupt Di = Deadline for Service Si relative to release (interrupt initiating executon)initiating executon)CPI = Clocks Per Instruction for a block of instructionsCPI = Clocks Per Instruction for a block of instructions


OverlapOverlap

Di >= IOT is required; otherwise if Di < IOT, Si Di >= IOT is required; otherwise if Di < IOT, Si is is IO-BoundIO-BoundDi >= ICT is required; otherwise if Di < ICT, Si Di >= ICT is required; otherwise if Di < ICT, Si is is CPU-BoundCPU-BoundDi >= (IOT + ICT) requires no overlap of IOT Di >= (IOT + ICT) requires no overlap of IOT with ICTwith ICTif Di < (IOT + ICT) where (Di >= IOT and Di >= if Di < (IOT + ICT) where (Di >= IOT and Di >= ICT), overlap of IOT with ICT is requiredICT), overlap of IOT with ICT is requiredif Di < (IOT + ICT) where (Di < IOT or Di < if Di < (IOT + ICT) where (Di < IOT or Di < ICT), deadline Di can’t be met regardless of ICT), deadline Di can’t be met regardless of overlapoverlap


Service Execution EfficiencyService Execution Efficiency

CPIworst-case = (ICT + IOT) / ICTCPIworst-case = (ICT + IOT) / ICT

CPIbest-case = (max(ICT, IOT)) / ICTCPIbest-case = (max(ICT, IOT)) / ICT

CPIrequired = Di / ICTCPIrequired = Di / ICT

OR = 1 - [(Di - IOT) / ICT]OR = 1 - [(Di - IOT) / ICT]

CPIrequired = [ICT(1-OR) + IOT] / ICTCPIrequired = [ICT(1-OR) + IOT] / ICT

NOA = (Di – IOT) / ICTNOA = (Di – IOT) / ICT

OR + NOA = 1 (by definition)OR + NOA = 1 (by definition)


Overlap Required for ICT and IOT Overlap Required for ICT and IOT Given DGiven Dii=5000 nsecs=5000 nsecs

100

500900

1300

17002100

2500

29003300

3700

4100

4500

4900

5300

5700

100 80

0 1500 22

00 2900 36

00 4300 50

00 5700

-0.2-0.1

00.10.20.30.40.50.60.70.80.9

11.11.2

1.31.41.51.6

1.71.8

Overlap Required (OR)

Instruction Count Time (ICT) nanosecs

CPU Core IO Time (IOT) nanosecs

OR Function(ICT,IOT) for IOT and/or ICT > CT1.7-1.8

1.6-1.7

1.5-1.6

1.4-1.5

1.3-1.4

1.2-1.3

1.1-1.2

1-1.1

0.9-1

0.8-0.9

0.7-0.8

0.6-0.7

0.5-0.6

0.4-0.5

0.3-0.4

0.2-0.3

0.1-0.2

0-0.1

-0.1-0

-0.2--0.1


Synchronization and Message Synchronization and Message PassingPassing

Message QueuesMessage Queues– Provide Provide

CommunicationCommunication– Provide Provide

SynchronizationSynchronization

Traditional Message Traditional Message QueueQueue– Message SizeMessage Size– Internal FragmentationInternal Fragmentation

Heap Message QueueHeap Message Queue– Messages Are PointersMessages Are Pointers– Message Data in HeapMessage Data in Heap

S1

Msg

S2

S1 Ptr S2

Msg

HeapBufferPool

AllocationOf MessageHeap

De-AllocationOf MessageHeap

Lecture 6

Documents

Transcript of Lecture 6