UAV Data Link Design for Dependable Real-Time Communications
-
Upload
gerardo-pardo-castellote -
Category
Technology
-
view
10.101 -
download
6
Transcript of UAV Data Link Design for Dependable Real-Time Communications
UAV Data Link Design for Dependable Real-Time
Communications
Avionics 2009, Amsterdam
Gerardo Pardo-Castellote, Ph.D.Chief Technology OfficerReal-Time Innovations, [email protected]
Outline
UAV Communication Requirements
Why TCP-based solutions do not work
Implementing your own data-link protocol
Using middleware (DDS) for the Data-Link
Conclusions
UAVs part of larger integrated network
Vehicle LAN
Data Link
Ground StationLAN
Avionics
Net Centric GIG
TacticalBackbone
Real-Time
Ground Station
BackendWAN
Characteristics of UAV Communications
In-Vehicle comm.
Data Link comm.
Ground Station comm.
Net-Centric Backbone comm.
Inside Vehicle Communications
Deeply-embedded, low-power– Limited CPU speed– General Purpose Processors and FPGAs
Memory constrained devices– Limited RAM– Flash filesystem or none
Dedicated IPC transports– Back plane
Certification requirements – DO-178B Operating Systems
Challenging Environment to
operate on!
Data-Link Communications
Multiple traffic types:– Sensor data streams– Command & Control data– Status, Intelligence, Mission, Supervisory
Different traffic requirements for each type:– Urgency, Priority, Reliability, Volume– Stealth operations
Challenging communications channel:– Large latency, low throughput channel– Lossy links– Disconnections– Asymmetric bandwidth (downlink vs uplink)
Data Link Types & Requirements
Low Throughput High reliability High integrity
Aggregate Performance for HRDL + HCDL
High Throughput(streaming data)
Moderate ThroughputHigh Avail.High Integrity
Reqs
C2 and Status data transfer in emergency
Relay xfer: High Altitude Platform or UAV
Sensor Data
C2 DataStatus Data (position attitude)
Use
Back UpBeyond Line of Sight
High Capacity(HCDL)
High Reliability(HRDL)
Ground Station Communications
ModularityMultiple vendorsMultiple languagesEvolutionFailoverHandwoverRedundancy
Ground Station Characteristics
Heterogeneous system– RT storage & processing of sensor data– Integration with display– Integration with C2 / supervision systems– Integration with net-centric back end
Multiple programming languages: C/C++/Java/.NETMulti-Platform: Linux/Windows/EmbeddedModular, reconfigurableVarying assignments of Ground-Station to UAV
Ground Station Requirements
Be able to handle and adapt a variety of:– CPUs, / Computer platforms– Traffic flows– Programming Languages– Operating Systems
Provide a modular framework– Support reconfiguration– Support evolution, extensibility– Support SOA tenets
Support operational use-cases– Link fallback– Multi-station handoff
Net-Centric Communications
Multiple vendorsHighly Heterogeneous:– computer platforms– operating systems– programming languages
Integration with disparate technologies– Databases– REST/HTTP, Web-Services– ESBs– Middlewares: DDS, JMS, CORBA, MOMs
Multiple architectural concerns– SOA– Event-Driven– Publish-Subscribe
SecurityAuditing, Recording requirementsCross Domain Solution requirements
Net-Centric Communications
DDSDDS
Outline
UAV Communication Requirements
Why TCP-based solutions do not work
Implementing your own data-link protocol
Using middleware (DDS) for the Data-Link
Conclusions
TCP-based solutions do not work for the Data Link
TCP has fundamental problems in the Data-Link…– Un-tunable timers & congestion control algorithm– Bad behavior on lossy networks & networks with dropouts– Bad behavior on large latency links
The consequences are:– Protocol Problems
Head of line blockingBrittle connection-oriented modelByte-oriented. Lacks prioritization Inflexible reliability model. Not stealth
– Performance problemsSlow connectLow link utilization
TCP
TCP problem: Head-of-line blocking
TCP funnels all traffic over single reliable streamA Byte cannot be delivered until all previous Bytes have been received– A lost packet will “block” all future traffic until that
packet is repaired– A large message will “block” all future traffic until it is
completely delivered
– IMAGE: Broken Bicycle blocking race car– IMAGE: Large Tractor blocking race car
TCP’s “stream-oriented” reliability model not suitable for Data Link
TCP
TCP issues: Brittle connections
TCP relies in hard-coded timers to establish connections– SYN messages bust be responded before timeout– SYN timer is 3 secs with doubling exponential backoff: 3s, 6s, 12s,…– Implementations give up after fixed number of attempts– Large latencies (> 60 sec) cause every TCP connection attempt to fail– Some TCP implementations fail sooner: e.g 9 sec for Windows
TCP is bad a detecting disconnections– To detect connection liveliness must use KEEP_ALIVE option
System-wide timeout defaults to 2 hours– Common solution is periodic application messaging
Detection time non-deterministic. Order of minutes
TCP connection failure is drastic– All state is lost– No knowledge of what messages were delivered or not– Application must do their message framing, sequence numbering and
acknowledgment to enable to continue upon re-connect
TCP
TCP issue: Low bandwidth use
‘Perfect storm’ for TCP protocol:– TCP slow start
Ramp-up time ~ RTT*log(BW) – RTT – roundtrip time (2 x latency)– BW -- bandwidth
– Insufficient TCP buffersize given large RTTTo utilize a given BW TCP needs buffersize ~ BW*RTT For 10 Mbps and 500msec buffsize ~ 640KB! Typical Available/Configured TCP buffsize far smaller!
– TCP congestion-control algorithm misinterprets packet loss as a sign of congestion
End result: long ramp-up times, low and/or unstable bandwidth use
TCP
Details: Insufficient buffersize
TCP flow control is based on a send "window size“– Send Window determines how much data can be outstanding (i.e.,
unacknowledged) in the network. In long-delay networks require large send-windows to hold large amount of “in flight data” without blocking sender– DataInTransit ~ bandwidth X delay
Operating Systems limit/hardcode window size.– TCP standard limits window to 64 KB (in practice 32KB due to
signed arithmetic)– Required windows are much larger:
RTT 0.8, BW 1.54 Mbps requires 154 KB
New "large-window“ TCP extension (TCP-LW) allows windows up to 232KB– But that makes the slow-start problem bigger…
TCP
Details: Congestion control
TCP congestion avoidance is bad for lossy or long latency links:– Mistakenly interprets packet loss as congestion– Excessively long ramp-up for new connections
RED (random early detection) gateways requires each gateway to monitor its own queue length. When imminent congestion is detected the TCP sender is notified. By dropping a packet earlier than it would normally, RED sends an implicit notification of congestion. The sender is effectively notified by the timeout of this packet. The principle behind the RED approach is that a few earlier-than-usual drops may help avoid more packet drops later on. The TCP sender can then reduce its window before serious congestion occurs.In TCP Vegas the TCP sender predicts when congestion is about to occur and reduces its transmission window before intermediate routers drop packets
– TCP can keep track of the minimum round trip time seen during a transfer and use the most recently observed round trip time to compute the data queued in the network.
– TCP can also keep track of the throughput before and after the congestion window changes to estimate the network congestion level.
– If estimates indicate that the number of packets queued in the network is rising, it reduces the congestion window. As it observes the number decreasing it increases the congestion window.
Although neither approach has been widely adopted, both hold promise for satellite networks. As we mentioned earlier, TCP congestion control responds to congestion slowly because of latency. If such congestion can be avoided before it happens, it is a big win for high-speed and long-delay networks.
TCP
TCP issue: reliability & congestion control
TCP acknowledgment is non-selective & blunt– If a segment is lost, TCP will retransmit all data starting
from the lost segment without regard to the successful transmission of later segments.
TCP congestion control fooled by lossy networks– TCP considers this lost segment as an indication of
congestion and reduce its window size in half
TCP
TCP issue: chatty reliability protocol
TCP reliability requires constant ACKs from receiver– Even if all messages are received…
ACK traffic consumers power and bandwidthACK traffic prevents stealth operations (can reveal position of ACKer)
Other protocols (best efforts or NACK only) may be better suited…
TCP
Summary TCP protocol problems
TCP is inflexibleTCP protocol not well suited for Data Link– Low performance– Incorrect behavior
NASA and others have tried to spearhead efforts to modify TCP…– Research on “delay tolerant” networks– Research on TCP: HACK, SACK, Trunk protocols
These efforts remain in the research domain
TCP “one size fits all” Qos not suitable for Data Link
TCP
Outline
UAV Communication Requirements
Why TCP-based solutions do not work
Implementing your own data-link protocol
Using middleware (DDS) for the Data-Link
Conclusions
Implementing your own data-link protocol
Session managementData stream managementBuffering Traffic Prioritization/ShapingFragmentation / ReassemblyReliabilityRedundant links/failover
General Architecture
To solve the reliability, flow control, and disconnection issues we need:– Data buffers at both ends– a reliable comm. protocol sends the data from the
send buffer to the receive buffer
SenderApplication Receiver
ApplicationReliability Protocol
Send Buffer Receive Buffer
General Architecture (2)
To avoid head-of-line blocking we need– Separate buffers for each traffic type– Separate reliable data streams for each traffic type,
each should have its own separate session
SenderApplication
ReceiverApplicationEach traffic type has its own session
Send Buffer Receive Buffer
Reliable Protocol
At a minimum the reliability protocol must– Identify each message with sessionId and a
sequence number– Send periodic HearBeats announcing which
sequence numbers should have been received– Accept ACKs to record the messages and clear
from send buffer– Accept NACKs for sequence numbers and send the
requested repairs
Company Confidential
Confirmed Reliability (TCP style) No packet loss
01
02
03
04
01020304, HB
01
02
03
04ACK 1-405
06
07
08
05060708, HB
05
06
07
08ACK 1-8
Company Confidential
Confirmed Reliability (TCP Style)Some packet loss
01
02
03
04
01020304, HB
01
02
XACK 1-2, NACK 3
05
06
07
08
05
060708, HB
06
07
08ACK 1-8
030405
XX
Packets 04 and 05 are received but the protocol drops them because a prior packet 03 is missing.This wastes valuable bandwidth
Reliable Protocol (II)
For performance the protocol should– Accept received messages out of order and cache
them on the receiver buffer while the missing messages are repaired
– Send selective NACKs (SACKs) for just the sequence numbers that are missed
To handle large sensor data (e.g images)– Fragment & re-assemble large messages– Handle reliability on message fragments as well
To handle small updates– Bundle small updates into batches– Flush batches based on max delay or packet size
Company Confidential
Confirmed Reliability (Reader Cache + SACK) No packet loss
01
02
03
04
01020304, HB
01
02
03
04ACK 1-405
06
07
08
05060708, HB
05
06
07
08ACK 1-8
Company Confidential
Confirmed Reliability (Reader Cache + SACK)Some packet loss
01
02
03
04
01020304, HB
01
02
X
04ACK 1-2, SACK 305
06
07
08
05
060708, HB
05
06
07
08ACK 1-8
03
Packets 04 and 05 are received and cached waiting for the repair of 03.
No bandwidth is wasted.
Reliable Protocol (III)
For performance on a wide variety of links the protocol must– Allow configuration of timers and buffer sizes – Maintain liveliness of the link via KeepAlive
messages– Allow sessions and buffers to survive link
disconnection– Perform output shaping with rate limits– Support prioritization between sessions/traffic types– Support differential shaping for each traffic type– …
Redundancy and Failover
Data-Link may deliver duplicate packetsData might arrive from redundant transportsFailover requires multiple sources of the same informationHow does protocol identify/filter these duplicates?– Needs VirtualSessionId identifying session
independent of data-link or source– Reader queue must be 2-level. Second level
organized by VirtualSessionId filters-out duplicates
Stealth
Reliability should be tunable:– Best-efforts mode. No ACK traffic
Sacrifices reliabilityWhile ensures order & no duplicates
– A NACK-only limits backwards trafficBut requires smarter buffer management
– Full reliability. Both ACKs and NACKsEnsures delivery to the receiving application
Example (best effort with packet loss)
01
02
03
04
01020304, HB
01
02
X
0405
06
07
08
05060708, HB
05
06
07
08
Company Confidential
Packets 03 is permanently lostRepair request would compromise stealth. Application notified of packet loss.
Stealth Reliability (no packet loss)
01
02
03
04
01020304, HB
01
02
03
0405
06
07
08
05060708, HB
05
06
07
08
Stealth not compromised underNormal operating conditions.
Stealth Reliability (some packet loss)
01
02
03
04
01020304, HB
01
02
X
04NACK 305
06
07
08
05060708, HB
05
06
07
08
03
Stealth minimally compromised Only when some message is lost
Message Batching
write()
sender receiver
write()
sender
Send queue Receive queue
Send queue Receive queue
Without batching each message is separately sent. For small messages protocol headers might be bigger than payload
With batching messages are held a little and combined into larger batches maximizing throughout and minimizing CPU
receiver
Transparent:
Receiver still sees individual messages
Reliability with Batching
Reliability must work even when messages are batchedACK or NACK of individual samples would negate some of the benefits of batching…=> Protocol must be batch aware so that it can ACK/NACK complete batches!
B3
B2
B1
B3
B2
B1
ACK(B3), NACK(B2)
Repair B2
B3
B2
B1
write()sender
receiver
Batching is hard but it pays!
RTI DDS 4.3b perftest results
0
100
200
300
400
500
600
700
800
900
1000
0 1000 2000 3000 4000 5000
Sample size (bytes)
Thro
ughp
ut (M
bps) Linux Baseline
Linux 10Kb Batch
Intel Core2Duo Single-CPU Dual-Core 2.4GHz, 4MB cache32-bit CentOS 5 (RHEL 5), 2GB memory, Intel E1000 NIC
Other considerations
Resource management:– During disconnected operation buffers might fill or
overflow…– Solution is smart caching:
Purge by ageFilter by frequencyKeep “one of each”– requires additional insight onto the data– Some object identifier (e.g. track Id)
Filter by content
This is HARD!!
Outline
UAV Communication Requirements
Why TCP-based solutions do not work
Implementing your own data-link protocol
Using middleware (DDS) for the Data-Link
Conclusions
Ethernet Wireless Radio Shared Memory cPCI 1553
Using a Network Middleware
Network middleware: A library between the operating system and the applicationIt insulates application from the raw network Implements reliability, caching, …
Hardware (e.g. Radio)
Network stack (e.g. IP)
Middleware
Application
Middleware
Application Application Application Application Application
Which middleware to use?
Standards basedConfigurable via QoSNot based on TCPManages Sessions/Fragmentation/Reliability…Failover/handover supoortEfficient use of bandwidthMulti-platformEmbeddable, Certifiable…Integration with net-centric back end
DDS mandated for data-distribution
DISR (formerly JTA)– DoD Information Technology
Standards Registry
US Navy Open Architecture
FCS SOSCOE– Future Combat System –
System of System CommonOperating Environment
SPAWAR NESI– Net-centric Enterprise Solutions
for Interoperability – Mandates DDS for Pub-Sub SOA
48
European Air Traffic Control
RETF (USA)Train Communications
Tokyo JapanTraffic Control
Boeing Army FutureCombat System
Boeing AWACSprogram
US Navy, DD(X)LCS, LPD-17
SeaSliceand 13 other Navies
DDS Adoption
Insitu Unmanned Air Vehicle
“…we have seen a 30% increase in productivity based on not having to handle data communication issues.” Gary Viviani, VP of Engineering
Insitu is a recognized leader in the exploding UAV space
The next generation of UAV’sincluding the Scan Eagle and newer platforms
Challenge is to have a successful UAV mission which requires impressive autonomy and reliable ground control
DDS enables an information flow that is much more orchestrated and flexible allowing seamless switch control between multiple ground stations while connecting reliably over unreliable links
© 2008 Real-Time Innovations, Inc.50
Advanced Cockpit Ground Control Station
Defense
General Atomics Aeronautical Systems developed advanced cockpit ground control stations (GCSs) for unmanned aircraft systems
Required real-time data distribution for acquisition, analysis, and response of remote controlled aircraft
DDS selected for proven software & services. Application built in under 14 months, significantly less time than with alternative software or building their own
CLIP Mediator Bridge
Transportation
• Common Link Integration Processing (CLIP): U.S. Air Force and Navy joint project to build Tactical Data Link (TDL) aggregator
• Enables information exchange between platforms with incompatible tactical data links
• Challenge: existing system had poor integration with platform mission systems
• With Northrop Grumman, RTI helped architect, design, develop & test mediator bridge between platform systems and CLIP
– RTI Services built a ‘mediator’ bridge between Air Force, Navy, NGC, B1, B52
– First NESI DDS Compliant Product
Defense
“Working with RTI has been both effective and productive.”
– Jim Miller, CLIP Program Manager
© 2008 Real-Time Innovations, Inc.52
BASE 10 Systems Land Vehicles
5 different subsystems on the data bus and communication link between RCC and RoboScout RS
Next release of RoboScoutwill implement DDS in vehicular platform and outside services (radio and satellite data-links).
Defense
© 2008 Real-Time Innovations, Inc.53
Autonomous vehicle in the 2005 DARPA Grand Challenge race
Unique characteristic of FireFox: adaptive vision system – vehicle “learns” through example
Complex network of control and vision systems, sensors, processors, operating systems
DDS integrates all kinds of data sources, shares data with minimal latency
DARPA Flying Fox - Autonomous Vehicle Systems
Unmanned Vehicles
© 2008 Real-Time Innovations, Inc.54
DDS B-1B Tactical Systems Upgrade
DDS is being used to seamlessly integrate legacy flight control systems with a new open architecture tactical communication and control system.
Adding new command & control and communications capabilities that need to work with legacy control system
Need architecture that is open & modular for future extensions and upgrades
DDS is open and scalable, reducing integration risk, standards-based ensuring supportability
Defense
AWACS Radar System Upgrade
Airborne control system for surveillance, command & control and battle management
Upgrading system to be open, supportable, less expensive to maintain and extend
DDS is standards-based, open and extensible, reducing integration risk
DDS is a proven COTS solution, reducing total cost of ownership over in-house development
CAE SimXXI Flight Simulation
State-of-the-art full-flight simulator from CAE
Challenge is communication between subsystems (over IEEE 1394) with low-latency data transfer
DDS chosen because it excels in real-time performance and is simple to use and integrate
© 2008 Real-Time Innovations, Inc.57
INDRA: Air Traffic Management
Finance
Air traffic service needed to control flow of traffic through busy metropolitan air space
Reliability is critical − hardware or software failures mean flight delays and substantial costs
DDS high performance permits the fast addition, updating and removal of system nodes without disrupting the data flow
DDS engaged to integrate, extend and design to the customer’s specific needs
Transportation
Next-generation of the U.S. Navy Aegis Weapon System
Challenge to share time-critical data across highly distributed system including radar, weapons, displays and controls
Need to maximize future scalability and flexibility
DDS provides real-time communication infrastructure. Standards-based & extensible for future system enhancements
Lockheed Martin US Navy Aegis Open Architecture Weapon System
© 2008 Real-Time Innovations, Inc.59
Navy Open Architecture Ship Self Defense System (SSDS)
Project to employ standards throughout ship systems (frameworks, OS, etc.)
Goal: Reduce total cost of ownership, ease system upgrades, reduce interoperability issues
DDS selected as middleware: its extensibility enables an open architecture throughout Navy!
DDS Services provided advanced integration, support & consulting
Defense
Sample EU project using DDS
ESO Extremely Large Telescope (E-ELT)– 43m diameter (see vehicles on picture!)– 30.000 sensors send data on the bus– RTI DDS used as middleware for critical data
communication and integration
INDRA i-TEC e-FDP ATM program– European leader in Air Traffic Mgmt
applications– ATM integration for UK, Spain and Germany– RTI Used as integration solution for Flight Data
Management and Distribution
EADS Euro Hawk UAV program– EADS selected RTI for European UAV program– RTI is used as embedded middleware in UAV
versatile payload
Sample EU project using DDS
PLATH (Hamburg, Germany)– Radio signal analysis experts– Has decided to use RTI on a large scale for key
middleware services
Volkswagen R&D– After thorough evaluation VW has selected RTI
as a middleware for their next generation vehicular R&D platform,
– AUTOSAR, ECS, ECU context.
MBDA France & UK– They have been using RTI for 2 years– Vertical launch missile program « MOUV »
Sample EU project using DDS
BASE 10 RoboScout Technology Reference System (TRS)– BTSE is a German project focused company
specialized in the defense market within NATO. They are experts in robotics integrating systems engineering, system qualification, manufacturing and long term support.
– Base 10 has been working with RTI for 1 year– We delivered Quick-Start training and an architecture
study on how to implement RTI on the vehicular platform data flows
– There are 5 different subsystems on the data bus and communication link between RCC and RoboScout RS
– RTI is now implemented in the RCC (bottom left picture)
– Next release of RoboScout will implement RTI in vehicular platform and outside services (radio and satellite data-links).
Many others
Dissecting Messaging Technologies
The alternatives:Standards based:– Web-Service/SOAP Based (WS-Eventing, WS-Notification…)– JMS– CORBA– Real-Time Data-Distribution Service
Vendor-proprietary: – ESBs– IBM WebSphere MQ, – TIBCO, – 29West, – Gigaspaces
Custom build
Architecture
Quality of ServicePerformance& Scalability
Best-of breed RT-Messaging: DDS
Data Distribution Service (DDS)– High performance real-time data distribution– Object Management Group (OMG)
DDS Standard API (v1.2)– Specifies user-visible API– Ensures application portability– Adopted in June 2003, revised June 2005,2006
DDS Standard Wire Protocol (v 2.1)– Real-Time Publish-Subscribe (RTPS)– Ensures application interoperability– Adopted in June 2006, revised July 2007, 2008
Real-time Publish-Subscribe
(RTPS) Wire Protocol
DDSMiddleware
Data DistributionService API
Standards-based services for application developers
Standard protocol for interoperability
Message bus architectures
Centralized Clustered
Federated Peer to Peer
DDS
JMS IBM
TIBCO
IBM
Message Quality of Service
Avoid a single source from overwhelming the network. Prevent large low-urgency data (e.g., file downloads) from compromising the performance of critical data (e.g., alarms and critical news updates).
Provide dedicated bandwidth to the most critical data.
Control how much load and bandwidth a particular sender can inject into the network. Control the peak load, average load, and size of a burst.
FlowControl
Prioritize real-time flows like live audio over traffic that may be buffered (e.g., video replay).
Prioritize critical control information (e.g., live radar tracks) over non-time critical information such as aircraft schedule changes.
Specify the relative importance of different messages and the maximum acceptable delay between the time the message is sent and the time it’s delivered to the reader(s).
LatencyBudget
Send live voice or video data. Send sensor data (e.g., radar tracks), traffic readings, CPU/network statistics and readings.
Let the application decide whether messages should be confirmed and retried when missed, or else sent as best efforts.
ReliabilityExample Use CasesPurposeQoS
Message Quality of Service
Avoid a single source from overwhelming the network. Prevent large low-urgency data (e.g., file downloads) from compromising the performance of critical data (e.g., alarms and critical news updates).
Provide dedicated bandwidth to the most critical data.
Control how much load and bandwidth a particular sender can inject into the network. Control the peak load, average load, and size of a burst.
FlowControl
Prioritize real-time flows like live audio over traffic that may be buffered (e.g., video replay).
Prioritize critical control information (e.g., live radar tracks) over non-time critical information such as aircraft schedule changes.
Specify the relative importance of different messages and the maximum acceptable delay between the time the message is sent and the time it’s delivered to the reader(s).
LatencyBudget
Send live voice or video data. Send sensor data (e.g., radar tracks), traffic readings, CPU/network statistics and readings.
Let the application decide whether messages should be confirmed and retried when missed, or else sent as best efforts.
ReliabilityExample Use CasesPurposeQoS
DDS JMS* (partial)
DDS
DDS
WS-* (partial)Proprietary
Proprietary
Message Quality of Service (Cont.)
Allow exploiting the differential service capabilities of the network infrastructure
Configure the network infrastructure to prioritize messages ahead of others.
Controls the traffic class used for the underlying network transport.
Takes advantage of network multicast infrastructure
Transport Priority
Multicast
Prevent a rapidly changing source from using a lot of resources and starving other less-active sources.
Some applications may only be interested in the last 100 events for each server regardless of the time interval when they occurred.
Control how many related messages (e.g., successive updates to a stock value or successive readings of a sensor) must be maintained by the middleware and delivered to readers.
History
Prevent data that loses value with age (e.g., old stock values, old news, old sensor readings) from using valuable system resources, while ensuring that needed historic information is kept (e.g., transaction records).
Control how long the data must be kept by the middleware to be delivered to readers.
Old data may be of little value delivering it wastes bandwidth and gets in the way of the more recent data.
LifespanExample Use CasesPurposeQoS
Message Quality of Service (Cont.)
Allow exploiting the differential service capabilities of the network infrastructure
Configure the network infrastructure to prioritize messages ahead of others.
Controls the traffic class used for the underlying network transport.
Takes advantage of network multicast infrastructure
Transport Priority
Multicast
Prevent a rapidly changing source from using a lot of resources and starving other less-active sources.
Some applications may only be interested in the last 100 events for each server regardless of the time interval when they occurred.
Control how many related messages (e.g., successive updates to a stock value or successive readings of a sensor) must be maintained by the middleware and delivered to readers.
History
Prevent data that loses value with age (e.g., old stock values, old news, old sensor readings) from using valuable system resources, while ensuring that needed historic information is kept (e.g., transaction records).
Control how long the data must be kept by the middleware to be delivered to readers.
Old data may be of little value delivering it wastes bandwidth and gets in the way of the more recent data.
LifespanExample Use CasesPurposeQoS
DDS JMS
DDS
DDS
Proprietary
Message Quality of Service (Cont.)
Allow consumers with slow CPU or network (e.g. wireless)
Filter data at the source or in the infrastructure. Avoid wasting CPU and bandwidth delivering data that is not of interest
Monitor aircraft in your airspace, alarms in the immediate vicinity, stocks that cross a threshold or in the industries of interest…
Provide an application only the data it needs
Filter messages based on content as requested by the consuming application
ContentFiltering
Prevent data that loses when application crash
Allow short-living applications (e.g. cgiscripts) to generate messages that are received reliable even by applications that join the network later
Externalize message history so that they survive beyond the life of the application that generates them
Deliver messages reliably in the presence of application failure and re-starts.
Persistence
Example Use CasesPurposeQoS
Message Quality of Service (Cont.)
Allow consumers with slow CPU or network (e.g. wireless)
Filter data at the source or in the infrastructure. Avoid wasting CPU and bandwidth delivering data that is not of interest
Monitor aircraft in your airspace, alarms in the immediate vicinity, stocks that cross a threshold or in the industries of interest…
Provide an application only the data it needs
Filter messages based on content as requested by the consuming application
ContentFiltering
Prevent data that loses when application crash
Allow short-living applications (e.g. cgiscripts) to generate messages that are received reliable even by applications that join the network later
Externalize message history so that they survive beyond the life of the application that generates them
Deliver messages reliably in the presence of application failure and re-starts.
Persistence
Example Use CasesPurposeQoS
DDS JMS
DDS
Proprietary
Proprietary
Non-real-time Soft real-time Hard real-time Extreme real-time
Java/RMIJava/JMS
CORBA
MPI
Java RTSJ (soft RT) RTSJ (hard RT)
Web Services
Mes
sagi
ng T
echn
olog
ies
and
Stan
dard
sM
essa
ging
Tec
hnol
ogie
s an
d St
anda
rds
Data Distribution Service / DDS
RT CORBA
Adapted from NSWC-DD OA Documentation
Data Distribution Service spans avery wide spectrum of application needs
Top reasons to use DDS
Flexibility and Power of the data-centric model
Performance & Scalability
Rich set of built-in services
Interoperability across platforms and Languages
Provides/integrates Pub-Sub into SOA
#1 DDS Data-Centric Model
Data WriterData Writer
Data WriterData Writer
Data ReaderData Reader
Data Reader
Data ReaderData Writer
“Global Data Space” generalizes Subject-Based Addressing– Data objects addressed by DomainId, Topic and Key– Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject
Key can be any set of fields, not limited to a “x.y.z …” formatted string
#1 DDS Data-Centric Model
Data WriterData Writer
Data WriterData Writer
Data ReaderData Reader
Data Reader
Data ReaderData Writer
Data Object
“Global Data Space” generalizes Subject-Based Addressing– Data objects addressed by DomainId, Topic and Key– Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject
Key can be any set of fields, not limited to a “x.y.z …” formatted string
#1 DDS Data-Centric Model
Data WriterData Writer
Data WriterData Writer
Data ReaderData Reader
Data Reader
Data ReaderData Writer
Topic
“Global Data Space” generalizes Subject-Based Addressing– Data objects addressed by DomainId, Topic and Key– Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject
Key can be any set of fields, not limited to a “x.y.z …” formatted string
#1 DDS Data-Centric Model
Data WriterData Writer
Data WriterData Writer
Data ReaderData Reader
Data Reader
Data ReaderData Writer
Key (subject)
“Global Data Space” generalizes Subject-Based Addressing– Data objects addressed by DomainId, Topic and Key– Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject
Key can be any set of fields, not limited to a “x.y.z …” formatted string
Company Confidential
Topic: “Market Data”
Subject Filter (for a Reader)
Field
Value
Symbol Type ExchangePayload
* * NYSE *
Subject Filter (for a Reader)
SourceField
Value
Symbol Type Exchange Payload
REUTERS * EQ NYSE Volume > x, Ask < y
Payload Filter (for a Reader)
Topic: “Order Entry”
Topic: “Market Data”
Subscriptions: By Topic, Subject, Content
Symbol OrderKind Stop Limit
SourceField
Value
Symbol Type ExchangePayload
* * * * *
Volume Bid Ask …
OrderNumber …
DDS Demo: Concepts
Topics– Square, Circle, Triangle– Attributes
Data types (schemas)– Shape (color, x, y, size)
Color is instance Key– Attributes
Shape & color used for key
QoS– Deadline, Liveliness– Reliability, Durability– History, Partition– OwnershipControl Area:
Allows selection of objects and QoS
Display Area: Shows state of objects
Start demo
QoS: Quality of Service
TRANSPORT PRIORITYCONTENT FILTERS
PRESENTATIONLIFESPAN
DESTINATION ORDERENTITY FACTORY
LATENCY BUDGETDEADLINE
LIVELINESSTIME BASED FILTER
OWNERSHIP STRENGTHRELIABILITY
OWNERSHIPRESOURCE LIMITS
PARTITIONWRITER DATA LIFECYCLE
GROUP DATAREADER DATA LIFECYCLE
TOPIC DATAHISTORY (per subject)
USER DATADURABILITYQoS PolicyQoS Policy
Tunable Reliability Protocol
Configurable AckNack reply times to eliminate stormsFully configurable to bound latency and overhead– Heartbeats, delays, buffer
sizes
Consumer /Reader
Producer /Writer
Reliable•Guaranteed Ordered Delivery
•“Best effort” also supportedS7
S5S6
S4S3S2
S7 S6 S5 S4 S3 S2 S1
S1
S7
S5S6
S4S3S2S1
Performance can be tracked by senders and recipients– Configurable high/low
watermark, Buffer fullFlexible handling of slow recipients– Dynamically remove slow
receivers
High-Throughput via Aggregation
Increases throughput by aggregating smaller messages into larger network packetsUser tunable– # packets to aggregate for delivery– Aggregate packet size– Max elapsed time before data is sent– Manual flush at any time
write()
Full or timeout
Demo: Quality of Service (QoS)
Topics– Square, Circle, Triangle– Attributes
Data types (schemas)– Shape (color, x, y, size)
Color is instance Key– Attributes
Shape & color used for key
QoS– Deadline, Liveliness– Reliability, Durability– History, Partition– Ownership
RTI DDS delivers
Writers and readers stateTheir needs
Start demo
#2 Performance & Scalability
DDS was designed to support high performance
RTI DDS was developed to maximize performance and minimize jitter
Advanced techniques employed:– Pre-allocation of memory
Never allocate/free memory in the critical path– Use dedicated threads per receive port
Minimize thread switchingAvoid expensing operating system calls (e.g. select())
– Maximize concurrencyCarefully design critical sectionsPatented concurrent mutex-free thread-safe data structures
– Employ high-performance data-access APIsRead data by array (no additional copies)Scatter/gather APIs to access transport. Buffer loaning for zero copy access
Latency – (Linear Scale)
DDS/GSOAP/JMS/Notification Service Comparison - Latency
0
500
1000
1500
2000
2500
4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384
Message Size (bytes)
DDS JMS Notification Service
Message Length (samples)
Adapted from Vanderbilt presentation at July 2006 OMG Workshop on RT Systems
Jitter – (Linear Scale)
DDS/JMS/CORBA Notification Service Comparison - Jitter
0
200
400
600
800
1000
1200
1400
1600
1800
2000
4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384
Message Size (bytes)
Stan
dard
Dev
iatio
n (u
secs
)
DDS JMS Notification service
Message Length (samples)
Source: Vanderbilt presentation at July 2006 OMG Workshop on RT Systems
DDS/CORBA Notification Service Comparison - Jitter
0
20
40
60
80
100
4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384
Message Size (bytes)
Stan
dard
Dev
iatio
n (u
secs
)
DDS JMS Notification service
Message Length (samples)
Performance: Looking under the hood…
Increases performance for large messages, resulting in higher throughput and lower latency. Reduces CPU consumption on both sender and receiver.
Makes performance scale better with message size.
Operating system network-stack technology that allows an application to put and get data from the network buffers “by reference,”without performing extra copy operations.
Zero Copy
Decouples sender and receiver, providing more predictable performance for the writer and reducing latency jitter.
Allows multiple write operations to be performed concurrently over multiple channels, batched, or optimized in other ways.
Middleware technology that allows a write operation to be processed by a separate thread and not block the application thread that performed the write.
Asynchronous Writes
Enables multicast use for larger (greater than 64KB) messages.
Prevents “Head of Line” blocking where a high-priority message is queued behind a large message.
Reduces jitter. Provides better performance in less reliable networks (wireless / WANs).
Middleware technology that breaks a large message into smaller units, delivers them separately, and then reassembles them prior to deliverance to the application.
Message Fragmentation
Greatly increases the throughput for small messages.Reduces bandwidth and processor utilization for small
messages.
Middleware technology that combines multiple messages into a single unit.
Message Batching
Provides the most efficient way to send messages to multiple receivers.
Reduces bandwidth, reduces overhead on the sender, and minimizes latency and jitter.
Internet technology that allows a single UDP message to be delivered to many receivers.
Multicast
Why It MattersDescriptionPerformance & Scalability Technique
Company Confidential
Performance: RTI DDS Low Latency and Jitter
0
50
100
150
200
250
300
350
400
32 64 128 256 512 1024 2048 4096 8192
Maximum99.99%99%MedianMinimum
Reliable, ordered delivery overGigabit Ethernet between 2.0 GHz Opteron processors running 32-bit Red Hat Enterprise Linux 4.0
Message/Data Size (bytes)
Late
ncy
(mic
rose
cond
s)
Latency and Jitter on Unloaded Network without Message Batching
© 2008 Real-Time Innovations, Inc. - May 1, 200890
0
50
100
150
200
250
300
350
400
450
500
0 100,000 200,000 300,000 400,000 500,000 600,000
Throughput (Messages per Seconds)
Aver
age
Late
ncy
(Mic
rose
cond
s)
1 (1 per CPU and NIC)20 (1 per CPU and NIC)40 (1 per core, 2 per NIC)
Performance: RTI DDS Latency at high Throughput
Number of Subscribers
Half Million msg/secat less than
300usec latency
563,498 556,896535,883
365,760
0
100,000
200,000
300,000
400,000
500,000
600,000
1 subscriber 20 subscribers(1 per CPU and NIC)
40 subscribers(1 per core, 2 per NIC)
72 subscribers(1 per core, 2-8 per NIC)
Mes
sage
s pe
r Sec
ond
Scalability:RTI DDS Reliable Multicast Performance
200 Byte messagesGBit Ethernet
Single publishing threadAll data subscribedNo message loss –throttled to slowest subscriberCentOS 5, 32-bitCPUs
– 2.4 GHz Intel Core 2 Duo E6600
– 2.4 GHz Intel Core 2 Quad Q6600
– 2.33 GHz Intel Xeon E5345
– 2.4 GHz AMD Opteron8216
NICs– Intel PRO/1000– Broadcom NetXtreme II
Throughput with batching
0
100
200
300
400
500
600
700
800
900
1,000
32 64 128 256 512 1024 2048 4096 8192 16384
Message Size (bytes)
Meg
abits
per
Sec
ond
Native C++.NET (C#)Java
Performance:RTI DDS High Performance across all Languages
Windows XP Pro SP232-bitReliable multicastGigabit Ethernet2.4 GHz Intel Core 2 Quad Q6600Single Intel PRO/1000 NICFour producer and consumer threads
Throughput: Megabits per Second with batching
#3 Powerful Services & Tools
– High-Availability– Persistent Data– Recording service– Relational Database bridge– Development & Monitoring Tools
DDS High Availability via Redundancy
Owner determined per subjectOnly extant writer with highest strength can publish a subject (or topic for non-keyed topics)Automatic failover when highest strength writer:– Loses liveliness– Misses a deadline– Stops writing the subject
Shared Ownership allows any writer to update the subject
Producer / Writerstrength=10
Topic T1
I1 I2Producer / Writer
strength=5
Producer / Writerstrength=1
I1 Primary
I1 BackupI2 Primary
I2 Backup
DDS Data Persistence
A standalone service that persists data outside of the context of a DataWriter
Data Writer
Global Data Space
Data Reader
PersistenceService
PersistenceService
Data Reader
Data Writer
PermanentStorage
PermanentStorage
Can be configured for:
• Redundancy
• Load balancing
Demo:1. PersistenceService2. ShapesDemo3. Application failure4. Application (ShapesDemo) re-start5. Persistence Svc failure6. Application re-start
Cleanup database
DDS Real-Time Recording Service
Applications:– Future analysis and
debugging– Post-mortem– Compliance checking– Replay for testing and
simulation purposes
Record high-rate data arriving in real-timeNon-intrusive – multicast reception
Demo:1. Start RecorderService2. Start ShapesDemo3. See output files4. Convert to: HTML XML CSV5. View Data: HTML XML CSV
Relational Actions
DDS Relational Database Integration
Topic T1
I1 I2 I3I1I2I3
Table T1
Messaging ActionsWrite()Read() & Take()Dispose()Wait() & Listener
UPDATE & INSERTSELECTDELETE
Event driven – The fastest way to observe database changes!
DDS Enables Event Processing
CEP: programmable engines used to transform “data” into “information”CEP engines are programmed using a derivative of SQLCEP engines save time: They can implement a lot of the application logic:– Classification, Correlation, Aggregation, Filter, Cleansing, Pattern Detection, etc.
DDS is the perfect ‘data’ and ‘information’ pipe for CEP engines– Use high-speed data streams (1,000-1,000,000 msg/sec)– Require latency measured in sub-milliseconds– Demand access to events from a heterogeneous systems
CEP Engine
Dashboards
Applications
Alerts
RTI Global Data Space
Market Data
Trades
Low Latency Messages
Tools provide insight into a distributed system
RTI Analyzer– Understand connections
and data flow– Tune QoS properties
without changing code
RTI Scope– Capture and monitor packet
payloads– Collect time histories of
Topic values
RTI Protocol Analyzer– Sniff the wire and analyze
traffic
#4 Interoperability between platforms & languages
Data accessible to all interested applications:– Data distribution (publishers and subscribers): DDS– Data management (storage, retrieval, queries): SQL– ESB Integration, Business process integration: WSDL– Legacy Java Integration: JMS
DBMS
DBMSDBMS
Global Data Space
DistributedNode
DistributedNode
DistributedNode
DistributedNode
DistributedNode SQL JMS
DDS SQL
DDSWSDL
D T
DDS: Multi- Architecture Support
• Same API for all platforms• Language Independence: C, C++, Java™, C#, .NET, ADA• Enterprise and Embedded Support
VxWorks®, INTEGRITY®, LynxOS®
Linux, Solaris, Windows
• Prototype on any platform
Linux
RTI DDS
Windows
RTI DDS
Integrity
RTI DDS
VxWorks
RTI DDS
RTI DDS: Pluggable Transports
• Enables non-IP centric transports (e.g InfiniBand)• Allows for multiple transports on same node• Provides high-performance (zero-copy interface)• Saves bandwidth (compact messages & encapsulation)
Standard IP network(Ethernet, Wifi, etc.)
IPv4 & IPv6
UDP
SharedMemory InfiniBand Custom
(e.g. Radio)
RTI DDS
Real-time Applications
#5 Provides Real-Time Pub-Sub in SOA
Real-TimeDevices Fault
ToleranceAuditing & Recording
Tools & Visualization
Database
EventProcessing
Real-Time Pub-Sub/Caching/MessagingSOA &
Real-TimeWeb Services
WS-DDS
Real-Time SOA Architecture/Implementation
RT Architecture/TechnologyHigh PerformanceEvent-Driven/Publish-SubscribeSmall footprintQuality of ServiceSupport for embedded environmentsSupport for unreliable & low-bandwidth networks
Traditional EnterpriseLow PerformanceClient-ServerCentralized (Server-based)TCP based
DDS Data Bus
Conclusions
Implementing your own Data-Link Protocol is HARD
The simplest, most flexible solution is to use middleware to handle the reliability, caching, failover…
Middleware must have special features to support specialized needs of Data Link: Robust to packet loss, disconnects, good use of bandwidth, etc.
DDS the best choice today– Is a mature international Standard from OMG
Platform Neutral: Operating systems and Programming LanguagesDeployed worldwide in Military systems and other Demanding real-time applications
– It is mandated by US DoD for Publish-Subscribe and data-distribution applications
– It is ideally suited to UAVsHighly Tunable via Quality of Service (QoS)Flexible reliability model overcomes TCP problemsCan accommodate unreliable & high-latency transportsUses bandwidth EfficientlyRich services (persistence, filtering, high-availability)