CONTROLS MIDDLEWARE RENOVATION –TECHNICAL OVERVIEW
26TH JUNE 2013
Wojciech Sliwinski BE-CO-INfor the BE-CO Middleware team:
Felix Ehm, Kris Kostro, Joel Lauener, Radoslaw Orecki, Ilia Yastrebov, [Andrzej Dworak]
Special thanks to: Vito Baggiolini and Pierre Charrue
2
Agenda
Context & Motivation for Renovation
Middleware Review process
Technical evaluation of the transport layer
Changes in the MW Architecture in LS1
MW Upgrade milestones in 2013
Conclusions
3
Agenda
Context & Motivation for Renovation
4
Motivations for MW Renovation Current CORBA-based CMW-RDA
Integrated in the Control system Used to operate all CERN accelerators Provides widely accepted Device/Property model > 10 years old
Why to review & upgrade MW ? CORBA was choosen 15 years ago Technical limitations of CORBA-based transport Functional limitations of the current CMW-RDA Codebase with long history difficult to maintain, needs architecture review Major issue of long-term support & future evolution Evolution of technology over last 10 years: HW, OS, middleware, 3rd party libraries Human factor less & less CORBA expertise on the market
5
Technical limitations of CORBA transport Became legacy, not actively supported maintenance issue
Shrinking community, slow response time omniORB (C++) – 1 developer/maintainer, last release mid-2011 JacORB (Java) – few developers, small community
Major technical limitations Lack of fully asynchronous processing channel Blocking communication infamous JacORB blocking issue Lack of low-level control of IO resources (sockets, request queues)
Development issues Difficult to extend the wire protocol Backward compatibility issue Complex, error prone API Heavy in memory usage
6
Summary: Why change CORBA?
CORBA was choosen 15 years ago Not actively maintained big risk for the MW project Better solutions exist on the market Invest in future solution rather than maintaining old one
With current CORBA-based middleware we can’t solve the pending operational issues
We can’t provide better scalability & reliability CMW-RDA is difficult to evolve & extend
7
Agenda
Middleware Review process
8
Middleware Renovation process MW Renovation = MW Review + MW Upgrade
MW Review aims to provide the most appropriate technical solution satisfying the user requirements
MW Upgrade establishes the plan & strategy for introduction of the new MW Objective: LS1 the unique opportunity for the major MW upgrade
Middleware Review Process Gathering of users feedback and requirements (2010-11) Review of communication and serialization libraries (2011-12) Prototyping using selected communication products (2012) Design & impl. of new RDA3: Data, Client & Server (2012-13) Testing & validation of core MW infrastructure (summer’13) Upgrade of all dependent MW libraries & services (2013-14)
○ JAPC, Directory Service, Proxy, DIP Gateway
9
Review of users requirements 2010-11 – series of interviews with major users
Lars Jensen, Stephen Jackson (BI) Andy Butterworth, Frode Weierud, Roman Sorokoletov (RF) Brice Copy, Clara Gaspar (DIP, DIM) Frederic Bernard, Herve Milcent, Alexander Egorov (PVSS) Alexey Dubrovskiy (CTF), Kris Kostro (DIP gateways) Marine Gourber-Pace, Nicolas Hoibian (Logging) Nicolas De Metz-Noblat (Front-Ends), Alastair Bland (Infrastructure) Michel Arruat (FESA), Stephen Page (FGC) Niall Stapley, Mark Buttner, Marek Misiowiec (LASER & DIAMON) Nicolas Magnin, Christophe Chanavat (ABT) Stephane Deghaye, Jakub Wozniak (InCA, SIS) Vito Baggiolini, Roman Gorbonosov (JAPC & DA systems) + regular feedback from OP + internal team input
http://wikis/display/MW/Interviews+with+Experts
10
New RDA3: Accepted requirements General
Java & C++ API, Win (64-bit) & Linux (SLC5 32-bit & SLC6 64-bit)
Accelerator Device Model (i.e. Device/Property) Get, Set, Async-Get, Async-Set, Subscribe Early detection of communication failures Improve error reporting in all the layers: client, server, gateways Admin interface & runtime diagnostics & statistics
Data support Data object: primitives, n-dim arrays, data structures
Subscription mechanism Subscription behaviour the same regardless condition of the server (active, down) Several client subscription policies (default: continuous) Provide subscription notification ordering First-Update enforced via CMW on server-side
○ Provide callback to front-end framework for the server-side Get Drop support for on-change flag Standardise use of subscription filters and update flags (e.g. immediate update) Add header for acquired Data common metadata (e.g. acq. stamp, cycle name) All loss of data (dropped updates) must be notified to clients
New requirement
11
New RDA3: Accepted requirements Client side
RDA3 client API connects with both: RDA2 (old) & RDA3 (new) servers Efficient mechanism for: connection, disconnection & reconnection Must be able to recover from any interruption of communication with the server
○ Server restarts, IP address change, rename/move of a device to another server Improved semantics of Array Calls, i.e. handling of individual parameters Enhanced diagnostics & collection of statistics
Server side Policies for discarding notifications, i.e. deal with overflows and ’bad clients’
○ Instrument with counters & timings allowing to diagnose the notifications delivery Prioritisation of Get/Set requests for high-priority clients Server-side subscription tree fully managed by CMW
○ Server does not need to manage client subscriptions any more Manage the client connections, e.g. forced disconnect of a client Client lifetime callbacks (i.e. connected, disconnected)
New requirement
12
New RDA3: Accepted requirements Server side (cont.)
Client discovery for the diagnostics purposes (i.e. connected clients with payload) Enhanced diagnostics & collection of statistics
Ongoing discussions (not accepted yet) Prioritisation of subscription notifications for high-priority clients
Technical notes Invest in asynchronous & non-blocking communication Prefer 0-copy & lock-free data structures, message queues
http://wikis/display/MW/Design+of+New+RDA
New requirement
13
New RDA3: Summary of requirements
UnchangedDevice/Property modelSet of basic operations (Get, Set, Subscribe)
Fixes & improvementsSubscription mechanismConnection managementDiagnostics & statistics
New functionality Policies for subscription management (client & server)Client prioritiesServer-side subscription treeExtended Data supportStandardise First-Update concept
14
Agenda
Technical evaluation of thetransport layer
15
Middleware transport requirements
Desirable
Mandatory
Fundamental
Lightweight
Friendly API, documentation
Request/reply & pub/sub patterns
Open source license
Asynchronous
Active community
Stability, Maturity & Longevity
Performance & Scalability
C++/Java
Linux/Windows
Over TCP/IP LAN
16
Evaluated middleware products
Ice
Thrift
omniORB
YAMI
OpenSpliceDDS
OpenAMQCoreDXRTI DDS
ZeroMQ
QPid
MQtt RSMBJacORB
Mosquito
All opinions are based only on our knowledge and evaluation. Each of the products, depending on the requirements, may constitute a good solution.
RabbitMQ
Andrzej Dworak, ICALEPCS 2011
17
Products comparison (according to the criteria)
Sync, async & msg patterns
QoS
Dependencies & memory f-p
Performance
Look & feel, API, docs
Community & maturity
Score
ZeroMQ 6Ice 5
YAMI4 4RTI 3
Qpid 3CORBA 2
Thrift 2
Andrzej Dworak, ICALEPCS 2011
18
Conclusions Several good middleware solutions available The choice is dictated by the most critical requirements Not easy performance matters but also ease of use, community, … Prototyping was done with the most promising candidates:
ZeroMQ, Ice & YAMI
Finally we decided to choose ZeroMQ (http://www.zeromq.org/) Asynchronous & non-blocking communication 0-copy & lock-free data structures, message queues Nice API, good documentation & active community
19
New RDA3 Java – Sync Get round-trip time
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
0
2
4
6
8
10
12
14
16
18
0 100 200 300 400 500 600 700 800 900 1000
Roun
d-tr
ip(m
s)
Number of clients
Syn Get round-trip (1kB message payload)
max
average
20
New RDA3 Java – subscription notification latency
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
0
50
100
150
200
250
0 100 200 300 400 500 600 700 800 900 1000
Late
ncy
(ms)
Number of clients
Subscription notification latency (1kB message payload)
min
max
average
21
New RDA3 Java – subscription notification latency
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
0
1
2
3
4
5
6
0 20 40 60 80 100 120 140 160 180 200
Late
ncy
(ms)
Number of clients
Subscription notification latency (a closer look)
min
max
average
22
Agenda
Changes in the MW Architecture in LS1
23
Current MW ArchitectureUser written
Middleware
Central services
Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …)
Java Control Programs
RDA Client API (C++/Java)Device/Property Model
DirectoryService
ConfigurationDatabase
CCDB
VB, Excel, LabView
ServersClients
Virtual Devices(Java)
PS-GMServer
FESAServer
FGCServer
PVSSGateway
C++ Programs
MoreServers
Administrationconsole
Passerelle C++
CMW InfrastructureCORBA-IIOP
RDA Server API (C++/Java)Device/Property Model
RBAC A1Service
DirectoryService
RBAC Service
JAPC API
CMW integr. CMW int. CMW int.CMW int.CMW int. CMW int.
24
Changes in MW Architecture in LS1User written
Middleware
Central services
Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …)
Java Control Programs
RDA Client API (C++/Java)Device/Property Model
DirectoryService
ConfigurationDatabase
CCDB
VB, Excel, LabView
ServersClients
Virtual Devices(Java)
PS-GMServer
FESAServer
FGCServer
PVSSGateway
C++ Programs
MoreServers
Administrationconsole
Passerelle C++
CMW InfrastructureZeroMQ
RDA Server API (C++/Java)Device/Property Model
RBAC A1Service
DirectoryService
RBAC Service
JAPC API
CMW integr. CMW int. CMW int.CMW int.CMW int. CMW int.
Upgrade in LS1
25
Agenda
MW Upgrade milestones in 2013
26
MW Upgrade Milestones in 2013Milestone Completed by ?
RDA3 Java (client/server) (alpha) June’13
RDA3 C++ server (alpha) July’13
RDA3 integration with: FESA, FGC, PVSS July-Oct’13
RDA3 C++/Java (client/server) validated September’13
New JAPC release with RDA3 Java September’13
New FESA3.2 release with RDA3 December’13
RDA3 C++ Integration with FESA, FGC, PVSS
RDA3 validatedNew JAPC New FESA3.2 Tests with eqp. End LS1
July’13 July-Oct’13 September’13 Winter’13/14 August’14December’13
End-of-Life for RDA2: LS2
27
MW Upgrade strategy in LS1 and towards LS2
No BIG-BANG migration but gradual Backward compatible (connection-wise) new RDA3 client library
New RDA3 clients can communicate with RDA2 & RDA3 servers FESA3 will exist with both: old RDA2 (FESA3.1) and new RDA3 (FESA3.2)
Old JAPC
Old RDA2server
FESA2.10 FESA3.1
Old RDA2server
New RDA3server
FESA3.2
Old RDA2client
New JAPC
New RDA3client
RDA2 RDA3 Gateway
Client apps will migrate during LS1
Only for justified, exceptional cases
FEC developers should migrate to
FESA3.2 ASAP
28
Conclusions
We have to replace CORBA with a new solution
We collected updated users requirements
MW upgrade will be performed during LS1 (2013-2014)
Interoperability between RDA2 RDA3
Gradual control system migration until LS2 (end-2017)
End-of-Life for RDA2: LS2
Top Related