RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... ·...

15
Bennaceur et al. Journal of Internet Services and Applications DOI 10.1186/s13174-015-0027-3 RESEARCH Open Access A unifying perspective on protocol mediation: interoperability in the future internet Amel Bennaceur 1* , Emil Andriescu 2,3 , Roberto Speicys Cardoso 3 and Valérie Issarny 2 Abstract Given the highly dynamic and extremely heterogeneous software systems composing the Future Internet, automatically achieving interoperability between software components —without modifying them— is more than simply desirable, it is quickly becoming a necessity. Although much work has been carried out on interoperability, existing solutions have not fully succeeded in keeping pace with the increasing complexity and heterogeneity of modern software, and meeting the demands of runtime support. On the one hand, solutions at the application layer synthesise intermediary entities, mediators, to compensate for the differences between the interfaces of components and coordinate their behaviours, while assuming the use of the same middleware solution. On the other hand, solutions at the middleware layer deal with interoperability across heterogeneous middleware technologies but do not reconcile the differences between components interfaces and behaviours at the application layer. In this paper we propose a unified approach for achieving interoperability between heterogeneous software components with compatible functionalities across the application and middleware layers. First, we provide a solution to automatically generate cross-layer parsers and composers that abstract network messages into a uniform representation independent of the middleware used. Second, these generated parsers and composers are integrated within a mediation framework to support the deployment of the mediators synthesised at the application layer. More specifically, the generated parser analyses the network messages received from one component and transforms them into a representation that can be understood by the application-level mediator. Then, the application-level mediator performs the necessary data conversion and behavioural coordination. Finally, the composer transforms the representation produced by the application-level mediator into network messages that can be sent to the other component. The resulting unified mediation framework reconciles the differences between software components from the application down to the middleware layers. We validate our approach through a case study in the area of conference management. Keywords: Interoperability; Cross-layer parsers and composers; Mediation; Dynamic composition; Middleware 1 Introduction Enabling the composition of functionally-compatible soft- ware components regardless of the technology they use and the protocols according to which they interact is a fundamental challenge in the Future Internet [1]. It has been the focus of extensive research, from approaches that identify the causes of interoperability issues and give guidelines on how to address them [2], to approaches that try to automate the application of such guidelines [3]. This challenge is exacerbated when heterogeneity spans the *Correspondence: [email protected] 1 The Open University, Milton Keynes, UK Full list of author information is available at the end of the article application, middleware, and network layers. At the appli- cation layer, components may exhibit disparate data types and operations, and may have distinct business logics. At the middleware layer, they may rely on different com- munication standards (e.g., SOAP and JMS) which define disparate data representation formats and induce differ- ent architectural constraints. Finally, at the network layer, data may be encapsulated differently according to the net- work technology in place. Heterogeneity at the network layer has partially been solved by convergence to a com- mon standard (i.e., IP - Internet Protocol). In this paper, we focus on achieving interoperability across the applica- tion and middleware layers assuming the use of IP at the network layer. © 2015 Bennaceur et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Transcript of RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... ·...

Page 1: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 DOI 10.1186/s13174-015-0027-3

RESEARCH Open Access

A unifying perspective on protocolmediation: interoperability in the future internetAmel Bennaceur1*, Emil Andriescu2,3, Roberto Speicys Cardoso3 and Valérie Issarny2

Abstract

Given the highly dynamic and extremely heterogeneous software systems composing the Future Internet,automatically achieving interoperability between software components —without modifying them— is more thansimply desirable, it is quickly becoming a necessity. Although much work has been carried out on interoperability,existing solutions have not fully succeeded in keeping pace with the increasing complexity and heterogeneity ofmodern software, and meeting the demands of runtime support. On the one hand, solutions at the application layersynthesise intermediary entities, mediators, to compensate for the differences between the interfaces of componentsand coordinate their behaviours, while assuming the use of the same middleware solution. On the other hand,solutions at the middleware layer deal with interoperability across heterogeneous middleware technologies but donot reconcile the differences between components interfaces and behaviours at the application layer. In this paperwe propose a unified approach for achieving interoperability between heterogeneous software components withcompatible functionalities across the application and middleware layers. First, we provide a solution to automaticallygenerate cross-layer parsers and composers that abstract network messages into a uniform representationindependent of the middleware used. Second, these generated parsers and composers are integrated within amediation framework to support the deployment of the mediators synthesised at the application layer. Morespecifically, the generated parser analyses the network messages received from one component and transforms theminto a representation that can be understood by the application-level mediator. Then, the application-level mediatorperforms the necessary data conversion and behavioural coordination. Finally, the composer transforms therepresentation produced by the application-level mediator into network messages that can be sent to the othercomponent. The resulting unified mediation framework reconciles the differences between software componentsfrom the application down to the middleware layers. We validate our approach through a case study in the area ofconference management.

Keywords: Interoperability; Cross-layer parsers and composers; Mediation; Dynamic composition; Middleware

1 IntroductionEnabling the composition of functionally-compatible soft-ware components regardless of the technology they useand the protocols according to which they interact is afundamental challenge in the Future Internet [1]. It hasbeen the focus of extensive research, from approachesthat identify the causes of interoperability issues and giveguidelines on how to address them [2], to approaches thattry to automate the application of such guidelines [3]. Thischallenge is exacerbated when heterogeneity spans the

*Correspondence: [email protected] Open University, Milton Keynes, UKFull list of author information is available at the end of the article

application,middleware, and network layers. At the appli-cation layer, components may exhibit disparate data typesand operations, and may have distinct business logics. Atthe middleware layer, they may rely on different com-munication standards (e.g., SOAP and JMS) which definedisparate data representation formats and induce differ-ent architectural constraints. Finally, at the network layer,data may be encapsulated differently according to the net-work technology in place. Heterogeneity at the networklayer has partially been solved by convergence to a com-mon standard (i.e., IP - Internet Protocol). In this paper,we focus on achieving interoperability across the applica-tion and middleware layers assuming the use of IP at thenetwork layer.

© 2015 Bennaceur et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium,provided the original work is properly credited.

Page 2: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 2 of 15

Middleware provides an abstraction that facilitates thecommunication and coordination of distributed compo-nents despite the heterogeneity of the underlying plat-forms, operating systems, and programming languages.However, middleware also defines specific message for-mats and coordination models, which makes it difficult(or even impossible) for applications using different mid-dleware solutions to interoperate. For example, SOAP-based clients deployed on Mac, Windows, and Linuxmachines can seamlessly access a SOAP-based WebService deployed on aWindows server. However, a SOAP-based client cannot access a RESTful Web Service. Fur-thermore, the evolving application requirements lead to acontinuous update of existing middleware tools and theemergence of new approaches. For example, SOAP haslong been the protocol of choice to interface Web servicesbut RESTful Web services are somehow prevailing nowa-days. As a result, application developers have to jugglewith a myriad of technologies and tools, and include adhoc glue code whenever it is necessary to integrate appli-cations implemented using different middleware. Mid-dleware interoperability solutions [3] facilitate this task,either by providing an infrastructure to translate messagesinto a common intermediary protocol, as is the case forEnterprise Service Buses [4], or by proposing a DomainSpecific Language (DSL) to describe the translation logicand to generate corresponding bridges [5]. These solu-tions, however, provide only an execution framework andstill require developers to implement or specify the trans-lations needed to enable the applications to interoperate.Solutions oriented toward interoperability at the appli-

cation layer, on the other hand, target higher automationand loose coupling. In particular, they rely on intermedi-ary entities, mediators [6], to enforce interoperability bymapping the interfaces of functionally-compatible com-ponents and coordinating their behaviours. Solutions forthe synthesis of mediators [7–13] focus on compensat-ing for the differences between the components at theapplication layer, based on some domain knowledge, butwithout specifying how to deploy them on top of het-erogeneous middleware solutions. As far as we know,only Starlink [14] allows binding application-layer media-tors to different middleware solutions. However, Starlinkrequires the binding to be explicitly described in termsof the structure of messages that need to be sent orreceived by the components. Furthermore, this descrip-tion is monolithic and binding cannot be reused acrossmany applications.In summary, existing solutions to interoperability have

not fully succeeded in dealing with the increasing hetero-geneity of components because of the following reasons:(i) they deal with middleware heterogeneity while assum-ing matching application components atop and rely ondevelopers to provide all the translations that need to

be made, (ii) they deal with behavioural mismatches atthe application layer and generate corresponding medi-ators but fail to deploy them on top of heterogeneousmiddleware, or (iii) they deal with both middleware andapplication interoperability in conjunction but require thecomplete, and low-level specification of message structureusing a general purpose DSL.Furthermore, a critical issue for enabling heteroge-

neous components to interoperate is the ability toparse messages from the middleware layer into a for-mat that can be handled by application-layer mediatorsand then concretise back (a.k.a. compose/un-parse) themessages produced by application-layer mediators intomiddleware-layer messages. We refer to the process ofparsing and composing a message as message translation.However, message translation is challenged by the encap-sulation of data according to different middleware proto-cols, e.g., SOAP message encapsulated within HTTP forWeb Services. As a result, implementing message transla-tors requires dealing with multiple message formats andidentifying the parts of themessage corresponding to eachprotocol. What is needed is a declarative solution thatfacilitates the composition of multiple, and potentiallyheterogeneous, translators while taking into account thedata dependencies between the application and middle-ware layers.In this paper, we define a unified approach to deal

with interoperability at both the application and middle-ware layers. We focus on client-service systems whichare functionally-compatible, that is at some high level ofabstraction the client requires a functionality that is pro-vided by the service but is unable to interact successfullywith it due tomismatching interfaces and behaviours. Ourkey contribution stems from the systematic and rigorousapproach to generate complex message translators andtheir seamless integration with application-layer medi-ation techniques in order to manage cross-layer datadependencies. More specifically, we make the followingcontributions:

• Composite Cross-Layer (CCL) message translators.We devise an approach to automate the compositionof message translators, called CCL messagetranslators, that are able to process messages sent orreceived by software components implemented usingdifferent middleware solutions. We generate themessage translators based on a declarative high-levelspecification that: (i) reuses implementations ofmessage translators for legacy protocols (e.g., HTTP,SOAP, CORBA), (ii) easily integrates with interface-description and serialisation languages (e.g.,WSDL,XSD, ASN.1), and (iii) builds upon format-specificreverse-engineering tools (e.g., XML learning).

• A unified mediation framework. In previouswork [13], we developed an approach based on

Page 3: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 3 of 15

ontology reasoning and constraint programming tosynthesise application-layer mediators automatically.We build upon this approach and extend it with CCLmessage translators to provide a unified mediationframework that deals with interoperability at both theapplication and middleware layers. This framework iscapable of generating composite message translatorsas well as to synthesise application-layer mediators,which are deployed over a dedicated mediator engine.

• Implementation and experimentation with areal-world scenario. To validate our approach, weimplemented a prototype tool and experimented itwith heterogeneous conference management systems.Conference management systems provide variousservices such as ticketing, attendee management, andpayment to organise events like conferences,seminars and concerts. Nevertheless, it is sometimesnecessary to interact with different conferencemanagement systems. This is the case of Ambientic(http://www.ambientic.com/), which develops mobilesoftware in the domain of Event Management (expos,trade shows, exhibitions, conferences). Depending onthe event, organisers may choose to rely on differentconference management systems. Our solution helpsAmbientic integrate with different conferencemanagement systems transparently.

This paper is organised as follows. Section 2 intro-duces the interoperable conference management example,which we use throughout the paper to motivate and illus-trate our mediation approach. Section 3 presents theproposed unified mediation framework that enables thegeneration of both CCL message translators and theirintegration with mediator synthesis at the applicationlayer. Section 4 then surveys state of the art solutions tothe generation and composition of message translators,thereby highlighting the variety of atomic message trans-lators that need to be composed within CCL translators.

The latter is the focus of Section 5, which details ourapproach to the generation of CCL translators by reusingand composing legacy ones. Section 6 describes a proto-type implementation of the unified mediation frameworkand reports on the experiment we conducted using theinteroperable conference management example. Finally,Section 7 concludes the paper and discusses future work.

2 The interoperable conferencemanagementexample

To motivate and illustrate our approach, we considerthe Ambientic application for event management, calledU-Event (see Fig. 1). U-Event embeds a client com-ponent implemented as an Amiando client (http://developers.amiando.com/). U-Event needs to coordinatewith a functionally-compatible service, Regonline (http://developer.regonline.com/). Instead of re-implementingthe client component, the integration of the U-Event appwith Regonline relies on our unified mediation frame-work.In the following, we examine the challenges of enabling

the Amiando client and the Regonline service to inter-operate. The complete description of both systems isbeyond the scope of this paper as they define more than50 operations each. We thus concentrate on the follow-ing interaction: the client component must obtain a listof conferences based on keywords found in their title,and browse the information (such as dates or registra-tion fees) of the obtained conferences. Amiando clientshave to send an EventFind request containing the key-words to query. For security purposes, each Amiandoclient is assigned a unique and fixed ApiKey whichmust be included in every interaction with the service.The EventFind response includes a list of conferenceidentifiers. To get some information about a conference,clients issue an EventRead request with the event iden-tifier as a parameter. To produce the equivalent result,

Eve

nt M

anag

emen

tA

pplic

atio

n

HTTP

Mediator

EventFind Login

GetEvents

EventRead

Con

fere

nce

Man

agem

ent

Clie

nt C

ompo

nent

SOAP/HTTPHTTP

U-Event

Fig. 1 Interoperability scenario

Page 4: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 4 of 15

a Regonline client must first invoke the Login opera-tion in order to obtain a session identifier ApiToken,which must be included in all subsequent requests. TheRegonline client then sends a GetEvents request, whichincludes a Filter argument specifying the keywordsto search for. The client gets in return the list of con-ferences matching the search criteria including theirdetails. Both Amiando and Regonline are based on therequest/response paradigm, i.e., the client issues a requestwhich includes the appropriate parameters and the serverreturns the corresponding response. However, Amiandois developed according to the REST architectural style,uses HTTP as the underlying communication protocol,and relies on JSON (http://www.json.org) for data for-matting. On the other hand, Regonline is implementedusing SOAP, which implies using WSDL (http://www.w3.org/TR/wsdl) to describe the application interface, andis further bound to the HTTP protocol. Although theclient component, which is an Amiando client, requiressome functionality provided by the Regonline service, itis unable to interact with it because of the mismatchesdescribed in the following.

Application-layer mismatches. To interoperate, com-ponents have to agree on the syntax and semantics ofthe operations they require and provide together with theassociated input and output data. However, the same con-cepts (e.g., conferences, tickets, and attendees) may beexpressed using different data types. To enable the com-ponents to interoperate, the data need to be converted inorder to meet the expectations of each component. Forexample, to search for a conference with a title containinga given keyword, the Amiando client simply specifies thekeyword in the title parameter, which is of type String.The Regonline GetEvents operation has a Filterargument used to specify the keywords to search for andwhich is also of type String. However, contrary to theWSDL description, the Regonline developer documenta-tion specifies that this String field is in fact a C# expres-sion and can contain some .NET framework method calls(such as Title.contains(“keyword”)), which isincompatible with the Amiando search string. The granu-larity and sequencing of operations is also very important.For example, the GetEvents operation of Regonlinereturns a list of conferences with the corresponding infor-mation. To get the same result in Amiando, two operationsneed to be performed.

Middleware-layer mismatches. Amiando is based onREST while Regonline is based on SOAP. Messages gen-erated by both systems are incompatible and must betranslated to allow them to interoperate. Moreover, themechanisms provided by each middleware to describethe application interface are different: while SOAP-based

Web Services rely on a standard interface descriptionlanguage (WSDL) to describe operations, there is no stan-dard description language for RESTful services, althoughJSON is widely used, and in particular by Amiando.

Cross-layer data mismatches. Even though applica-tion and middleware layers are conceptually separate,in real-world scenarios the boundaries between themare ill-defined. This is due to multiple factors suchas performance optimisation, simplified development orbad design decisions. For example, the Login opera-tion of Regonline returns an ApiToken value, which isapplication-specific data. However, instead of includingthis token in subsequent operations at the application-layer encapsulation, it is inserted in the HTTP message-header (i.e., part of the middleware layer) as an optionalfield.This example, although simple, demonstrates many

problems that are faced by developers, and suggestswhy existing interoperability approaches still fall shortin achieving interoperability. What is needed is a uni-fied approach to interoperability that brings together andenhances the solutions that tackle interoperability at theapplication and middleware layers, and automates thegeneration of message translators and mediators.

3 A unifiedmediation frameworkWe aim at providing a unified approach to supportinteroperability between functionally-compatible client-service systems by mediating their protocols from theapplication down to the middleware layers. Figure 2depicts themain elements of our unifiedmediation frame-work where those with grey background are automati-cally synthesised. The framework revolves around twokey elements: CCL translator generator and synthesis ofapplication-layer mediators.

CCL translator generator: enables fast design of com-plexmessage translators while requiringminimal develop-ment effort by reusing existing implementations of atomicmessage translators, when available. Figure 2 depicts themain elements relating to CCL translator generator:

• Atomic message translators transform one messageformat into an Abstract Syntax Tree (AST). An ASTis a tree representation of the abstract syntacticstructure of a protocol message. Each node of the treedenotes a data field of the message, and may containmetadata of the field. AST are a format commonlyused in message translation and middlewaretechnology. Atomic message translators are reusedand composed in order to generate CCL messagetranslators.

Page 5: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 5 of 15

Mediator Engine

Input

Synthesis

Execution

automatically synthesized element

Behaviourdescription Domain

OntologyMessageModel 1

Synthesis of Application-layer

Mediators

AbstractMessageSchemas

CCL TranslatorGenerator

CCL Translators

Application-layer

Mediator

AbstractMessages

ConcreteMessages

CommunicationManager

TCP/IP Network

Atomic Message

Translators

Fig. 2 A unified mediation framework

• Message Model defines the strategy for assemblingAtomic message translators in order to deal with thedata encapsulation in different middleware solutionsand cross-layer data dependencies. The messagemodel also includes annotations that are integrated inthe generated Abstract Message Schemas. Each ruleor annotation in the Message Model is applied to anAtomic message translator at a particular node of itsAST structure to solve or to annotate a cross-layerdata dependency.

• Abstract Message Schemas is an abstract descriptionof the component’s interface that facilitates thesynthesis of application-layer mediators. This schemacomposes and refines the AST schemas of a set ofAtomic message translators. Abstract messages of agenerated CCL message translator validate thegenerated Abstract Message Schema.

Synthesis of application-layer mediators: is respon-sible for generating application-layer mediators basedon the description of the interfaces and behaviours ofthe components involved, together with the associateddomain ontology. The interfaces of the components aredescribed usingAbstractMessage Schemas. The behaviourof a component then describes the ordering of the mes-sages sent or received by this component in order tointeract with other components in the environments. Thebehaviour of a component may be automatically extractedusing automata learning techniques [15–18].

The synthesis of application-layer mediators has beenthe subject of a lot of work, as surveyed in [19]. In theirseminal paper, Yellin and Strom [7] propose an algo-rithm for the automated synthesis of mediators basedon predefined correspondences between messages. Byconsidering the semantics of actions, Vaculín et al. [20]are able to infer the correspondences between messagesautomatically. To generate the application-layer mediator,they generate all requester paths and find the appropriatemapping for each path by simulating the provider pro-cess. Cavallaro et al. [21] also consider the semantics ofdata and relies on model checking to identify mappingscripts between interaction protocols automatically. Nev-ertheless, they propose to perform the interface mappingbeforehand so as to align the actions of both systems.However many mappings may exist and should be consid-ered during the mediator generation. Indeed, the interfaceand behavioural descriptions are inter-related and shouldbe considered in conjunction. Moreover, they focus onthe mediation at the application layer assuming the useof Web services for the underlying middleware. Finally,Inverardi and Tivoli [11] propose an approach to computea mediator that composes a set of pre-defined patterns inorder to guarantee that the interaction of components isdeadlock-free.The aforementioned research initiatives have made

excellent contributions. However, in environments wherethere is little or no knowledge about the componentsthat are going to meet and interact, the generation of

Page 6: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 6 of 15

suitable mediators must happen at runtime whereas in allthese approaches, the mediator models or some media-tion strategies and patterns are known a priori and appliedat runtime. We have specifically developed a solutioncombining ontology reasoning and constraint program-ming to synthesise application-layer mediators automati-cally [13]. In the following, we briefly describe the gist ofthis approach while details can be found elsewhere [13].Our main goal in this paper is to define a unified approachthat deals with mediation at both the application andmiddleware layers rather then presenting the synthesis ofapplication-layer mediators.To synthesise an application-layer mediator, we begin

by identifying the semantic correspondence between themessages sent by one component and those expected bythe other component, that is interface matching. Indeed,a significant role of the mediator is to convert data avail-able on one side and make it suitable and relevant to theother. This conversion can only be carried out if the dataenclosed in the expected messages can be safely producedfrom the data embedded in the received messages. Weformulate interface matching as a constraint satisfactionproblem and use constraint programming to solve it. Wefurther incorporate the use of ontology reasoning withinconstraint solvers by defining an encoding of the ontol-ogy relations using arithmetic operators supported bywidespread solvers. For each identified correspondence,we generate an associated matching process that performsthe necessary data conversions between the actions of thecomponents’ interfaces. Then, we analyse the behaviours

of components so as to generate the application-layermediator that combines the matching processes in a waythat guarantees that the components progress and reachtheir final states without errors.Figure 3 illustrates the data conversion and behavioural

coordination performed by the application-layer medi-ator that enables the Amiando client and the Regonlineservice to interoperate. The application-layer mediatorintercepts the EventFind request sent by the Amiandoclient and transforms it into two invocations: Login andGetEvent. It generates the EventFind response basedon the GetEvents response and is able to produce theresponses of the following EventFind invocations. Thereason is that the GetEvents includes a list of eventswhile the EventRead requires only one event.As depicted in Fig. 2, the Mediator Engine enables the

components to interoperate by executing the synthesisedapplication-layer mediator while relying on the gener-ated CCL message translators to deliver the messagesin the expected formats. The Communication Managerkeeps track of all network connections and pending mes-sage receptions. It support several IP transport protocolsincluding TCP, UDP, TLS/SSL, and SOCKS.

4 Message parsing and composition: state of theart and analysis

In order to deal with cross-layer message translation andgenerate CCL message translators, it is essential to havea good knowledge and understanding of the various mes-sage syntaxes encountered in middleware technologies as

Translate EventFind request into Login request followed by GetEvent

AmiandoClient Mediator RegonlineService

EventFind request

Translate GetEvents response into EventRead response and cache

Login request

Login response

GetEvents request

GetEvents response

EventFind response

EventRead request

Translate cache into EventRead response

EventRead response

Fig. 3 Illustrating the mediator between an amiando client and the regonline service

Page 7: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 7 of 15

well as the issues that can rise when composing hetero-geneous message translators. In general, a message trans-lator assures two functions: (i) parsing a stream of bitsor characters, representing a network message in orderto produce an AST, and (ii) processing an AST to pro-duce a network message in the format expected by a givencomponent. Most existing approaches focus on the pars-ing problem, which is, in the general case, the hardest. Inthis section, we present and analyse existing approches forgenerating and composing message translators.

4.1 Survey of message parsing approachesThere exist a plethora of approaches to build messageparser: some are optimised for low bandwidth overhead(e.g., Google’s mechanism for serialising structured dataknown as Protocol Buffers), and others are specificallydesigned to facilitate interoperability (e.g., by using stan-dard data serialisation formats). The forms in whichparsers are available also differ: parsers can be precom-piled components, or high-level descriptions in a domain-specific language. In Fig. 4, we distinguish four majorclasses of approaches to build message parsers.

Custom-made parsers. These are parsers implementedin an ad hoc manner using a general purpose program-ming language. The composition of such parsers oftenrequires adding “glue code” to adapt the data and interac-tions between the individual parsers.

CFG-based generation of parsers. An efficient alterna-tive to implementing custom-made parsers is provided byparser generators (e.g., Yacc, Bison, ANTLR). Parser gen-erators transform a user-provided Context-Free Grammar(CFG) into an executable component, which parses aninput according to the specification given. While parsergenerators allow the extensible or even incremental [22]generation of parsers, they lack the ability of integrating

ExpressivepowerFixed

encodingCommonencodingpatterns

Context-freelanguages

Custom-made

CFG-basedgenerationofparsers

DSL-basedgenerationofparsers

IDL-basedserialisers

Degreeofautomation

Fig. 4 Parser design methods

and composing already existing parser implementations.The problem of parser composition in the context ofCFG-based generators has already been addressed by Schw-erdfeger et al. [23, 24] with a precise focus on extensibleprogramming languages. Combinatory parsing [25] offersa set of primitive operations for modular parser composi-tion. These operations can define parser composition withrespect to the parser’s input e.g., sequential composition,alternative parsing, optional parsing and repetition, or byapplying a transformation to a parser’s output (i.e., result-conversion). However, CFGs are a non-compositional for-malism in the sense that compositions require in-depthmodification of the base CFG derivation rules.

DSL-based generation of parsers. DSLs can be usedby experts to specify parsers for complex message for-mats at a higher abstraction level, and in a more compactway, than CFGs. Solutions for generating parsers basedon a DSL specification [14, 26, 27] focus on enablinginteroperability of already existing systems. However, theyare usually associated with a specific kind of messageencoding pattern (e.g., text-based, XML, type-length-value encoding, etc.), and thus have a limited expressivepower. Further, such approaches are not future proof asmore message formats are expected to emerge, whichwill not be accounted for by DSLs that are definedaccording to known message encoding patterns. In addi-tion, DSL-based solutions do not support compositionand require messages to be defined in a monolithicway, which can easily become unmanageable for complexprotocols.

IDL-based serialisers. A different class of approachesfor parser generation, use an Interface Description Lan-guage (IDL) that allows users to describe abstract struc-tures of data using the IDL’s type system. The descriptionis passed to a compiler that generates source code, orcompiled components capable of serialising & deserial-ising messages to & from the described data format. Amajor deficiency of IDL-based approaches (e.g., ASN.1,Protocol Buffers, CORBA OMG IDL, etc.) is that, whilethey can define an arbitrary data format, they usually sup-port a fixed (or, in the case of ASN.1, a small set of )message encoding mechanism. Therefore, we view seri-alisers as a specific case of message parsers, with limitedexpressive power relative to the serialised message for-mat (lower expressive power than DSL-based parsers). Tofacilitate the integration of serialisers with other systems,development environments, such as CORBA-based ones,provide an IDL mapping (http://www.omg.org/spec/) todata types (e.g., objects, lists, associative arrays, etc.) ofvarious programming languages (e.g., Java, C++, Python,etc.). The mapping is supported by a separate IDL com-piler for each programming language. Given the above

Page 8: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 8 of 15

limitation on message formats, the serialiser compositionmethods are also limited to IDL message-type encap-sulation. While this method enables the reuse of IDLdescriptions, it cannot be used to integrate other types ofparsers.All the aforementioned approaches are specific to a

single parsing method.These are insufficient for the case of composite mes-

sage parsing as, in real life situations, protocol stacks mayuse a mix of message formats that originate from custom-made, CFG, DSL and IDL generators. Hence, messageparser composition must deal with the composition ofheterogeneous message syntaxes and hence parsers.

4.2 Composing heterogeneous message syntaxesIn a composite message format, ambiguity can occurbetween the encapsulating (outer) message format, calledhost, and encapsulatedmessage format, called extension.Parsing ambiguity is known to be a theoretically hardproblem [28] but in communication protocols, severalsolutions are commonly implemented to deal with it:

• Context-aware parsing [23, 29] refers to methods andalgorithms in which the scanner uses contextualinformation to disambiguate lexical syntax. Thisfunctionality allows a parser to carry out analternative interpretation for the extension message.When they are ignored, the extension message canlater be parsed by a second parser for that part of themessage (e.g., CDATA escape sections in XMLdocuments). The context change may be triggered bydifferent mechanisms like escape strings (orcharacters), or implicitly at predefined locations (e.g.,SOAP envelope messages can only contain XMLextensions, which may only be placed inside the<head> or <body> elements).

• Lexical disambiguation. Escape characters orcharacter replacement can be used to resolveconflicts between the grammars of the host andextension. This method allows input lexemes (i.e.,character sequences) from the base language to alsoappear in the extension language without causingambiguities, which would otherwise result in parsingerrors. For example, the string Hello <World>can be transformed into Hello &lt;World&gt;to disambiguate it from XML markup syntax.

• Re-encoding. Extension messages may also be entirelytransformed into a different representation that doesnot conflict with the host syntax. This transformationcan be done (i) by the host parser, in which case thebehaviour is similar to escape sequences, (ii) by theextension parser, or (iii) by a separate component.For example Simple Mail Transfer Protocol (SMTP),which uses only text encoded messages, uses Base64

binary-to-text re-encoding to include binary datawithin SMTP messages.

In the following, we present the classes of syntax com-position based on the principles of context-aware parsing.Figure 5 shows a schematic example for each class.

• Sequential inclusion. It is common in many protocols(e.g., protocols part of the TCP-IP stack, HTTP, etc.,)to compose messages by simply arranging thecontent in a sequential manner (e.g., one parseranalyses a part of the input, and returns theremaining part in its result). In Fig. 5(a), we observethat the parsing context a a (corresponding to thehost syntax) ends before the parsing context b b(corresponding to the extension syntax) begins.

• Bounded stratified inclusion. A middleware protocolparser is syntactically “unaware” of encapsulatedmessages, which are treated as a collection of binarydata or arbitrary character strings. Because of thiscontainment property, we can state that whenevertwo message parsers are composed to handle anencapsulated message format, they specialize (orrestrict) the set of messages initially accepted. Thus,bounded stratified inclusion is a special case of syntaxcomposition, which may only restrict theexpressiveness of the base language, in the samesense explained by Cardelli et al. in [30]. Figure 5(b)illustrates such an example, where data associatedwith an extension syntax (shown in blue) is includedat a specific point in the data of a message associatedwith a host syntax (shown in white). Although contextb b is included in a a , they are properly delimitedsuch that this message may be parsed even in thepresence of lexical ambiguity between tokens of thehost message and tokens of the encapsulated message.

• Bounded composition represents a generalisation ofbounded stratified inclusion where the parsingcontext is not strictly delimitated. This means thatlexemes from the host syntax can appear alongsidelexemes of the extension syntax. Sections of the datablock where this composed syntax is used(exemplified using hatched blue in Fig. 5(c)) can beparsed neither by the host parser, nor by theextension parser. Unlike bounded stratified inclusion,bounded composition may include both syntaxextensions and syntax restrictions. Syntax extensionsexpands the initial language with new message types,while syntax restrictions introduces intentionallimitations on the expressiveness of a language.

• Mutually-recursive syntax composition refers to thecase where the syntax of two distinct messageformats can mutually be included inside one another.A technique commonly used to support this case of

Page 9: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 9 of 15

b

a

a

b

(a) Sequential inclusion

b

b

a

a

(b) Bounded stratified in-clusion

b ba

a

(c) Bounded composition

b b

a ab

b

a

a

(d) Mutually-recursivecomposition

Extension syntax

Composed syntax

Parsing context boundaries

Host syntax

Data block

11

Fig. 5 Illustrative example of the three classes of syntax composition

composition is recursive descent parsing (inparticular implemented by parser combinators [25,31]), where a composed parser is defined from a setof mutually recursive procedures. This class of syntaxcomposition has been extensively studied in thedomain of extensible programming languages [23,24], where parser composition extends the syntax of ahost programming language, for instance Java (e.g.,context a a in Fig. 5(d)), with another syntax, suchas SQL (e.g., context b b ). Intuitively, the syntax ismutually-recursive because SQL queries can appearwithin Java expressions, and, at the same time, Javaexpressions can appear within SQL queries, allowingan unbounded chain of compositions. The samecannot be said about messages exchanged by protocolstacks where mutually-recursive compositions areunlikely given the fixed number of layers.

As far as we know, in existing protocol stacks, messagesare encapsulated either using (a) sequential inclusion,or (b) bounded stratified inclusion. In [32], we furthershow that for these cases, heterogeneous parsers can becomposed as black box functions (i.e., without requiringin-depth modification of the already existing parsers).

4.3 Atomic message translatorsThe Atomic message translators that can be used as inputfor composition are either Legacy (i.e., re-using an existingimplementation) or Generated (i.e., generated at design-time).Legacy Atomic message translators are appropriate for

middleware protocols given that they are based onindustry-wide standards, with reference implementationswidely available, and are unlikely to change frequently.Generated Atomic message translators are useful for

application-specific protocols, where changes in messagestructure are frequent.Generated Atomicmessage transla-tors are further categorised depending on the availabilityof a message description language: DSL and IDL-basedand Inferred. As the title suggests, some message for-mats can indeed be inferred automatically. This is the casewhen protocols represent/encode data using an extensibleserialisation (e.g, JSON, YAML) or encoding format (e.g.,ASN.1 –syntactical– - BER –lexical–, XSD –syntactical–- XML –lexical–)a. For this case to be applicable in a pro-tocol mediation scenario, we obviously require a set ofConcrete Message Samples that are used as input for typeinference. In our experimental implementation, we rely onthe tool Trang (http://www.thaiopensource.com/relaxng/

Page 10: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 10 of 15

trang.html) that can infer a schema from XML, JSON orother similar serialisation formats. Based on this schema,we automatically generate the corresponding syntacticalparsers.In the above, we make the assumption that parsers

output ASTs using a uniform format that can be manip-ulated. In our implementation, we reduce the scope toobject-oriented parser implementations. This is becauseAST instances represented as Objects may be examinedor even manipulated at runtime using reflection and byte-code manipulation and may be easily serialised to otherformats, like XML.Assuming that all necessary Atomic message translators

(either inferred, generated or off-the-shelf ) for the medi-ated applications are available we generate a set of CCLmessage translators corresponding to the set of messagetypes exchanged. In the Amiando client to Regonline ser-vice scenario, the set of Atomic message translators con-tains: a) legacy message translators for HTTP and SOAP,b) custom XML parsers generated from the WSDL/XSDdescription provided by Regonline and c) custom JSON b

parsers for Amiando inferred using a set of pre-collectedConcrete Message Samples.

5 Cross-layer mediationIn this section, we describe our approach for generat-ing CCL message translators by composing multiple, andpossible heterogeneous, Atomic message translators.

5.1 Composition of message translatorsWe mentioned that Atomic message translators arecombined based on a Message Model. In Fig. 6 wepresent a fragment of the Message Model describingthe Regonline service. This description is used to gen-erate the corresponding CLL Translator and AbstractMessage Schema. A Message Model comprises threesections: translator chaining, syntacticannotations, and semantic annotations. Thetranslator chaining section of the Message Modeldefines the composition of Atomic message translators toform the set of CCL message translators associated withan application. Each CCL message translator is gener-ated according to an operation (i.e., a pair of requestand response messages and associated data) of the com-ponent’s interface. Using the extension compositionrule, we declare how a specific field in the output (i.e.,the output AST) of an Atomic or CCL message transla-tor can be derived as input of a second Atomic messagetranslator determined by the identifier tag. Gener-ated Atomic message translator extensions require anextra description element containing a URI pointingto the message description and, optionally, a domain-specific content tag that specifies which part of themessage description must be used, in the case where the

Fig. 6 Fragment of the message model for the regonline component

provided description covers multiple operations. Fieldselection inside the AST is done using path expressions.For convenience, the syntax is borrowed from XPath(http://www.w3.org/TR/xpath/). Extension declarationsmay also contain the optional attribute oper, whichdefines the operation for which the rule is relevant in theform [Operation|*]:[Request|Response|*].Wildcards may be used on both sides of the attribute tospecify that this rule applies to multiple operations or toboth requests and responses.We use the Message Model to create a tree structure

based on the user defined path attributes and the ASTscorresponding to the referenced Atomic translators. Wethen recursively construct the composite message trans-lators corresponding to each protocol operation, byapplying composition and syntactical rules. This phaseallows the composite message translators to produceInternal AST instances. As an illustration, a CCL ASTinstance of the Regonline GetEvents Request mes-sage is given in Fig. 7. In this particular case, the initialinput is parsed by an HttpRequest parser, then thebody element encapsulating a SOAP message is further

Page 11: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 11 of 15

Fig. 7 AST of regonline getevents-request

processed by a SoapMessage parser, and finally theSOAP body element is parsed by a dynamically generatedWsdlMessage translator. The problem of inferring thedata schema of the Internal ASTs is non-trivial. For thisreason, in [32] we provide a formal mechanism, using treeautomata, which based on a path expression (using a sub-set of the navigational core of the W3C XML query lan-guage XPath), generates an associated AST data-schemafor the translator composition.Secondly, we refine each Internal AST structure into a

middleware-independent message-schema which definesthe syntax of the Abstract Message. This process includespruning all middleware-specific fields of the Internal ASTschema, and also flattening the structure when possi-ble without introducing ambiguity. The generated mes-sage schemas are enhanced with semantical annotationsdefined in the Message Model. This is the structure onwhich the application-layer mediator synthesis tool willreason, and infer appropriate mapping of data. Finally, wegenerate the functions necessary to transform AbstractMessages into their corresponding Internal AST represen-tation, and the inverse.

5.2 Overcoming cross-layer data dependencyWe now take a closer look on how syntacticand semantic annotations can help solve cross-layerdata dependencies and also support the synthesis ofapplication-layer mediators.

A first step is assuring that all necessary data require-ments are made explicit. While most abstract messagestructures (i.e., AST schemas) are automatically extractedfrom Atomic message translators and composed usingour algorithm in [32], the syntactic annotationssection of the Message Model further augments thisdescription. This may include specifying whether somefields are required or optional, or if there are any addi-tional restrictions on the value of certain fields. Forinstance, in our scenario, the Regonline GetEventsoperation accepts an optional orderBy parameter (seeFig. 7) to specify the return order of conferences. If theapplication-layer mediator synthesis tool is unaware thatthis field is optional, it may fail to map an operationbetween components because a required input is not pro-vided. Thus, we annotate this field as optional. Forspecific fields, the valuerestrict annotation allowsspecifying detailed value patterns for simple data types.While it may increase the complexity of the specification,this feature leads to a more precise data-mediation andmessage-validation than relying only on type-definitionand/or semantical annotations.Message formats may encapsulate sequences (e.g., lists

or maps) of values of the same type. In some cases, theapplication may have requirements on the presence of avalue, at a certain position. For example, the Regonlineprotocol requires that all requests except Login containa session identifier provided as an HTTP header with the

Page 12: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 12 of 15

key ApiToken. The extract annotation allows mak-ing this requirement explicit with respect to the structureof the message by removing the specific field from theheaders sequence, and reattaching it as a field at ahigher level of the tree format.When protocols are based on multiple middleware

solutions, message composition may require data tobe mapped internally across multiple translators. Themap element enables to associate the values of mid-dleware fields internal to a single CCL translator. Forexample, in the case of the message instance illus-trated in Fig. 7, the WSDL message translator fieldbody/body/soapAction is mapped to the HTTPrequest header field /head/soapAction.The last section of the Message Model, semantic

annotations, enables the annotation of parsed dataat various granularity. We support two types of seman-tic annotation: (i) domain knowledge (i.e., references toconcepts in an ontology) and (ii) the scope of data.One may annotate an operation, a message, and/or anymessage field (either of complex or simple type). Suchannotations support the synthesis of application-layermediators in finding relevant matches between availabledata and data required to perform an operation. Thedatascope is important whenever applications configure theunderlying middleware, causing application-specific datato be scattered over multiple layers. We mentioned that,in order to achieve mediation, we must identify and for-ward all application data. The element datascope setto application or middleware marks that the syn-thesis of application-layer mediators must consider thisdata as part of the application scope or, respectively, themiddleware scope (in which case it should be ignored).However, the separation of middleware data is not suf-ficient as components may exhibit more complex datascoping. For example, Amiando uses a static key calledApiKey to control service access while Regonline uses asession id called ApiToken. Both data are instances of

the same domain concept, but the mediator should neverassign the ApiToken to ApiKey or vice versa: Amiandowill not recognise session keys created by Regonlineand Regonline will not accept access keys generated byAmiando. Still, the application-layer mediator synthesistool must map the ApiToken between the subsequentRegonline requests.In response to the above data scoping challenge, we

allow the datascope annotation to take the followingvalues: (i) middleware when data is purely middlewarespecific and it should not be exposed to the application-layer mediator synthesis; (ii) application when databelongs to the application layer, and must be forwarded tothe application-layer mediator; (iii) replay-only whenapplication layer data should only be shared betweenthe set of operations from the same component; (iv)operation-only when application layer data may onlybe included in certain operations; (v) one-way whenapplication layer data may only flow in one direction, i.e.,only Request or Response messages may include this data.

6 Implementation and validationWe have implemented a prototype tool of the proposedmediation framework using Java, following the archi-tecture described in Fig. 8. The third-party tool andlibrary dependencies for each component are mentionedbetween parentheses. The Mediator Engine implementsthe interfaces necessary to interact with the artefacts gen-erated by the Composite Message Translator Generatorand Application-layer Mediator Synthesis (MICS).In the case of the CCL Message Translator Genera-

tor, Legacy and Generated Atomic message translatorsare chained, transformed and refined using the bytecodemanipulation library Javassist (http://www.csg.is.titech.ac.jp/~chiba/javassist/). To express richer constraints onthe syntactic structure of ASTs beyond the basic meansprovided by Java Type definitions, we use the stan-dard Java Architecture for XML Binding. In this way,

CCL Message Translator Generator

Concrete msg. cache

Javassist)

Intermediate translator construction (Javassist)

Deployment Schema generator (JAXB)

Message Model parser

Atomic Translator Generator Plug-ins

XML@XSD(XJC)

JSON(StAXON + Trang + XJC)

Application-level Mediator Synthesis(MICS)

Mediator Engine

Communication manager

Network

Transport bindings

Mediator Interface

CCL Translator Interface

Chaining

Transformation

Fig. 8 Implementation of the unified mediation framework

Page 13: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 13 of 15

Fig. 9 Comparison between mediated and non-mediated executions

each class structure is bound to an XSD schema. Sincevalue-restrictions, as described by the Message Model,cannot be injected as compile-time JAXB annotations,they are transformed to a JAXB External Binding Cus-tomization File. Generated Atomic message translator areobtained using external tools, which are integrated asplug-ins. XJC (http://jaxb.java.net) is used for generatingXML message translators based on XML Schemas. Sincethere is no well-established data-schema for JSON weuse StAXON (https://github.com/beckchr/staxon/) toolto transform JSON messages to XML before learningtheir data-schema using the XML learning tool Trang.We consider the integration of additional Atomic mes-sage translator generators like, for example, Java Asn.1Compiler (http://sourceforge.net/projects/jac-asn1/) forASN.1 parser specifications.In what follows, we assess our approach by comparing

the time to perform a conversation in the mediated andnon-mediated case between Amiando client/service andRegonline client/service. Figure 1) shows the result. Onthe server-side, we use the services operated by Amiandoand Regonline. On the client-side we use a Java implemen-tation provided by Amiando, while for Regonline, we par-tially generate the client source-code using the providedWSDL service description.

We first specify a Message Model (https://www.rocq.inria.fr/arles/software/ccl-Mediation-Framework/) foreach system, as well as two message samples containingthe JSON formatted responses of the Amiando service.The composite message translator Generator is then ableto generate eight different composite message translators(listed in Fig. 10) and their associated Semantically Anno-tated XSDs (http://www.w3.org/2002/ws/sawsdl/). TheSAXSDs are obtained by injecting semantical annotationsobtained from the Message Models into the XSD schemasgenerated using JAXB. At runtime, MICS generates thetwo mediators given the SAXSDs, a conference manage-ment ontology and the behavioural descriptions of eachsystem.We compare themediated execution-time with the non-

mediated case. Each test was repeated 30 times, in similarconditions, and connection delays were excluded (e.g.,opening sockets, SSL handshake, etc).In Fig. 9, we evaluate the execution-time overhead of

the mediation. Since this test is performed using the realonline services, the response time varies depending on thenetwork conditions. As expected, themediated execution-time is superior to the non-mediated case, given thatthe number of messages exchanged is doubled. We showthe decomposition of the execution-time for mediation,

Fig. 10 Access & parsing time decomposed by atomic parsers

Page 14: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 14 of 15

composing and access/parsing. Network access and pars-ing cannot be distinguished in this case because parsing isdone in multiple steps when data is available on the com-munication channel.While the overhead of mediation andmessage composition is low, we see that parsing and net-work reception introduce the largest overhead. This iswhy, in Fig. 10, we detail the decomposition of parsingtime over each Atomic parser used internally by a spe-cific generated translator. We see that the EventFindresponse message parsing has a peak of 1662 ms. We alsoobserve that the entire time is associated with the HTTPparser, and given that the size of the message is only 869bytes, we can conclude it is almost entirely due to theresponse delay of the Amiando Service. The same rea-soning applies for the GetEvents response message ofthe Regonline service, but in this case 197 ms are asso-ciated with the SOAP parser which is chained to parsethe HTTP response’s payload (the HTTP body). Knowingthat in this particular implementation, the SOAP parserdoes not wait for network access, we observe that theSOAP Atomic parser introduces an important SOAP-Envelope parsing overhead. This observation confirmsthat the Amiando/Regonline (i.e., Amiando Client medi-ated to the Regonline Service) mediator execution-time(in Fig. 9) can be reduced by using a more efficient SOAPAtomic parser. Hence, we can conclude that our medi-ation approach introduces an acceptable overhead whileenabling seamless interoperability between two originallyincompatible systems.

7 Conclusion and future workInteroperability is a very challenging topic. Over theyears, interoperability has been the subject of a greatdeal of work, both theoretical and practical. However,existing approaches focus on achieving interoperabil-ity either at the application or middleware layer. Thispaper presented a unifiedmediation framework to achieveinteroperability from application down to middlewarelayers. We have shown via our implemented prototypethat the framework successfully enables interoperabil-ity in a transparent way, while introducing acceptableoverhead.Future work includes increasing automation by infer-

ring, at least partially, the Message Model by cooperatingwith discovery mechanisms and packet inspection soft-ware. We also intend to experiment with various learningtechniques, both active and passive, for the inference ofcomponent behaviour. Finally, incremental re-synthesis ofmediators and, runtime refinement of composite messagetranslators would be useful in order to respond to changesin the individual systems or in the ontology. A furtherdirection is to consider improved modelling capabilitiesthat take into account the probabilistic nature of systemsand the uncertainties in the ontology. This would facilitate

the construction of mediators where we have only partialknowledge about the system.

EndnotesaNote the difference between: (i) lexical parsers that

consume streams of characters or bytes and, in case ofsuccess, output a result in the form of an AST, and (ii)syntactical parsers that consume streams of tokens toproduce the corresponding ASTs.

bSyntactical parsers defined on XML or JSON tokens.

Competing interestsThe authors declare that they have no competing interests.

Authors’ contributionsAB developed the mediator synthesis approach and implemented the MICStool. EA developed the translator generation approach and implemented theprototype tool. RSC proposed the case study and participated in the tests. VIsupervised and coordinated the work. All authors read and approved the finalmanuscript.

Author details1The Open University, Milton Keynes, UK. 2Inria, Paris-Rocquencourt,Rocquencourt, France. 3Ambientic, Paris, France.

Received: 2 February 2015 Accepted: 15 May 2015

References1. Issarny V, Georgantas N, Hachem S, Zarras A, Vassiliadis P, Autili M, Gerosa

MA, Hamida AB (2011) Service-oriented middleware for the futureinternet: state of the art and research directions. J Internet Serv Appl2(1):23–45. doi:10.1007/s13174-011-0021-3

2. Garlan D, Allen R, Ockerbloom J (1995) Architectural mismatch or why it’shard to build systems out of existing parts. In: ICSE. ACM, New York, NY,USA

3. Issarny V, Bennaceur A, Bromberg YD (2011) Middleware-layer connectorsynthesis: Beyond state of the art in middleware interoperability. In: LNCSSFM-11. Springer, Berlin, Heidelberg

4. Chappell DA (2004) Enterprise Service Bus. O’reilly Media5. D. Bromberg Y, Grace P, Réveillère L (2011) Starlink: Runtime

Interoperability between Heterogeneous Middleware Protocols. In:ICDCS. IEEE Computer Society, Washington, DC, USA

6. Wiederhold G (1992) Mediators in the architecture of future informationsystems. IEEE Comput 25(3):38–49

7. Yellin DM, Strom RE (1997) Protocol specifications and componentadaptors. ACM TOPLAS 19(2):292–333

8. McIlraith SA, Son TC, Zeng H (2001) Semantic web services. IEEE Intell Syst16(2):46–53

9. Spitznagel B, Garlan D (2003) A compositional formalization of connectorwrappers. In: ICSE. IEEE Computer Society, Washington, DC, USA

10. Mateescu R, Poizat P, Salaün G (2012) Adaptation of service protocolsusing process algebra and on-the-fly reduction techniques. IEEE TransSoftw Eng 38(4):755–777

11. Inverardi P, Tivoli M (2013) Automatic synthesis of modular connectors viacomposition of protocol mediation patterns. In: Proc. of the 35thInternational Conference on Software Engineering, ICSE. IEEE Pres,Piscataway, NJ, USA. pp 3–12

12. D’Ippolito N, Braberman VA, Kramer J, Magee J, Sykes D, Uchitel S (2014)Hope for the best, prepare for the worst: multi-tier control for adaptivesystems. In: Proc. of the 36th International Conference on SoftwareEngineering, ICSE. ACM, New York, NY, USA. pp 688–699

13. Bennaceur A, Issarny V (2015) Automated synthesis of mediators tosupport component interoperability. IEEE Trans Softw Eng 41(3):221–240

14. Bromberg YD, Grace P, Réveillère L, Blair GS (2011) Bridging theInteroperability Gap: Overcoming Combined Application andMiddleware Heterogeneity. In: Middleware. Springer, Berlin Heidelberg

15. Oldham N, Thomas C, Sheth AP, Verma K (2004) METEOR-S web serviceannotation framework with machine learning classification. In: SWSWPC.pp 137–146

Page 15: RESEARCH OpenAccess Aunifyingperspectiveonprotocol ... · Bennaceuretal.JournalofInternetServicesandApplications (2015) 6:12 DOI10.1186/s13174-015-0027-3 RESEARCH OpenAccess Aunifyingperspectiveonprotocol

Bennaceur et al. Journal of Internet Services and Applications (2015) 6:12 Page 15 of 15

16. Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of softwarebehavioral models. In: Proc. of the International Conference on SoftwareEngineering, ICSE. ACM, New York, NY, USA. pp 501–510

17. Krka I, Brun Y, Popescu D, Garcia J, Medvidovic N (2010) Using dynamicexecution traces and program invariants to enhance behavioral modelinference. In: Proc. of the 32nd International Conference on SoftwareEngineering, ICSE (2). ACM, New York, NY, USA. pp 179–182

18. Bennaceur A, Issarny V, Sykes D, Howar F, Isberner M, Steffen B, JohanssonR, Moschitti A (2012) Machine learning for emergent middleware. In: Proc.of the Joint Workshop on Intelligent Methods for Software SystemEngineering, JIMSE. Springer, Berlin Heidelberg

19. Issarny V, Bennaceur A (2012) Composing distributed systems:Overcoming the interoperability challenge. In: Formal Methods forComponents and Objects - 11th International Symposium, FMCO 2012,Bertinoro, Italy, September 24–28, 2012, Revised Lectures. Springer Berlin,Heidelberg. pp 168–196

20. Vaculín R, Neruda R, Sycara KP (2009) The processmediation framework forsemantic web services. Int J Agent-Oriented Softw Eng IJAOSE 3(1):27–58

21. Cavallaro L, Nitto ED, Pradella M (2009) An automatic approach to enablereplacement of conversational services. In: ICSOC. Springer, Berlin,Heidelberg

22. Heering J, Klint P, Rekers J (1989) Incremental Generation of Parsers. In:PLDI. doi:10.1145/73141.74834. http://doi.acm.org/10.1145/73141.74834

23. Schwerdfeger AC, Van Wyk ER (2009) Verifiable Composition ofDeterministic Grammars. PLDI. doi:10.1145/1543135.1542499

24. Schwerdfeger AC, Van Wyk ER (2009) Verifiable Parse Table Compositionfor Deterministic Parsing. In: SLE. Springer-Verlag, Berlin, Heidelberg

25. Swierstra SD (2000) Combinator parsers - from toys to tools 41:38-5926. Burgy L, Reveillere L, Lawall J, Muller G (2011) Zebu: A Language-Based

Approach for Network Protocol Message Processing. IEEE TSE.doi:10.1109/TSE.2010.64

27. Borisov N, Brumley DJ, Wang HJ (2007) Generic Application-LevelProtocol Analyzer and its Language. In: NDSS, The Internet Society

28. Basten H (2007) Ambiguity detection methods for context-freegrammars. Master’s thesis, Universiteit van Amsterdam

29. Van Wyk ER, Schwerdfeger AC (2007) Context-aware Scanning for ParsingExtensible Languages. In: GPCE. doi:10.1145/1289971.1289983. http://doi.acm.org/10.1145/1289971.1289983

30. Cardelli L, Matthes F, Abadi M (1994) Extensible grammars for languagespecialization. In: DBPL. http://dl.acm.org/citation.cfm?id=648290.754352

31. Moors A, Piessens F, Odersky M (2008) Parser combinators in scala. KULeuven, CW Reports vol:CW491

32. Andriescu EM, Martinez T, Issarny V (2015) Composing MessageTranslators and Inferring their Data Types using Tree Automata. In: Proc.of the 18th International Conference on Fundamental Approaches toSoftware Engineering, FASE, London, United Kingdom. https://hal.inria.fr/hal-01097389

Submit your manuscript to a journal and benefi t from:

7 Convenient online submission

7 Rigorous peer review

7 Immediate publication on acceptance

7 Open access: articles freely available online

7 High visibility within the fi eld

7 Retaining the copyright to your article

Submit your next manuscript at 7 springeropen.com