©Ian Sommerville 2006MSc module: Advanced Software Engineering Slide 1 Service dependability.

©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 1

Service dependability


Objectives

To explain what is required to ensure that services are dependable

To discuss mechanisms which can be used to deliver service dependability

To introduce you to some research work in service-oriented software engineering


Topics covered

Reliable message interchange Service-fault tolerance A generic mechanism to implement fault

tolerant services


Dependability

The dependability of a system reflects the extent that the system can be trusted to deliver its required services in a safe and secure way

Dependability (in general) will be covered in the 2nd semester. • Today, I’ll simply focus on one aspect of

dependability as applied to service• Availability - the service is available to service requests• Reliability - the services delivers results according to its

specification


Dependability requirements

Requests for services from service clients and responses to these requests are reliably delivered

Advertised services are available• Both the service software and the servers must be

available Services deliver results as advertised

• Services must be reliable


Reliable messaging

When one service sends a message to another, the sending service should be confident that that message will (eventually) reach its destination

Sometimes, it is essential that one and only one copy of messages are delivered• E.g. a withdrawal from a bank account

Sometimes, messages must be received in exactly the order they have been sent• E.g. a sequence of database updates


Messaging standards

Two similar but incompatible standards have been proposed• WS-Reliability.

• Has the backing of most companies except Sun and IBM

• WS-ReliableMessaging• Has the backing of Sun and IBM

It is not yet clear which of these will emerge as the dominant standard.

I will illustrate the topic using WS-ReliableMessaging


Reliable message processors

Service A Sending RMP

Receiving RMP

Service B


Reliable message processors

Part of middleware that facilitates service interaction so may be used by any service

They are NOT part of the services themselves Sending RMP

• Submit. Transfers a message from producer to sending RMP

• Notify. Transfers a response message from the sending RMP to the producer

Receiving RMP• Deliver. Transfers a message from the receiving RMP to a

consumer• Respond. Transfers a response message from the

consumer to the receiving RMP


RMP operation

RMP’s add information to the SOAP message header that allows for message exchange to be correlated

This allows the middleware to identify duplicate messages, acknowledge message delivery and ensure that messages are delivered in the correct sequence.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.


Service availability and reliability

To ensure availability, it must be possible to continue to deliver a service when software and/or hardware has failed• This suggests that there has to be more than one

instance of a service plus a mechanism to switch between them in the event of a failure

To ensure reliability, it must be possible to check if the results of a service are consistent with its specification


Fault tolerance

Fault tolerance is: • The ability of a system to continue execution (perhaps

offering a degraded service) without system failure in the presence of system faults.

Fault tolerance is mostly used to provide enhanced system availability but may also be used where a system has critical reliability requirements.

For web services, fault tolerance mechanisms may be used to ensure service availability and reliability.


Redundancy and diversity

Redundancy• A system includes functionality that replicates other

functionality provided elsewhere in the system. Diversity

• Redundant functionality is provided in different ways. Providing diversity and redundancy in systems is

expensive. Generally, confined to critical systems ie systems where the costs associated with system failure are high.


Fault tolerance policies

Re-try (with/without diversion)• Simple re-try (of a service) handles transient failures; Re-try with

diversion handles platform failures. N-version execution with voting

• Simultaneous execution of multiple versions of a service. Handles system failure either lack of service or inconsistent computation.

Constrained execution• Execution of a service with checks on inputs. Handles out-of-

range computation. Alternative version execution with acceptance tests

• Handles system failure - either lack of service or out of range computation.


Service fault tolerance

The services used in an application may be provided by unknown providers on unknown hardware.

Service users do not know anything about how well these services • Meet their specification• Have been tested for defects

It therefore makes sense to try to ensure that a service-oriented system can continue to operate even when faulty services are used

This means that the system should be fault-tolerant.


Fault tolerant services

In its most general form, this approach relies on the existence of multiple versions of a service.

The vision of competing services offered by different providers provides redundancy and diversity so fault tolerance can be widely deployed.• Service provider

• May use fault tolerance to help achieve an advertised quality of service.

• Service client• May use fault tolerance to achieve a required quality of service or to

enhance service trust.

Ideally, a common mechanism should be available that can be used by both providers and clients.


The container model

Used in component-based software engineering e.g. EJB• Common services (e.g. transaction management)

required by different components are provided by a ‘container’.

• To deploy a component, it is included in a container and thus it ‘inherits’ access to common services.

We decided to adopt a comparable approach where services are deployed in a container which is configured to provide fault tolerance support.


Conceptual model


Container components

Policies• A generic container is configured by a fault tolerance

policy. • This is an XML description of the strategy to be used to

achieve fault tolerance. Procedures

• The container includes a set of procedures (currently in Java) that implements the defined policy.

A proxy service• This manages access to the actual services that are used.


Anatomy of a container


A voting policy


Policy models

An XML description of the fault tolerance policy to be adopted.

This includes references to the procedures to be executed, the conditions for fault detection and the mechanisms for fault recovery.

This description is then interpreted by the container to implement the fault tolerance procedures.


Fault-tolerant service provision

Research project on service dependability to investigate how to provide service fault tolerance

Goals• The F/T controller should not require any change

in existing services or service usage by clients.

• Services should not have to be aware that they are accessed through a F/T controller.

• The F/T controller should accommodate a range of fault tolerance policies.


Policy model representation

<?xml version="1.0" encoding="UTF-8"?><policyModel>

<procedure name="VotingProcedure" class="net.sourceforge.digs.endpoints.voting.VotingProcedure" start="true">

<voting requirement="5"> <vote> <xpath>//maxTemp</xpath> <voteClass>net.sourceforge.digs.endpoints.voting.vote.IntegerVote</voteClass> <majority>3</majority> <tolerance>2</tolerance> </vote> <vote> <xpath>//minTemp</xpath> <voteClass>net.sourceforge.digs.endpoints.voting.vote.IntegerVote</voteClass> <majority>3</majority> <tolerance>2</tolerance> </vote> </voting>


Policy model representation

<connections> <connection id="Proxy1" procedure="Proxy1"/> <connection id="Proxy2" procedure="Proxy2"/> <connection id="Proxy3" procedure="Proxy3"/> <connection id="Proxy4" procedure="Proxy4"/> <connection id="Proxy5" procedure="Proxy5"/> </connections> </procedure>

<procedure name="Proxy1" class="net.sourceforge.digs.endpoints.proxy.ProxyProcedure"> <endpoint url="http://in-ega051000012.lancs.ac.uk:8080/weather/weather.bbc.co.uk" proxyHost="wwwcache.lancs.ac.uk" proxyPort="8080"/> </procedure>….


Procedures

We provide a set of generic procedures that can be adapted for use in different policy models

Flow control• Implement the fault tolerance pattern

Redirection• Handle redirection to actual services

Caching• Store messages within the container

Query• Used for message querying and manipulation


Container hosting

Containers are not stand-alone entities but are hosted on a Java servlet engine (currently Apache Tomcat).

This means that we need be less concerned with issues of performance, security and stability and can focus on fault tolerance.

It also allows the dynamic deployment of containers thus opening up the possibility of responsive fault tolerance policies that change according to the services available.


Deployment support

To simplify the development of fault tolerant services, we have developed a deployment tool that allows:• Graphical editing of F/T policies and generation of

associated XML description.• Access to reusable components• Automated support for deployment to the servlet

engine by creating a Web Archive (WAR) file.


Policy and procedure reuse

We do not envisage that developers will normally develop policies and procedures from scratch.

Our support tool is geared to supporting policy and procedure reuse and gives users access to a library of existing policies and procedures that can be modified.

We are currently developing a ‘wizard’ that will allow commonly used policies (e.g. N-version execution with voting) to be configured with no requirement for the user to be aware of the underlying implementation.


The deployment process

Assuming that policies and procedures can be reused, the steps involved are:• Create or edit the fault tolerance policy using the graphical

editing system and generate the policy model.• Chose and adapt or implement the procedures to

implement the policy.• Deploy the results as a WAR file.

Depending on the extent of reuse, creating a fault tolerant service can take between a few minutes and a few hours to complete.


The deployment tool


Dynamic adaptation

The model used allows for the possibility of dynamic adaptation of the fault tolerance policy to be used• As the policy is simply an XML document, clients

could specify their own policies in the SOAP message that initiates the service call.

• The container could dynamically switch policy models depending on the QoS provided.

• Dynamic discovery and replacement of services could be provided.


Current issues

Semantic equivalence of services• Services provide similar services but with different interfaces. We

need to be able to specify semantic equivalences across services. Stateful services

• Services with state offer both problems and opportunities when implementing fault tolerance policies.

Checkpointing• How should checkpointing be supported for composed and

computationally intensive services? Container failure

• How should container failure be handled?


Policy extensions

Service failure simulation• One of the problems we have faced is in testing our

system - failures of externally provided services are uncontrollable.

• Failure policies could be embedded thus allowing simulators for service testing to be created.

Service monitoring• Monitoring policies (see next slide) could be developed.

Service access control• Rather than embedding access control in the service

itself, access control policies could be defined.


Service monitoring

Monitoring involves maintaining information about the quality of the service.

In general, monitoring should be separated from the service provision.

Provider-side monitoring• Providers use monitoring to assess the effectiveness of

their service and to inform re-configuration for service improvement.

Client-side monitoring• Clients use monitoring to assess the actual quality of

service which THEY receive.


Conclusions

Our goal of developing a flexible and adaptable fault-tolerance mechanism for services has been achieved (for atomic services).• Current work is concerned with extending the model to

cope with composite services and long transactions. The mechanism is a generic mechanism that has

significant potential for use in other areas • Work is about to start on how it can be adapted for

service monitoring.


Key Points

Service dependability relies on• Reliable interchange of messages

• Service availability

• Service reliability Reliable messaging is supported by adding

message delivery information to the SOAP header with middleware using this information to ensure reliable message interchange


Key Points

Availability and reliability can be enhanced by using fault tolerance techniques

Container based fault tolerance provides fault tolerance mechanism in a container which can be used by all services.

By deploying several services in a container, a range of fault tolerance policies may be supported.

©Ian Sommerville 2006MSc module: Advanced Software Engineering Slide 1 Service dependability.

Documents

Transcript of ©Ian Sommerville 2006MSc module: Advanced Software Engineering Slide 1 Service dependability.