RISK ASSESSMENT AND ADAPTIVE GROUP TESTING OF … › ~weiyu › Adams_Wei_Yu_Homepage_files ›...

RISK ASSESSMENT AND ADAPTIVE GROUP

TESTING OF SEMANTIC WEB SERVICES

XIAOYING BAI

Department of Computer Science and Technology, Tsinghua University and

State Key Laboratory of Software Development Environment

BeiHang University, Beijing, 100084, [email protected]

RON S. KENETT

Department of Statistics and Applied Mathematics

Universita Di Torino, Italy

[email protected]

WEI YU

State Key Laboratory of Software Development Environment

BeiHang University, Beijing, 100084, [email protected]

Received 15 July 2011

Revised 10 September 2011

Accepted 17 September 2011

Testing is necessary to ensure the quality of web services that are loosely coupled, dynamicbound and integrated through standard protocols. Exhaustive testing of web services is usually

impossible due to unavailable source code, diversi¯ed user requirements and large number of

possible service combinations delivered by the open platform. This paper proposes a risk-based

approach for selecting and prioritizing test cases for testing service-based systems. We speciallyaddress the problem in the context of semantic web services. Semantic web services introduce

semantics to service integration and interoperation using ontology models and speci¯cations.

Semantic errors are considered more di±cult to detect than syntactic errors. Due to the com-

plexity of conceptual uniformity, it is hard to ensure the completeness, consistency and uni¯edquality of ontology model. A failure of the semantic service-based software may result from

many factors such as misused data, unsuccessful service binding, and unexpected usage sce-

narios. This work analyzes the two factors of risk estimation: failure probability and importance,from three aspects: ontology data, service and composite service. With this approach, test cases

are associated to semantic features, and are scheduled based on the risks of their target features.

Risk assessment is used to control the process of Web Services progressive group testing,

including test case ranking, test case selection and service ruling out. This paper discusses thecontrol architecture and adaptive measurement mechanism for adaptive group testing. As a

International Journal of Software Engineeringand Knowledge Engineering

Vol. 22, No. 5 (2012) 595�620

#.c World Scienti¯c Publishing CompanyDOI: 10.1142/S0218194012500167

595

Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

http://dx.doi.org/10.1142/S0218194012500167

statistical testing technique, the proposed approach aims to detect, as early as possible, theproblems with highest impact on the users.

Keywords: Semantic web services; risk-based testing; adaptive testing; progressive grouptesting.

1. Introduction

Service Oriented Architecture (SOA), and its Web Services (WS) implementations,

introduce an open architecture for integrating heterogeneous software through

standard internet protocols [15, 29]. From the providers' perspective, proprietary in-

house components are encapsulated into standard programmable interfaces and

delivered as reusable services for public access and invocation. From the consumers'

perspective, applications are built following a model-driven approach where business

processes are translated into control °ows and data °ows, of which the constituent

functions can be automatically bound to existing services discovered by service

brokers. In this way, large-scale software reuse of Internet available resources is

enabled, providing support for agile and fast response to dynamic changing business

requirements. SOA is believed to be the current major trend of software paradigm

shift.

However, due to the potential instability, unreliability, and unpredictability of

the open environment in SOA and WS, these developments present challenging

quality issues compared with traditional software. Traditional software is built and

maintained within a trusted organization. In contrast, service-based software is

characterized by dynamic discovery and composition of loosely coupled services that

are published by independent providers. A system can be constructed, on-the-°y, by

integrating reusable services through standard protocols [26]. For example, a housing

map application can be the integration of two independent services: Google Map

service and housing rental services. In many cases, the constituent data and func-

tional services of a composite application are out of the control of the application

builder. As a consequence, service-based software has a potentially higher probability

to fail compared with in-house developed software. Moreover, as services are open to

all Internet users, the provider may not envision all the usage scenarios and track the

usage status at runtime. Hence, a failure in the service may a®ect a wide range of

consumers and result in unpredictable consequence. For example, Gmail reported a

failure of \service unavailable due to outage in contacts system" on 8 November 2008

for 1.5 hours �� millions of customers were a®ected. In an early research of software

reliability, Kenett and Pollak [18] discovered that software reliability decays over-

time due to the side e®ects of bug correction and software evolution. In a service-

based system, such decaying process may not be aware to the consumers until a

failure occurs.

Testing is thus important to ensure the functionality and quality of individual

services as well as composed integrated services. Proper testing can ensure that the

selected services best satisfy the users' needs and that services that are dynamically

596 X. Bai, R. S. Kenett & W. Yu

Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

composed can interoperate with each other. However, testing is usually expensive

and con¯dence in a speci¯c service is hard to achieve, especially in an open Internet

environment. The users may have multiple dimensions of expected features, prop-

erties and functional points that result in a large number of test cases. It is both time

and resource consuming to test a large set of test cases on the large number of service

candidates.

To overcome these issues, the concept of group testing was introduced to services

testing [3, 4, 33, 34]. With this approach, test cases are categorized into groups and

activated in groups. In each group testing stage, the failed services are eliminated

through a prede¯ned ruling-out strategy. E®ective test case ranking and service

ruling-out strategies removes a large number of unreliable services at the early stage

of testing, and thus reduce the total number of executed tests. Measurement is

essential for test ranking and selection. In particular, Bayesian techniques such as

Bayesian inference and Bayesian Networks can help for estimation and prediction. In

our previous work, Kenett et al. [19, 20] present the applications of statistics and

DOE (Design of Experiments) methods in modern industry in general, and in soft-

ware development in particular.

This paper introduces a risk-based approach to evaluate service qualitatively and

rank test cases. With this approach, test cases are prioritized and scheduled based on

the risks of their target software features. Intuitively, a risky feature deserves more

testing e®ort and has a high priority to test. When time and resources are limited,

test engineers can select a subset of test cases representing the highest risk targets, in

order to achieve good enough quality with an a®ordable testing e®ort. An important

issue in the risk-based approach is the measurement of risk. In most cases, risk factors

are estimated subjectively based on experts' experiences. Hence, di®erent people may

produce di®erent results and the quality of risk assessment is hard to control. Some

researchers began to practice the application of ontology modeling and reasoning to

operational and ¯nancial risk management, combining statistical modeling of qual-

itative and quantitative information on a SOA platform [31] and [37]. This paper

presents an objective risk measurement based on ontology and service analysis. A

Bayesian network (BN) is constructed to model the complex relationships between

ontology classes. Failure probability and importance are estimated at three levels ��ontology, service and service work°ow �� based on their dependency and usage

relationships.

Services are changed continuously online (\The Perpetual Beta" principle in Web

2.0 [26]). The composition and collaboration of services are also built on-demand. As

a result, testing could not be fully acknowledged beforehand and has to be adaptive

so that tests are selected and composed as the services change. The paper introduces

the adaptive testing framework so that it can monitor the changes in service artifacts

and dependencies, and reactively reassess their risks and reschedule the test cases for

group testing.

More and more researchers are beginning to realize the unique requirements

of WS testing and propose innovative methods and techniques from various

Risk Assessment and Adaptive Group Testing of Semantic Web Services 597

Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

perspectives, such as collaborative testing architecture, test case generation, dis-

tributed test execution, test con¯dence and model checking [10]. Most of the current

WS testing research focuses on the application and adaptation of traditional testing

techniques. We address here the uniqueness of WS and discuss WS testing from a

system engineering and interdisciplinary research perspective. Compared with

existing work in this area, this paper covers the following aspects:

. It proposes an objective method to assess software risks quantitatively based on

both the static structure and dynamic behavior of service-based systems.

. It analyzes the quality issues of semantic WS and measures the risks of the

ontology-based software services based on semantic analysis using stochastic

models like Bayesian networks.

. It improves WS progressive testing with e®ective test ranking and selection

techniques.

The rest of this paper is organized as follows. Section 2 introduces the background

technology including risk-based testing, web services group testing, and semantic

web services. Section 3 de¯nes the problem of adaptive WS testing. Section 4 ana-

lyzes the characteristics of semantic WS and o®ers a method for estimating and

predicating failure probability and importance. Section 5 proposes the methods of

adaptive measurement and adaptation rules. These techniques are the basis for

incorporating a dynamic mechanism into the WS group testing schema. Section 6

presents the metrics and the experiments used to evaluate the proposed approach.

Section 7 concludes the research and discusses future work.

2. Background

2.1. Risk-based testing

Testing is expensive, especially for today's software with its growing size and com-

plexities. Exhaustive testing is not feasible due to the limitations in time and

resources. A key testing strategy is to improve test e±ciency by selecting and

planning a subset of tests with a high probability to ¯nd defects. However, selective

testing usually faces di±culties in answering questions like \What should we test ¯rst

given our limited resources?" and \When can we stop testing?".

Software risk assessment identi¯es the most demanding and important software

aspects, and provides the basis for test selection and prioritization. In general, the

risk of a software feature is de¯ned by two factors: the probability to fail, and the

consequence of the failure. That is,

RiskðfÞ ¼ P ðfÞ � CðfÞ ð1Þ

where f is a software feature, RiskðfÞ is its risk exposure, P ðfÞ is the failure prob-

ability and CðfÞ is the cost of the failure.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

Intuitively, a software feature is risky if it has a high probability to fail or its

failure will result in serious consequence. Hence, a risky feature deserves more testing

e®ort and has a high priority to test. Risk-based testing was proposed to select and

schedule test cases based on software risk analysis [1, 2, 12, 20, 25, 27, 28, 32]. The

general process of risk-based testing is as follows:

(1) Identify risk indicators;

(2) Evaluate and measure the failure probability of software features;

(3) Evaluate and measure the failure consequence of software features;

(4) Associate test cases to their target software features;

(5) Rank test cases based on the risks of their target features. Test cases for risky

features should be ranked to be exercised earlier; and

(6) De¯ne risk-related coverage to control the testing process and test exit criteria.

Kenett and Tapiero proposed a convergence between risk engineering and quality

control from statistical perspective [19, 21]. In software engineering, risk-based

software testing is gaining attention since the late 1990s. Amland [1] established a

generic risk-based testing approach based on the Karolak's risk management process

model [17]. Bach [2] identi¯ed the general categories of risks during software system

development including complexity, change, dependency, distribution, third-party,

etc. Practices and case studies show that, by spending more time on critical func-

tions, testing can bene¯t from the risk-based approach in two ways: reduced resource

consumption and improved quality.

2.2. Web services group testing

Group testing technique was originally developed at Bell Laboratories for e±ciently

inspecting products [30]. The approach was further expanded to general cases [13]. It

is routinely applied in testing large numbers of blood samples to speed up the test

and reduce the cost [14]. In this case, a negative test result of the group under test

indicates that all the individuals in the group do not have the disease; otherwise, at

least one of them is a®ected. Group testing has been used in many areas such as

medical, chemical and electrical testing, coding, etc., using either combinational or

probabilistic mathematical models.

WS progressive group testing is proposed to enable heuristic-based selective

testing in an open service environment [33, 34, 4]. We de¯ne test potency as its

capability to ¯nd bugs or defects. Test cases are ranked and organized hierarchically

according to their potency to detect defects, from low potency to high. Test cases are

exercised layer-by-layer, following a hierarchical structure, at groups of services.

Ruling out strategies are de¯ned so that web services that fail at one layer cannot

enter the next testing layer. Test cases with high potency are exercised ¯rst with the

purpose to remove as many web services as early as possible.

Group testing is by nature a selective-testing strategy, which is bene¯cial in terms

of reduced number of test runs and shortened test time. The key challenge in


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

progressive group testing is the ranking and selection of test cases. Bai et al. [3]

further re¯ned the model and proposed a dependency-based approach. Two test

cases tc1 and tc2 are dependent if a service that fails tc1 will also fail tc2. For any

service, a test case will be exercised only if the service passes all its dependent test

cases. We further propose the windowing technique that organizes web services in

windows [34] and incorporates an adaptive mechanism [4, 7]. It follows software

cybernetics theories [9, 11] to dynamically adjust the window size, determine web

service ranking and derive test case ranking.

Several challenges need to be addressed in setting up progressive group testing

strategies, including:

(1) Estimation of probabilities;

(2) Speci¯cation of dependencies;

(3) Dynamic updating of estimates; and

(4) Sensitivity evaluation of group testing rule parameters.

2.3. Semantic web services

WS was initiated as W3C standards to enable interoperation between heterogeneous

web-applications through self-descriptive programmable interfaces [38]. It de¯nes the

XML-encoded protocols to support service communication, registration, discovery,

composition, and collaboration, such as SOAP (Simple Object Access Protocol),

WSDL (Web Service Description Language), UDDI (Universal Description, Dis-

covery and Integration), BPEL (Business Process Execution Language), and so on.

These speci¯cations de¯ne the expected structure and behavior of the service-based

system, which can be used to understand the system at an abstraction level. The WS

standards enable the analysis, veri¯cation and validation of the software behavior

before service deployment, at runtime and during online evolution.

This paper analyzes risks based on WS semantic model. It takes the speci¯cations

as the models of the system, particularly OWL-S (Ontology Web Language for

Services) [39] as service composition model. Semantic Web is a new form of web

content in which the semantics of information and services are de¯ned to be un-

derstood by computers [8]. Ontology techniques are widely used to provide a uni¯ed

conceptual model of Web semantics [16, 23, 31]. For example, Spies [31] and the

MUSING project [37] introduce a comprehensive approach of ontology engineering

to ¯nancial and operational risks [22]. OWL-S is an OWL-based semantic markup

language. It provides a semantic model for composite services. It speci¯es the

intended system behavior in terms of inputs, outputs, process, pre-/post- conditions

and constraints using ontologies. Figure 1 shows OWL-S ontology model. The Ser-

viceProcess ontology is modeled as a work°ow of processes including atomic, simple,

and composite processes. Two components are used to de¯ne OWL-S process model:

the Process Ontology describes the service IOPE (inputs, outputs, preconditions and


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

e®ects) properties; and the Process Control Ontology describes the process control

constructs such as sequence, iterate, if-then-else, and split.

3. Problem Statement

Given a set of services S ¼ fsig and a set of test cases T ¼ ftig, selective testing is

the process of ¯nding an ordered set of test cases to detect bugs as early as possible,

and as many as possible. Suppose that 8 s 2 S, BðsÞ ¼ fbig is the set of bugs in the

Fig. 1. OWL-S ontology model.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

service, T ðsÞ � T is the set of test cases for the service s, 8 b 2 BðsÞ, 9T ðbÞ � T ðsÞ sothat 8 ti 2 T ðbÞ, ti can detect b. Ideally, the potency of a test case t, }ðtÞ, is de¯ned as

the capability of a test case to detect bugs. That is,

}ðtÞ ¼ jBðtÞjPjBðsiÞj

ð2Þ

where BðtÞ is the set of bugs that test case t can detect, jBðtÞj is the number of bugs

t can detect, andP jBðsiÞj is the total number of bugs in the system. The problem is

to ¯nd an ordered subset T 0 of T so that 8 ti; tj 2 T 0, 0 � i; j � n, if i < j, then

}ðtiÞ > }ðtjÞ .However, it is usually hard to obtain the accurate number of bugs present in the

software that the test case can detect. In this work, we transform the bug distri-

bution problem to a failure probability so that, rather than measuring the number of

bugs in a service, we measure the failure probability of the service. We further

consider the failures' impact and combine the two factors into a risk indicator for

ranking services. In this way, testing is a risk mitigation process. Suppose that

RiskðsÞ ¼ P ðsÞ � CðsÞ is the risk of a service, then the potency of a test case is

de¯ned as:

}ðtÞ ¼P

RiskðtsiÞPRiskðsiÞ

ð3Þ

where tsi is the set of services that t tests.

WS testing and bug detection are like a \moving target" shooting problem. As

testing progresses, bugs are continuously detected and removed. As a result, the bug

distribution and service quality change. The service and composite service under test

may also change as well. Therefore, adaptive mechanism is necessary during testing

process to re-calculate ranking and ordering of test cases in reaction to the dynamic

changes in services, test cases potencies and bugs.

To de¯ne the process, we make the following simpli¯ed assumptions:

(1) A service can have at most n bugs;

(2) All the bugs are independent; and

(3) A bug is removed immediately after being detected.

The general process of risk-based test selection is as follows:

(1) Calculate the risks of each service, and order the services in a sequence fsig suchthat 8 si; sj 2 S, 0 � i; j � n, if i < j, then RiskðsiÞ > RiskðsjÞ.

(2) Select the sets of test cases for each service and order the test sets in sequence

fTijTi ¼ T ðsiÞg and 8Ti;Tj � T , 0 � i; j � n, if i < j, then RiskðsiÞ > RiskðsjÞ.(3) Rank test cases in each test set according to their potencies. That is, Ti ¼ ftijg

such that 8 tij; tik 2 Ti, 0 � j; k � n, if j < k, then }ðtijÞ > }ðtikÞ.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

(4) Select the service si with the highest risk and select the corresponding set of test

cases Ti. Exercise the test cases in Ti in sequence.

(5) Re-calculate service risks and test case potencies.

(6) Repeat steps 4–5 until certain criteria are met. We can de¯ne di®erent exit

criteria such as the percentage of services (test cases) covered, the number of

bugs detected, etc.

4. Risk Assessment

Risks are analyzed baesd on WS semantic model. Semantics introduce additional

risks to WS. The ontology may be de¯ned, used, and maintained by di®erent parties.

A service may de¯ne the inputs/outputs of its interface functions as instances of

ontology classes in a domain model that is out of the control of the service provider

and consumer. For example, AAWS (Amazon Associate Web Services) is an open

service platform for online shopping. Its WSDL service interface provides 19

operations with complicated data structure de¯nition. By translating the WSDL

data de¯nition to ontology de¯nition for semantic-based service analysis, we iden-

ti¯ed 514 classes (including ontology and property classes) and 1096 dependency

relationships between the classes. Figure 2 gives an example to show the usage of an

ontology class in di®erent service contexts. In this example, ontology classes are

de¯ned for the publication domain. A class Book is de¯ned as a subclass of Publi-

cation. Di®erent BookStore services may use Book ontology to specify input para-

meters of the operation BookDetails() in the interface BookQuery. An application

builder de¯nes a business process BookPurchase as a work°ow of services. We can see

from the example that a domain model can be used by various services and

Fig. 2. Example of ontology usage in di®erent contexts.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

work°ows. Therefore, the quality of the domain model has signi¯cant impacts in the

services and applications in the domain.

Moreover, ontologies introduce complex relationships among data and services

which result in increased possibility of misuse of the ontology classes. For example, in

publication domain, two ontology classes CourseBook and EditBook inherit from

Book ontology. From domain modeler perspective, the two categories of books are

used for di®erent purposes and are mutual exclusive. However, from the user per-

spective, such as a courseware application builder, the two classes could be over-

lapped because an edited book can also be used as course reading materials for

graduate students. Such con°icting views will result in a possible misuse of the

ontology classes when developing education software systems.

Based on the analysis of semantic services, the two factors of risks �� failure

probability and importance �� are assessed at three layers: domain ontology, service

and composite service.

4.1. Failure probability estimation

The failure probability of the service-based system is estimated as follows:

(1) Estimate the initial failure probability of each ontology class in the domain.

(2) Adjust the estimation of each class by taking its dependencies into consideration.

(3) Estimate the initial failure probability of a service.

(4) Adjust the service's estimation by taking into consideration the failure proba-

bility of its input/out parameters de¯ned by the domain ontology.

(5) Given the failure probability of a set of functions and their input/output

parameters, estimate the failure probability of the composite service based on its

control construct analysis.

Here, we use Bayesian Statistics to analyze the initial failure probability p of each

ontology class and service function. Given a certain artifact a, suppose that we test a

for n times with x times failure. Suppose that X is the distribution of x and that X

follows binomial distribution, that is, X � bðn; pÞ. Suppose that the failure proba-

bility follows uniform distribution, that is p � Uð0; 1Þ and its density function is

�ðpÞ ¼ 1ð0 < p < 1Þ. Then, the posterior probability P ðpjXÞ follows �ðxþ 1;n�xþ 1Þ and the expectation of the posterior with respect to p is the estimator of the

initial failure probability as follows:

EP ðpjXÞðpÞ ¼xþ 1

nþ 2ð4Þ

4.1.1. Ontology analysis

A domain usually contains a large number of ontology classes with complex rela-

tionships. An error in a class may propagate to others along their dependencies.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

Hence, the initial failure probability of an ontology is adjusted with its dependency

analysis.

In this research, dependency relationships are identi¯ed from following three

perspectives:

(1) Inheritance. A subclass inherits all the properties of its super-class and can

extend the super-class with its own de¯nitions. Generically, a subclass has more

restrictions than its super class.

(2) Collection Computing. An ontology class represents a concept that can be

instantiated by a set of instances. Based on set theory, ontology classes may

have following relationships:

. Equivalence. Equivalent classes must have precisely the same instances.

That is, for any two classes C1 and C2 and an instance c, if C1 � C2 and

c 2 C1, it implies that c 2 C2, and vice versa.

. Disjointness. The disjointness of a set of classes guarantees that an indi-

vidual of one class cannot simultaneously be an instance of another speci¯ed

class. That is, for any two classes C1 and C2, if they are two disjoint classes,

then C1 \ C2 ¼ ;.(3) Containment. The relationship above can be nested de¯ned to form a complex

ontology class. The complex class and its nested ontology have the containment

relationship.

In addition, an ontology class may have properties. The property of an ontology

class can also be de¯ned as a class and has relationships listed above. A dependency

graph is de¯ned to model the dependency relationships among the ontologies. In this

directed graph, a node represents a class of ontology or property and a link represents

a dependency between two classes. Di®erent types of links are de¯ned to represent

various types of dependency relationships. Figure 3 illustrates an example depen-

dency graph.

The dependence graph (DG) is transformed into a Bayesian Network (BN) for

inferencing failure probability of each class. The nodes in the BN represent ontology

Fig. 3. Ontology dependency graph.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

classes and links represent dependency relationships. Depending on class dependency

types, transformation rules are de¯ned as follows:

(1) An ontology class in DG is mapped directly to a node in a BN.

(2) Property classes in DG are not shown in the BN. However, the failure proba-

bility of those property classes are used to estimate ontology classes that the

property belongs to. As an error in a property will cause errors in its ontology

class, we use the product of all the properties as the a®ecting factor. The

adjusted failure probability of the ontology class is de¯ned as follows:

PadjðoÞ ¼ 1� ð1� P ðoÞÞ �Yj¼1::n

ð1� P ðpjÞÞ ð5Þ

where

. P ðoÞ is the estimated failure probability of ontology o and Padj is the adjusted

probability taking relationships into considerations.

. pj 2 PropðoÞ where PropðoÞ is the set of properties of o of length n and P ðpjÞ isthe failure probability of property class pj.

(3) For two classes with Inheritance relationship, that is, Inheritðo1; o2Þ where o1is the parent of o2, a directed link is added to BN between o1 and o2 starting

from o1 and ending at o2 to denote that o2 is a®ected by o1.

(4) For two classes with CollectionComputing relationship, that is, Equivðo1; o2Þ orDisjoinðo1; o2Þ, then. two nodes o 0

1 and o 02 are added to BN with P ðo1Þ ¼ P ðo 0

1Þ and P ðo2Þ ¼ P ðo 02Þ;

and

. two links are added from o1 to o 02 and from o2 to o 0

1 to denote the mutual

dependence relationship.

Once BN is created, the standard BN formulas can be used to calculate the

probabilities as follows:

P ðoijEcÞ ¼P ðoi;EcÞP ðEcÞ

ð6Þ

where Ec is the current evidence (or the current observed nodes) and oi is the node of

ontology class.

4.1.2. Service analysis

Ontologies can be used to de¯ne inputs, outputs, operation, process and collabora-

tion of services [35]. An ontology class can be misused in many ways such as di®erent

scopes, restrictions, properties, and relationships, which may cause failures in the

service-based software. It is necessary to trace how an ontology is used in a diversi¯ed

context so as to facilitate software analysis, quality control and maintenance. For


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

example, in case an error is detected in an ontology de¯nition, all the a®ected atomic

and composite services can be located by tracing the usage of the ontology in those

software artifacts. Given an ontology domain D, we de¯ne OntðaÞ ¼ foig as the set

of ontology classes f�ijoi 2 Dg used in an artifact a; and ArtðoÞ ¼ faig is the set of

service artifacts that are a®ected by an ontology class o, o 2 D and ai could be any

type of service artifacts such as message, operation, interface, service endpoint and

work°ow.

The failure probability of a service is calculated by multiplying probability of its

functions and its ontologies, as follows:

P ðsÞ ¼ 1� ð1� PfðsÞÞ �Y

i¼1;...;n

ð1� PadjðoiÞÞ ð7Þ

where

. P ðsÞ is the failure probability of a service.

. PfðsÞ is the failure probability of service functionality.

. oi 2 OntðsÞ is the set of ontology classes used in a service de¯nition.

. PadjðoiÞ is the adjusted failure probability of each ontology class.

4.1.3. Composite service analysis

A composite service de¯nes the work°ow of a set of constituent services. For

example, in OWL-S, each composite process holds a ControlConstruct which is one

of the following: Sequence, Split, Split� Join, Any�Order, Iterate, If � Then�Else, and Choice. The Constructs can contain each other recursively, so the

composite process can model all of the possible work°ows of WS.

The failure probability of the composite service depends on that of each constituent

service and the control constructs over them [6]. A service may be conditioned (e.g.

If � Then� Else) or unconditioned (e.g. Sequence) executed based on the control

constructs. For unconditioned execution, the failure of each service in a construct will

result in a construct failure, hence the product formula is used to calculate the con-

struct failure probability. For conditioned construct, we use weighted sum formula

where weight denotes the execution probability of each branch.

We use cc to denote control construct, and SðccÞ ¼ fsig is the set of services in cc.

�i is the execution probability of a service si andP

�i ¼ 1. Table 1 shows the

formulas to calculate.

4.2. Importance estimation

We use importance measurement to estimate service failure consequences. A failure

in important services may result in a high loss, thus the importance of a service


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

implies the severity level of its failures. The importance of an element is evaluated

from two perspectives:

(1) Based on dependence analysis, that is, the more an element is dependent upon

by others, the more important it is.

(2) Based on usage analysis, that is, the more an element is used in various contexts,

the more important it is.

4.2.1. Dependence-based estimation

Given a domainD, the importance of ontology class o is calculated as a weighted sum

of the number of its dependent classes, including both directed and indirected de-

pendent classes. For any two classes o1 and o2, if there is a path between them inDG,

we de¯ne the distance Disðo1; o2Þ between them as the length of links between them.

Assume that there exists at least one dependence relationship in the domain, that is,

9 o1; o2 2 D such that Depðo1; o2Þ. Then, the dependence-based importance estima-

tion of an ontology class is calculated as follows:

DdaðoÞ ¼X

e1�ijDepiðoÞj ð8Þ

DdrðoÞ ¼CdaðoÞ

maxDCdaðojÞð9Þ

where

. o 2 D is an ontology class in the domain and CdaðoÞ is the absolute importance of o

while CdrðoÞ is the relative importance of o using a dependence-based approach.

. DepiðoÞ ¼ fojg is the set of ontology classes that are dependent upon o with

distance i, that is, 8 oj 2 D, Depðo; ojÞ and Disðo; ojÞ ¼ i.

. jDepiðoÞj is the length of the set, that is, the number of dependent ontology classes.

. maxDCdaðojÞ is the maximum absolute importance 8 oj 2 D.

Table 1. Failure probability of control construct.

Category Graphic expression OWL-S example P ðccÞ

Unconditioned Sequence, Split, Any-Order P ðccÞ ¼ 1�Qi¼1;...;nð1� P ðsiÞÞ

ConditionedIf-The-Else, Choice,

Iterate, While-RepeatP ðccÞ ¼ P

�iP ðsiÞ


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

4.2.2. Usage-based estimation

As shown in Fig. 2, an ontology class can be instantiated in various services and a

service can be integrated in various business process. The usage model tracks how an

ontology or a service is used in di®erent contexts and measures the importance of the

element as a weighted sum of the count of the context. Suppose that, for an element e

(e could be an ontology class or a service), ContextðeÞ ¼ fctig is the set of contexts

where e is used. Assume that the element is used in at least one context for at least

once, then the importance can be measured as follows:

CuðeÞ ¼PjContextðeÞj

i¼1 wiNumðe; ctiÞPjContextðeÞji¼1 Numðe; ctiÞ

ð10Þ

where

. CuðeÞ is the importance of e using the usage-based approach.

. jContextðeÞj is the number of contexts that e is used in.

. wi is the weight for a context cti 2 ContextðeÞ.

. Numðe; ctiÞ is the number of e's usage in a context cti.

Given a large number of measured elements, we can also use the statistic models

to normalize the values, such as the Bayesian model for a collective choice as follows:

CubðeÞ ¼1N

PNi¼1 CuðeiÞ þ CuðeÞ

1N

PNi¼1 jContextðeiÞj þ jContextðeÞj

ð11Þ

where

. E ¼ feig is the set of measured elements.

. N ¼ jEj is the number of measured elements.

. jContextðeÞj is the number of contexts that an element e is used in.

5. Risk-Based Adaptive Group Testing

A key technique of WS group testing is to rank test cases and exercise them pro-

gressively. However, as testing progresses, changes can occur in the service-based

system in various ways:

. Services could change, as providers maintain software, update its functionalities

and improve its quality.

. The service composition could change. An application builder may select di®erent

service providers for a constituent service and may change the work°ow of a

composite service to meet changed requirements of business processes.

. The risks of software could change due to changes in services and service

compositions.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

. The potency of test cases could change due to changes in services and service

compositions.

. The quality preference could change. For example, for a safety-critical or mission-

critical usage context, it may be required to have a comprehensive coverage of test

cases for a service to be accepted. While otherwise, less strict criteria can be used to

reduce the time and cost of testing.

To accommodate these changes, adaptation is necessary to adjust continuously

the measurement of software and test cases, and rules for test cases selection, pri-

oritization and service evaluation. In our previous research, we proposed an adaptive

testing framework based on software cybernetics theory. Software cybernetics

applies cybernetics to control the behaviors of software systems and/or software

development activities [11]. The purpose is to control software development process

based on mathematics. This is a new discipline with initial results in process man-

agement, requirement acquisition, software integration, and software aging. Cai

et al. introduced the Controlled Markov Chain model to software testing, reliability

assessment, and fault-tolerance [9]. As shown in Fig. 4, we applied feedback control

model for adaptive WS testing. Controller controls the testing activities such as test

generation, test selection, test execution, test case ranking, and WS ranking. Opti-

mizer changes the parameters of controller based on test and evaluation results from

the feedback loop. Speci¯cally, the optimizer changes test case ranking so that highly

ranked test cases will be used ¯rst.

The generic architecture is further re¯ned for group testing with windowing

mechanism. With group testing, WS are simultaneously tested in bulk. With the

windowing mechanism, it breaks WS into subsets called windows and testing is

exercised window by window [34]. Windowing allows for re-ranking of test cases and

the re-organization of test hierarchy of the group testing at each window. It improves

test e®ectiveness with fewer but more potent tests and °exibly adjusts the test

strategies across windows. In the windowing approach, the controller controls:

(1) Selection of a group of WS of the window size.

(2) Potency calculation of each test case.

(3) Ranking and classi¯cation of test cases.

(4) Organization of test cases into test hierarchies.

(5) Testing of WS with the test cases through the test hierarchies.

Fig. 4. Control architecture of adaptive testing.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

(6) Ranking of WS based on their testing statistics.

(7) Ruling out of WS that do not meet the passing criteria.

The optimizer optimizes the test strategies from the following aspects:

(1) It adjusts the window size, and the window size control the rate of re-ranking. If

changes are few, the refresh rate can be low.

(2) It adjusts the potency of test cases based on recent test results to that the

controller can re-rank and re-organize the test cases and choose the most potent

test cases for testing on the new window.

(3) It adjusts the WS ranking algorithm and rankings, and decides on the WS rule

out criteria.

Figure 5 shows the architecture of the controller and optimizer. Di®erent strat-

egies can be de¯ned to re-rank test cases and adjust window size.

This paper proposes a risk-based approach to introduce a dynamic mechanism in

order to enable adaptive risk assessment and test case ranking based on runtime

monitoring and pro¯ling of target services and systems.

Figure 6 presents an overview of the proposed approach. As shown in Fig. 6, risks

of the services are identi¯ed in two ways, static and dynamic analysis. Static analysis

is based on service dependencies and usage topologies. As services may be recom-

posed online, dynamic analysis is introduced to detect the changes at runtime and

recalculate the service risks based on runtime pro¯ling of service composition, con-

¯guration, and operation. Test cases are associated to their target features under test

and are ranked based on the risks of the services. With support of runtime moni-

toring, information can be gathered on the ontology dependencies, services usage,

and service work°ows. Pro¯ling and statistical analysis of logged information can

Fig. 5. The adaptive WS group testing architecture.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

facilitate detecting the changes in the system and adjust the measurement of risks

and test case potencies. Rules are de¯ned to control the testing process. In the generic

WS group testing process, rules are de¯ned to control the following testing activities:

. Risk levels to categorize the test cases and arrange them hierarchically into

di®erent layers.

. The strategies of ranking the test cases, such as cost, potency, criticality, depen-

dency, etc.

. The strategies for ruling out services after each layer of testing.

. The strategies of ranking the services, such as importance, failure rates on the test

case, etc.

. The entry and exit criteria for each layer of group testing.

For example, in an experiment, a sensor is instrumented in the process engine of a

composite service [5] to monitor service calls. The number and sequence of calls to

external services are recorded. Table 2 lists the typical sequence of service

Table 2. Example service composition pro¯le (number of invocations in the pro-

¯le).

Time interval Execution sequence s1 s2 s3 s4 s5 s6 s7

fs1; s2; s4; s5g½t1; t2 fs1; s2; s5; s4g 100 80 20 80 80 20 ��

fs1; s3; s6g½t2; t3 fs1; s2; s4g 50 25 25 25 �� 25 ��

fs1; s3; s6gfs1; s2; s4; s7g

½t3; t4 fs1; s2; s7; s4g 200 140 60 140 �� 60 140fs1; s3; s6g

Fig. 6. Approach overview.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

invocations and the number of invocations to each service in three time intervals.

Observation of logged data shows that:

. In interval ½t1; t2, s2 and s3 are conditioned executed after s1 and the execution

probability of each service are �s2 ¼ 0:8 and �s3 ¼ 0:2. s4 and s5 are executed in

parallel after s2. s6 is executed after s3.

. In interval ½t2; t3, service s5 becomes unavailable and there is no subsequent in-

vocation of s5 after s2. In addition, the call distribution between s2 and s3 is

changed from 0:8 :0:2 to 0:5 :0:5.

. In interval ½t3; t4, service s7 is newly bound to the system and invoked after s2 in

parallel to s4. The call distribution between s2 and s3 is resumed to 0.7:0.3.

The changing process of the composition structure is shown in Fig. 7. Similar to

the monitored composition changes, system can also detect changes in ontology

domain and ontology usage. Such changes trigger a reassessment of the two factors of

risk: failure probabilities and importance. Considering this example, Table 3 shows

the changes in the risk of the composite service.

In this application, testing is controlled as a risk mitigation process. That is, the

test cases that have a high probability to detect a risky bug should be exercised ¯rst

so that the risky bugs can be detected and removed early and that the risk of

the whole system can be reduced. As bugs are detected and removed, the rules of the

strategies and criteria are also adapted to re°ect the changes in the risks of the

services. The rule adaptation is by nature a problem of dynamic planning. Each layer

Fig. 7. Example service composition changes.

Table 3. Example of adaptive risk assessment.

Time interval Risk factors s1 s2 s3 s4 s5 s6 s7 Composite service

½t1; t2 P ðsÞ 0.1 0.3 0.1 0.3 0.7 0.3 �� 0.654

CðsÞ 1 0.8 0.2 0.3 0.3 0.3 ��½t2; t3 P ðsÞ 0.1 0.3 0.1 0.3 �� 0.3 �� 0.840

CðsÞ 1 0.5 0.5 0.5 �� 0.5 ��½t1; t2 P ðsÞ 0.1 0.3 0.1 0.3 �� 0.3 0.3 0.751

CðsÞ 1 0.7 0.3 0.3 �� 0.3 0.3


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

in the WS progressive group testing is a stage in decision making, and the goal is to

select a set of test case with maximum potential risks.

6. Experiments and Evaluation

6.1. Experiment setup

The BookFinder OWL ontology is used to illustrate OWL features. Suppose a WS

provides book-searching services, accepts queries, searches the repository, and

returns the required book information. The user can submit many query conditions

such as ISBN, title, authors, publisher, category, cover type, publish year, edition

etc. The OWL-S ¯le (book¯nder.owl, simply as book¯nder) de¯nes the atomic service

model, as shown in Fig. 8. The service has an input of the class type Book, declared in

the domain ontology, and an output of string type. The OWL ¯le (bibtex.owl, simply

as bibtex) de¯nes the domain ontology related to the service. The overall class

hierarchy in bibtex ontology is shown in Fig. 9.

6.2. Evaluation

Suppose that T ¼ ftig is the set of test cases, TE ¼ ftig � T is the set of exercised

test cases, B ¼ fbugig is the set of all bugs in the system, and BD ¼ bugi � B is

the set of bugs that has been detected. To evaluate the test results, we de¯ne the

following two metrics:

(1) Test Cost. The average number of test cases required to detect a bug in a

system.

CostðT Þ ¼ jTEjjBDj ð12Þ

Fig. 8. The example BookFinder service.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

(2) Test E±ciency. Given a number of exercised test cases, the ratio between the

percentage of bugs detected and the percentage of test cases exercised.

EffectðT ;TEÞ ¼ jBDjjBj

� jTEjjT j ð13Þ

The goal of testing is to detect as many bugs as possible with as few as possible

test cases [24]. That is, low test cost and high test e±ciency are preferred.

Fig. 9. The bibtex ontology example.

(a)

Fig. 10. The Bayesian network.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

Two evaluation experiments were exercised on case studies. For simplicity, we

only consider bugs in ontology classes. Assume that each ontology class has exactly

one bug, which could be detected by one test case. Test cases are categorized based

on their target ontology classes. Each ontology class is assigned 200 test cases. All

test cases are independent and they are designed for exactly one ontology class.

In Experiment 1, eight ontology classes are identi¯ed; the corresponding BN

network is shown in Fig. 10(a). We simulate the joint distribution of BN. Table 4

shows the calculated risk of each node during iterations of nodes risk analysis and test

(b)

Fig. 10. (Continued)

Table 4. Adaptive risk assessment of the ontology classes in experiment 1.

Nodes 1 2 3 4 5 6 7 8

Round1 0.3000 0.5000 0.8000 0.5300 0.4130 0.8060 0.7612 0.8418

Round2 0.3015 0.5036 0.8000 0.5478 0.4148 0.8617 0.7723 observedRound3 0.3052 0.5124 0.8000 0.5918 0.4192 observed 0.7723 observed

Round4 0.3052 0.5124 0.8000 0.5918 0.4192 observed observed observed

Round5 0.3052 0.5124 observed 0.5918 0.4592 observed 0.7723 observed

Round6 0.3396 0.5943 observed observed 0.4130 observed observed observedRound7 0.3333 observed observed observed 0.4130 observed observed observed

Round8 0.3333 observed observed observed observed observed observed observed


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

selection. Figure 11 shows results of the experiment which compares the proposed

risk-based approach (RBT) with random testing approach (RT). In the Figure, the

horizontal axis lists the number of bugs detected. The curves show the trend of test

cost (Fig. 11(a)) and test e±ciency (Fig. 11(b)) with increasing number of bugs

detected. Here, the RT approach is estimated with two probability levels, 80% and

90%; that is, to detect n bugs with a certain probability, the number of test cases

required.

In Experiment 2, 16 ontology classes are identi¯ed; corresponding BN network is

shown in Fig. 10(b). Taking experiment 1 as a training process for experiment 2, we

introduce the adaptation mechanism to rank test cases based on their defect de-

tection history. Figure 12 shows the experiment results which compare test cost

(Fig. 12(a)) and test e±ciency (Fig. 12(b)) with increasing number of bugs detected.

From the experiment we can see that the risk-based approach can greatly reduce

test cost and improve test e±ciency. The learning process and the adaptation

mechanism can greatly improve the test quality.

(a) Cost comparison

(b) E±ciency comparison

Fig. 11. Comparison of test cost and test e®ectiveness between RBT and RT of experiment 1.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

7. Conclusion and Outlook

Testing is critical to assure quality properties of a service-based system so that

services can deliver their promise. To reduce test cost and improve test e±ciency, this

paper proposes a risk-based approach to ranking and selecting test cases for con-

trolling the process of WS group testing. An adaptation mechanism is also intro-

duced so that the risks can be dynamically measured, and control rules can be

dynamically adjusted online. Motivated by unique testing issues, the work shows the

convergence among various disciplines including statistical, service-oriented com-

puting, and semantic engineering. Some preliminary results illustrate the feasibility

and advantages of the proposed approach.

Future work include: (1) application and experiments with a real application

domain; (2) service behavior analysis to identify typical fault models and risky sce-

narios; (3) model enhancement to achieve better test evaluation results by using

more sophisticated mathematical models and reasoning techniques.

Acknowledgments

This research is supported by the National Science Foundation China (No.

60603035), the Open Fund of the State Key Laboratory of Software Development

Environment (No. SKLSDE-2009KF-2-0X) of Beijing University of Aeronautics and

(a) Cost comparison

(b) E±ciency comparison

Fig. 12. Comparison of test cost and test e®ectiveness between RBT and RT of experiment 2.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

Astronautics, and the National Basic Research Program of China (973 Program)

under Grant (No. 2005CB321901).The second author was partially supported by the

FP6 MUSING project (IST-FP6 27097).

References

1. S. Amland, Risk-based testing: Risk analysis fundamentals and metrics for softwaretesting including a ¯nancial application case study, Journal of Systems and Software53(3) (2000) 287�295.

2. J. Bach, Heuristic risk-based testing, STQE Magazine 1(6) (1999).3. X. Bai, Z. Cao and Y. Chen, Design of a trustworthy service broker and dependence-based

progressive group testing, Int. J. Simulation and Process Modeling 3(1/2) (2007) 66�79.4. X. Bai, Y. Chen and Z. Shao, Adaptive Web Services testing, IWSC, 2007, pp. 233�236.5. X. Bai, S. Lee, R. Liu, W. T. Tsai and Y. Chen, Collaborative Web Services monitoring

with active service broker, COMPSAC, 2008, pp. 84�91.6. L. Wang, X. Bai, Y. Chen and L. Zhou, A hierarchical reliability model of service-based

software system, COMPSAC 1 (2009) 199�208.7. X. Bai and R. S. Kenett, Risk-based adaptive group testing of Web Services, COMPSAC

2 (2009) 485�490.8. T. Berners-Lee, J. Handler and O. Lassila, The Semantic Web, Sci. Am. May 2001.

(Revised 2008)9. K.-Y. Cai, Optimal software testing and adaptive software testing in the context of

software cybernetics, Information and Software Technology 44 (2002) 841�844.10. G. Canfora and M. Penta, Servcie-oriented architectures testing: A survey. Lecture Notes

in Computer Science, Vol. 5413 (2009), pp. 78�105.11. J. W. Cangussu, S. D. Miller, K. Y. Cai and A. P. Mathur, Software cybernetics, in

Encyclopedia of Computer Science and Engineering, to be published by John Wiley &Sons.

12. Y. Chen and R. L. Probert, A risk-based regression test selection strategy, in 14th ISSRE,2003.

13. R. Dorfman, The detection of defective members of large population, Annals of Mathe-matical Statistics 14 (1964) 436�440.

14. F. M. Finucan, The blood testing problem, Applied Statistics 13 (1964) 43�50.15. M. P. Papazoglou and D. Georgakopoulos, Service-oriented computing, Commun. ACM

46(10) (2003) 25�28.16. A. Gomez-Perez, M. Fernandez-Lopez and O. Corcho, Ontological Engineering (Springer,

2005).17. D. V. Karolak, Software Engineering Risk Management (IEEE Computer Society Press,

1996).18. R. S. Kenett and M. Pollak, A semi-parametric approach to testing for reliability growth

with an application to software systems, IEEE Transactions on Reliability 35(3) (1986)304�311.

19. R. S. Kenett and S. Zacks, Modern Industrial Statistics: Design and Control of Qualityand Reliability (Duxbury Press, San Francisco, 1998).

20. R. S. Kenett and E. Baker, Process Improvement and CMMI for Systems and Software(Auerbach Publications, 2010).

21. R. S. Kenett and C. Tapiero, Quality and risk: Convergence and perspectives, http://ssrn.com/abstract=1433490, QPRC (2009).

22. R. S. Kenett and Y. Raanan, Operational risk management: A practical approach tointelligent data analysis, Printed and online versions (John Wiley & Sons, 2010).


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

23. D. Martin et al., Bringing semantics to Web Services: The OWL-S approach, SWSWPC,2004.

24. G. Myers, The Art of Software Testing (John Wiley & Sons, 1979).25. I. Ottevanger, A risk-based test strategy, STARWest, 1999.26. Tim O'Reilly, What is Web 2.0, http://oreilly.com/web2/archive/what-is-web-20.html

(2005).27. F. Redmill, Exploring risk-based testing and its implications, Software Testing, Veri¯-

cation and Reliability 14 (2004) 3�15.28. L. H. Rosenberg, R. Stapko and A. Gallo, Risk-based object oriented testing, in 24th

SWE, 1999.29. M. P. Singh and M. N. Huhns, Service-Oriented Computing (John Wiley & Sons, 2005).30. M. Sobel and P. A. Groll, Group testing to eliminate all defectives in a binomial sample,

Bell System Technical Journal 38(5) (1959) 1179�1252.31. M. Spies, An ontology modeling perspective on business reporting, Information Systems,

2009.32. H. Stallbaum, A. Metzger and K. Pohl, An automated technique for risk-based test case

generation and prioritization, ICSE (2008), pp. 67�70.33. W. T. Tsai, Y. Chen, Z. Cao, X. Bai, H. Huang and R. Paul, Testing Web Services using

progressive group testing, Advanced Workshop on Content Computing, 2004,pp. 314�322.

34. W. T. Tsai, X. Bai, Y. Chen and X. Zhou, Web Services group testing with windowingmechanisms, SOSE, 2005, pp. 213�218.

35. W. T. Tsai, Q. Huang, J. Xu, Y. Chen and R. Paul, Ontology-based dynamic processcollaboration in service-oriented architecture, SOCA, 2007, pp. 39�46.

36. G. S. Watson, A study of the group screening method, Technometrics 3(3) (1961)371�388.

37. MUSING, Multi-Industry Semantic-Based Business Intelligence Solutions, available at:http://www.musing.eu.

38. Web Services Architecture[s], W3C Working Draft, http://www.w3.org/TR/ws-arch/,Nov. 14, 2002.

39. W3C, \OWL-S: Semantic Markup for Web Services," http://www.w3.org/Submission/OWL-S/.


Int.

J. S

oft.

Eng

. Kno

wl.

Eng

. 201

2.22

:595

-620

. Dow

nloa

ded

from

ww

w.w

orld

scie

ntif

ic.c

omby

PE

KIN

G U

NIV

ER

SIT

Y o

n 01

/19/

13. F

or p

erso

nal u

se o

nly.

RISK ASSESSMENT AND ADAPTIVE GROUP TESTING OF … › ~weiyu › Adams_Wei_Yu_Homepage_files ›...

Documents

Transcript of RISK ASSESSMENT AND ADAPTIVE GROUP TESTING OF … › ~weiyu › Adams_Wei_Yu_Homepage_files ›...