Highly Automated Agile Testing Process: An Industrial Case Study · 2016-08-09 · Highly Automated...

e-Informatica Software Engineering Journal, Volume 10, Issue 1, 2016, pages: 69–87, DOI 10.5277/e-Inf160104

Highly Automated Agile Testing Process: An IndustrialCase Study

Jarosław Berłowski∗, Patryk Chruściel∗, Marcin Kasprzyk∗, Iwona Konaniec∗,Marian Jureczko∗∗

∗NetworkedAssets Sp. z o. o.∗∗Faculty of Computer Science and Management, Wrocław University of Science and Technology

[email protected]

AbstractThis paper presents a description of an agile testing process in a medium size software project thatis developed using Scrum. The research methods used is the case study were as follows: surveys,quantifiable project data sources and qualitative project members opinions were used for datacollection. Challenges related to the testing process regarding a complex project environmentand unscheduled releases were identified. Based on the obtained results, we concluded thatthe described approach addresses well the aforementioned issues. Therefore, recommendationswere made with regard to the employed principles of agility, specifically: continuous integration,responding to change, test automation and test driven development. Furthermore, an efficienttesting environment that combines a number of test frameworks (e.g. JUnit, Selenium, Jersey Test)with custom-developed simulators is presented.

Keywords: software engineering, testing process, agile software development, case study

1. Introduction

Software testing is a very costly part of the soft-ware development process, it is sometimes esti-mated to make 50% of the whole developmentcost [1], [2]. It is one of the main activities (atleast should be) in agile software developmentmethods. Beck and Andres [3] claimed it to bea measure of project progress and the main meanof assuring software quality. On the other hand,applying agile software development methods sig-nificantly affects the testing process. The agilemethods usually require to test early and to havethe tests automated. In consequence, testing isnot left to the final phase, but requires sustain-able investments during the whole process (in-cluding after-release maintenance of automatedtest cases). Some of the authors even recommendto start with testing [4]. The agile testing hasbeen studied for several years, however, thereare still unanswered questions regarding scala-

bility [5], the role of testers [6] or test automa-tion [7].

The goal of this research is to extend the bodyof knowledge concerning agile testing by docu-menting a real life software testing process. Thispaper presents a case study of a medium-size soft-ware project with special factors that affects theaforementioned process, i.e. requests for unsched-uled releases and high complexity of project envi-ronment. The project is a Java Enterprise systemin the telecommunication domain and its mainpurpose is to ease network devices controlling andmanagement. Thus, there is a number of featuresthat concern integration and communication withother systems and devices. Functional tests ofsuch features involve items that are outside of thetested system, but are necessary for a successfultest execution. Therefore, the test environmentis complex and its management can consumea considerable amount of resources, specificallyin the case of automated tests, where all changes

http://www.e-informatyka.pl/wiki/e-Informatica

http://www.e-informatyka.pl/attach/e-Informatica_-_Volume_10/eInformatica2016Art4.pdf

70 Jarosław Berłowski et al.

that come from test execution should be verifiedand reverted in order to ensure tests repeatability.On the other hand there are many, unscheduledrelease requests that concern the newest versionof the developed system. Those releases regardpresentations for end users or deployments intoan environment for acceptance tests. Nonetheless,the released version must satisfy company qualitystandards and must be ‘potentially shippable’ [8].The unscheduled releases could be explained asreleases in the middle of a Sprint (the projectis developed using Scrum), that are announcedone or two days in advance and are requestedto contain some features from the current Sprintand, of course, all the features developed in pre-vious Sprints. The unscheduled releases createchallenges. It is critical to ensure that the ‘old’functionality still works correctly and there arevery limited resources for regression tests, sincethere are the ‘in-progress’ features that must bestraightened up before the release. The solutionis somewhat obvious – test automation. In orderto support the unscheduled releases a complexset of automated regression tests must be presentand frequently executed. In consequence, it is pos-sible to evaluate the current version of softwareinstantly and decide if it is ‘potentially shippable’.The paper describes how the challenges are sat-isfied, in which ways they affected the testingprocess and what results were obtained.

The case study presents an insider perspec-tive, i.e. it is conducted by members of the teamthat develops the project. Therefore, it is possibleto get a really deep insight, but unfortunatelythe risk of biased observation or conclusion growssimultaneously. In consequence, the case studyis focused on a quantitative approach, which isassociated with lower risk of subjectivity insteadof the qualitative one, which is usually employedin similar case studies. The study is designedaccording to guidelines given by Runeson andHöst [9].

A number of concepts is used in this paper.Let us present explanations of them in order toclarify possible ambiguities. The authors refer toproject size. The concept is in fact the metricwhich Mall [10] identified as one of the two mostpopular and the simplest measures of project

size, namely the number of Lines of Code (LOC).The project investigated here is described asa medium-size software project. The small,medium and large scale is not a precise term, itis very intuitive (e.g. more intuitive than LOC).The investigated project has been classified asa medium-size one, since it seems to be a bitsmaller than projects described in [5] and [6]which are reported as large ones. The authorsrefer to costs of development and test. Thisshould be considered as the amount of time com-mitted by the development team during projectrelated activities. One of the investigated aspectsof the testing process is availableness for un-scheduled release. A possibility of making anapplication release is evaluated using results ofthe functional tests (Selenium tests and RESTtests) executed in a continuous integration system(100% success of unit tests is a prerequisite of thefunctional ones). The application is truly readyfor release if 100% of the functional tests arepassed. The application fits for a presentation inthe case when at least 90% of the tests are passed.If more than 10% of the tests failed, no formof release shall be considered. The investigatedtesting process was assessed using the concept ofthe level of agility that evaluates the adoptionof different agile principles using the concept ofthe level of adoption. Both aforementionedconcepts are borrowed from [5], further detailsabout them can also be found in section 2.3.

The rest of this paper is organised as follows.The next section presents the goal and studydesign as well as a detailed description of thestudy context. The obtained results are docu-mented in section 3. In section 4, the threats tovalidity are discussed and in section 5, relatedwork is summarized. Finally, the discussion andconclusions are given in section 6.

2. Goal and Study Design

2.1. Context

The context is described according to the guide-lines given by Petersen and Wohlin [11]. It corre-sponds with the suggested description structure

Highly Automated Agile Testing Process: An Industrial Case Study 71

as well as the recommended elements of the con-text.

2.1.1. Product

The investigated project regards the developmentof a system which is used in the management oftelecommunication services delivery. The mainfunctionalities of the product are:– managing customer access devices,– managing and configuring network delivery

devices,– managing resources of IP addresses from the

public and private pools.The following technologies are used in the soft-

ware development:– GWT – Google Web Toolkit is a develop-

ment toolkit for building and optimizingcomplex browser-based applications (https://developers.google.com/web-toolkit/);

– Spring – a comprehensive programming andconfiguration platform for modern Java-basedenterprise applications without unnecessaryties to specific deployment environments(http://www.springsource.org/);

– Hibernate – It is a framework which allowsto provide the service of the data storage ina database, regardless of the database system(http://www.hibernate.org/);

– Oracle DB – Relational database manage-ment system which is commonly used in cor-porations because of data maintaining capa-bilities, security and reliability (http://www.oracle.com);

Maturity. The project was conducted from May2011, the first release was carried out in Autumn2011. The study was conducted in January 2013.Quality. The main means of ensuring the prod-uct quality are tests. Further details regardingthe testing process are presented in the nextsections.Size. The total number of Lines of Code is327387 (calculated by StatSVN – all lines arecounted).System type. Enterprise system with a webUI. The development team makes use of func-tionalities accessible as REST services and alsointegrates a them with existing systems.

Customisation. It is custom software develop-ment. The product is tailored to the requirementsof a customer (a telecommunication vendor).Programming language. The system is devel-oped mostly in Java.

2.1.2. Processes

Figure 1 presents the testing process. In fact, nodefined process is used, and what the diagramshows is the result of the ‘definition of done’that is being employed. New user stories arefirst identified, then acceptance criteria for themare defined and stories are estimated. When theSprint begins, the development team makes theSprint planning and later software developmentand unit tests. At the same time a specificationfor functional tests must be prepared. If pair pro-gramming was not carried out, than a code reviewis needed. Before the end of the Sprint, functionaltests must be conducted. When there is time, au-tomated functional tests are prepared, in othercase, automation is postponed to the next Sprint.Test automation is not a part of the ‘definitionof done.’ It might seem strange as automationis very important in agile processes, but unfortu-nately automation is very time-consuming andfor technical reasons sometimes must be done af-ter implementation (this remark does not regardunit tests). In consequence, it was not possibleto always do implementation and automation inthe same Sprint. Having the choice to either dolonger Sprints or postpone automation it waspreferable to postpone the automation as there isthe requirement for unscheduled releases, whichdoes not correspond well with long Sprints. Post-poning test automation is not a good practice, butconducting functional tests manually [12,13] orautomating after the sprint [5,14] is not unique insoftware development. Puleio [14] even inventeda name for not doing the test automation ontime, i.e. testing debt. At the end of the Sprintthe software is potentially shippable as the newfeatures which are not covered by automatedtests (if there are such) are tested manually. Aspresumably it already emerged from the abovedescription, the employed development processis Scrum.

https://developers.google.com/web-toolkit/

https://developers.google.com/web-toolkit/

http://www.springsource.org/

http://www.hibernate.org/

http://www.oracle.com

http://www.oracle.com


Figure 1. Testing process

2.1.3. Practices, Tools, Techniques

CASE toolsEclipse STS (http://www.springsource.org/sts)is used as the primary development environment.Furthermore, the development team uses a setof frameworks for test automation which are em-ployed in a continuous integration system (detailsin the next section).Practices and techniques– Time–boxing.– Frequent (in fact continuous) integration.– Frequent, small releases.– Early feedback.– Test automation.

2.1.4. People

Roles• Product Owner – comes up with new ideas,

updates priorities and chooses the most impor-tant issues, he is also responsible for contactwith customers.

• Scrum Master – takes control of the Scrumprocess and enables team members to reachthe goal of the Sprint by helping them withall impediments. For a short period of timethere were two Scrum Masters, as the projectwas developed by two different teams.

• Team – responsible for reaching the goal ofthe Sprint. The development team currentlyconsists of 8 people including: software devel-

opers, testers (one tester performs also therole of a business analyst), the Scrum Master(who takes part in software development) andthe Product Owner (who does not take partin the software development):– Software developers (5),– Scrum Master (1),– Testers (2).

2.2. Experience

Some software developers have longiterm experi-ence in the IT field, i.e. two people 5+ yearsof experience (one of them is a Scrum Mas-ter), the Product Owner 15+ years of experi-ence and one of the testers who is also a busi-ness analyst 10+ years of experience. For therest of the team, this is the first job. Theytake part in the project, learn technical skillsand expand knowledge using different tools. Theentire team has completed bachelor’s or mas-ter’s degree studies in computer science. In or-der to improve the qualifications, the membersof the development team have participated insome trainings and certification programs, forexample ISTQB (International Software TestingQualifications Board) – both testers, ProfessionalScrum Master I – the Scrum Master, CISCOcertificates (CCNA/CCNP, CCAI) – one of thetesters.

The development process was configured bythe experienced team members who also take care

http://www.springsource.org/sts


of the introduction of their younger colleagues.Thus, there are no reasons to believe that thereare issues in the process resulting a lack of experi-ence. Additionally, the trainings and certificationprogram were used to ensure high developmentstandards and to avoid shortcomings.

2.3. The Level of Agility of theInvestigated Testing Process

The paper is to be focused on agile testing process.Hence, it is important to take carefully analyzethe principles of agility and assess the process.Otherwise, there would be a considerable risk ofinvestigating other phenomena, not the ones thatshould be investigated.

2.3.1. Survey Design

The issue is addressed with a survey conductedamong all the staff that is or was involved inproject development. The survey is designed ac-cording to a set of weighted criteria suggestedby Jureczko1 [5], i.e. the following criteria areused (each of them can be met fully, partly ornot at all):– Test driven development – TDD is one of

the most remarkable improvements, whichhas been brought to testing with the agilemethods [15], [3]. Presumably not every agiletesting process uses TDD, on the other handTDD affects the process and system designin such a significant way, that it cannot beignored when evaluating the level of agility(weight = 3).

– Test automation – most of the agile methodsmove the focus from manual to automatedtests, e.g. [3]. Furthermore, the automationis often considered an essential practice [16],[14] (weight = 3).

– Continuous integration – what is the use ofautomated tests when they are not executedfrequently? (weight = 2).

– Communication – it is considered at two lev-els, i.e. not only between team members but

also with the customer, however, with respectto testing process the most relevant commu-nication is between developers and testers(weight = 2).

– Pair programming – pair programming doesnot relate to testing directly. Nevertheless, itaffects the quality of a source code – could beconsidered as on-the-fly code review (weight= 1).

– Root-cause analysis – it is one of the qualityrelated eXtreme Programming corollary prac-tices. It forces complex analysis for each of theidentified defects. Not only the defect shouldbe resolved, but also the software developmentprocess should be improved to avoid similardefects in future. The root-cause analysis isnot recommended when some of the essen-tial eXtreme Programming practices are notadopted. Therefore, there are agile testingprocesses that do not employ it (weight = 1).

– Working software (over comprehensive docu-mentation) – one of the Manifesto for AgileSoftware Development rules that may havestrong influence on tests (weight = 1).

– Responding to change (over following theplan) – another rule from the aforementionedmanifesto. In the context of a testing process,this rule is mainly considered with respect tothe flexibility of test plans (weight = 1).

The above listed criteria are extracted from prin-ciples suggested in eXtreme Programming [3]and Agile Manifesto. Each of them is somehowconnected with testing and as the Authors be-lieve they make a rich enough subset of all agileprinciples to offer a good understanding of theproject reality with respect to the testing process.The questionnaire was generated in a paper form,nonetheless, it is available on-line: http://purl.org/MarianJureczko/TestingProcess/Survey.

2.3.2. Survey Results

The questionnaire was completed by all presentteam members and those past members that arestill working for the company. The results are dis-

1 The author argued the usage of weights by stating that the principles of agility have different relevancy from thetesting perspective. It is also noteworthy that some of the agile software development methods, e.g. XP [3], identifycategories of principles that are based on the relevancy of their adoption.

http://purl.org/MarianJureczko/TestingProcess/Survey

http://purl.org/MarianJureczko/TestingProcess/Survey


cussed in detail in the forthcoming subsectionsand presented in Figure 2. Each of the sub figuresshows which respondents recognised given levelsof adoption. It is worth mentioning that there isno single ‘I do not know’ response. Presumablyit is a consequence of the fact that each of therespondents actively participates or participatedin the investigated project.Test driven development More than half ofthe respondents recognised test driven develop-ment as partly implemented and the rest of themas fully implemented (Fig. 2a). TDD has beenused in the project for a long time, however,not all features are developed in this way. It isalways the developer’s responsibility to choosethe most efficient way of the development andthere are no repercussions for not doing TDD.Regardless of the selected development method,the unit tests are obligatory and hence TDDis often a good choice. Nonetheless, there isalso a number of features that are closely cou-pled to moderately documented, external ser-vices which do not make a perfect environmentfor TDD.Test automation Almost all respondents recog-nised test automation as fully adopted (Fig. 2b).test automation is very important in the investi-gated testing process. The automated unit testsare explicitly written in the project’s ‘Definitionof Done’ [8]. The automation of functional testsis also obligatory, however, sometimes it is post-poned and it is not conducted in the very samesprint as the functionality which has to be tested.Typically, there is one automated test case pera user story, but there are also some very com-plex stories that are tested by more test casesand some complex test cases that cover morestories. All the automated tests cases are used asregression tests and are executed in a continuousintegration system which brings us to the nextevaluation criterion.Continuous integration Each of the respon-dents acknowledged that the continuous integra-tion principle was fully adopted (Fig. 2c). Thereis only one development line, and as far as it ispossible the development team is avoiding usingbranches and promotes frequent commits. More-over, there is a number of jobs defined in the Hud-

son (http://hudson-ci.org/) system to supportautomatic compilation, testing and deploymentof the developed system.

The unit tests and some of the functionaltests are executed after each commit. The func-tional tests are split into two groups. The first ofthem contains test cases that do not need a greatamount of time for execution, which in fact meansno interaction with graphical interface – thesetests are executed after each commit. The secondgroup of tests contains GUI related tests and forperformance reasons it is executed nightly.

The Hudson continuous integration systemsupports also releases. After a Sprint ReviewMeeting, i.e. when the sprint ends, a Hudson jobis executed and performs the following actions:– The version number of the developed system

is updated.– A SubVersion ‘TAG’ is created for the new

version of the developed system.– Release notes regarding newly implemented

features are generated.– A new version of the developed system is

deployed to Apache Maven repository andsaved on a FTP server.

– The new version is installed automatically inthe customer acceptance tests environment.

Communication All respondents recognisedthe Communication principle as partly adopted(Fig. 2d). The communication was considered attwo levels, namely among team members, specifi-cally between testers and developers and betweenteam members and the customer. The communi-cation between team members is acknowledgedto be effective. Testers and developers work inthe same location. The communication is mostlyverbal (but all major issues are reported in an is-sue tracking system) and supported by the Scrummeetings, e.g. Daily Scrum Standup Meeting. Fur-thermore, the roles (i.e. testers and developers)are not fixed, thus a team member has an oppor-tunity to do developing as well as testing tasks.

The communication with customer was notassessed so well. The eXtreme Programming ‘onsite customer’ principle [3] has not been installed.Furthermore, customer representatives do notparticipate in Planning and Review meetings[8]. Communication channels usually go through

http://hudson-ci.org/


(a) Test driven development (b) Test automation (c) Continuous integration

(d) Communication (e) Pair programming (f) Root-cause analysis

(g) Working software (over com-prehensive documentation)

(h) Responding to change (over followingthe plan)

Figure 2. The level of agility in testing process

Product Owner, which sometimes makes the com-munication inefficient.Pair programming Most of the respondentsrecognised pair programming as partly adopted(Fig. 2e). The usage of pair programming in theproject is limited – more often informal codereviews are used instead, usually in the form ofa code walk-through. Nonetheless, presumablyeach team member experienced this practice sinceit is extensively used as a knowledge transfer tool.New employees work in pairs with the more ex-perienced ones in order to improve their learningcurve.Root-cause analysis Presumably there isa confusion over what it means to adopt thisprinciple since no clear message comes from the

questionnaires (Fig. 2f). Definitely the root-causeanalysis is not executed for all identified defects,only a fraction of them is so closely investigated.On the other hand, there are Sprint RetrospectiveMeetings [8] which address the most important is-sues and give the team an opportunity to identifyremedies.Working software (over comprehensivedocumentation) Most of the respondents ac-knowledged that the working software is valuedover comprehensive documentation (Fig. 2g). Asa matter of fact the customer is not interestedin technical documentation. He is provided onlywith the user and installation guide. Therefore,the team could decide which technical documentsto use and there are no reasons for preparing doc-


uments that are not useful. The Agile Modelingprinciples [17] are followed and in consequencethe number of created documents is limited andalways corresponds with one of two purposes, i.e.a model to communicate or a model to under-stand. The Product and Sprint Backlogs are used,but neither of these documents is delivered withthe released product.Responding to change (over following theplan) Most of the respondents recognised thatresponding to change is valued over following theplan (Fig. 2h). With regard to test plans the prin-ciple is fully adopted since test plans are neverfixed and there is always room for an update.It looks slightly different in the case of the setof features (user stories) that are chosen for im-plementation during a Sprint Planning Meeting.The plan created during the aforementioned meet-ing shall not be changed according to Schwaber[8]. However, there is a possibility to terminatea Sprint and plan a new one (which has beendone several times during the project). Further-more, sprints are not long (for several months oneweek sprints were used, currently they have beenextended to two weeks), thus waiting for a newsprint with an unplanned change is not painful.It should also also stressed that there is no fixedlong term plan. Each new sprint is planned fromscratch and hence unexpected changes can beeasily handled.Conclusion The results were evaluated in theway suggested in [5], i.e. value 1 was used forfull adoption of an agile principle and 0.5 forpartial adoption, then weighted average was cal-culated (the weights are given in teh previoussubsection) and the value of 76.6% was obtained.The value is higher than the one calculated forthe project investigated in [5], which could beinterpreted as a slightly better adoption of agiletesting principles in the described in this studyprocess.

2.4. Objective

The study is focused on a testing process ina medium-size software project with challeng-ing requirements for unscheduled releases andcomplex infrastructure to deal with. The pa-

per presents how the testing process was tunedwith respect to the aforementioned requirements,hence it can be considered as a descriptive orexploratory study [9]. The Authors believe thatother practitioners who operate in a similar con-text will find this work helpful and use it as anexample of a well working testing process. Thestudy objective can be defined as follows:

Describe and evaluate an agile testingprocess with support for unscheduled re-leases in development of a software systemthat depends on a number of external ser-vices and devices.

The investigated testing process deals withtwo challenges. Rge first of them comes frombusiness, i.e. there are often unexpected opportu-nities which must be addressed immediately, i.e.within one or two days depending on a release,otherwise they are missed. In consequence, some-times it is not possible to wait with the releasetill the end of a Sprint. The second challengecomes from the project domain. The developedsystem operates in a complex environment thatconsists of a number of different network devicesand services. Thus, it is a typical case when theconfiguration of the surroundings requires moretime that the test execution itself. This chal-lenge significantly affected the testing process,and therefore the Authors believe that it is a veryimportant part of the project’s big picture andmust not be ignored in the case study.

2.5. Research Questions and DataAnalysis Methods

RQ1: To what extent is the requirementfor unscheduled releases satisfied? The un-scheduled releases are one of the main drivers ofthe definition of testing process. Thus, it is criticalto study this aspect. It will be evaluated usinga quantitative approach. The continuous integra-tion server will be employed as a data source andthe results of the builds that execute functionaltests (builds with unit tests are a prerequisite)will be used as a measure of the possibility ofa release. In order to quantify the data we as-sumed the a release is possible when all tests arepassed. Additionally, we assumed the possibility


of a release with acceptable risk of failure whennot more than 10% of tests failed. Such a risk canbe accepted only when the release is conductedexclusively for presentation purposes. The 10%threshold is provided to give insight into the vari-ability of the continuous integration outcomes. Italso corresponds with the business requirementsas 90% of available functionality is usually enoughto conduct a presentation. The Authors assumedthat the release is possible when before 1 PMthe build is successfully finished according to theaforementioned criteria. The time, i.e. 1 PM, wasselected as it was the latest possible hour thatenables installation in the customer environmentor preparation of a presentation (depending onthe goal of the unscheduled release) during thesame working day.

RQ2: How is the test automation per-formed? Test automation is the main mean toaddress unscheduled releases as it is the onlyway to quickly assess the current software qual-ity and decide if it is acceptable for a release.Furthermore, automation has a crucial role inthe testing process and requires significant ef-forts. Specifically, in a project that operates ina complex environment that creates non-standardrequirements for test configuration and in conse-quence out-of-the-box solutions are not sufficient.Hence, it is vital to describe in detail how thetest automation is done. This research questionwill be addressed using the qualitative approach.All the employed testing frameworks, tools andhome-made solutions will be described with re-spect to their role in the project.

RQ3: How much effort does the test au-tomation require (with respect to differenttypes of activities and tools)? Automatedtests play a crucial role in deciding about anunscheduled release and since two different testframeworks were used it could be very interestingto evaluate the process from a business perspec-tive as well. Hence, the study presents data re-garding costs of test automation that allow to jus-tify whether it is worth making the investmentsand which framework should be be chosen to havesupport for unscheduled releases. The companytracks data about committed efforts using theissue tracking system (Atlassian JIRA). There-

fore, it is possible to address the third researchquestion by mining the collected data and ex-tracting information regarding efforts connectedwith creating new automated functional tests aswell as with maintaining the existing ones. Theresults are presented using descriptive statisticsand statistical tests.

3. Results

3.1. To what Extent is the Requirementfor Unscheduled Releases Satisfied?

Research regarding availableness for an unsched-uled release was conducted over a period of twomonths. The results are shown in the Figure 3.For 47.4% of the time, the application was readyfor the release. A period of release with accept-able risk – 10,5%. The application was not readyfor release for 42.1% of the time. The obtainedresults can be claimed as satisfactory with respectto availableness for an unscheduled release.

Kaner et al. [18] considered test coverage inthe context of requirements-based testing, whichis very similar to the REST and Selenium func-tional tests, as they are intended for proving thatthe program satisfies certain requirements. Thereis at least one automated test per requirement,which the investigated project is expressed us-ing user stories. Therefore, there is 100% testcoverage at the requirement level, i.e. each ofthe requirements is tested. However, it is anoveroptimistic interpretation as not all cornercases in some of the user stories are automatedand thus there are places where errors cannotbe detected using the REST or Selenium func-tional tests. The tests execution results may alsomisleadingly show errors in the case of databasemalfunction or overload of a machine on whichthe test environment is located. Adding furthertests would decrease the probability of releasinga low quality version of the system by limitingthe number of not covered corner cases but itwould also increase the probability of blockinga high quality release which can happen as a re-sult of the aforementioned continuous integration


Figure 3. Results obtained from the continuous integration system

system malfunctions. Additional tests would alsoincrease the costs of tests maintenance.

3.2. How is Test Automation Performed?

The results of the execution of automated testsare used as the primary citerion in making deci-sions regarding unscheduled releases. In order toensure that tests results correspond with sys-tem quality, the team uses a wide variety oftools and frameworks. This includes also sev-eral self-developed solutions which contribute toconducting test automation in complex environ-ments.

3.2.1. JUnit Test Framework

JUnit is a test framework for the Java language.On its own it provides the basic functionalityfor writing tests, which can be further enhancedby other extensions. JUnit tests also serve asan entry point to nearly every type of test inthe project development (including the aforemen-tioned self-developed solutions). It is a first classcitizen with regard to the way the unit testing andtest driven development are conducted. The unittests are a part of the build, and thus the systemcannot be built (and in consequence released)when some of the tests do not pass. There is alsoa positive side effect, i.e. developers are forced toinstantly fix issues detected by unit tests.EasyMock class extensions. EasyMock pro-vides Mock Objects for interfaces and objectsthrough class extension which is used to replace

couplings to external systems or dependencies inunit tests.

3.2.2. Automated Selenium Tests

Automated regression tests with Selenium Web-Driver are used to test whether high level func-tionalities and a graphical user interface workand react properly to input. These tests are com-bined into a suite and executed nightly withinthe continuous integration system (a new buildof the system under test is deployed at the end ofevery day on a dedicated server and its databaseis set to a predefined state). These tests are notexecuted after each commit, as it is in the case ofJUnit and REST tests, since, due to their com-plexity, the execution takes more than one hour.It is more than the recommended 10 minutes [3]and thus would have decreased the comfort ofdevelopers’ work.

Tests using Selenium WebDriver are func-tional, they simulate a user action on the interfaceand check whether the output matches expecta-tions. Typically a test covers exactly one userstory, however, there are exceptions that spanmultiple test cases and user stories. Additionally,for tests where a connection to an external deviceis required (and it is not feasible to keep a realdevice connected to a test machine at all times),the team developed simulators which can be setup to respond like this particular device would(see 3.2.5). The development team strives to haveat least one automatic functional test for everyuser story.


3.2.3. Jersey Test Framework

Tests prepared using Jersey Test Framework arereferred to as the REST tests for short. Theframework enables functional tests for REST ser-vices. It launches the service on an embeddedcontainer, sends HTTP requests and capturesthe responses.

The REST tests are used to test function-ality of the system (each user story is coveredby at least one Selenium or REST test). Thecommunication between a client and a server andits validity is tested, unfortunately defects froma graphical user interface cannot be detected insuch an approach. The REST tests are a part ofthe build and thus broken tests are early detectedand fixed.

3.2.4. DbUnit

DbUnit is a JUnit extension targeted atdatabase-driven tests that, among other things,puts a database into a known state betweentest runs. DbUnit is used for preparing databasefor tests. Specifically, it helps in creating theHSQLDB in-memory database that is used dur-ing the Selenium and REST tests.

3.2.5. Custom Developed Simulators

In some cases the system under test requires com-munication with a device or an external service.To solve this problem the team developed:– SSH Simulator (https://github.com/

NetworkedAssets/ssh-simulator/) to simulatecommunication with a device through theSSH protocol.

– Policy Server Simulator to test communica-tion with a policy server through HTTP.

– WebService simulators to test WebServiceclients.

The simulators are used in the Selenium andREST tests.

SSH Simulator can be configured to readand respond to commands received through theSSH protocol. The exact behaviour, such as whatresponse should be given to request, is configuredin an XML settings file.

The SSH Simulator can be used in two ways.It is possible to launch it as an SSH server or asa temporary service for a single JUnit test. TheSSH Simulator tool is configured using XML files.The XML configuration file contains the expectedrequests and responses that will be generated bythe simulator:SSH Simulator configuration file

<test_case ...><login >login </login ><password >password </password ><device_type >CNR </ device_type ><request delay_in_ms ="3000" >

<request_command >dhcp reload </ request_command ><response_message >

100 Ok</response_message ><response_prompt >nrcmd >$</ response_prompt >

</request ></test_case >

The configuration file contains the<test_case> entry that represents the sequenceof requests and responses mapped using seriesof <request> nodes (on the presented listingthere is only one) that define the expected clientrequests and instruct the simulator how to replyto them. When the XML configuration file isready, it is enough to use it in a JUnit test caseas it is presented in the listing below:Using SSH-Simulator in JUnit

public class MyTest extendsSshSimulatorGenericTestCase {

@Testpublic void sampleTest () {

initializeNewSshServer(xmlConfigFile ,ipAddress , port);

//do the tests here}

}

The WebServices simulators are madeusing J2EE classes from javax.jws andjavax.xml.ws packages. There is a dedicatedclass that is responsible for launching the Web-Services in a separate thread.Executing a simulated WebService for test pur-poses

https://github.com/NetworkedAssets/ssh-simulator/

https://github.com/NetworkedAssets/ssh-simulator/


public class WebServiceExecutor {private static Endpoint endpoint;private static ExecutorService executor;...public WebServiceExecutor(NaWebService ws) {

if (endpoint != null &&endpoint.isPublished ()) {

endpoint.stop ();}endpoint = Endpoint.create(ws);

}/∗∗ Starts the web service ∗/public void publish () {

if (executor == null) {executor =Executors.newSingleThreadScheduledExecutor ();

}endpoint.setExecutor(executor );endpoint.publish(WS_URL );

}/∗∗ Closes the web service and verifies

execution results ∗/public void shutdown () throws Throwable {

endpoint.stop ();assertTrue(webService.isOk ());}

}

The web service has its own thread, but the‘shutdown’ method is called from the JUnit con-text. Therefore, it is possible to assess what callsthe received WebService. It is done by using the‘isOk’ method which should be implemented byeach of the simulated WebServices:Example of a WebService implementation

@WebService(name = "WS",serviceName = "WSService ")

public class MyWs extends NaWebService...@WebMethod(operationName = "OP",

action = "urn#OP")@WebResult(name = "Result",

targetNamespace = "http ://...")@RequestWrapper (...)@ResponseWrapper (...)public Result op(@WebParam(name = "NAME "...)

String name ...)throws OperationFault_Exception {

try {assertEquals ("PCSCF", name);

} catch (Throwable t) {log.error(t.getMessage (), t);reportError(t);

}return = new GenericResponseType.Result ();

}}

The presented example shows how to test thevalue of a WebService parameter. WebServicesimulators are usually employed in the RESTtests and are useful in testing WebService clients.

3.3. How Much Effort Does the TestAutomation Require (with Respectto Different Types of Activities andTools)?

To answer this question, it is necessary to referto historical data about the efforts made by teammembers to create and maintain existing auto-mated tests. For this purpose, data from the issuetracking system used by the company (AtlassianJIRA) were collected.

Then the data about all automated tests havebeen divided into two categories according tothe used tool: Selenium tests and REST tests.This distinction has been made due to the factthat these tools tests different layers of the soft-ware. The REST tests are focused on the RESTservices, while the Selenium test examines thecorrectness of the operations of GUI which inturn may use the aforementioned services. Thedifference between these two types of tests isalso noticeable in in their maintenance becauseof time needed to activate them. The Seleniumtests are started periodically at a specified timeby the continuous integration system (or manu-ally by a developer in their environment), whichprolongs the response time to errors in tests.The REST tests are executed in a continuousintegration system after each commit and INA local environment when developers build theapplication, so tests errors are usually spottedimmediately.


Team effort has been measured based on thetime which has been logged on specific taskswhich regard the creation of automated tests andtheir subsequent maintenance. Figure 4 presentsthe measured percentage of time spent on cre-ation and maintenance related to the total timeof the REST tests. A relatively low percentageof the REST tests maintenance (18%) is due tothe fact that fixing is usually easy to perform.So the overwhelming amount of time is spenton the implementation of new test cases. Figure5 presents the percentage of time spent on thecreation and maintenance of the Selenium tests.Figure 6 shows the average time spent on theimplementation and maintenance task of the Se-lenium and REST test. The average time spentby a developer on Selenium test creation (13.46h)was more than two times longer than the cre-ation of a REST test (5.53h). An even greaterdifference between average times was observedbetween the maintenance of the Selenium anda REST tests. The average time of Selenium testmaintenance tasks (6.75h) was more than threetimes longer than the average time logged onREST tests maintenance. It results from the factthat repairing Selenium tests is usually difficultto perform (in most cases there is a need to fixthe code on both, the client and the server sides).Figure 6 shows the differences between the effortcommitted for the REST and Selenium tests. Thisis largely due to the difference in the complexityof these tests. On the other hand, the Seleniumtests (which generally require more effort) de-tect errors in both, the server and the clientside code.

In the case of the creation of both the Sele-nium and the REST tests, it is possible to presentfurther statistics, see Table 1. The average num-bers of hours committed to automation of a testcase are reported once more but also the vari-abilities and results of analysing the differencesbetween the Selenium and the REST tests aregiven. In the case of the REST tests not only themean effort but also the variance is noticeablysmaller. In order to compare the two types oftests a two-sample t-test was used to verify thefollowing null hypothesis:

The average effort committed to cre-ation of a Selenium test is equal to theaverage effort of a REST test creation.versus an alternative hypothesis:

The average effort committed to cre-ation of a Selenium test is larger than inthe case of a REST test.

Table 1. Efforts committed to the creation ofthe Selenium and REST tests statistics

Selenium RESTtests Tests

Mean 13.46 5.83Variance 146.89 5.67

F 24P (F ≤ f) one-tail 0.0004

t-Stat 3.2P (T ≤ t) one-tail 0.0014

Since the alternative hypothesis is asymmet-ric, the one-tail version of the t-test was used.The hypotheses are tested at the significance levelα = 0.05. The F -Test was used to test whetherthe variances of two populations are equal andaccording to the obtained value of P (F ≤ f),i.e. smaller that 0.05, the two-sample assumingunequal variances version of t-test 2 was used.P (T ≤ t) value equal to 0.0014 was obtained,thus the null hypothesis can be rejected and thealternative one accepted. In other words, the ef-fort needed to create a REST test is significantlysmaller than it is in the case of the Seleniumtests.

It is not possible to present analogous statis-tics for the efforts related to the maintenance ofthe tests. The maintenance task almost alwaysspans across multiple test cases, hence there areno data regarding individual tests and the averagevalues have already been reported.

The execution of the automated test cases isdone in the continuous integration system, henceit does not involve additional efforts. The projectcompilation and build process, which is frequentlyexecuted by developers as a part of their work,includes only unit tests and REST tests and inconsequence it is below the 10 minutes suggestedby Beck [3]. Therefore, the Authors do not con-

2 Satterthwaite’s approximate t-test, a method in the Behrens–Welch family.


Figure 4. Time spent on the REST tests Figure 5. Time spent on the Selenium tests

Figure 6. Average time spent on implementation and maintenance of a single Selenium and REST test

sider test execution as a source of additional effortfor a development team.

The next logical step in answering this re-search question leads to the unit tests. Unfortu-nately, there are no empirical data in this area.As in the case of test driven development, theunit tests are created simultaneously with thecode. There were no dedicated tasks regardingunit tests that carry effort related information.Furthermore, it is not possible to decide whichpart of the tracked effort was committed to theproduction and which to the test code. However,test driven development considers the prepara-tion of a unit test as an integral part of thedevelopment process, hence decisions regardingunit tests should not be driven by cost and effortrelated criteria in an agile testing process.

There is one more type of tests in the investi-gated project, i.e. the manual tests. These tests

are outside of the scope of RQ3 as they are notappropriate for supporting unscheduled releases,but still could be interesting. Unfortunately, itwas not possible to correlate the manual tests ef-forts per particular requirement and according tothe development team there is a significant frac-tion of untracked efforts. Therefore, the Authorsdecided to report the subjective interpretations ofteam members involved in manual testing. Theinterviewed team members estimated the costof implementing and maintaining an automatedtest scenario in the investigated project to beup to 20 times higher than the cost of manualexecution. Furthermore, the overall testing effort(it covers both manual and automated tests) wasestimated by the members of the developmentteam to be close to 25% of the overall develop-ment effort.


4. Threats to Validity

In this section the most trustworthy results areevaluated: to what extent they are true and notbiased toward the Authors’ subjective point ofview.

4.1. Construct Validity

Most of the employed measures are indirect. Thelevel of agility was assessed using questionnaires,hence there is a risk of misunderstanding thequestions or giving biased answers due to unin-tentional company influence or context. To ad-dress the risk a meeting was organized, where allconcepts from the questionnaires were explainedand all questions that were answered. The pos-sibility of unintentional influence is connectedwith the fact that all of the questioned people (aswell as the Authors) were involved in the projectdevelopment. Therefore, they have a wide knowl-edge about the object of study, but simultane-ously they cannot have an objective point of viewwhich is necessary to spot the influence. In conse-quence, it must be stated that the level of agilitywas assessed only from a subjective perspective.Moreover, the Authors’ involvement creates aneven greater threat to validity in terms of bias.To mitigate the issue quantitative measures werepreferred over qualitative ones in the study de-sign as the latter ones are more vulnerable toinfluence.

The ability of making an unscheduled releasewas assessed from the perspective of automatedtests and the continuous integration system. Thetests results carry information about the qual-ity of the developed system. Nevertheless, usingthem exclusively is a simplification of the sub-ject. There could be some unmeasurable factorsinvolved (like team members intuition). The testresults obtained from the continuous integrationsystem is the most convincing, quantitative mea-sure we were able to come up with.

Test automation efforts were evaluated usingdata stored in the issues tracking system. Thereare no serious doubts about the quality of thedata, but when it comes to completeness the situ-ation changes. There is a considerable probability

that a fraction of efforts was not tracked. Therecould be small issues that were not reported at allor issues that were missed by the developers, e.g.if they forgot about tracking their efforts. We didnot find a way to evaluate which fraction of thecollected data suffers from such problems. How-ever, the data filtering was done manually. Thedevelopers, who were assigned to the test automa-tion tasks were approached and asked how thecollected data correspond with their real efforts.

4.2. Internal Validity

According to Runeson and Höst [9] the internalvalidity is a concern when casual relations areexamined. This study does not provide such a re-lation explicitly. Nonetheless, it may create animpression that following the principles that areused in the project described here should lead tosimilar results, which the Authors in fact believeis true. However, there is still a possibility thatsome relevant factors are missing as they were notidentified by the Authors. To mitigate the risk ofthis threat, the context was described accordingto guidelines suggested by Petersen and Wohlin[11] and the rest of the study was designed fol-lowing Runeson and Höst [9] recommendationsfor a descriptive and to some extend exploratorycase study. Specifically, the Authors used thesuggested research process, terminology, researchinstruments and validity analysis.

4.3. External Validity

The context of this case study is described inSection 2.1 and there is no evidence that thesame or similar results could be achieved in a an-other context. Nonetheless, the Authors believethat the environment complexity (e.g. the needfor simulators) increases the efforts related totest automation and hence better results maybe obtained when there are fewer couplings withexternal systems.

Specifically, it must be stressed out that thecomparison of efforts related to different testframeworks has very limited external validity.Statistical tests were employed, but the investi-gated data were mined from only one project.


Therefore, it is not clear whether the results aretrue for other projects and it cannot be empir-ically verified which project specific factors arerelevant for the comparison results. A plausibleexplanation is presented in the ‘Discussion andconclusions’ section, however, data from addi-tional projects are required to confirm it.

4.4. Reliability

According to Runeson and Höst [9] reliabilityis concerned with the extent to which the dataand the analysis are dependent on specific re-searchers. The conducted analyses are ratherstraightforward and several perspectives havebeen presented to ensure the reliability. Nonethe-less, the Authors do not publish raw data, asthey contain business critical information, andthat can be an obstacle in replicating the study.Additionally, the Authors were involved in projectdevelopment and thus the observations as well asconclusions may be biased – the issue is discussedin Subsection 4.1.

5. Related Work

The agile methods have been used for a cou-ple of years. Thus, a number of case studieswith regard to the testing process have alreadybeen conducted. Nonetheless, the Authors arenot familiar with any works that analyze theagile testing process with respect to unscheduledreleases. On the other hand, the complexity ofthe developed system is always a factor takeninto account, however, it but seldom becomes theobject of a study – none of the works reportedin this section consider a similar approach tohandling system complexity.

Kettunen et al. [2] compared testing pro-cesses between software organizations that ap-plied agile practices and employ traditional plandriven-methods. Altogether twelve organizationswere investigated and as a result the Authorsconcluded that agile practices:– tend to allow more time for testing activates,

while the total time for the project remainsthe same,

– smooth the load of test resources,– require stakeholders to understand and con-

form to the practices in agile methods,– are usually supported by internal customers,– allow faster reaction time for change.The findings advocate agile testing but do notcorrespond with the goals of the process investi-gated in this study, i.e. support for unscheduledreleases and complex infrastructure.

Jureczko [5] investigated the level of agilityin a testing process in a large scale financialsoftware project. The studied project is of differ-ent size and comes from another domain, never-theless, the suggested criteria for agility evalua-tion have been used to assess the testing processdescribed in this study. Therefore, a compari-son is possible. The process from [5] is signif-icantly outperformed in the scope of continu-ous integration, pair programming, root-causeanalysis and working software (over comprehen-sive documentation), however, it is underper-formed in the field of communication. The over-all assessment is higher in the process studiedin this work. Let us elaborate the developmentand testing process of this work. The projectwas planned for 150 working years, but laterthe duration time was lengthened. More than150 high skilled developers, testers and man-agers were involved. The project was dividedinto five sub-projects and the paper is focusedon only one of them. The sub-project was de-veloped by a group of about 40 people. Theproject is a custom-build solution that supportsmore than 1500 initially identified case scenarios.The system is based on a well defined techni-cal framework, called Quasar. The developmentteam frequently delivers new releases with newfunctionalities. There are two major releases peryear: in spring and autumn. They are used todeliver large changes and bulks of possible bugs.The two major ones are supplemented by hot-fix releases that target the most important bug-fixes only. The employed testing process forcespractices borrowed from the V-Model. Testerswork on test concepts once the specification ofa requirement is ready. Subsequently develop-ers write a source code and perform manualtests when testers write new automated tests.


Each of the tests is immediately added to theregression test set that is executed daily. Sub-system tests are performed after the integration.They are usually manual and focused on newfeatures.

The role of test automation in a testing pro-cess was empirically evaluated by Karhu et al.[7] in five software organizations among which hewas developing a complex system with a numberof interfaces to customer-specific external systemsand hence creates challenges in test automation.The Authors identified a number of interestingconsequences of automation. Quality improve-ment through better test coverage and increasein the number of executed test cases were notedamong benefits, whereas costs of implementa-tion, maintenance and training were pointed outas the main disadvantages. Moreover, accordingto Berlino [1] great emphasis is put on this is-sue and the methods of extending the degree ofattainable automation are in the focus of test-ing research. Unfortunately, there is a dichotomywith regard to the industrial reality, i.e. manycompanies, which claim that they have adoptedXP, practice the automation in a limited scope[19]. Considerable research has been conductedon the test automation [16,20–23] and in generalthis practice is strongly recommended. Amongthe aforementioned works especially [21] is note-worthy as it is conducted it the context of theScrum method. Another commonly investigatedagile testing practice is test driven development.There is evidence for its usefulness in the indus-trial environment [24, 25], reports of controlledexperiments are also available [26, 27].

Winter [28] investigated the challenges regard-ing testing in a complex environment, however,the complexity came from evaluation usabilityfor a variety of end users. One of the areas ofinterest was the agility of a testing process andthe balance between the formal and informalapproaches. The agility was considered in thecontext of the Agile Manifesto, and hence therewas limited overlap with the criteria employedin our study (i.e. only Working software (overcomprehensive documentation) and Respondingto change (over following the plan) are consideredin both studies).

Talby et al. [6] described installation of ag-ile software testing practices in a real large-scaleproject in a traditional environment. The authorspointed out that in the investigated case studythe agile methods dramatically improved qualityand productivity. The testing process was ana-lyzed and described in four key areas: test designand activity execution, working with professionaltesters, planning, and defect management. Theinvestigated project is a sub-project of a largerone. The larger one was developed by 60 skilleddevelopers and testers. Thus, there is a consider-able probability that the sub-project is similar insize to the project described in this study. Moredetails about the software project investigated in[6] can be found in [29].

A lesson learned from a transformation froma plan-driven to agile testing process is docu-mented in [14]. Extreme programming and Scrumwere adopted, which makes the process similarto the one described here. Hence, the challengesdescribed by Puleio [14] may arise when trying toadopt adopt the principles advocated in this pa-per in a traditional (i.e. not agile) environment.

6. Discussion and Conclusions

This paper contributes to the body of knowledgein two ways. The Authors provide a detailed casestudy of an agile testing process, which can beused in further research and in combination withother studies to help make general conclusions.The documentation of the testing process in thisproject could also be beneficial to practitionersinterested in installing an agile testing process ina similar environment. Especially, a number ofready to use testing tools are recommended (e.g.the simulators).

A survey was conducted in order to assessthe level of agility and the results showed a highlevel of adoption, i.e. 76.6%. Furthermore, each ofthe assessment criteria was compared against theproject reality in a qualitative approach whichgives an insight into the way the agile principleswork. A detailed analysis indicated some areas ofpossible improvement, e.g. communication withcustomer and pair programming.


The project preparation for the ‘on short no-tice’ release was analysed and assessed accordingto test execution results in the continuous inte-gration system. The results gave a value of about47% of time in which the project was ready tobe released and 10% of time it could have beenreleased with the acceptable risk. Hence, the Au-thors can conclude that the requirement for un-scheduled releases is supported to a considerableextent.

Comparative evaluation of the cost of theREST and Selenium tests was conducted. It mea-sured how much time is necessary in both testframeworks for the implementation of new testand maintenance of the existing ones. A signifi-cant disadvantage was found in the case of theSelenium tests, which we believe is a result ofusing Google Web Toolkit for the graphical in-terface – this framework uses dynamic identifiersand generates complex (at least from the Sele-nium point of view) web pages. This extra costis counterbalanced with an ability to detect bugsin GUI which is covered only by the Seleniumtest. Nonetheless, the big difference in costs en-courages reconsideration of the testing approach.

A brief description of all tools that were usedto implement automated tests is provided. Themain contribution in this area regards the sug-gestion for mitigating environment complexitywith simulators. There is a detailed descriptionof an open-source project called SSH-simulatorwhich was co-developed by the Authors, in fact,it is officially presented in this paper for the firsttime. Furthermore, the Authors also suggesteda straightforward solution for simulating WebSer-vices in a test environment. The paper containslistings that show how to mock a WebServicein the context of the JUnit tests. The Authorsbelieve that the detailed descriptions of thosesimulators will help other practitioners who facesimilar challenges during test automation sincethe presented solutions are ready to use (the nec-essary listings and references to external sourcesare given in Sec. 3.2.5).

The overall results are satisfactory with re-gard to the project goals. Therefore, we would liketo recommend following the same rules (in simi-lar projects), i.e. adopt the principles of agility,

specifically assure high quality and coverage ofautomated tests and employ a continuous inte-gration system for automated builds.

References

[1] A. Bertolino, “Software testing research: Achieve-ments, challenges, dreams,” in Future of SoftwareEngineering, FOSE ’07. IEEE, 2007, pp. 85–103.

[2] V. Kettunen, J. Kasurinen, O. Taipale, andK. Smolander, “A study on agility and testingprocesses in software organizations,” in Proceed-ings of the 19th international symposium onSoftware testing and analysis. ACM, 2010, pp.231–240.

[3] K. Beck and C. Andres, Extreme programmingexplained: embrace change. Addison–Wesley Pro-fessional, 2004.

[4] L. Koskela, Test driven: practical TDD and ac-ceptance TDD for Java developers. Manning Pub-lications Co., 2007.

[5] M. Jureczko, “The level of agility in the testingprocess in a large scale financial software project,”in Software engineering techniques in progress,T. Hruška, L. Madeyski, and M. Ochodek, Eds.Oficyna Wydawnicza Politechniki Wrocławskiej,2008, pp. 139–152.

[6] D. Talby, A. Keren, O. Hazzan, and Y. Dubinsky,“Agile software testing in a large-scale project,”IEEE Software, Vol. 23, No. 4, 2006, pp. 30–37.

[7] K. Karhu, T. Repo, O. Taipale, and K. Smolan-der, “Empirical observations on software test-ing automation,” in International Conferenceon Software Testing Verification and Validation,ICST ’09. IEEE, 2009, pp. 201–209.

[8] K. Schwaber, Agile project management withScrum. Microsoft Press, 2004.

[9] P. Runeson and M. Höst, “Guidelines for con-ducting and reporting case study research insoftware engineering,” Empirical Software Engi-neering, Vol. 14, No. 2, 2009, pp. 131–164.

[10] R. Mall, Fundamentals of software engineering.PHI Learning Pvt. Ltd., 2009.

[11] K. Petersen and C. Wohlin, “Context in indus-trial software engineering research,” in Proceed-ings of the 3rd International Symposium on Em-pirical Software Engineering and Measurement.IEEE Computer Society, 2009, pp. 401–404.

[12] S. Harichandan, N. Panda, and A.A. Acharya,“Scrum testing with backlog management in agiledevelopment environment,” International Jour-nal of Computer Science and Engineering, Vol. 2,No. 3, 2014.


[13] K.K. Jogu and K.N. Reddy, “Moving towardsagile testing strategies,” CVR Journal of Science& Technology, Vol. 5, 2013.

[14] M. Puleio, “How not to do agile testing,” in AgileConference. IEEE, 2006, pp. 305–314.

[15] K. Beck, Test driven development: By example.Addison–Wesley Professional, 2003.

[16] M. Jureczko and M. Mlynarski, “Automated ac-ceptance testing tools for web applications us-ing test-driven development,” Electrical Review,Vol. 86, No. 09, 2010, pp. 198–202.

[17] S. Ambler, Agile modeling: effective practicesfor extreme programming and the unified process.Wiley, 2002.

[18] C. Kaner, J. Bach, and B. Pettichord, Lessonslearned in software testing. John Wiley & Sons,2008.

[19] D. Martin, J. Rooksby, M. Rouncefield, andI. Sommerville, “ ‘Good’ organisational reasonsfor ’bad’ software testing: An ethnographic studyof testing in a small software company,” in 29thInternational Conference on Software Engineer-ing, ICSE 2007. IEEE, 2007, pp. 602–611.

[20] M. Catelani, L. Ciani, V.L. Scarano, and A. Ba-cioccola, “Software automated testing: A solutionto maximize the test plan coverage and to in-crease software reliability and quality in use,”Computer Standards & Interfaces, Vol. 33, No. 2,2011, pp. 152–158.

[21] R. Löffler, B. Güldali, and S. Geisen, “Towardsmodel-based acceptance testing for Scrum,”Softwaretechnik-Trends, Vol. 30, No. 3, 2010.

[22] X. Wang and P. Xu, “Build an auto testing frame-work based on selenium and fitnesse,” in Inter-national Conference on Information Technologyand Computer Science, ITCS 2009, Vol. 2. IEEE,2009, pp. 436–439.

[23] T. Xie, “Improving effectiveness of automatedsoftware testing in the absence of specifica-tions,” in 22nd IEEE International Conf. onSoftware Maintenance, ICSM ’06. IEEE, 2006,pp. 355–359.

[24] N. Nagappan, E.M. Maximilien, T. Bhat, andL. Williams, “Realizing quality improvementthrough test driven development: results andexperiences of four industrial teams,” EmpiricalSoftware Engineering, Vol. 13, No. 3, 2008, pp.289–302.

[25] A.P. Ress, R. de Oliveira Moraes, and M.S.Salerno, “Test-driven development as an innova-tion value chain,” Journal of technology manage-ment & innovation, Vol. 8, 2013, p. 10.

[26] L. Madeyski, “The impact of pair programmingand test-driven development on package depen-dencies in object-oriented design—an experi-ment,” in Product-Focused Software Process Im-provement. Springer, 2006, pp. 278–289.

[27] L. Madeyski, “The impact of test-first program-ming on branch coverage and mutation scoreindicator of unit tests: An experiment,” Infor-mation and Software Technology, Vol. 52, No. 2,2010, pp. 169–184.

[28] J. Winter, K. Rönkkö, M. Ahlberg, andJ. Hotchkiss, “Meeting organisational needsand quality assurance through balancing agileand formal usability testing results,” in Soft-ware Engineering Techniques. Springer, 2011, pp.275–289.

[29] Y. Dubinsky, D. Talby, O. Hazzan, and A. Keren,“Agile metrics at the Israeli air force,” in AgileConference, Proceedings. IEEE, 2005, pp. 12–19.

Highly Automated Agile Testing Process: An Industrial Case Study · 2016-08-09 · Highly Automated...

Documents

Transcript of Highly Automated Agile Testing Process: An Industrial Case Study · 2016-08-09 · Highly Automated...