Escaping Test Hell - ACCU 2014

Post on 19-Aug-2014

2.060 views 2 download

Tags:

description

My talk delivered on 10th of April 2014 in Bristol at ACCU Conference. This is the combination of a few talks I delivered over 2012 and 2013 with some latest updates. This is an experience report based on the work of many developers from Atlassian and Spartez working for years on Atlassian JIRA. If you have (or going to have) thousands of automated tests and you are interested how it may impact you, this presentation is for you.

Transcript of Escaping Test Hell - ACCU 2014

Automated Test Hell

Wojciech Seligawojciech.seliga@spartez.com, @wseliga

or There and Back Again

About me• Coding since 6yo

• Former C++ developer (’90s, early ’00s)

• Agile Practices (inc. TDD) since 2003

• Dev Nerd, Tech Leader, Agile Coach, Speaker, PHB

• 6.5 years with Atlassian (JIRA Dev Manager)

• Spartez Co-founder & CEO

XP PromiseC

ost

of C

hang

e

Time

WaterfallXP

The Story

2.5 years ago

About 50 engineers

Obsessed with Quality

Almost 10 years of accumulating

garbage automatic tests

18 000 tests* on all levels13 000 unit tests

*excluding tests of the libraries

4 000 func and integration tests

1 000 Selenium tests

Atlassian JIRA

Our Continuous Integration

environment

Test frameworks

• JUnit 3 and 4

• JMock, Easymock, Mockito

• Powermock, Hamcrest

• QUnit, HTMLUnit, Jasmine.js, Sinon.js

• JWebUnit, Selenium, WebDriver

• Custom runners

Bamboo Setup

• Dedicated server with 70+ remote agents (including Amazon Elastic)

• Build engineers

• Bamboo devs on-site

Looks good so far?

for each main branch

Run in parallel in batches

Run first

There is

Much More

Type of tests

• Unit

• Functional

• Integration

• Platform

• Performance

Platforms

• Dimension - DB: MySQL, PostgreSQL, MS SQL, Oracle

• Dimension - OS: Linux, Windows

• Dimension - Java ver.: 1.5, 1.6, 1.7, 1.8

• Dimension - CPU arch.: 32-bit, 64-bit

• Dimension - Deployment Mode: Standalone, Tomcat, Websphere, Weblogic

Run Nightly

Coming

Triggering Builds

• On Commit (hooks, polling)

• Dependent Builds

• Nightly Builds

• Manual Builds

Very slow (long hours) and fragile feedback loop

Serious performance and reliability issues

It takes time to fix it...

Sometimes very long

You commit at 3 PM

You get “Unit Test Green” email at 4PM

You get flood of “Red Test X” emails at 4 - 9PM

Your colleagues on the other side of the globe

You happily go home

You

“We probably spend more time dealing with the JIRA

test codebase than the production codebase”

Dispirited devs accepting RED as a norm

Broken window theory

Feedback Speed `

Test Quality

Catching up with UI changes

Page Objects Pattern

Problem:

Solution:

Page Objects Pattern• Page Objects model UI elements (pages,

components, dialogs, areas) your tests interact with

• Page Objects shield tests from changing internal structure of the page

• Page Objects generally do not make assertions about data. The can assert the state.

• Designed for chaining

Page Objects Examplepublic class AddUserPage extends AbstractJiraPage!{!! private static final String URI = !

"/secure/admin/user/AddUser!default.jspa";!! @ElementBy(name = "username")! private PageElement username;!! @ElementBy(name = "password")! private PageElement password;!! @ElementBy(name = "confirm")! private PageElement passwordConfirmation;!! @ElementBy(name = "fullname")! private PageElement fullName;!! @ElementBy(name = "email")! private PageElement email;!! @ElementBy(name = "sendemail")! private PageElement sendEmail;!! @ElementBy(id = "user-create-submit")! private PageElement submit;!! @ElementBy (id = "user-create-cancel")! private PageElement cancelButton;!! @Override! public String getUrl()! {! return URI;! }!

...

@Override! public TimedCondition isAt()! {! return and(username.timed().isPresent(), !password.timed().isPresent(), fullName.timed().isPresent());! }!! public AddUserPage addUser(final String username, !

final String password, final String fullName, final String email, final boolean receiveEmail)!

{! this.username.type(username);! this.password.type(password);! this.passwordConfirmation.type(password);! this.fullName.type(fullName);! this.email.type(email);! if(receiveEmail) {! this.sendEmail.select();! }! return this;! }!! public ViewUserPage createUser()! {! return createUser(ViewUserPage.class);! }!!! public <T extends Page> T createUser(Class<T> nextPage, Object...args)! {! submit.click();! return pageBinder.bind(nextPage, args);! }!

Using Page Objects @Test! public void testServerError()! {! jira.gotoLoginPage().loginAsSysAdmin(AddUserPage.class)! .addUser("username", "mypassword", "My Name",!

"sample@email.com", false)! .createUser();!

// assertions here! }!

Opaque Test Fixtures

REST-based Set-up

Problem:

Solution:

REST-based Setup @Before! public void setUpTest() {! restore("some-big-xml-file-with-everything-needed-inside.xml");! }!

@Before! public void setUpTest() {! restClient.restoreEmptyInstance();! restClient.createProject(/* project params */);! restClient.createUser(/* user params */);! restClient.createUser(/* user params */);! restClient.createSomethingElse(/* ... */);! }!

VS

Flakey Tests

Timed Conditions

Problem:

Solution:

Mock Unreliable DepsTest-friendly Markup

Flakey Tests

Quarantine

Problem:

Solution:

Fix Eradicate

Quarantine

• @Ignore

• @Category

• Quarantine on CI server

• Recover or Die

Non-deterministic tests are strong inhibitor of change

instead of the catalyst

Execution Time: Test Level

Unit Tests

REST API Tests

JWebUnit/HTMLUnit Tests

Selenium/WebDriver Tests

Speed Confidence

Our example: Front-end-heavy web app

100 WebDriver tests:100 QUnit tests:

15 minutes1.2 seconds

Test Pyramid

Unit Tests (including JS tests)

REST / HTML Tests

Selenium

Good!

Test Code is Not Trash

Design

MaintainRefactor

Share

Review

Prune

Respect

Discuss

Restructure

Optimum Balance

Isolation Speed Coverage Level Access Effort

Dangerous to temper with

MaintainabilityQuality / Determinism

Two years later…

People - Motivation Making GREEN the norm

Shades of Red

Pragmatic CI Health

Build Tiers and Policy

Tier A1 - green soon after all commits

Tier A2 - green at the end of the day

Tier A3 - green at the end of the iteration

unit tests and functional* tests

WebDriver and bundled plugins tests

supported platforms tests, compatibility tests

Wallboards: Constant

Awareness

Training

• assertThat over assertTrue/False and assertEquals

• avoiding races - Atlassian Selenium with its TimedElement

• Favouring unit tests over functional tests

• Promoting Page Objects

• Brownbags, blog posts, code reviews

Quality

Automatic Flakiness Detection Quarantine

Re-run failed tests and see if they pass

Quarantine - Healing

SlowMo - expose races

Selenium 1

Selenium ditching Sky did not fall in

Ditching - benefits

• Freed build agents - better system throughput

• Boosted morale

• Gazillion of developer hours saved

• Money saved on infrastructure

Ditching - due diligence

• conducting the audit - analysis of the coverage we lost

• determining which tests needs to rewritten (e.g. security related)

• rewriting the tests (good job for new hires + a senior mentor)

Flaky Browser-based TestsRaces between test code and asynchronous page logic

Playing with "loading" CSS class does not really help

Races Removal with Tracing// in the browser:!function mySearchClickHandler() {!    doSomeXhr().always(function() {!        // This executes when the XHR has completed (either success or failure)!        JIRA.trace("search.completed");"    });!}!// In production code JIRA.trace is a no-op

// in my page object:!@Inject!TraceContext traceContext;! !public SearchResults doASearch() {!    Tracer snapshot = traceContext.checkpoint();!    getSearchButton().click(); // causes mySearchClickHandler to be invoked!    // This waits until the "search.completed" // event has been emitted, *after* previous snapshot    !    traceContext.waitFor(snapshot, "search.completed"); !    return pageBinder.bind(SearchResults.class);!}!

Can we halve our build times?

Speed

Parallel Execution - Theory

End of Build

Batches

Start of Build

Parallel Execution

End of Build

Batches

Start of Build

Parallel Execution - Reality Bites

End of Build

Batches

Start of Build

Agent availability

Dynamic Test Execution Dispatch - Hallelujah

"You can't manage what you can't measure."

not by W. Edwards Deming

If you believe just in it

you are doomed.

You can't improve something if you can't measure it

Profiler, Build statistics, Logs, statsd → Graphite

Anatomy of Build*

CompilationPackaging

Executing Tests

Fetching Dependencies

*Any resemblance to maven build is entirely accidental

SCM Update

Agent Availability/Setup

Publishing Results

JIRA Unit Tests Build

Compilation (7min)

Packaging (0min)

Executing Tests (7min)Fetching Dependencies (1.5min)

SCM Update (2min)

Agent Availability/Setup (mean 10min)

Publishing Results (1min)

Decreasing Test Execution Time to

ZERRO alone would not let us

achieve our goal!

Agent Availability/Setup

• starved builds due to busy agents building very long builds

• time synchronization issue - NTPD problem

• Proximity of SCM repo

• shallow git clones are not so fast and lightweight + generating extra git server CPU load

• git clone per agent/plan + git pull + git clone per build (hard links!)

• Stash was thankful (queue)

SCM Update - Checkout time

2 min → 5 seconds

• Fix Predator

• Sandboxing/isolation agent trade-off: rm -rf $HOME/.m2/repository/com/atlassian/*

intofind $HOME/.m2/repository/com/atlassian/ -name “*SNAPSHOT*” | xargs rm

• Network hardware failure found (dropping packets)

Fetching Dependencies

1.5 min → 10 seconds

Compilation

• Restructuring multi-pom maven project and dependencies

• Maven 3 parallel compilation FTW -T 1.5C *optimal factor thanks to scientific trial and error research

7 min → 1 min

Unit Test Execution

• Splitting unit tests into 2 buckets: good and legacy (much longer)

• Maven 3 parallel test execution (-T 1.5C)

7 min → 5 min

3000 poor tests (5min)

11000 good tests (1.5min)

Functional Tests

• Selenium 1 removal did help

• Faster reset/restore (avoid unnecessary stuff, intercepting SQL operations for debug purposes - building stacktraces is costly)

• Restoring via Backdoor REST API

• Using REST API for common setup/teardown operations

Functional Tests

Publishing Results

• Server log allocation per test → using now Backdoor REST API (was Selenium)

• Bamboo DB performance degradation for rich build history - to be addressed

1 min → 40 s

Unexpected Problem

• Stability Issues with our CI server

• The bottleneck changed from I/O to CPU

• Too many agents per physical machine

JIRA Unit Tests Build Improved

Compilation (1min)

Packaging (0min)

Executing Tests (5min)

Fetching Dependencies (10sec)

SCM Update (5sec)

Agent Availability/Setup (3min)*

Publishing Results (40sec)

Improvements Summary

Tests Before After Improvement %

Unit tests 29 min 17 min 41%

Functional tests 56 min 34 min 39%

WebDriver tests 39 min 21 min 46%

Overall 124 min 72 min 42%

* Additional ca. 5% improvement expected once new git clone strategy is consistently rolled-out everywhere

Better speed increases responsibility

Fewer commits (authors) per single build

vs.

The Quality Follows

But that's still bad

We want CI feedback loop in a few minutes maximum

Splitting The Codebase

Inevitable Split - Fears

• Organizational concerns - understanding, managing, integrating, releasing

• Mindset change - if something worked for 10+ years why to change it?

• Trust - does this library still work?

• We damned ourselves with big buckets for all tests - where do they belong to?

Splitting code base• Step 0 - JIRA Importers Plugin (3.5 years ago)

• Step 1- New Issue View and Navigator

• Step 2 - now everything else follows JIRA 6.0

We are still escaping hell. Hell sucks in your soul.

Conclusions

• Visibility and problem awareness help

• Maintaing huge testbed is difficult and costly

• Measure the problem - to baseline

• No prejudice - no sacred cows

• Automated tests are not one-off investment, it's a continuous journey

• Performance is a damn important feature

Test performance is a damn important

feature!

XP vs Sad RealityC

ost

of C

hang

e

Time

WaterfallXP - idealSad Reality

Interested in such stuff?

http://www.spartez.com/careers

We are hiring in Gdańsk

• Turtle - by Jonathan Zander, CC-BY-SA-3.0

• Loading - by MatthewJ13, CC-SA-3.0

• Magic Potion - by Koolmann1, CC-BY-SA-2.0

• Merlin Tool - by By L. Mahin, CC-BY-SA-3.0

• Choose Pills - by *rockysprings, CC-BY-SA-3.0

• Flashing Red Light - by Chris Phan, CC BY 2.0

• Frustration - http://www.flickr.com/photos/striatic

• Broken window - http://www.flickr.com/photos/leeadlaf/

Images - Credits

Thank You!