GTAC 2014: What lurks in test suites?

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)


We all want "better" test suites. But what makes for a good test suite? Certainly, test suites ought to aim for good coverage, at least at the statement coverage level. To be useful, test suites should run quickly enough to provide timely feedback. This talk will investigate a number of other dimensions on which to evaluate test suites. The talk claims that better test suites are more maintainable, more usable (for instance, because they run faster, or use fewer resources), and have fewer unjustified failures. In this talk, I'll present and synthesize facts about 10 open-source test suites (from 8,000 to 246,000 lines of code) and evaluate how they are doing.

Transcript of GTAC 2014: What lurks in test suites?

  • 1. Beyond Coverage:What Lurks in Test Suites?Patrick Lam, @uWaterlooSE(and Felix Fang)University of Waterloo

2. Test Suites: Myths vs Realities. 3. Subjects: Open-Source Test Suites 4. Basic Test Suite PropertiesBenchmark sizes:30 kLOC (google-visualization) to495 kLOC (weka)% of system represented by tests:5.3% (weka) to 50.4% (joda-time) 5. Static Test Suite Properties 6. Test suite versus benchmark sizem = 0.3002m = 0.03514 7. # test cases versus # test methods 8. apache-commons-collection testsConsider map.TestFlat3Map:contains 14 test methodsyet, 156 test casessuperclass tests: 42 tests+ 4 Apache Commons Collections bulk tests 9. Run-time Test Suite Properties 10. Test suites run quicklyjoda-time 4.9sjdom 5.0sgoogle-vis 5.1sjgrapht 16.9sweka 28.9sapache-cc 34.0spoi 36.5sjmeter 53.0sjfreechart 241.0s 11. Failing tests76/3840n/a 0103/1109000 12. Continuous Integration: Daily Builds 13. Continuous Integration: Daily Tests(via SonarQube,Travis CI, Surefire) 14. Myth #1:Coverage is a key propertyof test suites. 15. Coverage is central in textbooksAmmann and Offutt, Introduction to Software Testing 16. Coverage metrics from EclEmma 17. Coverage metrics 18. Reality #1Coverage sometimes important,but tools only give limited data. 19. Guideline #1Consider metrics beyondreported coverage results:- weka uses peer review for QA- not measured by tools:input space coverage 20. Myth #2Tests are simple.- test complexity- test dependencies 21. Static Code Complexity 22. Test methods with at least 5 assertse.g. from Joda-Time:public void testEquality() {assertSame(getInstance(TOKYO), getInstance(TOKYO));assertSame(getInstance(LONDON), getInstance(LONDON));assertSame(getInstance(PARIS), getInstance(PARIS));assertSame(getInstanceUTC(), getInstanceUTC());assertSame(getInstance(), getInstance(LONDON));} 23. % Test methods with 5 asserts 24. Test Methods with Branchesif (isAllowNullKey() == false) {try {assertEquals(null, o.nextKey(null));} catch (NullPointerException ex) {}} else {assertEquals(null, o.nextKey(null));}// from apache-cc 25. Test Methods with Loopscounter = 0;while (this.complexPerm.hasNext()) {this.complexPerm.getNext();counter++;}assertEquals(maxPermNum, counter);// from jgrapht 26. % Test Methods with Control-Flow 27. Tests Which Use the Filesystem 28. Filesystem Usage Detailsnew File(tempDir, "tzdata");verifies vs canonical formsof serialized collections on disk 29. More Filesystem Usage Detailsresources, serializationcreates charts, tests their existencesome comparisons vs test data 30. Tests Which Use the Network* 31. Network Usage Detailsconnects tohttp://sc.openoffice.orgtests HTTP mirror serverat localhost 32. flip side: Mocks and StubsTrue mocks only in Google Visualization. 33. flip side: Mocks and StubsTrue mocks only in Google Visualization.Found stubs/fakes in4 other suites. 34. Reality #2Test cases are mostly simple.few asserts, little branchingsome filesystem/net usage 35. Consequence #2Many tests dont needhigh expertise to write,but some do! 36. Myth #3Test cases are written by hand. 37. Types of reuse (standard Java)1. test class setUp()/tearDown()2. inheritance: e.g. in apache-cc,TestFastHashMap extends AbstractTestMap3. composition: e.g. in jfreechart,helper class RendererChangeDetector 38. JUnit setup/tearDown usage 39. Inheritance is heavily used(> 50% test classes inherit functionality) 40. Test Classes with Custom Superclasses 41. Helper Classes Examplefrom poi:/** Test utility class to get Records* out of HSSF objects. */public final class RecordInspector {public static Record[] getRecords(...) {}} 42. Helper Class Countweka 1google-vis 3jdom 6joda-time 7jfreechart 7jmeter 12jgrapht 15apache-cc 22hsqldb 31poi 54 43. Test Clone Examplepublic void testNominalFiltering() {m_Filter = getFilter(Attribute.NOMINAL);Instances r = useFilter();for (int i = 0; i < r.numAttributes(); i++)assertTrue(r.attribute(i).type() != Attribute.NOMINAL);}public void testStringFiltering() {m_Filter = getFilter(Attribute.STRING);Instances r = useFilter();for (int i = 0; i < r.numAttributes(); i++)assertTrue(r.attribute(i).type() != Attribute.STRING);} 44. Assertion Fingerprintsdetect clonesby identifyingsimilar tests 45. Incidence of cloning 46. How to Refactor? setUp/tearDown/subclassing JUnit 4:Parametrized Unit Tests Test Theories 47. apache-cc: Bulk testspublic BulkTest bulkTestKeySet() {return new TestSet(makeFullMap().keySet());} runs all tests in the TestSet classwith the object returned from makeFullMap().keySet() 48. jdom: Generated Test Case Stubsclass ClassGenerator makes e.g.:class TestDocument {void test_TCC__List();void test_TCM__int_hashCode();}Developer still needs to populate tests. 49. Automated Testing TechnologyIn our test suites,the principal automation technologywas cut-and-paste. 50. Reality #3Automated test generationis uncommon in our test suites. 51. GuidelineMaximize reuse:setUp/tearDown,inheritance,parametrized tests,whatever works for you! 52. SuggestionUse automated test generation tools!Some examples: Korat (structurally complex tests) Randoop (random testing) CERT Basic Fuzzing Framework 53. SummaryMyths:1. Coverage is a key propertyof test suites. 2. Tests are simple. 3. Tests are written by hand. 54. Data