Are You Sure What Failures Your Tests Produce? Lee White.

Are You Sure What Failures Your Tests Produce?

Lee White

Results on Testing GUI Systems

• CIS (Complete Interaction Sequences) approach for testing GUI systems: Applied to four large commercial GUI systems

• Testing GUI system in different environments: operating system, CPU speed, memory

• Modified CIS approach applied to regression test two versions of a large commercial GUI system

Three Objectives for this Talk

• Use of memory tools during GUI testing discovered many more defects; observability problems here

• In GUI systems, defects manifested themselves as different failures (or not at all) in different environments

• In GUI systems, many more behaviors reside in the code than designer intended.

Complete Interaction Sequence(CIS)

• Identify all responsibilities (GUI activity that produces an observable effect on the surrounding user environment).

• CIS: Operations on a sequence of GUI objects that collectively implement a responsibility.

• Example: (assume file opened)File_Menu -> Print -> Print_Setup_Selection -> Confirm_Print

FSM for a CIS(Finite State Model)

• Design a FSM to model a CIS

• Requires experience to create FSM model

• To test for all effects in a GUI, all paths within the CIS must be executed

• Loops may be repeated, but not consecutively

Init

File

Open

NameFile

SelectFile

Highlight

Edit

Cut Copy

Ready-Edit

FinishMove

Cursor2File

Ready

SelectFile2

NameFile2

Open2

Paste

FileMoveCursor

Figure 1 Edit-Cut-Copy-Paste CIS FSM

How to Test a CIS?

• Design Tests: FSM model based upon the design of the CIS is used to generate tests.

• Implementation Tests: In the actual GUI, check all CIS object selections, and select all those transitions to another GUI object within the CIS; add these transitions to the FSM model to generate tests, as well as any new inputs or outputs to/from the CIS.

I1 B I2 A C O1 D

A

B

DC

Figure 2 Design Tests for a Strongly Connected Component

[(I1,B,C,D,A,B,C,O1), (I2,A,B,C,D,A,B,C,O1)]

A

B

C

D

I1

I2 O1

* I3 O2

Figure 3 Implementation Tests for a Strongly Connected Component

[ (I1,B,C,D,B,C,D,A,B,C,D,A*,B,C,O1), (I1,B,C,D,B,C,D,A,B,C,D,A*,B,C,D,O2), (I2,A,B,C,D,B,C,D,A,B,C,D,A*,B,C,O1), (I2,A,B,C,D,B,C,D,A,B,C,D,A*,B,C,D,O2), (I3,D,A,B,C,D,B,C,D,A*,B,C,O1), (I3,D,A,B,C,D,B,C,D,A*,B,C,D,O2) ]

1) Real Network Suite RealJukeBox Team A (3) RealDownload Team B (4) RealPlayer Team B (4)

2) Adobe Suite PhotoDeluxe Team B (4) EasyPhoto Team A (3) Acrobat Reader Team A (3)

3) Inter WinDVD Team C (3) 4) Multi-Media DB Team D (4)

GVisual VStore AdminSrvr ObjectBrowser

GUISystem

GUI Objects

Design

# Tests

# Faults

Impl.

# Tests

# Faults

Real

Networks

443 84

9

242

19

Adobe PS

Acrobat R.

507 223

2

612

10

Inter

WinDVD

112 56

0

154

3

Multi-Media

DB

294 98

0

241

9

Table 1 Case Study of 4 Systems

Memory Tools

• Memory tools monitor memory changes, CPU changes and register changes

• Used to detect failures that would have eluded detection, and account for 34% of faults found in these empirical studies

• Used two such tools: Memory Doctor and Win Gauge from Hurricane Systems Tool.

Table 2 Hidden Faults Detected by Memory Tools

GUI

System

Hidden

Faults

All

Faults

Percent

Real

Network 7 19 37%

Adobe PS

Acrobat Rd 4 10 40%

Inter

WinDVD 1 3 33%

Multi-Media

DB 2 9 22%

Total

Faults 14 41 34%

Failures of GUI Tests on Different Platforms

Lee White and Baowei Fei

EECS Department

Case Western Reserve University

Environment Effects Studied

• Environment Effects: Operating System, CPU Speed, Memory Changes

• Same software tested: RealOne Player

• 950 implementation tests

• For OS, same computer used, but use of Windows 98 and 2000 investigated

Table 3. Faults detected by implementation tests for different operating systems

Surprises Defects Faults

Windows 98 96 35 131

Windows 2000 37 24 61

Table 4. Faults detected by implementation tests for different CPU speeds


PC1 31 19 50

PC2 34 19 53

PC3 37 24 61

Table 5. Faults detected by implementation tests for different memory sizes


PC3 (256 MB) 96 35 131

PC3 (192 MB) 99 36 135

PC3 (128 MB) 101 38 139

Regression Testing GUI Systems

A Case Study to Show the Operations of the GUI Firewall for

Regression Testing

GUI Features

• Feature: A set of closely related CISs with related responsibilities

• New Features: Features in a new version not in previous versions

• Totally Modified Features: Features that are so drastically changed in a new version that this change cannot be modeled by an incremental change; simple firewall cannot be used.

Software Under Test

• Two versions of Real Player (RP) and RealJukeBox (RJB): RP7/RJB1, RP8/RJB2

• 13 features; RP7: 208 obj, 67 CIS, 67 des. tests, 137 impl. tests; RJB1: 117 obj, 30 CIS, 31 des. tests, 79 impl. tests

• 16 features; RP8: 246 obj, 80 CIS, 92 des. tests, 176 impl. tests; RJB2: 182 obj, 66 CIS, 127 des. tests, 310 impl. tests.

RP7/RJB1 RP8/RJB2

8 Features

17 Faults 21 Faults

0 Faults

5 Totally ModifiedFeaturesFirewall

3 New Features53 Faults inOriginal System

59 Faults16 Features

Tested from Scratch by T2

Tested by T1

Figure 4 Distribution of Faults Obtained by Testers T1 and T2

Failures Identified in Version1, Version2

• We could identify identical failures in Version1 and Version2.

• This resulted in 9 failures in Version2, and 7 failures in Version1 not matched.

• The challenge here was to show which pair of failures might be due to the same fault.

Different Failures in Versions V1, V2 for the Same Fault• V1: View track in RJB, freezes if album

cover included

• V2: View track in RJB, loses album cover

• Env. Problem: Graphical settings needed from V2 for testing V1

Different Failures (cont)

• V1: Add/Remove channels in RP does not work when RJB is also running

• V2: Add/Remove channels lose previous items

• Env. Problem: Personal browser used in V1, but V2 uses a special RJB browser

Different Failures (cont)

• V1: No failure present

• V2: In RP, Pressing forward crashes system before playing stream file

• Env. Problem: Forward button can only be pressed during play in V1, but in V2, Forward botton can be selected at any time; regression now finds this fault

Conclusions for Issue #1

• The use of memory tools illustrated extensive observability problems in testing GUI systems:

• In testing four commercial GUI systems: 34% were missed without use of this tool.

• In regression testing, 85% & 90% missed.

• Implication: GUI testing can miss defects or surprises (or produce minor failures).


• Defects manifested as different failures (or not at all) in different environments:

• Discussed in regression testing study

• Also observed in testing case studies, as well as for testing in different HW/SW environments.

Implication for Issue #2

• When testing, you think you understand what failures will occur for certain tests & defects for the same software. But you

don’t know what failures (if any) will be seen by the user in another environment.


• Difference between design and implementation tests are due to non-design transitions in actual FSMs for each GUI CIS:

• Observed in both case studies

• Implication: Faults are commonly associated with these unknown FSM transitions, and are not due to the design.

Question for the Audience

• Are these same three effects valid to this extent for software other than just GUI systems?

• If so, then why haven’t we seen lots of reports and papers in the software literature reporting this fact?

Are You Sure What Failures Your Tests Produce? Lee White.

Documents

Transcript of Are You Sure What Failures Your Tests Produce? Lee White.