“The Future of Software Testing” Probabilistic Stochastic Test Data · 2019-07-25 · “The...
Transcript of “The Future of Software Testing” Probabilistic Stochastic Test Data · 2019-07-25 · “The...
Probabilistic Stochastic Test Data
Bj Rollison, Microsoft, USA
Europe’s Premier Software Testing EventWorld Forum Convention Centre, The Hague, Netherlands
WWW.QUALTECHCONFERENCES.COM
“The Future of Software Testing”
Bj Rollison Test Architect
Microsoft
http://www.TestingMentor.comhttp://blogs.msdn.com/imtesty
Customer provided data
Domain expertise
Generally very limited in scope
Tester generated data
Happy path, probabilistic data
Input population poorly defined, human bias
Random data not representative of population
Static data files
Library of historical failure indicators
Too restrictive
Ineffective with multiple iterations
Large number of variables
Variable sequences can result in a virtual infinitenumber of combinations
Impractical to test all values and combinations of values in any reasonable testing cycle
Example:
NetBIOS name 15 alphanumeric characters
Using ASCII only chars, 82 allowable characters (0x20 \ * + = | : ; “ ? < > , ) invalid*
Total number of possible input tests equals8215 + 8214 + 8213…+ 821 = 51,586,566,049,662,994,687,009,994,574
It does not “look” like real world test data.
Years ago developers would argue that a name textbox couldn’t contain a number!
To a computer, what is the difference between the strings Margaret and ksjCu9ls?
Random data is not reproducible.
A seeded random generator will produce the same exact result given the same seed value
Random data violates constraints of real data
Representative data from population
Deterministic algorithms
Sampling is commonly used in risk based testing
Samples must be representative
Samples must be statistically unbiased
Samples set must include variability for breadth
Random data generation provides variability, but
Simple random data may not be representative
Simple random data hard to reproduce
Goal – generate random data that is
Representative of the input data set
Statistically unbiased - random sample of elements from a probability distribution
Value – input test data that
Provides greater variability
Includes expected and unexpected sequences
Eliminates human bias
Is better at evaluating robustness
Is dynamic!
System.Security.Cryptography.RandomNumberGenerator class
Encrypted data indistinguishable from random
Cannot be seeded; no repeatability
System.Random class
Sequence of numbers that meet certain statistical requirements for randomness
Can be seeded for repeatability
Not perfect, but reasonably random for practical purposes
Comparison between RandomNumberGenerator class and Random class
Red – RNG
Blue – Random
Both pseudo –random
No obvious pattern
based on sample byJeff Attwoodhttp://www.codinghorror.com
User defined seed
Tester provides seed value for repeatability
Dynamic seed
New seed valuegenerated at runtime
Seed variablemust be preserved in test log
public static int GetSeedValue(string seedValue)
{int seed = 0;if (seedValue != string.Empty){
seed = int.Parse(seedValue);}else{
Random r = new Random();seed = r.Next();
}return seed;
}
Define the representative data set
Example – Credit card numbers
341846580149320
Card length –
(BIN + digits)
between 14 and
19 depending on
card type
Bank Identification
Number (BIN) –
between 1 and 4
digits depending
on card type
Checksum – Luhn (Mod 10) algorithm
Equivalence class partitioning decomposes data into discrete valid and invalid class subsets
Card type Valid Class subsets Invalid Class subsets
American
Express
BIN – 34, 37
Length – 15 digits
Checksum – Mod 10
Unassigned BINs
Length <= 16 digits
Length >= 14 digits
Fail Checksum
Maestro BIN – 5020, 5038,
6034, 6759
Length – 16, 18
Checksum – Mod 10
Unassigned BINs
Length <= 15 digits
Length >= 19 digits
Length == 17 digits
Fail Checksum
Input variable Valid input Invalid input
Valid BIN
Number(s)
& Length
Seed
Generator
Is Valid
Luhn
Algorithm
Random
Number
Generator
Card
Length(s)
by Type
Get
credit card
Info
Input
(card type)Output
(card #)
Input
(optional seed)
Assigned BINs ensures the data looks real
The Mod10 check ensures the data feels real
Result is representative of real data!
GetCardNumber(int cardType, int seed)Get BIN (cardType, seed);Get CardLength (cardType, seed);Assign BIN to cardNumber;Generate a new random object;for (cardNumberLength < CardLength)
Generate a random number 0 <> 9;Append it to the cardNumber;
if IsNotValidCardNumber(cardNumber)while (IsNotValidCardNumber(cardNumber))
increment last number by 1;return cardNumber;
Deterministic
algorithm to
generate a valid
random credit
card
Model
test
data
Generate
test data
Apply
test
data
Verify
results
Decompose the
data set for each
parameter using
equivalence class
partitioning
Generate valid
and invalid test
data adhering to
parameter properties,
business rules, and
test hypothesis
Apply the test
data to the
application
under test
Verify the actual
results against
the expected
results – oracle!
JCB Type 1
BIN = 35 Len = 16
JCB Type 2
BIN = 1800, 2131, Len = 15
Robusttesting
Multi-languageinputtesting
String length
fixed or variable
Seed value
Custom range for
greater controlUnicode
language
familiesAssigned code
points
Reserved
characters
Unicode surrogate
pairs
1000 Unicode charactersfrom the sample population
Character corruption and data loss
135 characters (bytes)
obvious data loss
Static test data wears out!
Random test data that is not repeatable or not representative may find defects, but…
Probabilistic stochastic test data
Is a modeled representation of the population
Is statistically unbiased
Is especially good at testing robustness
Recommend using both static (real-world)test data and probabilistic stochastic test data for breadth
Practice .NET Testing with IR DataBj Rollisonhttp://www.stpmag.com/issues/stp-2007-06.pdf
Automatic test data generation for path testing using a new stochastic algorithmBruno T. de Abreu, Eliane Martins, Fabiano L. de Sousahttp://www.sbbd-sbes2005.ufu.br/arquivos/16-%209523.pdf
Data Generation Techniques for Automated Software Robustness TestingMatthew Schmid & Frank Hillhttp://www.cigital.com/papers/download/ictcsfinal.pdf
Toolshttp://www.TestingMentor.com