Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Author:

Carlos Pacheco, Shuvendu K. Lahiri,

Michael D. Ernst, Thomas Ball MIT CSAIL.

Feedback-directed Random Test Generation

• Introduction

• Technique

• Evaluation

• Conclusion

Contents Table

• What is a Feedback-directed Random Test Generation

Improve random test generation by incorporating feedback obtained from previously-constructed inputs.

Feedback is the most feature, because it will extremely improve the random test performance.

• Input and Output:

Input: The code need to be tested.

Output: Test case.

Introduction

• Competitors

Systematic testing

Undirected random test generation

• Direction

Our work addresses random generation of unit tests for object-oriented programs, Like java program.

Introduction

• Randoop

We have implemented the technique in RANDOOP which is fully automatic, requires no input from the user and scales to realistic applications with hundreds of classes.

• Surroundings

JDK

.NET

Introduction

• Example to run Randoop

After download the randoop.jar, we can input the following commands:

After run successfully, we can see in the catalog folder generates RandoopTest.Java and RandoopTest0.Java.

Then it will give you testing results.

Introduction

• A test case generated by randoop.

Introduction

Steps:• Randomly call methods

• Apply method to contracts

• Classify the methods after checking

• Output test case.

Technique

How a feedback-directed random test works?• First, we should know what the

materials it will use.

• Second, we can analysis its process.

Technique

Technique

Contracts and Filters:• Contracts and filters have an

important role in this technique, because they will help us validate whether every line is correct or not.

• Filters will change the S.i.extensible flag to true or false after the method have been called to the contracts.

Technique

Default contracts checked by RANDOOP. Users can extend these with additional contracts, including domain-specific ones.

Technique

Repetition:• A good test case needs to call a given

method multiple times.

• In our work, when generating a new sequence, there are 2 important parameters we can set they are N and M, which N is the possibility a method need to be called and M is the maximum times a method need to be called.

Technique

3 steps:• Evaluate the coverage that randoop

achieves on a collection of container data structures.

• Randoop generates test case to find the API contract violations on widely used libraries. Like JDK and .NET.

• Uses randoop-generated regression test cases to find regression errors.

Evaluation

Step_1:• Test on 4 container classes: a binary

tree, a binomial heap, a fibonacci heap and a red-black tree.

• Use 4 techniques: systematic techniques in JPF, randoop, undirected random testing implemented in JPF, undirected random testing implemented in randoop.

Evaluation

For each data structure, we performed following steps:

1. We use the optimal parameters to run Visser et al.'s test on the containers. (JPF)

2. We ran RANDOOP on the containers. (RP)

3. To compare against unguided random generation, we also reproduced Visser et al.'s results for random generation test. (JPFu)

4. Ran RANDOOP a second time, turning off all filters. (RPu)

Evaluation

Evaluation

Result:• Randoop’s coverages are equal to or

larger than other techniques, especially it’s running time is much lower than others.

• Our results suggest that further experiment is required on how systematic and random techniques compare in detecting errors in data structures.

Evaluation

Step_2: Checking API contracts• Test objects: java JDK and C++ .NET

framework.

• We will use 3 Techniques: feedback-directed random generation, undirected random generation, systematic generation to create test suites for widely-used libraries above mentioned.

Evaluation

First, for each library , we performed randoop as the following steps:1. We ran randoop on a library, specifying

all the public classes as targets for testing. Then it will produce a suite case.

2. We compiled the test suite and ran it with REDUCE.

3. We manually inspected the failing test cases reported by REDUCE.

Evaluation

Evaluation

Second, we use systematic testing to test the libraries as the follow steps:• Write a suite case to check the same

contracts as randoop. (Because systematic testing can’t generate suite case like randoop)

• Ran suite case by JPF.

• Report the results.

Evaluation

Compare with randoop:• For all the libraries, JPF ran out of

memory (after 32 seconds on average) without reporting any errors.

• RANDOOP was able to explore the space more effectively not because it explored a larger portion of the state space. While JPF thoroughly sampled a tiny , localized portion of the space.

Evaluation

Third, we performed undirected random testing on each library:

• We reran RANDOOP a second time but without the filters.

• Result:

Violation-inducing test cases: 1,326.

REDUCE reported test cases: 60.

Did not find any errors.

Evaluation

Regression and compliance testing:• This section describes a case study in

which we used feedback-directed random testing to find inconsistencies between different implementations of the same API.

• We tested 3 implementations: Sun JDK 1.5, Sun JDK 1.6, and IBM JDK 1.5.

Evaluation

Result:

We ran the resulting test suite using Sun 1.6 and a second time using IBM 1.5. A total of 25 test cases failed on Sun 1.6, and 73 test cases failed on IBM 1.5.

Evaluation

Feedback-directed random testing scales to large systems, quickly finds errors in heavily tested, widely deployed applications, and achieves behavioral coverage on standard with systematic techniques.

Conclusion

Thanks!

Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Documents

Transcript of Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.