Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

29
Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL. Feedback-directed Random Test Generation

Transcript of Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Page 1: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Author:

Carlos Pacheco, Shuvendu K. Lahiri,

Michael D. Ernst, Thomas Ball MIT CSAIL.

Feedback-directed Random Test Generation

Page 2: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• Introduction

• Technique

• Evaluation

• Conclusion

Contents Table

Page 3: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• What is a Feedback-directed Random Test Generation

Improve random test generation by incorporating feedback obtained from previously-constructed inputs.

Feedback is the most feature, because it will extremely improve the random test performance.

• Input and Output:

Input: The code need to be tested.

Output: Test case.

Introduction

Page 4: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• Competitors

Systematic testing

Undirected random test generation

• Direction

Our work addresses random generation of unit tests for object-oriented programs, Like java program.

Introduction

Page 5: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• Randoop

We have implemented the technique in RANDOOP which is fully automatic, requires no input from the user and scales to realistic applications with hundreds of classes.

• Surroundings

JDK

.NET

Introduction

Page 6: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• Example to run Randoop

After download the randoop.jar, we can input the following commands:

After run successfully, we can see in the catalog folder generates RandoopTest.Java and RandoopTest0.Java.

Then it will give you testing results.

Introduction

Page 7: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

• A test case generated by randoop.

Introduction

Page 8: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Steps:• Randomly call methods

• Apply method to contracts

• Classify the methods after checking

• Output test case.

Technique

Page 9: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

How a feedback-directed random test works?• First, we should know what the

materials it will use.

• Second, we can analysis its process.

Technique

Page 10: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Technique

Page 11: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Technique

Page 12: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Contracts and Filters:• Contracts and filters have an

important role in this technique, because they will help us validate whether every line is correct or not.

• Filters will change the S.i.extensible flag to true or false after the method have been called to the contracts.

Technique

Page 13: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Default contracts checked by RANDOOP. Users can extend these with additional contracts, including domain-specific ones.

Technique

Page 14: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Repetition:• A good test case needs to call a given

method multiple times.

• In our work, when generating a new sequence, there are 2 important parameters we can set they are N and M, which N is the possibility a method need to be called and M is the maximum times a method need to be called.

Technique

Page 15: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

3 steps:• Evaluate the coverage that randoop

achieves on a collection of container data structures.

• Randoop generates test case to find the API contract violations on widely used libraries. Like JDK and .NET.

• Uses randoop-generated regression test cases to find regression errors.

Evaluation

Page 16: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Step_1:• Test on 4 container classes: a binary

tree, a binomial heap, a fibonacci heap and a red-black tree.

• Use 4 techniques: systematic techniques in JPF, randoop, undirected random testing implemented in JPF, undirected random testing implemented in randoop.

Evaluation

Page 17: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

For each data structure, we performed following steps:

1. We use the optimal parameters to run Visser et al.'s test on the containers. (JPF)

2. We ran RANDOOP on the containers. (RP)

3. To compare against unguided random generation, we also reproduced Visser et al.'s results for random generation test. (JPFu)

4. Ran RANDOOP a second time, turning off all filters. (RPu)

Evaluation

Page 18: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Evaluation

Page 19: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Result:• Randoop’s coverages are equal to or

larger than other techniques, especially it’s running time is much lower than others.

• Our results suggest that further experiment is required on how systematic and random techniques compare in detecting errors in data structures.

Evaluation

Page 20: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Step_2: Checking API contracts• Test objects: java JDK and C++ .NET

framework.

• We will use 3 Techniques: feedback-directed random generation, undirected random generation, systematic generation to create test suites for widely-used libraries above mentioned.

Evaluation

Page 21: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

First, for each library , we performed randoop as the following steps:1. We ran randoop on a library, specifying

all the public classes as targets for testing. Then it will produce a suite case.

2. We compiled the test suite and ran it with REDUCE.

3. We manually inspected the failing test cases reported by REDUCE.

Evaluation

Page 22: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Evaluation

Page 23: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Second, we use systematic testing to test the libraries as the follow steps:• Write a suite case to check the same

contracts as randoop. (Because systematic testing can’t generate suite case like randoop)

• Ran suite case by JPF.

• Report the results.

Evaluation

Page 24: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Compare with randoop:• For all the libraries, JPF ran out of

memory (after 32 seconds on average) without reporting any errors.

• RANDOOP was able to explore the space more effectively not because it explored a larger portion of the state space. While JPF thoroughly sampled a tiny , localized portion of the space.

Evaluation

Page 25: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Third, we performed undirected random testing on each library:

• We reran RANDOOP a second time but without the filters.

• Result:

Violation-inducing test cases: 1,326.

REDUCE reported test cases: 60.

Did not find any errors.

Evaluation

Page 26: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Regression and compliance testing:• This section describes a case study in

which we used feedback-directed random testing to find inconsistencies between different implementations of the same API.

• We tested 3 implementations: Sun JDK 1.5, Sun JDK 1.6, and IBM JDK 1.5.

Evaluation

Page 27: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Result:

We ran the resulting test suite using Sun 1.6 and a second time using IBM 1.5. A total of 25 test cases failed on Sun 1.6, and 73 test cases failed on IBM 1.5.

Evaluation

Page 28: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Feedback-directed random testing scales to large systems, quickly finds errors in heavily tested, widely deployed applications, and achieves behavioral coverage on standard with systematic techniques.

Conclusion

Page 29: Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Thanks!