SmartUnit: Empirical Evaluations for Automated …SmartUnit: Empirical Evaluations for Automated...

SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry

Chengyu Zhang, Yichen Yan, Hanru Zhou, Yinbo Yao, Ke Wu, Ting Su, Weikai Miao, Geguang Pu

East China Normal University, ChinaNanyang Technological University, Singapore

National Trusted Embedded Software Engineering Technology Research Center, China

2018.5.30

Outline

• Background

• Approach

• Implementation

• Evaluation

• Conclusion

MotivationBackground

IEC 61508ISO26262

RTCA DO-178B/C

Statement Coverage

Decision Coverage

Unit Testing Code Coverage Criterion

Background

RTCA DO-178B/C

MC/DCLevel A

Level B

Level C

Condition Coverage

Background

If ( A || B ) && C ) {/*Instructions*/

} else{

/*Instructions*/}

A = True B = True C = True

A = False B = False C = False

Decision Coverage

Background

} else{

/*Instructions*/}

A = True B = True C = True => True

A = False B = False C = False => False

Modified Condition/Decision Coverage （MC/DC)

Background

} else{

/*Instructions*/}

A = False B = True C = True

A = False B = True C = False

A = False B = False C = True

A = True B = False C = True

Background

} else{

/*Instructions*/}

A = False B = True C = True

A = False B = True C = False

A = False B = False C = True=> FalseA = True B = False C = True=> True

Background

} else{

/*Instructions*/}

A = False B = True C = True=>TrueA = False B = True C = False=>FalseA = False B = False C = True

A = True B = False C = True

Background

} else{

/*Instructions*/}

A = False B = True C = True=> TrueA = False B = True C = False

A = False B = False C = True=> FalseA = True B = False C = True

Dynamic Symbolic Execution (Concolic Testing)

Approach

int checkSign (int x){ if (x > 0) return 1; else if (x == 0) return 0; else return -1; }

1. input x

3. return 1

5. return 0 7. return -1

2. x > 0

4. x == 0

12345678

Approach

1. input x := 11

3. return 1

2. x > 0T

Random generated input : 111

2345678

Conjunction constraint : x <= 0New input : -7

Approach

3. return 1

7. return -1

2. x > 0

4. x == 0

12345678

1. input x := -7 New input : -7

Conjunction constraint : x <= 0 ^ x== 0New input : 0

Approach

3. return 1

2. x > 0

4. x == 0

12345678

1. input x := 0 New input : 0

Conjunction constraint : noneNew input : none

Approach

3. return 1

2. x > 0

4. x == 0

12345678

1. input x := 0

Test suite :{ 11, -7, 0 }

Cloud-based PlatformImplementation

Implementation

Web UI

Server

File System Database

Worker

SmartUnit Core

Cloud-based Platform Workflow

SourceCode

Source Code with Stub

Test Suite

Report

Implementation

SearcherExecutorMemory Model

Test Suite

SmartUnit Execution Core

Constraint Solver

Execution Module

C Source code

Implementation

Features

Automatically{generate test report.

generate test suite.

insert stubs for function calls & variables (Global variables, function inputs, etc.)

(For LDRA Testbed, Tessy, etc.)

(Statement, branch, MC/DC coverage)

Evaluation

Research Questions

RQ1: How about the performance of SmartUnit on both commercial embedded software and open-source database software?

RQ2: What factors make dynamic symbolic execution get low coverage?

RQ3: Can SmartUnit find the potential runtime exceptions in real-world software?

RQ4: What are the differences in terms of time, cost and quality between automatically generated test cases and manually written test cases?

Evaluation

Subjects # Files # Functions # LOC

Aerospace Software 8 54 3,769

Automotive Software 4 330 31,760

Subway Signal Software 108 874 37,506

SQLite 2 2,046 126,691

PostgreSQL 906 6,105 279,809

Total 1,028 9,409 479,535

RQ1： How about the performance of SmartUnit ？

Evaluation

Subjects Statement Coverage* N/A 0-50% 50-99% 100%

Branch Coverage* N/A 0-50% 50-99% 100%

MC/DC Coverage* N/A 0-50% 50-99% 100%

Aerospace Software 1 3 10 41 1 5 8 41 45 2 - 8

Automotive Software 1 3 11 315 1 6 8 315 274 5 1 50

Subway Signal Software 6 1 50 817 6 2 55 811 558 11 11 294

SQLite 86 86 206 1668 86 119 205 1636 1426 118 149 351

PostgreSQL 687 732 1044 3642 687 1102 804 3512 4083 1308 249 465

RQ1： How about the performance of SmartUnit ？

* : Statistic with the number of functions for corresponding coverage range

Evaluation

RQ2： What factors make dynamic symbolic execution get low coverage ？

• Environment variables and Environment functions

• Complex operations

• Limitation of SMT solver

Evaluation

RQ3： Can SmartUnit find the potential runtime exceptions in real-world software?

• Array index out of bounds

• Fixed memory address

• Divided by zero

Evaluation

• Array index out of bounds

static char *cmdline_option_value(int argc, char **argv, int i) {if (i == argc) {utf8_printf(stderr, "Error: missing argument to %s\n",

argv[0], argv[argc - 1]);exit(1);

}return argv[i];

Evaluation

• Fixed memory address

(*0X00000052) or (*(symbolic_variable+12)

Evaluation

• Divided by zerostatic void getLocalPayload(int nUsable, u8 flags, int nTotal, int *pnLocal){int nLocal, nMinLocal, nMaxLocal;if( flags==0x0D ){nMinLocal = (nUsable - 12) * 32 / 255 - 23;nMaxLocal = nUsable - 35;

}else{nMinLocal = (nUsable - 12) * 32 / 255 - 23;nMaxLocal = (nUsable - 12) * 64 / 255 - 23;

}nLocal = nMinLocal + (nTotal - nMinLocal) % (nUsable - 4);

Evaluation

RQ4： What are the differences between automatically generated and manually written test cases?

Subjects # Functions Time (s) Average (s/func)

Aerospace Software 54 318 6

Automotive Software 330 329 1

Subway Signal Software 874 2,476 3

SQLite 2,046 13,482 6

PostgreSQL 6,105 18,857 3

Total 9,409 35,462 3.77

A trained test engineer can only product test case for 5-8 functions per day.

Conclusion

SmartUnit{Industry application

Potential runtime exceptions

Dynamic symbolic execution (High coverage unit testing)

(Out of bounds, divided by zero, etc.)

(Insert stubs, test report, test suite)

SmartUnit: Empirical Evaluations for Automated …SmartUnit: Empirical Evaluations for Automated...

Documents

Transcript of SmartUnit: Empirical Evaluations for Automated …SmartUnit: Empirical Evaluations for Automated...

Effective Evaluations

Towards Certified Separate Compilation for Concurrent Programs · Concurrent Programs Hanru Jiang∗ University of Science and Technology of China, China hanru219@mail.ustc.edu.cn

[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies

Some empirical evaluations of a temperature forecasting module based on Artificial Neural Networks for a domotic home environment - KDIR 2012

Concrete Models and Empirical Evaluations for the Categorical Compositional ...anthology.aclweb.org/J/J15/J15-1004.pdf · · 2015-04-16for the Categorical Compositional Distributional

Overview of Health Communication Campaigns · 2019-05-29 · CAMPAIGNS Empirical Evidence Supporting the Effectiveness of Mass Media Communication Campaigns Mass media campaign evaluations

Student evaluations

Church evaluations

Surrogate models via Polynomial Chaos Expansions Diane ... · A CMG model for peak gas extraction, empirical CDFs plots (3000 evaluations) from the original model (taking days) and

Www.hpa-midas.org.uk Evaluations and Validations post Carter Keith Perry, Head of Evaluations Evaluations & Standards Laboratory BSMT, November 2006.

Customer evaluations of after-sales service contact modes ... · Customer evaluations of after-sales service contact modes: An empirical analysis of national culture’s consequences

STEP 1: Go to the Results website at // MCE.pdfActive Evaluations Tab - Shows evaluations currently in progress. Recent Evaluations Tab - Shows results for evaluations within the past

Empirical evaluations on the cost-effectiveness of state ... · Empirical evaluations on the cost-effectiveness of state-based testing: An industrial case study Nina Elisabeth Holta,⇑,

The Truth, the Whole Truth, and Nothing but the Truth: A Pragmatic Guide to Assessing Empirical Evaluations

DFID COUNTRYPROGRAMME EVALUATIONS SYNTHESISOF2006/2007 EVALUATIONS · 2016-03-29 · DFID COUNTRYPROGRAMME EVALUATIONS SYNTHESISOF2006/2007 EVALUATIONS Julian Barr and CharlotteVaillant

Evaluations presentation

Personal evaluations

Course Evaluations

STORYTIME AND BEYOND! - LibraryLinkNJ€¦ · • Staff evaluations • Write-ups of current programs • Youth room observations • Patron evaluations • Storytime evaluations

Draft Evaluations