GPCE16: Automatic Non-functional Testing of Code Generators Families

26
Automatic Non-functional Testing of Code Generators Families Mohamed BOUSSAA Olivier BARAIS Gerson SUNYE Benoit BAUDRY 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016) August 1-3, 2016 - Vienna, Austria INRIA Rennes, France 15th International Conference on Generative Programming: Concepts & Experiences (GPCE 2016) Amsterdam, Netherlands, October 31 – November 1, 2016 1

Transcript of GPCE16: Automatic Non-functional Testing of Code Generators Families

Page 1: GPCE16: Automatic Non-functional Testing of Code Generators Families

Automatic Non-functional Testing of Code Generators Families

Mohamed BOUSSAA

OlivierBARAIS

GersonSUNYE

BenoitBAUDRY

2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016)

August 1-3, 2016 - Vienna, Austria

INRIA Rennes, France

15th International Conference on Generative Programming: Concepts & Experiences (GPCE 2016) Amsterdam, Netherlands, October 31 – November 1, 2016

1

Page 2: GPCE16: Automatic Non-functional Testing of Code Generators Families

a1. Context

a2. Motivation

a3. Automatic Non-functional Testing of Code Generators Families

a4. Performance Evaluation

a5. Conclusion

Outline

2

Page 3: GPCE16: Automatic Non-functional Testing of Code Generators Families

Context

3

All tests are successfully passed but…

How about the non-functional properties (quality) of generated code ?

Code generators are used everywhereThey automatically transform high-level system specifications (Models, DSLs,

GUIs, etc.) into general-purpose languages (JAVA, C++, C#, etc.)Target diverse and heterogeneous software platorms

Page 4: GPCE16: Automatic Non-functional Testing of Code Generators Families

Context

4

• Testing issues:

- Defective code generators may generate poor-quality code

- Testing the non-functional properties is time-consuming

- Require examining different non-functional requirements

- Code generators are complex and difficult to understand (involve complex and hetergenous technologies)

Page 5: GPCE16: Automatic Non-functional Testing of Code Generators Families

Motivation

5

Non-functional testing of code generators: The traditional way• Analyze the non-functional properties of generated code using platform-

specific tools, profilers, etc.

Lack of tools for automatic non-functional testing of code generators

Page 6: GPCE16: Automatic Non-functional Testing of Code Generators Families

Automatic Non-functional Testing of

Code Generators Familieshttps://testingcodegenerators.wordpress.com

6

Page 7: GPCE16: Automatic Non-functional Testing of Code Generators Families

Contributions

7

We propose:

• A runtime monitoring infrastructure, based on system containers (Docker) as execution platforms, that allow code-generator developers to evaluate the non-functional properties of generated code

• A black-box testing approach to automatically check the potential inefficient code generators

Page 8: GPCE16: Automatic Non-functional Testing of Code Generators Families

Microservice-based infrastructure

8

Execute and monitor of the generated code using system containers

Different configurations, instances, images, machines, etc

Resource isolation and management

Less performance overhead

Provide a fine-grained understanding and analysis of compilers behavior

Automatic extraction of non-functional properties relative to resource usage

Page 9: GPCE16: Automatic Non-functional Testing of Code Generators Families

Approach Overview

9

Page 10: GPCE16: Automatic Non-functional Testing of Code Generators Families

Approach Overview

000

000Compile and execute the

generated code within a new container instance

Gather at runtime non-functional properties of running programs under test

Save information relative to resource consumptions within a times series database

Analysis of the performance and non-functional properties

of programs under test

1

2

3

4

Code Execution

RuntimeMonitoring

Time seriesDatabase

PerformanceAnalysis

10

Page 11: GPCE16: Automatic Non-functional Testing of Code Generators Families

Testing Infrastructure

ComponentUnder Test

Back-endDatabase

Component

Cgroup file systems

Running…

Monitoring records

Front-end:VisualizationComponent

Time-series database

HTTP Requests

CPU

Memory

11

8086:

MonitoringComponent

…Code

Generation + Compilation

Page 12: GPCE16: Automatic Non-functional Testing of Code Generators Families

Testing Method

12

Definition (Code generator family): We define a code generator family as a set of code generators that takes as input the same language/model and generate code for different target platforms (example: Haxe, ThingML, etc)

Differential Testing: Compare equivalent implementations of the same program written in different languages

Standard deviation (std_dev):Quantify the amount of variation among the execution traces in terms of memory usage and execution time

Page 13: GPCE16: Automatic Non-functional Testing of Code Generators Families

Testing Method

13

Test suites with Std_dev > threshold value are interpreted as code generator inconsistencies

Page 14: GPCE16: Automatic Non-functional Testing of Code Generators Families

Evaluationhttps://testingcodegenerators.wordpress.com/experimental-results/

14

Page 15: GPCE16: Automatic Non-functional Testing of Code Generators Families

Experimental SetupHaxe Libraries + Test suites

For monitoring:Google cAdvisor

For storage:InfluxDB

Execution time (S)

Programs under test:

Haxe Libraries

Code Generators under Test:

Haxe Compilers

Non-functional metrics

Memory usage (MBytes)

15

5 targets: C#, C++, JAVA, JS, PHP

Page 16: GPCE16: Automatic Non-functional Testing of Code Generators Families

Validation

16

• The comparison results of running each test suite across five target languages: the metric used is the standard deviation between execution times

• Standard deviations are mostly close to 0 - 8 interval.

• 8 data points where the std_dev was extreamly high

Page 17: GPCE16: Automatic Non-functional Testing of Code Generators Families

Validation

17

Test suites with the highest variation in terms of execution time (k=60)

We can identify a singular behavior of the PHP code regarding the exectution time

Page 18: GPCE16: Automatic Non-functional Testing of Code Generators Families

Validation

18

• The comparison results of running each test suite across five target languages: the metric used is the standard deviation between memory consumptions

• Standard deviations are mostly close to 0 - 150 interval.

• 6 data points where the std_dev was extreamly high

Page 19: GPCE16: Automatic Non-functional Testing of Code Generators Families

Validation

19

Test suites with the highest variation in terms of memory usage (k=400)

We can identify a singular behavior of the PHP code regarding the memory usage

Page 20: GPCE16: Automatic Non-functional Testing of Code Generators Families

Validation

20

For Color_TS4 in PHP:

• We observe the intensive use of « arrays »

• We replace « arrays » by « SplFixedArray »

=> Speedup x5=> Memory usage reduction x2

Page 21: GPCE16: Automatic Non-functional Testing of Code Generators Families

Conclusion

21

Page 22: GPCE16: Automatic Non-functional Testing of Code Generators Families

Conclusion

22

Approach for testing and monitoring the code generators families using a container-based infrastructure

Automatically extract information about the resource usage

The evaluation results show that we can find real issues in existing code generators (i.e., PHP)

Summary

Detect more code generator issues (e.g., CPU consumption)

Evaluate our approach:• On other code generator families• Compare to other state-of-the-art

approaches

Future directions

22

Page 23: GPCE16: Automatic Non-functional Testing of Code Generators Families

https://testingcodegenerators.wordpress.com 23

Questions?

Page 24: GPCE16: Automatic Non-functional Testing of Code Generators Families

Tool Support

24

Page 25: GPCE16: Automatic Non-functional Testing of Code Generators Families

Visualization

25

Page 26: GPCE16: Automatic Non-functional Testing of Code Generators Families

26

Code Generators Testing: ThingML