Generation of Random EMF Models for Benchmarks

Generation of Random Software Models for

BenchmarksMarkus Scheidgen

1

Agenda

▶ Benchmarks for MDE

▶ Input models for MDE benchmarks

▶ Generation of random models

■ Language

■ Examples

▶ Related Work

▶ Conclusion

2

Benchmarks (I)

▶ in smallMDE technology1 is solely evaluated by its functionality

▶ BigMDE technology is evaluated by its functionality and its performance (execution time, memory consumption, ...)

▶ Benchmarks enable sound comparison of technologies based on their performance

1) technology = algorithms ∪ methods ∪ tools ∪ frameworks

3

Benchmarks (II)

▶ A benchmark describes the measure ...

■ of a well defined property

■ acquired in a well defined processes

■ with a well defined workload (tasks and inputs)

■ in a well defined environment

4

All measurements were performed on a Notebook computer with Intel Core i5 2.4GHz CPU, 8 GB 1067 MHz DDR3 RAM, running Mac OS 10.7.3.

+ some environment- software versions, JVM

configuration

M. Scheidgen, A. Zubow, J. Fischer, T. H. Kolbe: Automated and Transparent Model Fragmentation for Persisting Large Models; ACM/IEEE 15th International Conference on Model Driven Engineering Languages & Systems (MODELS); Innsbruck; 2012; LNCS Springer

http://www.markus-scheidgen.de/wp-content/uploads/2012/07/Automated-and-Transparent-Model-Fragmentation-for-Persisting-Large-Models.pdf




configuration


+ some process- distribution, variation,

outlier- warmup: JIT, caches, GC

All experiments were repeated at least 20 times, and all present results are respective averages.



We measured the performance of instantiating and persisting objects.

+ some property description- exact task?- comparable between

technologies



configuration







We measured the performance of instantiating and persisting objects.

+ some property description- exact task?- comparable between

technologies



configuration





XMI CDO Morsa EMFFrag10

3

104

105

Ob

ject

s p

er

seco

nd

(! 1

04)

w/ cross referencesw/o cross references

+ two specific shapes- two specific shapes- real world likeness

We created test models with 105 objects, a binary containment hierarchy, and two different densities of cross references: one cross reference per object and no cross references.



Input for Benchmarks (I)

▶ A benchmark input model should

■ include no bias

■ invoke real world behavior

■ cover different scenarios

■ metrical scale

6

Input for Benchmarks – In MDE

▶ For MDE technology input is usually a software engineering artifact, which we commonly refer to as a model

▶ Usually the models from the 2009 Grabat’s graph transformation contest are used

■ MoDisco-models of JDT

■ different sizes and shapes (with and without method implementations)

■ sizes not linear

7

Input for Benchmarks – Properties: Size & Shape

▶ different properties to mimic different scenarios and invoke different behavior/performance characteristics

▶ goal: understand correlation between performance properties and model sizes and shapes

▶ ordinal vs metrical

▶ What defines a shape?

■ metrics (depending on the language, e.g. methods per class in OO programming)

■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)

▶ What defines size

■ # objects

■ # values

■ # links

8

Input for Benchmarks – Properties: Size & Shape

▶ different properties to mimic different scenarios and invoke different behavior/performance characteristics

▶ goal: understand correlation between performance properties and model sizes and shapes

▶ ordinal vs metrical

▶ What defines a shape?

■ metrics (depending on the language, e.g. methods per class in OO programming)

■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)

▶ What defines size

■ # objects

■ # values

■ # links

9

Input for Benchmarks – Approaches

▶ handcraft input models – no scalability

▶ take existing models – only given shapes

▶ generate models – do not mimic the real world

▶ bias?

■ bias in creation, selection, algorithm

■ social problem, can’t use technology to solve social problems

10

Input for Benchmarks – Random Models

▶ random ≠ arbitrary nor uniform

▶ surprise element

▶ probability distributions as abstractions for typical usage of language constructs

■ e.g. a class has typically a negative binomial distributed (with certain parameters) number of methods [1]

▶ distribution parameters to define shapes

▶ random models can be sensible representatives of a large class of models

11

[1] Tetsuo Tamai, Takako Nakatani: Analysis of Software Evolution Processes Using Statistical Distribution Models, IWPSE '02

http://iwpse2002.ics.es.osaka-u.ac.jp/

http://iwpse2002.ics.es.osaka-u.ac.jp/

Generation of Random Models – A Generator DSL (I)

12

generator RandomEcore for ecore in "...ecore/model/Ecore.ecore" { ePackage: EPackage -> name := RandomID(Normal(8,3)) eClassifiers += eClass#NegBinomial(5,0.5) ; eClass: EClass -> name := RandomID(Normal(10,4)) abstract := UniformBool(0.2) eStructuralFeatures += eReference(UniformBool(0.3))#NegBinomial(4,0.7) eStructuralFeatures += eAttribute#NegBinomial(6,0.5) ; eReference(boolean composite):EReference -> name := RandomID(Normal(10,4)) upperBound := if (UniformBool(0.5)) -1 else 1 ordered := UniformBool(0.2) containment := composite eType:EClass := Uniform(model.EClassifiers.filter[it instanceof EClass]) ;

...}

http://github.com/markus1978/RandomEMF

https://github.com/markus1978/RandomEMF

https://github.com/markus1978/RandomEMF

Input for Benchmarks – A Generator DSL (II)

13

▶ Maps Meta-Model to Grammar-like description

▶ Rule based

▶ Each rule creates an object of a certain meta-class

▶ Each rule calls other rules to create features

▶ Rules can have parameters

▶ Expressions with random values

■ different distributions for random number generation

■ random number of rule application

■ random values (e.g. identifier, choices)

▶ xText + xBase DSL

Generation of Random Models – Generated Example

14

1. package dabobobues;

3. class Dues {4. 5. DuBoBuTus begubicus;6. ELius brauguslus;7. 8. void Dues(Alius donus, FanulAudaCio aubetin) {9. }10. 11. void baGusFritus() {12. eudaguslius = "";13. bigusdaGubolius();14. if ("") {15. annulAugusaugusfrigustin("");16. albucio = Dues()<=++12;17. bi();18. eBoTor();19. } else {20. brauguslus = 9;21. baGusFritus();22. duLus = ""=="";23. }24. }25. 26. void aufribonulAubufrinus(Dues e) {27. dobubogutor();28. aubiguTus = 9;29. }30. }

Classes/InterfacesMethods

StatementsExpressions

others

randomly generated code

actual Java code

syntheticly generated code

15

generatedgenerated

actual project

actual project

generatedgenerated

Generation of Random Models – Problems

▶ randomness is a tool to reduce bias, but clients have to decide to use it correctly

▶ hard to generate static semantically correct models

16

Related Work

▶ Test-Model generation with SAT-Solvers

■ Meta-Model/Constraint divided into small partitions that cover test-cases

■ translation into logical equations

■ SAT-Solver

■ translation of results into model-fragments

■ composition of test-models from model fragments

➡ small, valid models with statistically proved test-coverage

17

Sagar Sen, Benoit Baudry, Jean-Marie Mottu: Automatic Model Generation Strategies for Model Transformation Testing, Theory and Practice of Model Transformations, Springer, 2009

Erwan Brottier, Franck Fleurey, Jim Steel, Benoit Baudry, Yves Le Traon: Metamodel-based Test Generation for Model Transformations: an Algorithm and a Tool, ISSRE’06, IEEE, 2006

Related Work

▶ Translation into a constructive formalism

■ Meta-Modeling is not constructive (full set of instances can not be generated from a meta-model)

■ translation into context-free or graph-grammars

■ random application of rules to generate random models

➡ large models, shape can be influenced via probability distributions on rule selection

18

K Ehrig, JM Küster, G Taentzer: Generating instance models from meta models, Formal Methods for Open Object-Based Distributed Systems, Springer, 2006

https://scholar.google.de/citations?user=rQwUZewAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=rQwUZewAAAAJ&hl=de&oi=sra

http://link.springer.com/chapter/10.1007/11768869_13

http://link.springer.com/chapter/10.1007/11768869_13

http://link.springer.com/book/10.1007/11768869




Related Work

▶ Fitting meta-model instances onto randomly generated tree/graph structures

■ existing methods for random tree or graph generation

■ interpretation of randomly generated trees/graphs as meta-model instances

➡ large models, but uniform models, not static semantic aware

19

A Mougenot, A Darrasse, X Blanc, M Soria: Uniform random generation of huge metamodel instances, ECMDA, Springer, 2009

https://scholar.google.de/citations?user=gv78_KoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=gv78_KoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=UInPjWoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=UInPjWoAAAAJ&hl=de&oi=sra

http://link.springer.com/chapter/10.1007/978-3-642-02674-4_10




Related Work

▶ benchmark definitions for graph transformations

▶ different distribution for graph edges to create different shapes

■ binomial

■ hypergeometric

■ uniform

■ preferential attachment

➡ large models, not static semantic aware

20

Izso, B., Szatmari, Z., Bergmann, G., Horvath, A., & Rath, I.: Towards precise metrics for predicting graph query performance. 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013

Conclusions

▶ benchmarking in MDE can be improved

▶ there are other options for input models than the Grabats’ 09 contest models

▶ different shapes (preferably on a metrical scale) should be used to find distinctive merits and flaws in compared technologies

▶ generators for random models

■ parameters to create differently shaped models

■ randomness and suitable distributions for real world like input

■ linear scaled sizes21

Generation of Random EMF Models for Benchmarks

Science

Transcript of Generation of Random EMF Models for Benchmarks