Model-Based Mutation Testing of Reactive Systems Mutation Testing of Reactive Systems From Semantics...

15
Model-Based Mutation Testing of Reactive Systems From Semantics to Automated Test-Case Generation Bernhard K. Aichernig Institute for Software Technology Graz University of Technology, Austria [email protected] Abstract. In this paper we give an overview of our work on combin- ing model-based testing and mutation testing. Model-based testing is a black-box testing technique that avoids the labour of manually writing hundreds of test cases, but instead advocates the capturing of the ex- pected behaviour in a model of the system-under-test. The test cases are automatically generated from this model. The technique is receiving growing interest in the embedded-systems domain, where models are the rule rather than the exception. Mutation testing is a technique for assessing and improving a test suite. A number of faulty versions of a program-under-test are produced by injecting bugs into its source code. These faulty programs are called mutants. A tester analyses if his test suite can ”kill” all mutants. We say that a test kills a mutant if it is able to distinguish it from the original. The tester improves his test suite until all faulty mutants get killed. In model-based mutation testing, we combine the central ideas of model- based testing and mutation testing: we inject bugs in a model and gen- erate a test suite that will kill these bugs. In this paper, we discuss its scientific foundations and tools. The foundations include semantics and conformance relations; the supporting tools involve model checkers, con- straint solvers and SMT solvers. 1 Introduction Is testing able to show the absence of bugs? The most prominent negative answer was given by the late Edsger Dijkstra: “Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence.” [15]. Dijkstra was always motivating the need for formally verified software. Of course, in general Dijkstra is right, in the same way as Popper was right, when he stated that we can never verify that a theory is correct by a finite set of experiments. In principle, only refutation (falsification) is possible [20]. However, this should not lead to an over-pessimistic judgement rejecting testing completely. This would be futile, since testing is the only way of building trust in a running system embedded in a complex environment. Testing is needed to check our assumptions. With wrong assumptions, even formally verified software

Transcript of Model-Based Mutation Testing of Reactive Systems Mutation Testing of Reactive Systems From Semantics...

Model-Based Mutation Testingof Reactive Systems

From Semantics to Automated Test-Case Generation

Bernhard K. Aichernig

Institute for Software TechnologyGraz University of Technology, Austria

[email protected]

Abstract. In this paper we give an overview of our work on combin-ing model-based testing and mutation testing. Model-based testing is ablack-box testing technique that avoids the labour of manually writinghundreds of test cases, but instead advocates the capturing of the ex-pected behaviour in a model of the system-under-test. The test casesare automatically generated from this model. The technique is receivinggrowing interest in the embedded-systems domain, where models are therule rather than the exception.Mutation testing is a technique for assessing and improving a test suite.A number of faulty versions of a program-under-test are produced byinjecting bugs into its source code. These faulty programs are calledmutants. A tester analyses if his test suite can ”kill” all mutants. We saythat a test kills a mutant if it is able to distinguish it from the original.The tester improves his test suite until all faulty mutants get killed.In model-based mutation testing, we combine the central ideas of model-based testing and mutation testing: we inject bugs in a model and gen-erate a test suite that will kill these bugs. In this paper, we discuss itsscientific foundations and tools. The foundations include semantics andconformance relations; the supporting tools involve model checkers, con-straint solvers and SMT solvers.

1 Introduction

Is testing able to show the absence of bugs? The most prominent negative answerwas given by the late Edsger Dijkstra: “Program testing can be a very effectiveway to show the presence of bugs, but it is hopelessly inadequate for showingtheir absence.” [15]. Dijkstra was always motivating the need for formally verifiedsoftware. Of course, in general Dijkstra is right, in the same way as Popper wasright, when he stated that we can never verify that a theory is correct by a finiteset of experiments. In principle, only refutation (falsification) is possible [20].However, this should not lead to an over-pessimistic judgement rejecting testingcompletely. This would be futile, since testing is the only way of building trustin a running system embedded in a complex environment. Testing is needed tocheck our assumptions. With wrong assumptions, even formally verified software

may fail. A famous example of such a rare and subtle software bug was foundin the binary search algorithm implemented in the Java JDK 1.5 library in 2006[13].

As mentioned, Sir Karl Popper proposed the process of falsification. The ideais to build up trust by trying to disprove a theory. Translated to computer-basedsystems, we form a theory by modelling the system that is under investigation.We call these models test models. By testing, we try to disprove that the con-structed system conforms to the test model. The tests are guided by educatedguesses of possible faults that have been made during the construction.

If these falsification attempts fail, we build up trust. More important, thistrust is measurable since we know what kind of challenges the system survived,i.e. what kind of faults are absent. In this paper we present our work on model-based mutation testing that follows this fault-oriented strategy. The main ad-vantage of this testing technique is that it can guarantee the absence of specificfaults.

Our goal is to generate a small set of test cases that cover these anticipatedfaults. This is in contrast to more traditional model-based testing approachesthat often aim for structural model coverage, like e.g., state coverage or transitioncoverage.

The remainder of this paper is structured as follows. Next, in Section 2 weintroduce mutation testing. Then, in Section 3 we explain the process of model-based mutation testing. In Section 4 we develop its general theory. In Section 5the general theory is instantiated for transformational systems. In Section 6we show how to handle reactive systems. Finally, in Section 7 we draw ourconclusions.

2 Mutation Testing

Mutation testing is a way of assessing and improving a test suite by checkingif its test cases can detect a number of injected faults in a program. The faultsare introduced by syntactically changing the source code following patterns oftypical programming errors. These deviations in the code are called mutations.The resulting faulty versions of the program are called mutants. Usually, eachmutant includes only one mutation. Examples of typical mutations include re-naming of variables, replacing operators, e.g., an assignment for an equivalenceoperator, and slightly changing Boolean and arithmetic expressions. Note thatwe only consider mutations that are syntactically correct. The number and kindof mutations depend on the programming language and are defined as so-calledmutation operators.

A mutation operator is a rewrite rule that defines how certain terms in theprogramming language are replaced by mutations. For every occurrence of theterm the mutation operator rewrites the original program into a new mutant.After a set of mutants has been generated, the test cases are run on the originaland on each mutant. If a test case can distinguish a mutant from the originalprogram, i.e. a test case passes the original, but fails on a mutant, we say that

1 object triangle {23 def tritype(a : Int , b : Int , c: Int) = (a,b,c) match {4 case _ if (a <= c-b) => "no triangle"5 case _ if (a <= b-c) => "no triangle"6 case _ if (b <= a-c) => "no triangle"7 case _ if (a == b && b == c) => "equilateral"8 case _ if (a == b) => "isosceles"9 case _ if (b == c) => "isosceles"

10 case _ if (a == c) => "isosceles"11 case _ => "scalene"12 }13 }

Fig. 1. Scala function returning the type of a triangle.

this test case kills a mutant. The goal is to develop a test suite that kills allmutants.

Mutation testing can also be lifted to a test case generation technique. Theaim is to automatically search for test cases that kill the mutants, i.e. the faults.However, this is still research as was recently pointed out in a survey on mutationtesting: “There is a pressing need to address the, currently unresolved, problemof test case generation.” [17]. It is the objective of our research to solve thisproblem.

Example 1 (Mutation of Programs). Consider the Scala program in Figure 1.The function tritype takes three lengths of a triangle and returns the resultingtype of triangle. An example mutation operator could rewrite every equalityinto a greater-equal operator (==⇒ >=) . This would produce five mutants, eachcontaining exactly one mutation. For example, in the first mutant (Mutant 1),Line 7 would be replaced by case _ if (a >= b && b == c) => "equilateral". �

Mutation can also be applied on the modelling level, as the following exampleillustrates.

Example 2 (Mutation of Models). Consider the UML diagram of a car alarmsystem in Figure 2. From the initial state OpenAndUnlocked one can traverseto ClosedAndLocked by closing all doors and locking the car. Actions of closing,opening, locking, and unlocking are modelled by corresponding signals Close,Open, Lock, and Unlock. The alarm system is armed after 20 seconds in Closed-AndLocked. Upon entry of the Armed state, the model calls the method Alar-mArmed.SetOn. Upon leaving the state, which can be done by either unlockingthe car or opening a door, AlarmArmed.SetOff is called. Similarly, when enter-ing the Alarm state, the optical and acoustic alarms are enabled. When leavingthe alarm state, either via a timeout or via unlocking the car, both acoustic andoptical alarm is turned off. When leaving the alarm state after a timeout the sys-tem returns to an armed state only in case it receives a close signal. Turning offthe acoustic alarm after 30 seconds is reflected in the time-triggered transitionleading to the Flash sub-state of the Alarm state.

AlarmSystem_StateMachine

AlarmActivate Alarms /entry Deactivate Alarms /exit

Flash

FlashAndSound

Armed

Show Armed /entry Show Unarmed /exit

ClosedAndLocked

OpenAndUnlocked

ClosedAndUnlocked OpenAndLocked

SilentAndOpen

Unlock

30 / Deactivate Sound

300

Open

Unlock

20

Close

Unlock OpenLock Close

Close LockOpen Unlock

Fig. 2. State machine model of a car alarm system in UML.

Let us consider a mutation operator for state machines that turns everytransition into a reflexive transition ( ⇒ ). This operator produces 17mutants of the car alarm system’s UML diagram, one for each transition. Forexample, applying this mutation operator to the Lock-transition at the stateOpenAndUnlocked results in a faulty behaviour staying in the state after aLock-event. �

Our project partner, the Austrian Institute of Technology (AIT), has de-veloped a large set of mutation operators for UML state machines, includingremoving trigger events on transitions, mutating transition signal events, mutat-ing transition time trigger events, mutating transition OCL expressions, mutat-ing transition effects, mutating transition guards, and removing entry and exitactions in states. We have recently studied the effectiveness of this mutationoperators for different modelling styles [21]. This study shows that the numberof generated mutants per mutation operator heavily depends on the style of theUML models.

After generating the mutants, we try to kill them. A test case kills a mutant,if its execution on the mutant shows a different behaviour than on the original.We say that a mutant survives a test case if it is not killed.

Example 3. Let us consider a set of test cases for the triangle example of Figure 1:tritype(0,1,1), tritype(1,0,1), tritype(1,1,0), tritype(1,1,1), tritype(2,3,3),tritype(3,2,3), tritype(3,3,2), tritype(2,3,4). These test cases cover all states,all branches, all paths of the program. They even satisfy the MC/DC coveragecriterion and yet our mutant of Example 1 survives this test suite. For killingthe mutant, we need an isosceles test case with a > b, e.g., tritype(3,2,2). Thistest case kills the mutant by returning equilateral instead of isosceles. �

It is our goal to generate such test cases. This is not only possible for programs,but also for models.

Example 4. Let us return to the transition-mutation discussed in Example 2.This mutant may survive function coverage, state coverage and even transitioncoverage, because the fault of staying in the state is only observable after waitingfor 20 seconds and checking if the alarm system has been armed. Hence, a testsequence Lock(); Close(); Wait(20) is needed to kill this mutant. The expectedbehaviour of this test case is that the red flashing light indicating the arming willbe switched on. In contrast, the mutant will show quiescence, i.e. the absence ofany observation. �

In recent years, mutation testing has received a growing interest in academia[17]. Today, it is most frequently used as a technique to analyse the quality of agiven test suite. The quality is measured in terms of the mutation score whichis the ratio of killed mutants to the total number of mutants — the higher themutation score, the better the test suite. Test suites can also be minimised, byreducing the number of test cases while keeping the mutation score. Obviously,the aim is to have a test suite with a maximal mutation score.

Our research aims at automatically generating the test cases that maximisethe mutation score. Hence, rather than analysing a given test suite, we are in-terested in its synthesis. Our idea is to use and develop model checkers thatanalyse the equivalence between the original and a mutant. These tools producea counter-example to equivalence which can be turned into a test case. However,there are some challenges.

Challenges. Unfortunately, achieving a mutation score of 1 (100%) is often im-possible. The problem is that some mutants show an equivalent behaviour and,therefore, cannot be killed by any test case. The reason is that some syntacticchanges do not have an effect on the semantics, e.g. mutations in code fragmentsthat are never executed (dead code). Hence, these equivalent mutants have tobe identified in order to normalise the mutation score1. For model checkers thismeans that in case of equivalence, the full state-space has to be explored, whichmay lead to the well-known state-space explosion. In general, equivalence check-ing is undecidable and NP-complete for bounded models. Therefore, we applythe technique to abstract models of the system-under-test (SUT). This leadsto a combination of model-based testing and mutation testing, which we callmodel-based mutation testing.

3 Model-Based Mutation Testing

Figure 3 summarises the process of model-based mutation testing. Like in clas-sical model-based testing, the user creates a test model out of the given require-ments. A test case generator then analyses the model and generates an abstracttest case (or a test suite). This test case is on the same abstraction level as thetest model and includes expected outputs. A test driver maps the abstract test

1 Therefore, originally, the mutation score is defined as the ratio of killed mutants tothe total number of non-equivalent mutants.

ModelMutation

ToolModel Mutant

Test CaseGenerator

Abstract Test Case

SUTTest

Driver then fail

if ¬conforms

if conforms

Fig. 3. Model-Based Mutation Testing.

case to the concrete test interface of the SUT and executes the test case. Thetest driver compares the expected outputs with the actual outputs of the SUTand issues a verdict (pass or fail).

If the SUT conforms to the model, i.e. the SUT implements the model cor-rectly, the verdict will always be pass (assuming that the tool chain generatessound test cases). In case of non-conformance (¬ conforms), i.e. a bug exists, wemay issue a fail verdict. However, due to the incompleteness of testing, we maymiss the bug and issue a pass verdict. Dijkstra was referring to these incom-pleteness of testing when he pointed out that testing cannot show the absence ofbugs. However, in model-based mutation testing, we can improve this situationconsiderably.

In model-based mutation testing, we mutate the models automatically andthen generate an abstract test case that will cover this mutation. What thiscoverage means will be defined later, when we define the conformance relation.For now we want to point out an important difference to other testing techniques:if a bug exists and this bug is represented by the generated mutant, then thetest case will find this bug. This important property is illustrated in Figure 3by the two conformance arrows: if the SUT does not conform to the model, butconforms to the mutant, the execution of the generated test case will result ina fail verdict. Here we are assuming a deterministic implementation. For non-deterministic SUTs, we have to repeat the test cases a given number of times.

4 General Theory

In this section we present the general theory of our model-based mutation testingapproach. The theory is general in the sense that it does not define what kind ofconformance relation is used. It can be any suitable order-relation. In the nextsections, we will instantiate the conformance relation for transformational andreactive systems. The first property follows directly from Figure 3.

Theorem 1. Given a transitive conformance relation v, then

(Model 6v SUT ) ∧ (Mutant v SUT ) ⇒ (Model 6vMutant)

Proof. Proof by contradiction: let us assume Model vMutant, then by transitiv-ity it follows from Mutant v SUT that Model v SUT . This is a contradictionto the assumption Model 6v SUT , hence Model 6vMutant. �

The theorem expresses the fact that if a SUT has a fault and this fault is capturedin the mutant, then the mutant is non-conforming to the model, i.e. the mutantis non-equivalent. Our test case generation algorithm is looking for the cases ofnon-conformance in Model 6v Mutant. These cases are then turned into testcases and executed on the SUT . Such a test case will detect, if the SUT is animplementation of its Mutant.

Next, we characterize the test cases we are looking for. In general, a test casecan be interpreted as a partial specification (model). It defines the expectedoutput for one input and the rest is undefined. In this sense, a test case is highlyabstract, because every behaviour different to its input-output is underspecified.This view causes sometimes confusion since the syntax of a test case is veryconcrete, but its semantics as a specification is very abstract. Consequently, ifa SUT (always) passes a test case, we have conformance between the test caseand the SUT:

Test case v SUT

If we generate a test case from a model, we have selected a partial behavioursuch that the model conforms to this test case:

Test case vModel

If the SUT conforms to this model, we can relate all three:

Test case vModel v SUT

We can now define fault detecting test cases:

Definition 1. Given a model and a mutant, its fault-detecting test case is (1)generated from the model and (2) kills the mutant, i.e.

Test case vModel ∧ Test case 6vMutant

Such a test case only exists for non-equivalent mutants:

Theorem 2.

Model 6vMutant iff ∃Test case : (Test case vModel∧Test case 6vMutant)

The theorem shows that the fault-detecting test case is the counter-exampleto conformance. We presented the specification-view on test cases first in theweakest-precondition semantics of the refinement calculus [1, 2]. The definitionof fault-detecting test cases and their existence was developed in our mutationtesting theory formulated in the relational semantics of the Unifying Theory ofProgramming (UTP) [6]. Next, we instantiate the general theory for transfor-mational systems.

5 Transformational Systems

Transformational systems transform inputs and a pre-state to some output andpost-state, then they terminate. Hence, the model and mutant of a transforma-tional system can be interpreted as predicates Model(s, s′) and Mutant(s, s′)describing their state transformations (s→ s′). For such relational models, con-formance is defined via implication in the standard way [16]:

Definition 2 (Conformance as Implication).

Model vMutant =df ∀s, s′ : Mutant(s, s′)⇒Model(s, s′)

Here conformance between a mutant and a model means that all behaviour ofthe mutant is allowed by the model. Consequently, non-conformance is expressedvia the existence of a behaviour of the mutant that is not allowed by the model:

Theorem 3.

Model 6vMutant = ∃s, s′ : Mutant(s, s′) ∧ ¬Model(s, s′)

Note that this is a constraint satisfaction problem. Hence, a constraint solvercan be used to search for a pre-state (input) s leading to the fault.

Example 5. Contract languages, like e.g. the Java Modelling Language (JML),support the specification of the transition relation of a method. A contract ofour triangle example would look very similar to the Scala code in Figure 1. Theirpredicative semantics would be equivalent. Let us consider the semantics of ourtriangle example and its mutant.

Mutant(a, b, c, res′) ∧ ¬Model(a, b, c, res′) =df

(. . .¬(a ≤ c− b ∨ a ≤ b− c ∨ b ≤ a− c) ∧ (a ≥ b ∧ b = c ∧ res′ = equilateral))

∧¬(. . .¬(a ≤ c− b ∨ a ≤ b− c ∨ b ≤ a− c) ∧ (a = b ∧ b = c ∧ res′ = equilateral))

The arrow indicates the difference in the semantics due to the mutation. Simpli-fying the formula results in the condition that all fault-detecting test cases must

01234567891011 121314151617

ctr C

lose

ctr L

ock

obs a

fter(2

0)

obs A

larm

Arm

ed_S

etO

n

ctr O

pen

obs A

larm

Arm

ed_S

etO

ff

obs O

ptic

alA

larm

_Set

On

obs A

cous

ticA

larm

_Set

On

obs a

fter(3

0)

obs A

cous

ticA

larm

_Set

Off

obs a

fter(2

70)

obs A

larm

Arm

ed_S

etO

ff

obs A

cous

ticA

larm

_Set

Off

obs O

ptic

alA

larm

_Set

Off

ctr C

lose

obs A

larm

Arm

ed_S

etO

n

ctr U

nloc

kobs p

ass

Fig. 4. An abstract test case for the car alarm system.

satisfy: a > b ∧ b = c ∧ res′ = equilateral. A constraint solver would produce,e.g., the solution a = 3, b = 2, c = 2, res′ = equilateral. This input togetherwith the expected output of the original comprises the fault-detecting test casea = 3, b = 2, c = 2, res′ = isosceles. �

We developed this theory for transformational systems together with HeJifeng [6]. The technique was implemented with different solvers for differentspecification languages, e.g. OCL [11], Spec# [18], and the Reo connector lan-guage [19].

6 Reactive Systems

Reactive systems continuously react to their environment and do not necessarilyterminate. Common examples of such systems are controllers and servers. Thepoints of observations from a tester’s perspective are controllable (input) andobservable (output) events. A test case for such systems is a sequence of con-trollable and observable events in the deterministic case. For non-deterministicsystems, test cases have to branch over all possible observations. Such tree-liketest cases are known as adaptive test cases.

The operational semantics of such systems is usually given in terms of LabelledTransition Systems (LTS) and the abstract test cases are LTS, too. Hence, inthe deterministic case an abstract test case is a sequence of (input and output)labels.

Example 6. The car alarm system of Example 2 is a reactive system. Figure 4shows a generated abstract test case for this system.

A prominent testing theory for this kind of semantics was developed by Tret-mans [22]. Its conformance relation ioco is defined as follows.

Definition 3.

SUT iocoModel =df ∀σ ∈ traces(Model) : out(SUT afterσ) ⊆ out(Model afterσ)

Here after denotes the set of reachable states after a trace σ, and out denotesthe set of all observable events in a set of states. The observable events are alloutput events plus one additional quiescence event for indicating the absence ofany output.

!flashOn

!soundOn

!soundOn!flashOn

!flashOn

!soundOff

?unlock

!flashOn

!soundOn

!soundOn

!soundOff

?unlock

pass

failpass

pass

Fig. 5. Labelled transition systems of a part of a non-deterministic model of the caralarm system (left), a mutant (centre), and their synchronous product graph (right).

This input-output conformance relation ioco supports non-deterministic models(see the subset relation) as well as partial models (only traces of the Model aretested). For input-complete models ioco is equivalent to trace-inclusion (languageinclusion).

Example 7. The left-hand side of Figure 5 shows the LTS semantics of switchingon both kind of alarms non-deterministically. Exclamation marks denote observ-able events, question marks controllable events. In this model either the flash,or the sound is switched on first. An implementer may decide for one of the twointerleavings according to ioco. He might even add additional controllable eventsat any point, like the ?unlock-event in the LTS at the centre. However, the sub-set relation of output events has to be respected. Therefore, it is the !soundOffevent in the mutant in the centre causing non-conformance. �

6.1 Explicit Conformance Checking

The conformance between a model and its mutant can be checked by buildingthe synchronous product of their LTSs modulo ioco. The right-hand side ofFigure 5 shows this product graph for our example. Product modulo ioco meansthat we limit the standard product construction if the mutant has either (1)an additional (unexpected) output event (here !soundOff ), or (2) an additionalinput event (here ?unlock). In the first case, we have detected non-conformanceand add a fail state after the unexpected event. Furthermore, we add all expectedobservables of the model. In the second case we stop the exploration, becausewe have reached an unspecified input behaviour.

Different strategies for extracting a test case from such a product graph exist.We can select a linear or adaptive test case, the shortest path or a random pathto a fail-state, cover each fail-state or only one. Our experiments have shownthat a combination of random and lazy shortest path strategies works well [3].Lazy refers to the strategy of generating new test cases only, if the existing testcases do not kill a mutant.

We have applied this explicit conformance checking technique to generatetest cases to several case studies, including testing an HTTP server using LO-TOS models [5], SIP servers [23] using LOTOS models, controllers [3] using

var closed : Bool := false;locked : Bool := false;armed : Bool := false;sound : Bool := false;flash : Bool := false;

actionsClose :: ¬closed → closed := true;

Open :: closed → closed := false;

SoundOn :: armed ∧ ¬closed ∧ ¬sound → sound := true;

FlashOn :: armed ∧ ¬closed ∧ ¬flash → flash := true;. . .

. . .do Close

�Open�SoundOn;FlashOn�FlashOn;SoundOn. . .

od

Fig. 6. Action System model of the car alarm system.

UML models, and most challenging, hybrid systems [14] using Action Systemsextended with qualitative reasoning models [4].

Explicit checking works well with event-oriented systems, but we ran intoscalability issues with parametrised events. Therefore, we have developed a sec-ond line of tools using symbolic conformance checkers.

6.2 Symbolic Conformance Checking

The idea is to use a similar approach as for transformational systems. Therefore,we have decided to use Back’s Action Systems [12] as our input language. Ac-tion systems are a kind of guarded command language for modelling concurrentreactive systems. They are similar to Dijkstra’s iterative statement.

Example 8. Figure 6 shows an Action System model of our car alarm systemexample. First, the model variables and their initial values are declared. Next,the actions in the form of guarded commands are listed. Note that each action islabelled establishing the link to the LTS semantics. On the right-hand side is theprotocol layer of actions which further restricts the possible order of actions. Thestandard composition operator for actions is non-deterministic choice (A�B),however also sequential (A;B) or prioritised compositions (A//B) are possible.The protocol layer establishes a loop that iterates while any action is enabled.Action Systems terminate if all actions are disabled. �

Originally, Actions Systems are defined in weakest-precondition semantics. How-ever, for our purposes a relational semantics suffices. Therefore, we have givenAction Systems a predicative semantics in the style of UTP as shown in Figure 7[7].

The state-changes of actions are defined via predicates relating the pre-stateof variables s and their post-state s′. Furthermore, the labels form a visible traceof events tr that is updated to tr′ whenever an action runs through. Hence, aguarded action’s transition relation is defined as the conjunction of its guard g,the body of the action B and the adding of the action label l to the previouslyobserved trace. In case of parameters x, these are added as local variables to thepredicate. An assignment updates one variable x with the value of an expressione and leaves the rest unchanged. Sequential composition is standard: there must

l :: g → B =df g ∧ B ∧ tr′ = tr [l]

l(x) :: g → B =df ∃ x : g ∧ B ∧ tr′ = tr [l(x)]

x := e =df x′ = e ∧ y′ = y ∧ ... ∧ z′ = z

g → B =df g ∧ B

B(s, s′);B(s, s′) =df ∃ s0 : B(s, s0) ∧ B(s0, s′)

B1 � B2 =df B1 ∨ B2

Fig. 7. Predicative UTP semantics of Action System.

exist an intermediate state s0 that can be reached from the first body predicateand from which the second body predicate can lead to its final state. Finally,non-deterministic choice is defined as disjunction. The semantics of the do-odblock is as already mentioned: while actions are enabled in the current state, oneof the enabled actions is chosen non-deterministically and executed. An actionis enabled in a state if it can run through, i.e. if a post-state exists such that thesemantic predicate can be satisfied.

This semantics is already close to a constraint satisfaction problem. However,the existential quantifiers need to be eliminated first, before we can negate theformula. For further details see [8].

With this predicative semantics conformance can be defined via implication,too. However, we also have to take the reachability via a trace of actions intoaccount. For mutation testing, we are interested in non-conformance which canbe defined as follows.

Definition 4. Given two Action Systems, a model and its mutant, then non-conformance is given iff

∃ s, s′, tr, tr′ : reachable(s, tr) ∧ Mutant(s, s′, tr, tr′) ∧ ¬Model(s, s′, tr, tr′)

The predicate reachable(s, tr) holds if a state s is reachable via trace tr from theinitial state of the Action System.

This definition of non-conformance is much stronger than non-ioco, since anydifference in the state s′ causes non-conformance. Here, we also do not distinguishbetween input and output labels. However, it is very efficient to search for, sincereachability can be done on the model only.

We have recently implemented two symbolic conformance checkers using thisformula. One is implemented in Sicstus Prolog and uses Constraint Logic Pro-gramming. The other is implemented in Scala and uses the SMT solver Z3. Inour first experiments both show similar performance [9].

For an ioco check of input-complete models, non-conformance amounts to alanguage-inclusion check:

∃ s1, s′1, s2, s

′2, tr, !a : reachable(Mutant, tr, s1) ∧ reachable(Model, tr, s2)

∧Mutant(s1, s1′, tr, tr !a) ∧ ¬Model(s2, s2′, tr, tr !a)

Here non-conformance is only due to a common trace leading to an output-label (output-action) in the mutant that is not allowed by the model. Note thatthis ioco formula holds for deterministic models only. In the non-deterministiccase we have to check that none of the reachable states leads to an unexpectedobservation.

Most recently, we have implemented a similar formula in a test-case generatorfor timed automata (TA). The tool reads TA models and checks for tioco, a timedversion of ioco. Here Scala and the Z3 solver are used [10].

7 Conclusions

We have shown how model-based testing and mutation testing can be combinedinto model-based mutation testing. We started with transformational systemsand then developed explicit and symbolic techniques for reactive systems. Severaltools have been implemented that show that the approach is feasible.

The underlying test-case generation techniques are closely related to formalsemantics. With a precise semantics we can define our notion of conformance.Non-conformance is the basis for our fault-models and test-case generation al-gorithms. Test cases are derived from counter-examples of a conformance check.With a predicative semantics such counter-examples may be found using con-straint or SMT solvers.

The novelty in this research is the general theory and the test-case generatorsthat can deal with non-deterministic models. For related work we refer to therecent survey on mutation testing, where also model-based mutation testing iscovered [17].

The presented work shows that model-based mutation testing involves a va-riety of research directions and is far from being a closed case. As of today, nocommercial tool has adopted this technique yet. Scalability is certainly an issue,but we firmly believe that advances are possible.

Acknowledgement. The recent research has received funding from the ARTEMISJoint Undertaking under grant agreement No 269335 and from the Austrian Re-search Promotion Agency (FFG) under grant agreement No 829817 for the imple-mentation of the project MBAT, Combined Model-based Analysis and Testingof Embedded Systems. The work was also funded by the Austrian ResearchPromotion Agency (FFG), program line “Trust in IT Systems”, project number829583, TRUst via Failed FALsification of Complex Dependable Systems UsingAutomated Test Case Generation through Model Mutation (TRUFAL).

References

1. Bernhard K. Aichernig. Test-case calculation through abstraction. In Jose NunoOliveira and Pamela Zave, editors, Proceedings of Formal Methods Europe 2001,FME 2001: Formal Methods for Increasing Software Productivity, March 12–162001, Berlin, Germany, volume 2021 of Lecture Notes in Computer Science, pages571–589. Springer-Verlag, 2001.

2. Bernhard K. Aichernig. Mutation Testing in the Refinement Calculus. FormalAspects of Computing, 15(2-3):280–295, 2003.

3. Bernhard K. Aichernig, Harald Brandl, Elisabeth Jobstl, and Willibald Krenn.Efficient mutation killers in action. In IEEE Fourth International Conference onSoftware Testing, Verification and Validation, ICST 2011, Berlin, Germany, March21–25 , 2011, pages 120–129. IEEE Computer Society, 2011.

4. Bernhard K. Aichernig, Harald Brandl, and Franz Wotawa. Conformance testing ofhybrid systems with qualitative reasoning models. In B. Finkbeiner, Y. Gurevich,and A.K. Petrenko, editors, Proceedings of Fifth Workshop on Model Based Testing(MBT 2009), York, England, 22 March 2009, volume 253 (2) of Electronic Notesin Theoretical Computer Science, pages 53–69. Elsevier, October 2009.

5. Bernhard K. Aichernig and Carlo Corrales Delgado. From faults via test purposesto test cases: on the fault-based testing of concurrent systems. In Luciano Baresiand Reiko Heckel, editors, Proceedings of FASE’06, Fundamental Approaches toSoftware Engineering, Vienna, Austria, March 27–29, 2006, volume 3922 of LectureNotes in Computer Science, pages 324–338. Springer-Verlag, 2006.

6. Bernhard K. Aichernig and Jifeng He. Mutation testing in UTP. Formal Aspectsof Computing, 21(1-2):33–64, February 2009.

7. Bernhard K. Aichernig and Elisabeth Jobstl. Towards symbolic model-based mu-tation testing: Combining reachability and refinement checking. In 7th Workshopon Model-Based Testing (MBT 2012), volume 80 of EPTCS, pages 88–102, 2012.

8. Bernhard K. Aichernig and Elisabeth Jobstl. Towards symbolic model-based mu-tation testing: Pitfalls in expressing semantics as constraints. In Workshops Proc.of the 5th Int. Conf. on Software Testing, Verification and Validation (ICST 2012),pages 752–757. IEEE Computer Society, 2012.

9. Bernhard K. Aichernig, Elisabeth Jobstl, and Matthias Kegele. Incremental refine-ment checking for test case generation. In TAP 2013: 7th International Conferenceon Tests & Proofs, Budapest, Hungary, June 18-19, 2013, Lecture Notes in Com-puter Science, pages 1–19. Springer, 2013.

10. Bernhard K. Aichernig, Florian Lorber, and Dejan Nickovic. Time for mutants:Model-based mutation testing with timed automata. In TAP 2013: 7th Inter-national Conference on Tests & Proofs, Budapest, Hungary, June 18-19, 2013,Lecture Notes in Computer Science, pages 20–39. Springer, 2013.

11. Bernhard K. Aichernig and Percy Antonio Pari Salas. Test case generation byOCL mutation and constraint solving. In Kai-Yuan Cai and Atsushi Ohnishi, edi-tors, QSIC 2005, Fifth International Conference on Quality Software, Melbourne,Australia, September 19-21, 2005, pages 64–71. IEEE Computer Society, 2005.

12. Ralph-Johan Back and Reino Kurki-Suonio. Decentralization of process nets withcentralized control. In Proceedings of the 2nd ACM SIGACT-SIGOPS Symp. onPrinciples of Distributed Computing, pages 131–142, Montreal, Quebec, Canada,1983. ACM.

13. Joshua Bloch. Extra, extra - read all about it: Nearly all binary searches and merge-sorts are broken. Google Research Blog, June 2006. http://googleresearch.

blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html (lastvisit May 17, 2013).

14. Harald Brandl, Martin Weiglhofer, and Bernhard K. Aichernig. Automated con-formance verification of hybrid systems. In Ji Wang, W. K. Chan, and Fei-ChingKuo, editors, Proceedings of the 10th International Conference on Quality Software,QSIC 2010, Zhangjiajie, China, 14-15 July 2010, pages 3–12. IEEE Computer So-ciety, 2010.

15. Edsger W. Dijkstra. The humble programmer. Communications of the ACM,15(10):859–866, October 1972.

16. C.A.R. Hoare and Jifeng He. Unifying Theories of Programming. Prentice-HallInternational, 1998.

17. Yue Jia and Mark Harman. An analysis and survey of the development of mutationtesting. IEEE Transactions on Software Engineering, 37(5):649–678, 2011.

18. Willibald Krenn and Bernhard K. Aichernig. Test case generation by contractmutation in Spec#. In B. Finkbeiner, Y. Gurevich, and A.K. Petrenko, editors,Proceedings of Fifth Workshop on Model Based Testing (MBT 2009), York, Eng-land, 22 March 2009, volume 253 (2) of Electronic Notes in Theoretical ComputerScience, pages 71–86. Elsevier, October 2009.

19. Sun Meng, Farhad Arbab, Bernhard K. Aichernig, Lacramioara Astefanoaei,Frank S. de Boer, and Jan Rutten. Connectors as designs: Modeling, refinementand test case generation. Science of Computer Programming, In Press, CorrectedProof, 2011.

20. Karl Popper. Logik der Forschung. Mohr Siebeck, 10th edition, 2005.21. Stefan Tiran. On the effects of UML modeling styles in model-based mutation

testing. Master’s thesis, Institute for Software Technology, Graz University ofTechnology, 2013.

22. Jan Tretmans. Test generation with inputs, outputs and repetitive quiescence.Software - Concepts and Tools, 17(3):103–120, 1996.

23. Martin Weiglhofer, Bernhard Aichernig, and Franz Wotawa. Fault-based confor-mance testing in practice. International Journal of Software and Informatics,3(2–3):375–411, June/September 2009. Special double issue on Formal Methods ofProgram Development edited by Dines Bjoerner.