S-CUBE LP: Using Data Properties in Quality Prediction

S-Cube Learning Package

Using Data Properties in Quality Prediction

Universidad Politecnica de Madrid (UPM)

Learning Package Categorization

S-Cube

WP-JRA-1.3: End-to-End Quality Provisionand SLA Conformance

Quality Assurance and Quality Prediction

Using Data Properties in Quality Prediction

Service Compositions and QoS

Service compositions are an essential element of theService-Oriented-Architecture (SOA):

• Putting together several “lower level” (specialized) services• Leveraging low coupling and platform-independence• Achieve a more complex goal, e.g. a business process• Often cross-organizational, i.e. using services from different providers

Quality of Service (QoS) for compositions often critically important:• Relates to composition level running time, computational cost,

bandwidth, etc.• Depends on QoS of component services + composition internals +

environment factors (such as system and network loads/failures)• Can affect business-level KPIs (key performance indicators)• Influences applicability and usability in a particular context• Constrained by a Service-Level Agreement (SLA)

Learning Package Overview

1 Problem Description

2 Using Data Properties in Quality Prediction

3 Discussion

4 Conclusions

1 Problem Description

Components Impacting Orchestration QoS

Two groups of factors are usually encountered when analyzing QoSof a service composition:

• External variations:I Bandwidth, current load and throughput, network statusI Behavior of component services (e.g., meeting deadlines?)I Usually not under designer’s control, they change dynamically

• Composition structure:I What does it do with incoming requests?I Which other services are invoked and how?I Partially under designer control, known in advance.

Focusing on the latter, what kind of knowledge about compositionbehavior we can extract to predict composition QoS?

Besides, can we make prediction more precise by taking into accountcharacteristics of the data fed to the composition?

Automotive Scenario Example

Suppose you are an car part provider hired by an factory to purchasea series of parts for its assembly line.

• You are given a list of parts and their quantities• The parts must come from the same maker (be mutually compatible)• You contact a number of part makers and reserve each of the parts in

the required quantity.• If a maker cannot provide all parts, you cancel all reserved parts from

that maker and move to another maker.

Factory Provider

Maker 1

Maker K

...

• Time is essential: you want the process to take the least amount oftime and to include the smallest number of cancellations.

Automotive Scenario Example (contd.)In the service world, you publish to the Client (the factory) yourProvider service that uses a series of Maker services.

Client Provider

Maker 1

Maker K

RequestCancel

Part req.

OK / not OK

Part req.Cancel

OK / not OK

• The protocol requires reserving one car part type at a time. If a makeranswers with “not OK,” the provider sends “Cancel” messages for allreserved parts and starts reserving from another maker.

The total time is linked to the computation cost of serving the client.• It depends heavily (among other things) on number of parts (in the

input message) and characteristics of individual makers.

Computation Cost Example

A

B1

B2

TA(n) = 2n + 3 + nS(n)

TB1(n) = n + 1

TB2(n) = 0.1n + 7

binding to B1?

binding to B2?

Computation Cost of Service Networks

A

B1

B2

TA(n) = 2n + 3 + nS(n)

TB1(n) = n + 1

TB2(n) = 0.1n + 7

binding

to B1?

binding to B2?

20

40

60

80

100

120

140

4 5 6 7 8 9 10

QoS

/ C

ompu

tatio

nal C

ost

Input data size (for a given metric)

QoS / Comp Cost for A+B1QoS / Comp Cost for A+B2

Input message abstracted as thenumber of parts n.

Time TA for provider (A) dependson n and the time S(n) of thechosen maker (B1 or B2).

Structural part 2n + 3 in TA doesnot depend on the choice of maker.

The graph shows the QoS /computation cost for two possiblebindings:

TA with B1(n) = 2n + 3 + n(n + 1)

= n2 + 3n + 3

TA with B2(n) = 2n + 3 + n(0.1n + 7)

= 0.1n2 + 9n + 3

Ivanovic et al. (UPM, IMDEA) Data-Aware QoS-Driven Adaptation 2010-07-07 5 / 1

Input message abstracted as thenumber of parts n.

Time TA for provider (A) dependson n and the time S(n) of thechosen maker (B1 or B2).

Structural part 2n + 3 in TA doesnot depend on the choice of maker.

The graph shows the QoS /computation cost for two possiblebindings:

TA with B1(n) = 2n + 3 + n(n + 1)

= n2 + 3n + 3

TA with B2(n) = 2n + 3 + n(0.1n + 7)

= 0.1n2 + 9n + 3

Computation Cost of Service Networks

Computation cost information for B1 and B2 can be made available togetherwith other service-related information (e.g., WSDL extensions):

• Computation cost expressed as function of some metrics of input data.• Relationships between the size of input data and size of the output

data (when they exist).

A should in turn publish synthesized information (for reuse in othercompositions involving A).

Such abstract descriptions of computation cost do not compromise privacyof implementation details.

• They act as higher-level contracts on composition behavior.

ProblemInferring, representing and using the computation cost informationfor service compositions for QoS prediction.

2 Using Data Properties in Quality Prediction

Overview of the Approach

BPEL

WSDL

Intermediatelanguage

Logicprogram

AnalysisresultsTranslation

Translation Tran

slat

ion

Ana

lysi

s

Feedback

Feedback

1 Service / orchestration descriptions represented in intermediate language.• Provides indepdence from the source language (BPEL, Windows

Workflow, etc.)2 Intermediate representation translated into (annotated) logic program.

• Can capture just the relevant characteristics of the orchestration.3 Logic program analyzed for computation cost bounds.4 Analysis results useful for design-time quality prediction, predictive

monitoring, matchmaking, etc.

Background: Alternatives in S-Cube

Other S-Cube Approaches Include:

Detecting Possible SLA Violations Using Data Mining• Extracting information from event logs of successful and failed

executions of a composition in combination with event monitoring toidentify critical points and influential factors that are used as predictorsof possible SLA violations.

Using Online Testing to Predict Fault Points in Compositions• Using model checking-based techniques on post-mortem traces of

failed composition executions to identify activities that are likely to fail,both on the level of composition definition, and in particular cases ofexecuting instances.

Benefits of the Computation Cost ApproachStatistical approaches: structure and environmental factors contribute toQoS variability:

QoS

Environment factors Structural factors

• Hard to separate structural & environmental variations.• Whole range of input data may not be represented / sampled.• Runs may not be representative.• Results reflect historic variations in the environment.

Structural approaches with data information: safe approximations ofstructural contributions.

Environment factors Structural factor bounds

QoS

• Structural and environmental factors separately composed into QoS.• Entire input data range accounted for.• Results are safe and hold for all possible runs.• Results reflect current variations in the environment.

Computation Cost Analysis and SOA

The computation cost approach relies on applying static cost analysisto service orchestrations:

• Traditionally concerned with running time: Number of execution steps,worst-case execution time (WCET)

• Generalized to counting and measuring events Number of iterations,number of partner service invocations, number of exchangedmessages, network traffic (number of bytes sent/received).

Data Awareness: bounds expressed as functions of input data.• Magnitude of scalars: floating-point, ordinal and cardinal values• Measures of data structures: number of items in a list, depth of a tree,

size of a collection

Leveraging existing analysis tools.• In this case, for logic programs

Bounds for Computation CostCost analysis (either automatic or manual) often can only determine safeupper and lower bounds of computation costs.Exact computation cost function somewhere in between.

Approximating Actual BehaviorWith Upper and Lower Bounds

Automatic analysis often can only determine safe upper and lower bounds.

Exact cost function somewhere in between.

20

40

60

80

100

120

140

4 5 6 7 8 9 10

QoS

/ C

ompu

tatio

nal C

ost

Input data size (for a given metric)

Upper bound QoS / Comp Cost for A+B1Lower bound QoS / Comp Cost for A+B1Upper bound QoS / Comp Cost for A+B2Lower bound QoS / Comp Cost for A+B2

Assumption: different instances ofthe same event type contributeequally to the overall computationcost.

Safe cost bounds are combinedwith current environmentparameters from monitoring (e.g.,network speed) to produce QoSbounds.

QoS ≈ cost ⊗ environment not strictly safe, but:

� More informed than data-unaware, single point predictions, staticbounds, or averages.

� Can be used to predict future behavior of a composition.

Ivanovic et al. (UPM, IMDEA) Data-Aware QoS-Driven Adaptation 2010-07-07 7 / 1

Assumption: different instances of thesame event type contribute equally tothe overall computation cost.

Safe computation cost bounds arecombined with current environmentparameters from monitoring (e.g.,network speed) to produce QoSbounds.

QoS bounds approximated by combining cost bounds and environmentfactors are not strictly safe, but:

• More informed than data-unaware, single point predictions, staticbounds, or averages.

• Can be used to predict future behavior of a composition.

Benefits of Upper/Lower Bounds Approach

QoS

Input data measure

QoS

Input data measure

QoS

Input data measure

QoS

Input data measure

INSENSITIVE TO INPUT DATA SENSITIVE TO INPUT DATA

FOCUS:AVERAGE

CASE

Good for aggregatemeasures.

Usually simplerto calculate.

Not very informativefor individual runninginstances.

FOCUS:UPPER /LOWER

BOUNDS

Can be combined withthe average caseapproach.

More difficult tocalculate.

Useful for monitoring /adapting individualrunning instances.

General idea: More information ⇒ more precision

Orchestration Intermediate Language

Intermediate language (partly) inspired by common BPEL constructs:

Data Types: XML-style data structures with basic (string, Boolean, number) andcomplex types (structures, lists, optionality).

Expression language: XPath restricted to child/attribute navigation that can beresolved statically. Basic arithmetic/logical/string operations.

Basic constructs: assignment, sequence, branching, and looping.

Partner invocation: invoke follows the synchronous pattern. The moment ofreply reception is not accounted for.

Scopes and fault handlers: usual lexical scoping and exception processing.

Parallel flows: using logical link dependencies.

Translation into Logic ProgramService: Translated into a logic predicate expressing a mapping from the

input message to a reply or a fault.

Invocation: Translated into a predicate call. Returns a reply or a fault.

Assignment: Passes the expression value to subsequent predicate calls.

Branching: Mutually exclusive clauses for the then and else parts.

Looping: Recursive predicate with the base case that corresponds to theloop exit condition.

Scopes: Sub-predicates for scope body and each defined fault handler.

Flows: Statically serialized according to logical link dependencies.

Concrete Semantics and Resource Consumption

Resulting logic program does not aim to mimic the operational semantics of,e.g., BPEL processes.

Reflecting just the necessary semantics for resource analyzers to infercomputation costs with minimal precision loss.

Obtaining Computation Cost Functions

Example analysis of a simple scenario (one provider - one maker):

Client Provider MakerRequestCancel

part req.OK / not OK

• not OK is treated as a fault by the provider.• two analysis variants: without fault handling (ideal case) and with fault

handling (general case).

As a generalized resource that is analyzed, here we take the number ofProvider→Maker invocations for different n.

• Can be related to the Key Performance Indicators (KPIs)I Some events are related to business value for the provider and/or maker.I E.g., minimizing cancellations (undesirable in general).

Example of Analysis Results

Computation cost analysis results returned as upper and lower boundfunctions of n (number of parts to reserve).

• These functions express the number of events:I executions of simple activities in the orchestrationI reservations of single part typeI cancellations of previously reserved types

• In the case without fault handling, we assume that each invocation issuccessful (i.e. the optimistic case).

With fault handling Without fault handlingResource lower bound upper bound lower bound upper bound

No. of simple activities 2 7n 5n + 2 5n + 2Single reservations 0 n n n

Cancellations 0 n − 1 0 0

3 Discussion

Application to Predictive Monitoring

History

QoS metric

Max

A B C D

Initially expected behavior

Actual profile

Prediction afterobservation B

Prediction afterobservation C

Notion of pending QoS – remaining metric until composition finishes.

At point B, a deviation is detected from the initial prediction ⇒ it must comefrom environment. Updated prediction (densely dotted) for D still withinrange.

At point C, further deviation detected. Updated prediction (loosely dotted)can fall out the range ⇒ violation of QoS concerns can be predicted aheadof time.

Experiment in Predictive MonitoringSimulation of a service-to-service call with time constraint Tmax:

• Service A invoked with input message of size n in range 1..50• A invokes service B between 50 and 100 times for n = 1, and between

250 and 500 times for n = 50 (the bounds are linear)• B performs between 8 and 16 steps on each invocation.• Each iteration of A and each step of B take some time between known

bounds. Message and reply transfer times are environment factors.During execution of an orchestration instance for given n, the systemtakes into account:

• known computation cost bounds (iterations, steps above)• the current environment factors

and gives the following signals:• OK: time limit compliance guaranteed• Warn: time limit violation possible• Alarm: time limit violation certain

The actual results are: OK for the time limit compliance and ¬OK forviolation.

Experiment in Predictive Monitoring (Cont.)Scenario 1: Environment factors suddenly double (on average) attime Tmax/3 into execution of a composition instance.!"##$%

&'(#)*

*+,%-./01*2***+*,*%*-*.*/*0*1+2+*+++,+%+-+.+/+0+1,2,*,+,,,%,-,.,/,0,1%2%*%+%,%%%-%.%/%0%1-2

23

*23

+23

,23

%23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OK

Warn/¬OK

Warn/OKOK

!"##$%

&'(#)*

*%+,-./01*2***%*+*,*-*.*/*0*1%2%*%%%+%,%-%.%/%0%1+2+*+%+++,+-+.+/+0+1,2,*,%,+,,,-,.,/,0,1-2

23

*23

%23

+23

,23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OKOKWarn/OK

Warn/¬OK

!"##$%

&'(#)*

*%+,-./01*2***%*+*,*-*.*/*0*1%2%*%%%+%,%-%.%/%0%1+2+*+%+++,+-+.+/+0+1,2,*,%,+,,,-,.,/,0,1-2

23

*23

%23

+23

,23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

!"##$%

&'(#)*

*+,%-./01*2***+*,*%*-*.*/*0*1+2+*+++,+%+-+.+/+0+1,2,*,+,,,%,-,.,/,0,1%2%*%+%,%%%-%.%/%0%1-2

23

*23

+23

,23

%23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OK

Warn/¬OK

Warn/OKOK

Alarm/¬OKOKWarn/OK

Warn/¬OK

Fig. 6. Ratio of true and false positives for two environmental regimes.

Under the first regime, composition executions for small values of n take littletime to complete, so they comply with the time limit (marked by OK) and no alertsare raised. For slightly larger input sizes (e.g. n = 9), executions still comply with thetime limit, but warnings are raised (Warn/OK), since the monitor’s estimate of theupper bound running time exceeds Tmax. As n increases, the number of false warningpositives decreases in favor of the true warning positives (Warn/¬OK), because theaverage running time increases and thus the possibility of execution being affectedby sudden deterioration of the environment factors. In the same region (around n =20) some warnings start to be promoted to alarms, as the lower bound time estimatesincreasingly start to overshoot Tmax. These cases are marked with Alarm/¬OK, andthey are all true positives, since the system degrades monotonically (things neverget better). Further increases in n (to around 30) lead to rapid disappearance of thefalse warning positives. After n = 38, all executions fall into Alarm/¬OK, because themonitor is always able to detect ahead of time that the lower execution time boundovershoots the time limit.

In the second regime in Figure 6, featuring a gradual degradation, the upper exe-cution time bound overshoots Tmax in some cases even for very small input sizes (e.g.n = 3). A warning is then raised although no actual violations happen (executions areOK). As n increases, a pattern similar to that in the first regime is followed. For largevalues of n (but before the point in which it happened in the previous regime) allalarms rightly correspond to the ¬OK case.

6 Concluding Remarks

We have sketched a resource analysis for service orchestrations based on a transla-tion to an intermediate programming language for which complexity analyzers areavailable. The translation process approximates the behavior of the original processnetwork in such a way that the analysis results (the cost functions) are valid for theoriginal network. We have presented a mechanism to use these functions, togetherwith environmental characteristics, to predict the future behavior of the system evenwhen the environment deviates from its expected behavior. We have applied themto perform predictive monitoring and the approach has been validated with a sim-ulation which detects when the complexity bounds and the actual (simulated) exe-cution cross the deadline and extracts statistical data regarding the accuracy of thepredictions.

9

• For small n, violations are not predicted and do not happen (OK)• For slightly larger n, some false warnings arise (Warn/OK)• For large n, false warnings yield to true violation warnings (Warn/¬OK)

and true alarms (Alarm/¬OK)• There are no false alarms (Alarm/OK).

Conclusion: very good prediction accuracy, with some false warningsin the lower mid-range of n.

Experiment in Predictive Monitoring (Cont.)Scenario 2: Environment factors gradually deteriorate (quadruplingon average) during the period Tmax from the start of execution.!"##$%

&'(#)*

*+,%-./01*2***+*,*%*-*.*/*0*1+2+*+++,+%+-+.+/+0+1,2,*,+,,,%,-,.,/,0,1%2%*%+%,%%%-%.%/%0%1-2

23

*23

+23

,23

%23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OK

Warn/¬OK

Warn/OKOK

!"##$%

&'(#)*

*%+,-./01*2***%*+*,*-*.*/*0*1%2%*%%%+%,%-%.%/%0%1+2+*+%+++,+-+.+/+0+1,2,*,%,+,,,-,.,/,0,1-2

23

*23

%23

+23

,23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OKOKWarn/OK

Warn/¬OK

!"##$%

&'(#)*

*%+,-./01*2***%*+*,*-*.*/*0*1%2%*%%%+%,%-%.%/%0%1+2+*+%+++,+-+.+/+0+1,2,*,%,+,,,-,.,/,0,1-2

23

*23

%23

+23

,23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

!"##$%

&'(#)*

*+,%-./01*2***+*,*%*-*.*/*0*1+2+*+++,+%+-+.+/+0+1,2,*,+,,,%,-,.,/,0,1%2%*%+%,%%%-%.%/%0%1-2

23

*23

+23

,23

%23

-23

.23

/23

023

123

*223

45'6789: 45'678;9: <'6=89: <'6=8;9: ;<'6=89: ;<'6=8;9:

Alarm/¬OK

Warn/¬OK

Warn/OKOK

Alarm/¬OKOKWarn/OK

Warn/¬OK

Fig. 6. Ratio of true and false positives for two environmental regimes.

Under the first regime, composition executions for small values of n take littletime to complete, so they comply with the time limit (marked by OK) and no alertsare raised. For slightly larger input sizes (e.g. n = 9), executions still comply with thetime limit, but warnings are raised (Warn/OK), since the monitor’s estimate of theupper bound running time exceeds Tmax. As n increases, the number of false warningpositives decreases in favor of the true warning positives (Warn/¬OK), because theaverage running time increases and thus the possibility of execution being affectedby sudden deterioration of the environment factors. In the same region (around n =20) some warnings start to be promoted to alarms, as the lower bound time estimatesincreasingly start to overshoot Tmax. These cases are marked with Alarm/¬OK, andthey are all true positives, since the system degrades monotonically (things neverget better). Further increases in n (to around 30) lead to rapid disappearance of thefalse warning positives. After n = 38, all executions fall into Alarm/¬OK, because themonitor is always able to detect ahead of time that the lower execution time boundovershoots the time limit.

In the second regime in Figure 6, featuring a gradual degradation, the upper exe-cution time bound overshoots Tmax in some cases even for very small input sizes (e.g.n = 3). A warning is then raised although no actual violations happen (executions areOK). As n increases, a pattern similar to that in the first regime is followed. For largevalues of n (but before the point in which it happened in the previous regime) allalarms rightly correspond to the ¬OK case.

6 Concluding Remarks

We have sketched a resource analysis for service orchestrations based on a transla-tion to an intermediate programming language for which complexity analyzers areavailable. The translation process approximates the behavior of the original processnetwork in such a way that the analysis results (the cost functions) are valid for theoriginal network. We have presented a mechanism to use these functions, togetherwith environmental characteristics, to predict the future behavior of the system evenwhen the environment deviates from its expected behavior. We have applied themto perform predictive monitoring and the approach has been validated with a sim-ulation which detects when the complexity bounds and the actual (simulated) exe-cution cross the deadline and extracts statistical data regarding the accuracy of thepredictions.

9

• For small n, do not happen (OK), but there are some false warnings(Warn/OK)

• For larger n, false warnings yield to true violation warnings(Warn/¬OK) and true alarms (Alarm/¬OK)

• There are again no false alarms (Alarm/OK)

Conclusion: when conditions gradually deteriorate, the predictiontends to become more accurate on average.

Experiment in Proactive Adaptation

Client

P1

UB1(m)

P2

UB2(m)

PN

UBN(m)

Tier 1

...

S1 ub1(n)

S2 ub2(n)

SN ubN(n)

Tier 2

...

0

50

100

150

200

250

0 10 20 30 40 50

ub_1(x)ub_2(x)ub_3(x)ub_4(x)ub_5(x)ub_6(x)ub_7(x)ub_8(x)ub_9(x)

ub_10(x)ub_11(x)ub_12(x)

lub(x)

Client chooses provider Pj fromfirst tier of services, passing theinput argument m = 0..50.

Chosen provider chooses M = 5times a part maker (the secondtier) with the input n = m.

Plot depicts family of upper boundfunctions for structural computationcost for the first and the second tier.

Structural computation cost modelsnumber of messages exchanged(without messages between thetiers).

Fault rate used to model serviceunavailability.

Experiment in Proactive Adaptation (Cont.)

Client

P1

UB1(m)

P2

UB2(m)

PN

UBN(m)

Tier 1

...

S1 ub1(n)

S2 ub2(n)

SN ubN(n)

Tier 2

...

0

50

100

150

200

250

0 10 20 30 40 50

ub_1(x)ub_2(x)ub_3(x)ub_4(x)ub_5(x)ub_6(x)ub_7(x)ub_8(x)ub_9(x)

ub_10(x)ub_11(x)ub_12(x)

lub(x)

Selection of first/second tierservice done using:

• random choice;• fixed preference (lowest

computation cost for n = 12); and• data-aware computation cost

minimization

Message passing times for theservices simulated using thefollowing two regimes:

(A) Random Gaussian choice withaverage 5ms for all services.

(B) Varying average 4-8ms.

Effectiveness of policies comparedw.r.t. total simulated time.

A Simulation Experiment (Cont.)

Simulation results indicate that for both cases (A and B) of service runningtime variations, the data aware outperforms both the random choice andfixed preference policies.

• x-axis gives input data size in the range 0-50• y -axis gives total simulated running time• The fault rate is pf = 0.001

0

1000

2000

3000

4000

5000

6000

5 10 15 20 25 30 35 40 45 50

Time [ms]

randomfixeddata

sim_s1_pf001.data

0

1000

2000

3000

4000

5000

6000

5 10 15 20 25 30 35 40 45 50

Time [ms]

randomfixeddata

sim_s2_pf001.data

Experiment in Proactive Adaptation (4)

Another set of simulation results for pf = 0.1 (below) indicate that theadvantages of using the data aware service selection policy persist evenunder very high noise / failure / unavailability rates.

• included both cases (A and B) of service running time variations• overall, the data awareness gives best results for very small and big

input data sizes

0

1000

2000

3000

4000

5000

6000

5 10 15 20 25 30 35 40 45 50

Time [ms]

randomfixeddata

sim_s1_pf100.data

0

1000

2000

3000

4000

5000

6000

5 10 15 20 25 30 35 40 45 50

Time [ms]

randomfixeddata

sim_s2_pf100.data

Current Restrictions on Orchestrations

Currently, we are looking at “common” orchestrations that respectsome restrictions w.r.t. behavior.

• Overcoming these limitations is a goal for future work.

Orchestrations must follow receive-reply interaction pattern:• All processing between reception of the initiating message and

dispatching of (final) response.• Applicable to processes that accept one among several possible input

messages.• Future work: relax restriction by using fragmentation to

identify/separate reply-response service sections.

Orchestration must have no stateful callbacks:• I.e., no correlation sets / WS-Addressing.• Practical problem: current analyzers lose precision when passing

opaque objects containing state.• Future work: improve translation and analysis itself.

4 Conclusions

Conclusions

Data-aware computation cost functions can be used to predict QoS and thusdrive QoS-aware adaptation or signal certain or possible QoS violations.

Based on a translation scheme that, from an orchestration represented in anintermediate language, a logic program is generated and analyzed byexisting tools.

• Analysis derives computation cost functions which are safe upper andlower bounds of the orchestration’s computation cost.

• The computation cost functions are expressed as functions of the sizeof input data, expressed in some appropriate data metrics.

• Computation cost functions are combined with environment factorsused to build more precise QoS bounds estimations as a function ofinput data.

Conclusions (Cont.)

In predictive monitoring, simulation results suggest high accuracy ofpredictions ahead of time, including situations when environmentalconditions gradually deteriorate.

• The time before detection and occurrence of a violation may be usedfor preparing and triggering the appropriate adaptive action.

Simulation results indicate the usefulness of the approach in improving theefficiency of dynamic, run-time adaptation based on QoS-aware serviceselection.

• In general, data-aware adaptation gives better results than otherservice selection policies — even with very large variability in serviceavailability.

The idea is to integrate the presented approach into service compositionprovision systems, collect empirical data and compare and combine it withstatistical / data mining approaches.

References

This presentation is based on [ICH10a, ICH10b].

Some pointers on QoS analysis and prediction for Web servicecompositions: [Car05, Car07, LWR+09, HKMP08, DMK10]

Some pointers on automatic complexity analysis / computational cost/ resource consumption analysis:[HBC+12, HPBLG05, NMLH09, NMLGH08, ABG+11]

Bibliography I[ABG+11] E. Albert, R. Bubel, S. Genaim, R. Hahnle, G. Puebla, and G. Roman-Dıez.

Verified resource guarantees using COSTA and KeY.In Siau-Cheng Khoo and Jeremy G. Siek, editors, PEPM, pages 73–76.ACM, 2011.

[Car05] J. Cardoso.About the Data-Flow Complexity of Web Processes.In 6th International Workshop on Business Process Modeling, Development,and Support: Business Processes and Support Systems: Design forFlexibility, pages 67–74, 2005.

[Car07] J. Cardoso.Complexity analysis of BPEL web processes.Software Process: Improvement and Practice, 12(1):35–49, 2007.

[DMK10] Dimitris Dranidis, Andreas Metzger, and Dimitrios Kourtesis.Enabling proactive adaptation through just-in-time testing of conversationalservices.In Elisabetta Di Nitto and Ramin Yahyapour, editors, ServiceWave, volume6481 of Lecture Notes in Computer Science, pages 63–75. Springer, 2010.

Bibliography II

[HBC+12] M. V. Hermenegildo, F. Bueno, M. Carro, P. Lopez, E. Mera, J.F. Morales, andG. Puebla.An Overview of Ciao and its Design Philosophy.Theory and Practice of Logic Programming, 12(1–2):219–252, January 2012.

http://arxiv.org/abs/1102.5497.

[HKMP08] Julia Hielscher, Raman Kazhamiakin, Andreas Metzger, and Marco Pistore.A framework for proactive self-adaptation of service-based applicationsbased on online testing.In Petri Mahonen, Klaus Pohl, and Thierry Priol, editors, Towards aService-Based Internet, volume 5377 of Lecture Notes in Computer Science,pages 122–133. Springer Berlin / Heidelberg, 2008.

[HPBLG05] M. Hermenegildo, G. Puebla, F. Bueno, and P. Lopez-Garcıa.Integrated Program Debugging, Verification, and Optimization Using AbstractInterpretation (and The Ciao System Preprocessor).Science of Computer Programming, 58(1–2):115–140, 2005.

Bibliography III[ICH10a] D. Ivanovic, M. Carro, and M. Hermenegildo.

An Initial Proposal for Data-Aware Resource Analysis of Orchestrations withApplications to Predictive Monitoring.In Asit Dan, Frederic Gittler, and Farouk Toumani, editors, InternationalWorkshops, ICSOC/ServiceWave 2009, Revised Selected Papers, number6275 in LNCS. Springer, September 2010.

[ICH10b] D. Ivanovic, M. Carro, and M. Hermenegildo.Towards Data-Aware QoS-Driven Adaptation for Service Orchestrations.In Proceedings of the 2010 IEEE International Conference on Web Services(ICWS 2010), Miami, FL, USA, 5-10 July 2010, pages 107–114. IEEE, 2010.

[LWR+09] Philipp Leitner, Branimir Wetzstein, Florian Rosenberg, Anton Michlmayr,Schahram Dustdar, and Frank Leymann.Runtime prediction of service level agreement violations for compositeservices.In Asit Dan, Frederic Gittler, and Farouk Toumani, editors,ICSOC/ServiceWave Workshops, volume 6275 of Lecture Notes in ComputerScience, pages 176–186, 2009.

Bibliography IV

[NMLGH08] J. Navas, E. Mera, P. Lopez-Garcıa, and M. Hermenegildo.

Inference of User-Definable Resource Bounds Usage for Logic Programs andits Applications.

Technical Report CLIP5/2008.0, Technical University of Madrid (UPM),School of Computer Science, UPM, July 2008.

[NMLH09] J. Navas, M. Mendez-Lojo, and M. Hermenegildo.

User-Definable Resource Usage Bounds Analysis for Java Bytecode.

In Proceedings of the Workshop on Bytecode Semantics, Verification,Analysis and Transformation (BYTECODE’09), volume 253 of ElectronicNotes in Theoretical Computer Science, pages 6–86. Elsevier - NorthHolland, March 2009.

Acknowledgments

The research leading to these results has received fundingfrom the European Community’s Seventh FrameworkProgramme [FP7/2007-2013] under grant agreement 215483(S-Cube).

S-CUBE LP: Using Data Properties in Quality Prediction

Technology

Transcript of S-CUBE LP: Using Data Properties in Quality Prediction