Predictive Process Monitoring with Hyperparameter Optimization

18
Predictive Process Monitoring Framework with Hyperparameter Optimization Chiara Di Francescomarino Chiara Ghidini Fondazione Bruno Kessler Marlon Dumas Fabrizio Maria Maggi University of Tartu Marco Federici Williams Rizzi University of Trento

Transcript of Predictive Process Monitoring with Hyperparameter Optimization

Page 1: Predictive Process Monitoring with Hyperparameter Optimization

Predictive Process Monitoring Framework with Hyperparameter Optimization

Chiara Di Francescomarino Chiara GhidiniFondazione Bruno Kessler

Marlon Dumas Fabrizio Maria MaggiUniversity of Tartu

Marco FedericiWilliams RizziUniversity of Trento

Page 2: Predictive Process Monitoring with Hyperparameter Optimization

2

Predictive Business Process Monitoring

Predictive Business Process Monitoring

Historical execution

traces

Running trace

Prediction problem

Prediction

Does Alice need a given exam?

Page 3: Predictive Process Monitoring with Hyperparameter Optimization

3

Predictive Process Monitoring Frameworks

• Framework instance or configuration: combination of techniques and their input parameters (hyperparameters).

• No unique framework instance for all prediction problems and datasets.

Predictive Process Monitoring Framework

K-means

clustering

DBScan clusteringDecision Tree

Agglomerative

clustering

Frequency-based encoding

Sequence-based

encoding

Voting

Random Forest

• Cluster Number

• Minpoints

• Epsilon

• Voters

• Cluster Number

• Seed

• Seed

Historical execution

traces

Running trace

Prediction problem

Framework Instance

Page 4: Predictive Process Monitoring with Hyperparameter Optimization

In the “Real” WorldDoes Alice need the exams tumor marker CA- 19.9 or ca -125 using meia?

Which framework instance best suits my dataset and problem?Which one if I would

like to have only accurate predictions?

Predictive Process Monitoring Framework

• Cluster Number• Minpoints

• Epsilon

• Voters

• Cluster Number

• Seed

• Seed

K-means

clusteringDBScan clusteringDecision

Tree

Agglomerative

clustering

Frequency-based encoding

Sequence-based

encoding

Voting

Random Forest

4

Page 5: Predictive Process Monitoring with Hyperparameter Optimization

5

The Existing Landscape

• Approaches for – the selection of machine learning techniques– the tuning of their hyperparameters – the combined optimization of machine learning

techniques and their hyperparameters• We need to deal with the combination of

more than one machine learning technique, depending one from the other.

Challenge

Page 6: Predictive Process Monitoring with Hyperparameter Optimization

6

How to Avoid Users’ Panic?

• A Predictive Process Monitoring Framework enhanced with technique and hyperparameter optimization1. An exhaustive exploration of a set of the

framework configurations

2. Comparison and analysis of the results.

How to make

it efficiently?

How to

support users?

Page 7: Predictive Process Monitoring with Hyperparameter Optimization

7

The Enhanced Framework

Prediction Problem

Predictive Process Monitoring Framework

Historical execution

traces

Running trace

Prediction

Technique and Hyperparameter Tuner

Validation execution

traces

ReplayerEvaluator

Framework InstanceAggregated Metrics

Framework Instance

Page 8: Predictive Process Monitoring with Hyperparameter Optimization

8

The Predictive Process Monitoring Framework

Pre-processing

Historical execution

traces

Running trace

Runtime

Clustering ClustersControl

flow encoding

Encoded control

flow

CONTROL FLOW

Prefix extraction

Trace Prefixes

Predictive MonitoringControl

flow encoding

Data encoding

Cluster(s) identification

Classification

Prediction Problem

Prediction

Supervised Learning Classifiers

Data encoding

Encoded data

DATALabeling function

Page 9: Predictive Process Monitoring with Hyperparameter Optimization

9

The Predictive Process Monitoring Framework Instances

• Each technique has its own hyperparameters• Other framework parameters:

– Trace prefix size– Voting mechanism– Interval choice in case of interval time predictions

Page 10: Predictive Process Monitoring with Hyperparameter Optimization

10

Technique and Hyperparameter Tuning

• A trace is replayed until an evaluation point with a prediction confidence above a given threshold is reached.

• Three metrics/evaluation dimensions:– Accuracy– Failure rate– Earliness

ProM

ProM Operational

Support Service 2.0

Predictive Monitor

Technique and Hyperparameter Tuner

ReplayerValidation execution

traces

Configuration Sender

Evaluator

Framework Instance AggregatedMetrics

Framework Instance

Page 11: Predictive Process Monitoring with Hyperparameter Optimization

11

Improving Efficiency

• Scheduling mechanism for parallel replayers• Reuse of data structures

ProM

ProM Operational

Support Service 2.0

Predictive Monitor

Technique and Hyperparameter Tuner

Replayer 1

<<GUI>>

Unfolding Module

Configuration Sender

Replayer Scheduler

configuration{Run ID}

<Run ID, Trace>

Replayer 2

Replayer NSCHEDULER

Structured structure

Repository

Page 12: Predictive Process Monitoring with Hyperparameter Optimization

12

Supporting Users in the Analysis of the Results

Page 13: Predictive Process Monitoring with Hyperparameter Optimization

13

Evaluation

• A suitable configuration for the prediction problem and dataset in practice1. Does it return a set of configurations suitable for

the prediction problem?2. Does the selected configuration meet the choice

criteria?3. Does it require a reasonable amount of time?

Page 14: Predictive Process Monitoring with Hyperparameter Optimization

14

Experimental Settings• Two datasets and two prediction problems– BPI Challenge 2011

– BPI Challenge 2015

Dataset preparation:• Training set (70%)• Validation set (20%)• Testing set (10%)

Identification of the most suitable

configurations (among 160)

Evaluation of the identified

configurations (with the testing

set)

Page 15: Predictive Process Monitoring with Hyperparameter Optimization

15

Configuration Set Variability

• Higher variability for the first dataset → tuning depends on users’ needs

• Lower variability for the second dataset → configurations do not change that much

Page 16: Predictive Process Monitoring with Hyperparameter Optimization

16

Configuration Selection

• No unique best configuration.• Evaluation values are aligned with the tuning

ones.

Page 17: Predictive Process Monitoring with Hyperparameter Optimization

17

Computation Time

• Computation time can depend on the trace length.

• Data structure reuse →20% time reduction• 8 replayers → 13% time reduction

Page 18: Predictive Process Monitoring with Hyperparameter Optimization

18

Summing up & Looking Ahead

• A predictive monitoring framework enhanced with technique and hyperparameter optimization

• Three directions:– Increase user support– Optimize exhaustive search– Prescriptive process monitoring

THANK YOU!!