Chapter 10 Verification and Validation of Simulation...

20
1 Chapter 10 Verification and Validation of Simulation Models Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Learning Outcomes After completing this chapter, you should be able to: Distinguish between verification and validation, Explain the importance of verification and validation, Define key terms like high face validity, Turing test, calibration, and others, Conduct simple statistical tests on assumed distributional forms, Conduct a Turing test, Compare model output to system output by statistical tests, 2

Transcript of Chapter 10 Verification and Validation of Simulation...

  • 1

    Chapter 10

    Verification and Validation

    of Simulation Models

    Banks, Carson, Nelson & Nicol

    Discrete-Event System Simulation

    Learning Outcomes

    After completing this chapter, you should be able

    to:

    Distinguish between verification and validation,

    Explain the importance of verification and validation,

    Define key terms like high face validity, Turing test,

    calibration, and others,

    Conduct simple statistical tests on assumed distributional

    forms,

    Conduct a Turing test,

    Compare model output to system output by statistical tests,

    2

  • 2

    The goal of the validation process is:

    • To produce a model that represents true behavior closely enough for decision-making purposes,

    • To increase the model’s credibility to an acceptable level,

    Validation is an integral part of model development:

    • Verification: building the model correctly, correctly

    implemented with good input and structure

    • Validation: building the correct model, an accurate representation of the real system,

    Most methods are informal subjective comparisons

    while a few are formal

    3

    Purpose & Overview

    The process of verification and validation

    4

    Validation is the process to ensure that the model is representing

    the real world as much as possible. The validation process helps a

    modeler be certain that the correct model is built. The validation

    process relies heavily on the data collected from the real world, and

    the perception and understanding of the process of the modeler.

    The validation process ensures that the model is doing what the

    real process is doing. (See Figure 1.)

    Verification is the process of comparing two or more results to

    ensure its accuracy.

  • 3

    Overview

    Validation is an integral part of model development

    Verification – building the model correctly (correctly implemented

    with good input and structure)

    Validation – building the correct model (an accurate

    representation of the real system)

    Usually achieved through calibration

    Most methods are informal subjective comparisons while

    a few are formal statistical procedures

    Statistical procedures on output are the subject of chapters 11

    and 12

    5

    10.1 Model building, verification, and validation

    First step in model building is observing the real system

    Interactions of components, collecting data

    Take advantage of people with special knowledge

    Construct a conceptual model

    Assumptions about components - hypotheses

    Structure of the system

    Implementation of an operational model using software

    Not a linear process

    Will return to each step many times while building, verifying and

    validating the model

    6

  • 4

    7

    Modeling-Building, Verification & Validation

    Steps in Model-Building

    • Observing the real system

    and the interactions among

    their various components

    and of collecting data on

    their behavior.

    • Construction of a

    conceptual model

    • Implementation of an

    operational model.

    8

    10.2 Verification

    Purpose: ensure the conceptual model is reflected

    accurately in the computerized representation.

    Conceptual model usually involves some abstraction or

    some amount of simplification of actual operations

    Many common-sense suggestions, for example:

    Have someone else check the model.

    Make a flow diagram that includes each logically possible action

    a system can take when an event occurs.

    Closely examine the model output for reasonableness under a

    variety of input parameter settings. (Often overlooked!)

  • 5

    Verification (cont.)

    Print the input parameters at the end of the simulation, make

    sure they have not been changed inadvertently

    Verify the animation imitates the real system

    Monitor the simulation via a debugger

    Step it and check values

    Use graphical interfaces as a form of self documentation

    Are these the steps a software engineer would use?

    9

    10

    Examination of Model Output

    for Reasonableness see page 392 [Verification]

    Example: A model of a complex network of queues

    consisting many service centers.

    Response time is the primary interest, however, it is important to

    collect and print out many statistics in addition to response

    time.

  • 6

    11

    Examination of Model Output

    for Reasonableness see page 392 [Verification]

    Two statistics that give a quick indication of

    model reasonableness are current contents

    and total counts, for example: If the current content grows in a more or less linear fashion

    as the simulation run time increases, it is likely that a queue

    is unstable

    If the total count for some subsystem is zero, indicates no

    items entered that subsystem, a highly suspect occurrence

    If the total and current count are equal to one, can indicate

    that an entity has captured a resource but never freed that

    resource.

    Compute certain long-run measures of performance, e.g.

    compute the long-run server utilization and compare to simulation

    results

    12

    Other Important Tools [Verification]

    Documentation

    A means of clarifying the logic of a model and verifying

    its completeness

    Use of a trace

    A detailed printout of the state of the simulation model

    over time.

    Gives the value of every variable in a program, every time one

    changes in value.

    Can use print/write statements

    Use a selective trace in eclipse

  • 7

    10.3 Calibration and Validation

    Verification and validation are usually conducted

    simultaneously

    Validation: the overall process of comparing the model

    and its behavior to the real system.

    Calibration: the iterative process of comparing the model

    to the real system and making adjustments.

    Some tests subjective and other objective

    Objective tests require data on the system’s behavior and

    Corresponding data produced by the model

    13

    14

    Calibration and Validation

  • 8

    15

    Calibration and Validation

    Danger during the calibration phase

    • Typically few data sets are available, in the worst case

    only one, and the model is only validated for these.

    • Solution: If possible collect new data sets

    No model is ever a perfect representation of the system

    The modeler must weigh the possible, but not guaranteed,

    increase in model accuracy versus the cost of increased validation

    effort.

    Three-step approach:

    Build a model that has high face validity.

    Validate model assumptions.

    Compare the model input-output transformations with the real

    system’s data.

    16

    10.3.1 High Face Validity [Calibration & Validation]

    Ensure a high degree of realism: Potential users should

    be involved in model construction (from its conceptualization

    to its implementation).

    Sensitivity analysis can also be used to check a model’s

    face validity.

    Example: In most queueing systems, if the arrival rate of

    customers were to increase, it would be expected that

    server utilization, queue length and delays would tend to

    increase.

    For large-scale simulation models, there are many input

    variables and thus possibly many sensitivity tests.- Sometimes not possible to perform all of theses tests, select the

    most critical ones.

  • 9

    17

    10.3.2 Validate Model Assumptions [Cal & Val]

    General classes of model assumptions:

    Structural assumptions: how the system operates.

    Data assumptions: reliability of data and its statistical

    analysis.

    Bank example: customer queueing and service facility in a

    bank.

    Structural assumptions, e.g., customer waiting in one line

    versus many lines, served FCFS versus priority.

    Data assumptions, e.g., inter-arrival time of customers,

    service times for commercial accounts.

    Verify data reliability with bank managers.

    Test correlation and goodness of fit for data (see Chapter 9 for

    more details).

    Analysis of input data from Chapter 9

    1. Identify an appropriate probability distribution

    2. Estimate the parameters of the hypothesized

    distribution

    3. Validate the assumed statistical model by goodness-of-

    fit test, such as the chi-square or Kolmogorov-Smirnov

    test, and by graphical methods

    18

  • 10

    19

    10.3.3 Validate Input-Output Transformation (I-O)

    Goal: Validate the model’s ability to predict future behavior

    The only objective test of the model as a whole.

    If input variables were to increase or decrease, model should

    accurately predict what would happen in the real system

    The structure of the model should be accurate enough to make good

    predictions for the range of input data sets of interest.

    In this process the model is viewed as an input-output

    transformation

    One possible approach: use historical data that have been

    reserved for validation purposes only.

    Criteria: use the main responses of interest.

    If model is used for different purpose later, revalidate it

    Validate I-O Transformations [Cal & Val]

    Note that for validation of the I-O, some version of the

    system under study must exist.

    Without the system other types of validation should be used

    Maybe just subsystems exist

    Transfer to a new set input parameters may cause

    changes in operational model and may require

    revalidation

    This is an interesting validation problem that really concerns

    what-if questions about the model.

    20

  • 11

    Changing the Input-Output transformations

    The model may be used to compare different

    system designs or

    The model may be used to investigate system

    behavior under new input conditions

    What can be said of the about the validity of the

    model under these new conditions?

    Responses of the two models used to compare two systems

    Hope that confidence in old model can be transferred to new

    21

    Typical changes in the operational model

    Minor changes of single numerical parameter such as

    arrival rate of customers

    Minor changes of the form of the statistical distribution such

    as the service time

    Major changes involving the logical structure of a

    subsystem

    Major changes involving a different design for the new

    system (say computerizing the inventory)

    Minor changes can be carefully verified and model accepted with

    confidence

    Possible to do a partial validation on the major changes

    No way to validate the input-output transformations of a non-existing

    system completely22

  • 12

    23

    Example 10.2: The Fifth National Bank of Jasper

    [Validate I-O Transform]

    One drive-in window serviced by one teller,

    only one or two transactions are allowed at the window

    So it was assumed that each service time was from some

    underlying population.

    Data collection: 90 customers during 11 am to 1 pm.

    Observed service times {Si, i = 1,2, …, 90}.

    Observed interarrival times {Ai, i = 1,2, …, 90}.

    Data analysis (Ch 9) led to the conclusion that:

    Interarrival times: exponentially distributed with rate l = 45/hr That is, a Poisson arrival process

    Service times assumed to be N(1.1, (0.2)2)

    24

    The Black Box [Bank Example: Validate I-O Transformation]

    A model was developed in close consultation with bank

    management and employees

    Model assumptions were validated

    Resulting model is now viewed as a “black box”:

    Input Variables

    Possion arrivals

    l = 45/hr: X11, X12, …Services times,

    N(D2, 0.22): X21, X22, …

    D1 = 1 (one teller)D2 = 1.1 min (mean service time)

    D3 = 1 (one line)

    Uncontrolled

    variables, X

    Controlled

    Decision

    variables, D

    Model Output Variables, Y

    Primary interest:

    Y1 = teller’s utilizationY2 = average delayY3 = maximum line length

    Secondary interest:

    Y4 = observed arrival rateY5 = average service timeY6 = sample std. dev. of

    service times

    Y7 = average length of time

    Model

    “black box”

    f(X,D) = Y

  • 13

    25

    Comparison with Real System Data[Bank Example: Validate I-O Transformation]

    Real system data are necessary for validation.

    System responses should have been collected during the same

    time period (from 11am to 1pm on the same Friday.)

    Compare the average delay from the model Y2 with the actual delay Z2:

    Average delay observed, Z2 = 4.3 minutes, consider this to be the true mean value m0 = 4.3.

    When the model is run with generated random variates X1n and X2n, Y2 should be close to Z2.

    Six statistically independent replications of the model, each of 2-hour duration, are run.

    26

    Comparison with Real System Data[Bank Example: Validate I-O Transformation]

  • 14

    27

    Hypothesis Testing[Bank Example: Validate I-O Transformation]

    Compare the average delay from the model Y2 with the actual delay Z2 (continued):

    Null hypothesis testing: evaluate whether the simulation and the real

    system are the same (w.r.t. output measures):

    If H0 is not rejected, then, there is no reason to consider the model invalid

    If H0 is rejected, the current version of the model is rejected, and the modeler needs to improve the model

    minutes 3.4

    minutes 34

    21

    20

    ): E(YH

    .): E(YH

    28

    Hypothesis Testing[Bank Example: Validate I-O Transformation]

    Conduct the t test:

    Chose level of significance (a = 0.5) and sample size (n = 6), see result in Table 10.2.

    Compute the same mean and sample standard deviation over

    the n replications:

    Compute test statistics:

    Hence, reject H0. Conclude that the model is inadequate.

    Check: the assumptions justifying a t test, that the observations (Y2i) are normally and independently distributed.

    minutes 51.21

    1

    22

    n

    i

    iYn

    Y minutes 81.01

    )(1

    2

    22

    n

    YY

    S

    n

    i

    i

    test)sided-2 a(for 571.2 5.24 6/82.0

    3.451.2

    /

    020

    criticalt

    nS

    Yt

    m

  • 15

    29

    Hypothesis Testing[Bank Example: Validate I-O Transformation]

    Similarly, compare the model output with the observed

    output for other measures:

    Y4 Z4, Y5 Z5, and Y6 Z6

    30

    Type II Error [Validate I-O Transformation]

    For validation, the power of the test is:

    Probability[ detecting an invalid model ] = 1 – b

    b = P(Type II error) = P(failing to reject H0|H1 is true)

    Consider failure to reject H0 as a strong conclusion, the modeler

    would want b to be small.

    Value of b depends on:

    Sample size, n

    The true difference, d, between E(Y) and m:

    In general, the best approach to control b error is:

    Specify the critical difference, d.

    Choose a sample size, n, by making use of the operating characteristics curve (OC curve).

    md

    )(YE

  • 16

    31

    Type II Error [Validate I-O Transformation]

    32

    Type I and II Error[Validate I-O Transformation]

    Type I error (a):

    Error of rejecting a valid model.

    Controlled by specifying a small level of significance a.

    Type II error (b):

    Error of accepting a model as valid when it is invalid.

    Controlled by specifying critical difference and find the n.

    For a fixed sample size n, increasing a will decrease b.

  • 17

    33

    Confidence Interval Testing[Validate I-O Transformation]

    Confidence interval testing: evaluate whether the simulation and the real system are close enough.

    If Y is the simulation output, and m = E(Y), the confidence interval (C.I.) for m is:

    Validating the model (see Figure 10.6): Suppose the C.I. does not contain m0:

    If the best-case error is > e, model needs to be refined.

    If the worst-case error is e, accept the model.

    If best-case error is e, additional replications are necessary.

    Suppose the C.I. contains m0:

    If either the best-case or worst-case error is > e, additional replications are necessary.

    If the worst-case error is e, accept the model.

    nStY n /1,2/ a

    34

    Confidence Interval Testing[Validate I-O Transformation]

    Bank example: m0 4.3, and “close enough” is e = 1minute of expected customer delay.

    A 95% confidence interval, based on the 6 replications is [1.65, 3.37] because:

    Falls outside the confidence interval, the best case |3.37 – 4.3| = 0.93 < 1, but the worst case |1.65 – 4.3| = 2.65 > 1, additional replications are needed to reach a decision.

    )6/82.0(51.23.4

    /5,025.0

    nStY

  • 18

    35

    Confidence Interval Testing[Validate I-O Transformation]

    36

    10.3.4 Using Historical Input Data [Validate I-O Transformation]

    An alternative to generating input data:

    Use the actual historical record.

    Drive the simulation model with the historical record and then

    compare model output to system data.

    In the bank example, use the recorded inter-arrival and service

    times for the customers {An, Sn, n = 1,2,…}.

    Procedure and validation process: similar to the

    approach used for system generated input data.

    You should follow the Candy Factory example (Example 10.3) to get

    familiar with using the computer system trace historical data.

  • 19

    37

    Using a Turing Test [Validate I-O Transformation]

    A Turing Test is a method of inquiry in artificial

    intelligence (AI) for determining whether or not a

    computer is capable of thinking like a human being.

    The test is named after Alan Turing, the founder of the

    Turning Test and an English computer scientist,

    cryptanalyst, mathematician and theoretical biologist.

    38

    Using a Turing Test [Validate I-O Transformation]

    Use in addition to statistical test, or when no statistical

    test is readily applicable.

    Utilize persons’ knowledge about the system.

    For example:

    Present 10 system performance reports to a manager of the

    system. Five of them are from the real system and the rest are

    “fake” (false) reports based on simulation output data.

    If the person identifies a substantial number of the fake reports,

    interview the person to get information for model improvement.

    If the person cannot distinguish between fake and real reports

    with consistency, conclude that the test gives no evidence of

    model inadequacy.

  • 20

    39

    Summary

    Model validation is essential:

    Model verification

    Calibration and validation

    Conceptual validation

    Best to compare system data to model data, and make

    comparison using a wide variety of techniques.

    Some techniques that we covered (in increasing cost-to-

    value ratios):

    Insure high face validity by consulting knowledgeable persons.

    Conduct simple statistical tests on assumed distributional forms.

    Conduct a Turing test.

    Compare model output to system output by statistical tests.